Saturday, November 1, 2008

fdc suspend/resume support

In another update, I've updated fdc (the floppy disk controller) so that it supports suspend/resume even if a floppy drive is present. There is one caveat though -- namely that the drive must be empty. Still, its far easier to pull a disk out of a drive (if you even have one -- most floppy drives these days probably see very little, if any, use), than to disconnect the drive entirely.

You can download the file from the device drivers download page (look for a file named fdc-2008-11-01.tar.gz). A webrev of the changes is also posted.

For the curious, the reason for the above caveat is best stated by the following explanation extracted from a comment in the code:
Bad, bad, bad things will happen if someone changes the disk in the drive while it is mounted and the system is suspended. We have no way to detect that. (Undetected filesystem corruption. Its akin to changing the boot disk while the system is suspended. Don't do it!)

So we refuse to suspend if there is media present in the drive. This limits some of the usability of suspend/resume, but it certainly avoids this potential filesytem corruption from pilot error. Given the decreasing popularity of floppy media, we don't see this as much of a limitation.
Anyway, I suspect these changes may have little value to laptops, but may help a lot of desktops. It certainly addresses the problem of SUSPEND for my Dell Precision M390.

Update: Thanks to Jürgen Keil's testing, we have found a bug, which I've fixed. The webrev and the binaries have been updated accordingly. Look for the 11-04 release instead.

iprb updated with suspend/resume & quiesce

I've updated iprb (Intel Pro/100 Ethernet driver) to support suspend/resume and quiesce (fast reboot). I've also made some other changes, such as updating to the latest microcode from Intel, fixing some potential races, and improving the internal implementation of the statistics routine.

I'd like folks to try it out on their Intel or AMD platforms. The tarball can be downloaded from here (look for iprb-2008-11-01.tar.gz).

I still want to get this open sourced, but I haven't got approval from the lawyers yet. But here are the binary bits in case anyone is waiting for them.

(I'm not planning on porting this driver to SPARC. There are too many places in the code that would have to be changed to use endian-safe calls to ddi_get, and I think there is probably very little demand for a SPARC version of the driver. If you are one of the people who do want SPARC support for iprb, please let me know.

Enjoy!

Tuesday, October 28, 2008

Boomer: Next Generation Audio for Solaris

I've posted draft inception materials for our PSARC case, 2008/318, for Boomer. Boomer is the code name for the effort to integrate a modern OSS-compatible API, and additional drivers, into Solaris.

PSARC inception review for this case is scheduled for November 5, 2008. Please check the ARC web page for call in details (any engineer in the community may dial in.) Be advised that the meeting is an engineering meeting, though. (PSARC provides engineering architectural review for projects.)

We have a working prototype for most of the bits we are ARC'ing at this point, and I expect to be opening up the source code tree, and binary test bits, in the coming weeks. Perhaps by Thanksgiving.

Monday, October 13, 2008

Sun position available in Austin

A new position in Austin, TX has opened in my group (onsite only). This is for a junior or mid-level administrator or QA type person -- great for someone who loves playing with new hardware, software, etc. No specific programming skills beyond every day scripting required (though perl and C would probably be useful.) Send replies back to me. (Full disclosure: I get a referral bonus for anyone I refer.)

Thursday, October 9, 2008

Microsoft & NetBSD

Some folks might be interested in this. I used to hack on NetBSD at my previous job. I recently had an inquiry from a recruiter at Microsoft (don't worry Neal, I'm not going anywhere! :-) which I thought I'd share here.

Location: Redmond, WA
Client: Microsoft.
Job Description:

Seeking a talented NetBSD software developer interested in helping Danger (a subsidiary of Microsoft) ship the next generation of Danger’s Sidekick platform. Specifically, we’re looking for NetBSD developers interested in commercializing the NetBSD platform for an embedded mobile computing device, focusing on performance and optimization, bug fixing, and integration with Danger’s higher-level platform code, with an emphasis on kernel and driver support.
Qualifications:
- 8+ years of software development experience
- 8+ years experience with C
- Strong understanding of OS concepts (particularly with NetBSD) such as multi-threaded program design and synchronization, processes & memory protection, etc.
- Strong communication skills
- Strong understanding of the NetBSD/GNU software development process and embedded development & debugging techniques
- Deep understanding of NetBSD, including timers, RPC, TCP/IP, etc.
- Experience with ARM processors highly desirable

So, not Windows CE or some other Windows platform. Cool. Wonder when they'll start posting job offers for Solaris engineers? (Probably as soon as Solaris is useful for small embedded devices -- i.e. I wouldn't hold my breath. :-)

Friday, October 3, 2008

New Stuff in OpenSolaris 2008.11

OpenSolaris 2008.11 is coming soon. Build 101 is the stabilization build before it, and as usual, new features are excluded from this build. So we can say pretty certainly what features are in, and what are out.

While there are a lot of new features coming in this release, there are some in particular that I've been more involved with.

  1. SDcard support. Numerous laptops can now use SDcard and MultiMediaCard media directly. A good way to know if you're laptop is one of them is to search for pciclass,0805 in the output of prtconf -vp. While it isn't completely conclusive (in particular some models from Texas Instruments are schizophrenic here), if you see this, there's an excellent chance it will work. I'd always like to hear feedback about this feature -- if your unit works, or doesn't work, let me know. (Also, of course, the Tadpole SPARCLE models are supported. )
  2. AudioHD improvements. Notably, many more laptops will now have working audio. The audiohd driver is also updated to support Suspend/Resume.
  3. Fast Reboot. I participated as a consultant. I'll also be updating (post OpenSolaris 2008.11) additional drivers to support this feature. The upshot of this project is that a healthy system can reboot much more quickly now.
  4. Brussels (NIC Administration). I've participated in converting several drivers to Brussels, and in generally improving Brussels (I'm also the ARC sponsor for this work.) The upshot of this project is greatly improved manageability for network interfaces.
  5. Suspend (S3) Support. I helped review the conversions of several drivers, and provided fixes for several NIC devices.
  6. Bug Fixes. I've worked on a number of them, and of course, there are huge numbers of bugs that have been fixed in this release.
All said, I think OpenSolaris 2008.11 is going to be great -- I confess that I was skeptical about the earlier releases, but this release is shaping up to be really awesome.

Thursday, October 2, 2008

Ancient History Exhumed

Okay, maybe not so ancient (circa 2000), but I recently got an e-mail from the Sun IT group notifying me about changes that impacted web pages I set up for Alternate Pathing, which was the very first project I did that involved work within the Solaris kernel. Apparently they didn't notice that I'd left the company and returned.

Of course Alternate Pathing was canceled a long time ago, although someone may still be using it on older E10K Solaris 8 systems.

Here's the Sun internal URL to which the e-mail referred. The bug tracking pages are broken, since the scripts behind them were developed (by me) to talk to the BugTraq+ Sybase server, which Sun hasn't used in ~forever.

Monday, September 15, 2008

IOMMU comes to Solaris x86

This weekend, the code for the IOMMU for Solaris on Intel (PSARC 2008/560) was pushed. This has potentially profound ramifications for folks working on Solaris device drivers, and I thought I'd take some time to talk about them.

First off, it needs to be noted that we've had an IOMMU on Solaris SPARC pretty much for as long as we've had Solaris on SPARC. (In fact, most SPARC platforms have to use an IOMMU -- they have no choice.) But on x86 this technology is new.

The benefits that IOMMU brings are many fold.

  1. It virtually guarantees that all DMA requests can be set up with a single DMA cookie, reducing complexity and eliminating the use of bounce buffers by the DMA framework -- even for old devices with unfortunate restrictions (such as an inability to perform dual address cycles on PCI -- i.e. no 64-bit support.) Such restrictions are actually fairly common place.
  2. It allows for strong isolation to be given for devices that can be accessed via other virtual machines or domains. This can prevent one misbehaving xVM domain from crashing others by misprogramming the DMA engine on a physical device.
  3. It allows for isolation of faulty devices, so that they cannot scribble into arbitrary PCI spaces, preventing a misbehaving device from accessing regions which it should not. This has major benefits for fault resilience, as well as diagnosability. (To be fair, I'm not 100% sure the code is in place yet to leverage all of this benefit, or integrate it with FMA.)
  4. It facilitates debugging of faulty device drivers. To give a concrete example, when I was working on an audio driver recently it took me a long time to figure why the device was emitting white noise. It turns out that I had not initialized the DMA address register properly. With IOMMU, instead of the device just getting random data, I'd have gotten a bus fault that would have contained information that I could have used to see that the device was trying to access some weird place memory to which it had no right.
Now, these features don't come without some cost.
  1. Setup and tear-down of DMA operations is potentially significantly more expensive than with the simple translation layer previously used. Device drivers that assume such operations are inherently cheap may be in for a surprise.
  2. Drivers still have to retain the ability to operate in an environment without an IOMMU. Effectively, this means that they need to be prepared to see more than one DMA cookie or window per DMA region. Generally speaking, well written drivers should make no assumptions about the number of cookies used beyond the limits expressed in the DDI DMA attributes. (Namely, the framework is free to use any number of ddi_dma_cookie(9s) >= 1 and <= dma_attr_sgllen.)
  3. Drivers that require physical rather than virtual memory be used (i.e. that need to bypass the IOMMU) can request mappings using DDI_DMA_FORCE_PHYSICAL in the ddi_dma_attributes(9s) dma_attr_flags field, but such requests are not guaranteed succeed. A portable driver must retry such a request without the flag set, if the first attempt with it set fails.
  4. Generally, correctly written drivers will Just Work with the integration, without any changes to them. I would discourage driver authors from disabling the use of DDI_DMA_FORCE_PHYSICAL unless they have specific performance requirements. (And, normally, there are better solutions such as reuse of mappings, so that DDI_DMA_FORCE_PHYSICAL remains unnecessary.)
  5. The IOMMU imposes (or should impose) a new test requirement -- namely that device drivers are tested on systems both with and without an IOMMU. While code that works on systems without an IOMMU is unlikely to notice the introduction of the IOMMU, the reverse is not true. If drivers were developed in the presence of an IOMMU, it is not unusual for them to fail on systems without an IOMMU, as the lack of an IOMMU often requires mappings to be made with multiple DMA cookies, especially for resources that page boundaries.
I've not had a chance to play with the new framework myself yet, but I look forward to doing so. Also, be aware that same feature set is coming soon for AMD platforms, see PSARC 2008/561.

Sunday, September 7, 2008

audiohd pushed

The latest & greatest audiohd driver, which includes vastly improved support for a large number of codecs, suspend/resume support, and a generic "codec parser" has now been "pushed" into build 98 of ON. Many thanks to the Beijing audio team, who worked long and hard to bring this project to fruition.

Tuesday, August 12, 2008

New audiohd driver posted

We've gotten a lot of good feedback from previous posting of the audiohd driver, and the Beijing engineering team has come up with a new version (Aug 12, 2008.) Note that the latest audiohd driver will always be posted here.

One note: please use the obj32/ or obj64/ versions of the driver unless you are running a debug kernel. There are some binary dependencies where the debug drivers won't work on a production kernel.