IOMMU comes to Solaris x86

This weekend, the code for the IOMMU for Solaris on Intel (PSARC 2008/560) was pushed. This has potentially profound ramifications for folks working on Solaris device drivers, and I thought I'd take some time to talk about them.

First off, it needs to be noted that we've had an IOMMU on Solaris SPARC pretty much for as long as we've had Solaris on SPARC. (In fact, most SPARC platforms have to use an IOMMU -- they have no choice.) But on x86 this technology is new.

The benefits that IOMMU brings are many fold.

  1. It virtually guarantees that all DMA requests can be set up with a single DMA cookie, reducing complexity and eliminating the use of bounce buffers by the DMA framework -- even for old devices with unfortunate restrictions (such as an inability to perform dual address cycles on PCI -- i.e. no 64-bit support.) Such restrictions are actually fairly common place.
  2. It allows for strong isolation to be given for devices that can be accessed via other virtual machines or domains. This can prevent one misbehaving xVM domain from crashing others by misprogramming the DMA engine on a physical device.
  3. It allows for isolation of faulty devices, so that they cannot scribble into arbitrary PCI spaces, preventing a misbehaving device from accessing regions which it should not. This has major benefits for fault resilience, as well as diagnosability. (To be fair, I'm not 100% sure the code is in place yet to leverage all of this benefit, or integrate it with FMA.)
  4. It facilitates debugging of faulty device drivers. To give a concrete example, when I was working on an audio driver recently it took me a long time to figure why the device was emitting white noise. It turns out that I had not initialized the DMA address register properly. With IOMMU, instead of the device just getting random data, I'd have gotten a bus fault that would have contained information that I could have used to see that the device was trying to access some weird place memory to which it had no right.
Now, these features don't come without some cost.
  1. Setup and tear-down of DMA operations is potentially significantly more expensive than with the simple translation layer previously used. Device drivers that assume such operations are inherently cheap may be in for a surprise.
  2. Drivers still have to retain the ability to operate in an environment without an IOMMU. Effectively, this means that they need to be prepared to see more than one DMA cookie or window per DMA region. Generally speaking, well written drivers should make no assumptions about the number of cookies used beyond the limits expressed in the DDI DMA attributes. (Namely, the framework is free to use any number of ddi_dma_cookie(9s) >= 1 and <= dma_attr_sgllen.)
  3. Drivers that require physical rather than virtual memory be used (i.e. that need to bypass the IOMMU) can request mappings using DDI_DMA_FORCE_PHYSICAL in the ddi_dma_attributes(9s) dma_attr_flags field, but such requests are not guaranteed succeed. A portable driver must retry such a request without the flag set, if the first attempt with it set fails.
  4. Generally, correctly written drivers will Just Work with the integration, without any changes to them. I would discourage driver authors from disabling the use of DDI_DMA_FORCE_PHYSICAL unless they have specific performance requirements. (And, normally, there are better solutions such as reuse of mappings, so that DDI_DMA_FORCE_PHYSICAL remains unnecessary.)
  5. The IOMMU imposes (or should impose) a new test requirement -- namely that device drivers are tested on systems both with and without an IOMMU. While code that works on systems without an IOMMU is unlikely to notice the introduction of the IOMMU, the reverse is not true. If drivers were developed in the presence of an IOMMU, it is not unusual for them to fail on systems without an IOMMU, as the lack of an IOMMU often requires mappings to be made with multiple DMA cookies, especially for resources that page boundaries.
I've not had a chance to play with the new framework myself yet, but I look forward to doing so. Also, be aware that same feature set is coming soon for AMD platforms, see PSARC 2008/561.

Comments

Hi,

Yes, I been watching this and its much needed ! Great news indeed.

Where can one find out what drivers have been ported to make full use of this are and what is there status?
We should maybe setup some kind of list somewhere on opensolaris.org

Best Regards,
Edward.
Unknown said…
The short answer is, drivers don't have to be modified. The details are hidden from them in the DDI DMA framework. (Unless they choose to override with DDI_DMA_FORCE_PHYSICAL, of course.)

So, all drivers should benefit from IOMMU, starting immediately, at least on those platforms where an IOMMU is present and supported.
Unknown said…
As an update, the push for AMD happened last night. So both AMD and Intel platforms now have this ability.

Popular posts from this blog

An important milestone

SP (nanomsg) in Pure Go

GNU grep - A Cautionary Tale About GPLv3