Thursday, July 30, 2009

Fast Reboot & Panic

I noticed that Sherry Moore just posted a blog entry about Fast Reboot.

I wanted to take a few moments to mention a few things, that I think folks should understand.

First off, the feature (fast reboot) is really useful -- for manually initiated reboots to perform administration (such as to reboot after installing new kernel bits or a critical patch), its wonderful to skip past the various hardware related initialization, and can really help with downtime costs associated with administrative maintenance tasks like patching.

This is especially true for systems with lots of peripheral buses (SCSI, Infiniband, etc.) that take a long time for low-level BIOS to probe and test. In such situations, BIOS initialization can consume several minutes. Reducing this to a few seconds is a compelling idea.

However, there are some gotchas that in my opinion people should be aware of when using the variant of this that gets used on panic().

  • During a panic situation, all bets are off about kernel or hardware state. (This is why the code in the kernel called panic() after all -- it has deemed it unsafe to proceed.)
  • So, the nice safe quiesce(9e) entry points are not guaranteed to be called. Hopefully they will be, but not necessarily.
  • Some drivers may panic when they find hardware in a state that is beyond their ability to recover. So quiesce(9e) may be functionally unable to put the system back into a sane state.
  • Some hardware simply can't restore properly without a low level PCI reset, which the current fast reboot code skips.
  • If hardware is not quiesce(9e)'d properly, on the reboot, the new kernel can wind up in a situation where a device might be randomly scribbling (via DMA) to physical memory (this can lead to arbitrary data corruption of either kernel or user pages of memory), or might wind up with a stuck interrupt (which may exhibit as a hard hang of the machine).

Note that the above situations are not theoretical. I have hit these problems, involving various different bits of hardware ... certain framebuffers that require low level initialization, certain Ethernet parts that don't have a functional software reset mechanism, and a certain WiFi controller that can leave interrupts stuck.

However, all the situations described above are also quite unlikely to occur. They probably occur in fewer than 1% of all potential panic scenarios.

The upshot of this is that I would most definitely not use fast reboot on any machine that is in production or which has critical data. Do you want to Its a wonderful feature for kernel developers who trash their systems all the time and are accustomed to taking risks -- for such uses the shortened reboot time on a panic is a net win compared to the potential risks (which is virtually nil -- if a system I'm testing hard hangs or crashes a second time I can always just power cycle it), but in a production environment the expectations are different.

In such an environment, you really don't expect to see panic() occur (if it does, you're already on the path of a bug!), but when it does, you want to be 100% certain that you confine the potential damage and get back to a known safe (and good) state. This is why we panic() and reboot, after all. Eliding those time consuming steps of low-level initialization might at first sound like an attractive way to get to higher uptimes, but if you analyze the situation carefully, its a potentially riskier proposition that could (admittedly unlikely) cause much greater downtime.

Now that you know the concerns, you are of course free to make your own assessment.

If you do want to turn off fast reboot on panic (and I do recommend that you leave regular fast reboot on, as far as I can see there is no downside to making an administratively requested reboot go faster), then you can just use the following commands (which are taken from Sherry's blog posting):
# svccfg -s "system/boot-config:default" \
setprop config/fastreboot_onpanic=false
# svcadm refresh svc:/system/boot-config:default

(Note that fast reboot on panic is enabled by default on OpenSolaris since build 112. However, since nobody should have deployed a system with this on in production -- we've had only development releases since then -- there is probably no urgency to go and immediately change your systems. Of course, that situation might be different if you're reading this blog post at some point in the future.)

Archived KCA 2009 Boomer Talk

As some people know, the KCA 2009 was streamed live. The video of my talk on Boomer was archived (as well as other talks) and is available on line.

You can watch it here.

Note that I highly recommend advancing to about 6:30 in the stream, because the first 6+ minutes was me struggling with laptop/projector incompatibilities. (Would be nice if the folks that posted the video could crop that meaningless bit out.)

Other thoughts I had from looking back and reviewing this (which I did for the first time yesterday):

  • A live demo would make it more interesting.
  • Test the equipment compatibility first when presenting.
  • Turn off screen savers.
  • Rehearse the talk before hand.
  • Some points were repeated, which was unfortunate since I had to rush or skip other points.
  • Practice talking more.. especially at the beginning it seems like I didn't know what to do with my hands.
  • Don't be afraid to get out from behind the podium.
  • The podium itself was poorly situated ... the exit sign above my head was particularly apropros framing.

I'd be happy to hear other constructive criticisms, whether on Boomer, the paper, slides, or my presentation style. I don't get the chance to speak in public often, so when I do I'd like to be better at it then I think I managed this time.

Friday, July 24, 2009

audiovia97 pushed

I've pushed the audiovia97 device driver. Those of you with Via 82C686 south bridges will be able to make use of your on-board audio in OpenSolaris builds 121 and beyond. Enjoy.

Boomer Paper at KCA

I presented a paper covering Boomer at Kernel Conference Australia 2009. I've made the paper and slides available for your enjoyment.

Thursday, July 9, 2009

Off to Brisbane, Austrialia

I'm headed to Brisbane for the week -- I'm presenting at Kernel Conference Australia 2009. I hope to meet other like minded UNIX kernel nerds there. Maybe do some snorkeling as well, although its the off-season there. (But it can hardly be colder than the water in California...)

STREAMS and non-STREAMS in the same driver

Some of you kernel hackers may be interested to know about my case (PSARC 2009/380) that eliminates one of the ancient limitations of Solaris -- having STREAMS and non-STREAMS entry points in the same device driver. I've also implemented it and am waiting for the case to time out before I submit the RTI.

As part of this, I also implemented a set of changes to the audio stack -- the austr(7D) "speecial" node and driver goes away, and the Sun audio personality sheds about 1,000 lines of complexity. I'll be pushing those changes later, once I've gotten code review feedback. (If you review the changes, please let me know!)

mii related fallout

So there was some fallout from my mii push. Elxl devices had a nasty panic, which I fixed in time for build 119. Some older i82557 (iprb) devices had problems as well. The push for this fix went into build 120. To everyone affected, please accept my apologies. Build 120 should be stable for these devices.