Wednesday, July 7, 2010

ZFS disk monitoring...

So I've posted this on zfs-discuss at opensolaris dot org, but its been suggested I mention it here too.

It turns out that the ZFS/FMA integration doesn't pick up on drive removals for most disk devices until the filesystem attempts to perform some I/O to the drive. This is rather unfortunate, because if a file system is not busy, you might suffer a loss of redundancy and not find out about it until too late.

It also means that you won't know about failures of hot spare devices until you need to put them into service, since by definition they are idle. (Note: as an exception running periodic scrubs should detect this too, although scrubs are highly intrusive to the overall I/O load on the system and probably should not be performed too often as a result.)

I'm told the Oracle 7000 series appliances have a solution for this problem, but of course the source for that is not in OpenSolaris. (Apparently there are quite a few differences in the core OS between the 7000 series and vanilla OpenSolaris -- unfortunately we can't know because -- unlike with NexentaStor -- we don't have access to the kernel source tree!)

This is not good for folks who use ZFS with ordinary Solaris 10 or OpenSolaris... or with derivatives such as NexentaStor.

To address that problem, I've developed a some code called "zfs-monitor" that periodically monitors the health of any physical vdev (disk) that is part of a ZFS pool (hot spare, log, or real device). This code is implemented as an FMA module. When a disk goes offline, zfs-monitor detects it, and triggers an FMA event, which allows ZFS to do the right thing. This means if a disk goes away, even if it isn't in use, whatever action is appropriate will be performed. (Logged in FMA fault logs, and if appropriate, a hot spare will be recruited to replace the failed or offline device.)

This code is part of NexentaStor 3.0.3. As there are some semantic differences of opinion (what constitutes device failure versus intentional removal by an administrator), the code is unlikely to be pushed into ON without further change. (At the same time, I've fixed a different problem in the ZFS FMRI parsing code, and I've submitted a request to get that fix integrated -- but I've not heard back from anyone at Oracle who is willing to sponsor the change yet.)

I'm happy to share the code for zfs-monitor to anyone who requests it. (In fact, you can examine the code in our open Mercurial repository directly!) Note that for it to work properly, you also will need the fix for the ZFS FMRI parsing bug just mentioned.

At Nexenta, we're committed to innovating and improving upon the great foundation of ZFS and OpenSolaris, and to the reasonable extent possible, we want to share those innovations with the greater OpenSolaris community. Hopefully changes like this demonstrate this commitment in a tangible fashion.

Monday, June 28, 2010

Looking for CIFS/AD expertise

(I know its probably questionable using my blog for this, but I thought I'd post it here anyway. My apologies if anyone finds this offensive. I'll keep it brief in any case.)

I'm looking for a high-caliber developer, preferably with some kernel and/or OpenSolaris expertise, who's also got extensive knowledge of ActiveDirectory and CIFS. If that's you, or you know someone who fits that description, please contact me -- garrett at nexenta dot com. (No recruiters or agents please.)

Tuesday, June 15, 2010

skype for Solaris

So I'm irked, really irked.

If we had Skype support for Solaris, I could probably ditch this half baked mess of Linux hosts running VMware guests with OpenSolaris and Nexenta. I want just a single host OS for my development box.

Right now the single biggest barrier to running OpenSolaris on my desktop for my job at Nexenta is Skype. But this is silly, because Skype works in Linux, and the APIs should basically be compatible. Especially with the OSS layer that we already have in OpenSolaris these days via Boomer.

If someone at Skype sees this (good luck trying to find a contact on their web site!), and wants to work with me on it, I'd be happy to help them work through the issues of getting a native Skype port.

If anyone who has an "in" at Skype reads this post, please forward it to your in at Skype.

If any folks are paying for business services from Skype, feel free to let them know you want a Solaris client, and there is an expert on the Solaris audio stack waiting to help.

Thanks!

(On a side note, I'd also like to have VMware on Solaris as well. Yeah, I know about VirtualBox, but I need support VMware for clients, and it would be a heck of a lot easier if I could just host VMware guests on my development head.)

LDRS 29, very cool

So, this past weekend my son and I went to LDRS 29, which is the event for the national high-powered rocketry club, Tripoli. We were there only one day and one night, but here were some cool highlights from Saturday:
  • Mass squat launch -- Timothy's Squat with an Aerotech G-67 redline motor flew very nicely, if a bit late off the pad. 28 other rocketeers had their rockets launch at roughly the same time.
  • Many wild squats. With the $29 specials from WhatsUpHobbies, lots of people were flying very unstable Squat rockets with I-140 skidmark engines. This configuration needs nose weight, as we found out for ourselves when we flew Timothy's with the same engine.
  • Four half-scale Patriots launched in 3 second intervals from a "box" launch vehicle -- much like a real Patriot. Very, very cool.
  • Drag race of six or seven N-impulse rockets. These are big rockets, lots of power.
  • Drag race between a number of very detailed rockets. There was a CATO about 20 feet off the pad, and unfortunately several other rockets were destroyed on ascent by the CATO. Cool to watch, but glad it wasn't one of my rockets.
  • Full scale O-impulse Patriot launch (and unfortunate catastrophic failure near the end of boost, flaming bits falling down all over the range.)
  • My own J-350W powered LOC-IV (with significant modifications in nose weight and fiberglass reinforced fins.) This was my Level 2 cert flight and it went brilliantly. (So I'm certified to fly level 2 high powered rockets - impulse up to and including L power.)
  • Our Big Daddy Estes rocket (typically D or E power) launch with an F-32 Blue Thunder engine -- believe it or not this was one of the highlights for me of the day. As the launch control officer proclaimed -- "a little too much power for the rocket, but we like that!"
  • Flying a drag race between two D-21 powered 18mm rockets. Both were lost, and later found damaged.
  • A number of very cool night launches -- lots of creativity here on the part of the rocketeers.
  • Discover channel. This was a mixed bag... it was cool that they were there, but they did interfere with launch schedules quite a bit. Still, I think we'll be part of the ultimate show, which is supposed to air July 5. Looking forward to watching that.
There was one significant accident, involving an extremely high powered rocket on the far pads. A couple of people were unfortunately badly burned, and had to be medevaced, and our wishes go with them for their recovery.

As for Timothy and I, we're hooked. We'll be going to the November RocStock event as well, provided we can make the schedule work out.

Press release

Noticed this press release got posted to the Nexenta web site. /me preens. :-)

Friday, June 4, 2010

O_SYNC behavior not honored

UPDATE (6/21/2010): This problem is apparently solved in b142. Probably other builds as well. But I was unable to reproduce this problem with real hardware on b142.

Note that VMware does not honor cache flushing, so VMware (and possibly other v12n users) will potentially still see this issue.

So, it turns out that ZFS in recent (somewhere after build 134 apparently) builds has a critical bug ... O_SYNC writes are not really synchronous. This leads to potential data loss.

I've not yet figured out which change introduced the bug, but I hope to work on it next week.

In the meantime, I would strongly discourage use of post-134 binaries for anything where data integrity is important.

I've filed a P1 bug with Oracle for this issue. I'll be trying to nail it down further next week; if I'm able to fix it before Oracle can, I'll offer up my fix.

I'll post the CR number when I receive the number back.

I imagine that this bug, which is trivially reproducible, will be getting top priority from the ZFS engineers next week.


UPDATE: CR number is 6958848

The link to access it isn't available yet.

Great Falcon-9 Launch!

SpaceX, one of our greatest hopes for a commercial manned space program, has achieved a huge milestone with the successful maiden launch of Falcon-9 with a Dragon capsule today. This is the craft that may one day soon be used for ISS resupply, and perhaps even crew transport.

Even as Obama shuts down the US governments manned space program, the commercial sector is picking it up. This is a momentous day.

Congratulations to Elon Musk and the rest of the team at SpaceX!