Friday, June 4, 2010

O_SYNC behavior not honored

UPDATE (6/21/2010): This problem is apparently solved in b142. Probably other builds as well. But I was unable to reproduce this problem with real hardware on b142.

Note that VMware does not honor cache flushing, so VMware (and possibly other v12n users) will potentially still see this issue.

So, it turns out that ZFS in recent (somewhere after build 134 apparently) builds has a critical bug ... O_SYNC writes are not really synchronous. This leads to potential data loss.

I've not yet figured out which change introduced the bug, but I hope to work on it next week.

In the meantime, I would strongly discourage use of post-134 binaries for anything where data integrity is important.

I've filed a P1 bug with Oracle for this issue. I'll be trying to nail it down further next week; if I'm able to fix it before Oracle can, I'll offer up my fix.

I'll post the CR number when I receive the number back.

I imagine that this bug, which is trivially reproducible, will be getting top priority from the ZFS engineers next week.


UPDATE: CR number is 6958848

The link to access it isn't available yet.

Great Falcon-9 Launch!

SpaceX, one of our greatest hopes for a commercial manned space program, has achieved a huge milestone with the successful maiden launch of Falcon-9 with a Dragon capsule today. This is the craft that may one day soon be used for ISS resupply, and perhaps even crew transport.

Even as Obama shuts down the US governments manned space program, the commercial sector is picking it up. This is a momentous day.

Congratulations to Elon Musk and the rest of the team at SpaceX!

Wednesday, June 2, 2010

audioens in VMware...

So, we have not had audio in OpenSolaris within VMware since... well, ever.

I've been doing some investigation. I'm seeing a situation where the VMware emulated audioens device behaves rather differently from the real hardware.

For one, it seems to insist on using real interrupts. In particular, the sample count registers do not appear to be updated unless one receives and acknowledges an interrupt. (By toggling the interrupt enable bit.) This means that this virtualized device will never be able to run "interrupt free" like the other audio devices (or real audio hardware).

For another, it appears that the audio device has some weird dependency on the relationship between the size of the audio buffer, and the interrupt rate (the number of samples at which to interrupt). Using different values gives, strange results.

Finally, I cannot, for the life of me, figure out how to cause the device to actually trigger an interrupt. I've been able to make some progress by simulating a soft interrupt at 100Hz, which is how the interrupt free framework works anyway, but from what I can tell, nothing is causing a real interrupt to be delivered. This is really strange. (Without this functionality, I am able to process audio at a reasonable rate, but it still stutters, and is not really suitable for real-world use.)

My guess is that the virtual device has some weird dependencies that we don't know about. For example, while the hardware spec identifies registers as being 8, 16, or 32 bits wide, and we use those at the right bit widths, other FOSS drivers all seem to just blithely use 32-bit wide accesses. Is there a hidden dependency here?

If any reader from VMware is seeing this, and can help me understand the behavior of the simulated device, I'd appreciate it. I'd like to make audio work in this environment, if possible. I'm pretty close, I think.

Actually, it seems kind of crazy that these environments emulate such complex audio hardware. (For example sophisticated sample rate conversion hardware.) Much better, I think, would be a simple paravirtualization driver that just exposes a simple buffer and some control functions. If someone at VMware wants to work on that as a solution, I'd be happy to help with Solaris support for it. Since these things run isochronously, and chew up a fair bit of cpu when they run, such a solution would probably be quite useful. (For example, its silly to perform multiple sample rate conversions in software... instead we could express native sample rates via a PV driver to the guest, ensuring only a single SRC operation is performed appropriately in the guest operating system.)


Tuesday, June 1, 2010

Well *That* Didn't Work Out So Well

You may recall my recent blog post about Windows 7 being surprisingly usable.

Well, I have to recant here.

I used Windows 7 for about a week and half. While it *worked*, it was a pleasure to use. But after three BSODs in just that week and half, I have abandoned it. I'm now running Ubuntu. (Why not OpenSolaris? Because I need the ability to host VMware and Skype, and I can't do that natively on OpenSolaris -- yet.)

Sure, I could have called up support -- but Microsoft support is provided by my computer manufacturer, and I didn't feel like spending 3 hours on the phone dealing with tech support while they tried to triage my problem. In the end, it was simply faster and easier for me to reinstall with Linux, even allowing for the time it took to download the media.

Sure, the problem might have been my virtualization software, or maybe it was a shoddy audio driver, or maybe it was brokenness in my graphics driver, or maybe it was the 3rd party antivirus software (which begs the question-- why doesn't Microsoft ship with builtin malware protection -- you'd think given all the heat that they've taken over this that they would have figured out that they *need* a solution here that doesn't involve 3rd parties...)

The "automatic solution finder" that Windows 7 ships was completely unhelpful, it didn't find any links. Google was not much help either... with everything to buggy hardware, drivers, and even overtemp problems being cited as root causes.

I'm sure that tech support would have had me running around in circles trying to solve the IRQL_NOT_LESS_OR_EQUAL blue screen. (I'm guessing, from my kernel expertise, that this is probably an assertion fault somewhere that an IRQ level is set unexpectedly high or low -- exactly the kind of problem I know how to fix in Solaris.) Probably plugging and unplugging hardware, unloading and reinstalling drivers and maybe other software, and generally burning an unmentionable amount of my precious time. Especially given that the hardware tech support I'd have been routed to was unlikely to have any real software clue (which is where I think this problem was most likely located.)

Again, faster and simpler to just dump the busted OS, and load something else.

And, with Linux (or any other FOSS), I have at least a fighting chance of trying to debug the problem myself. Sure, my kernel-fu is substantially higher than average joe home user, so my leanings are more towards something I can troubleshoot myself. But, I will say this, so far I've not had a panic ("oops" in Linux parlance) yet in the past four days that I've been running Ubuntu LTS 10.04 (even though I'm running the "not recommended for general desktop use" 64-bit edition.)

Microsoft, if you see this post, I hope you'll learn something from this.

Friday, May 21, 2010

Last Day

So, today's my last day as a Sun^WOracle employee.

While I'm excited to be starting at Nexenta, I want to reiterate what I've already said, which is that I've really enjoyed working with the great folks at S'noracle, and that they made this decision to leave quite difficult. Its been quite a ride over the years, and its been fun and exciting. Thanks everyone!

Of course, my old e-mail address(es) at Oracle won't reach me after about noon today.

To reach me for matters pertaining to OpenSolaris, gdamore@opensolaris.org will continue to work. For matters pertaining to my new employer, Nexenta, you can use garrett@nexenta.com. My personal e-mail address of garrett@damore.org remains unchanged. Now please standby while I go reinforce the spam filters...

New Computer

As part of the process of changing employers, I needed to get a new computer for the new job (and return the old desktop to Oracle.)

I wound up picking this one... I didn't seem to be able to build it any cheaper (as of the date of this post) myself. And guess what... someone goofed! Instead of the 3 GHz Core i7 950, it came with a 3.2 GHz Core i7 960. Bonus! (Other goofs relative to the ad: the system has 9 GB -- but that's spelled out in the details, comes with a black aluminum chassis, and ships with a cheap logitech keyboard.)

I'm still using the stock load of Windows 7, and I'm both surprised (and maybe a bit embarrassed) to admit that the Windows environment (especially when replacing IE with Chrome) is actually quite nice -- fast and usable. Maybe running this environment (and running OpenSolaris in a VM) might not be so bad after all! (Ok, I'll go find some soap to wash my mouth out for blaspheming....) If I do this, besides being able to use Skype for work, I'll be able to use my Phoenix RC flight simulator without having to resort to borrowing the wife's computer...


Engines arrived for Squat yesterday

The Squat is a 4" diameter short high power rocket with a 54mm engine mount. My engines, 54mm hardware (including the higher end Aeropack retainer), and 38 mm adapter arrived from What's Up Hobbies yesterday. Timothy's going to fly it at the LDRS mass launch on a G67 redline -- this will be his first reloadable engine. Later that day I'll fly it on an I140 skidmark, which represents both my first 54mm engine, and my first Caeseroni engine.

Timothy and I put the rocket together last night; I must say, the higher end metal hardware and thicker fins on this rocket are definitely a step up even from the LOC IV I flew previously on my Level 1 flight (go to about 1:30 in the video link -- I haven't figured out how to edit the video file yet).

I also received the propellant for the J350W, which I'll be flying in my LOC IV as part of my Level 2 certification attempt. OpenRocket says the LOC IV will be approaching 700 mph with this particular engine! Guess I will be glassing the fins on it to help strengthen them for transonic speeds. (I'm open to alternative suggestions from the experts, as well.)

LDRS is going to be fun, indeed!