Wednesday, October 27, 2010

New illumos logo

Today at the OpenStorage Summit 2010, I unveiled the new illumos logo. We will be updating our branding, which also includes a new font, and other elements, over the next few weeks.

There were other updates on illumos that I covered in this talk. I think this was recorded, but I'm not sure right now where it was recorded and how to acces it. I'll be sure to share that when I find out.

Wednesday, October 13, 2010

CFV: web/HTML/graphics people

I have an urgent need to rennovate the illumos website. If you'd like to help the project out, and you have got both time and talent, please let me know. A major overhaul of the site is in order, and we need someone willing to dedicate some time on it. There may be some funds available for the right person, but to be clear, illumos can't afford the services of a professional design bureau.

New implementation of printf

So I finally got tired of waiting for someone else to do a printf(1) replacement in illumos for the closed binary from Oracle. I had thought this would be a trivial thing to do via ksh93/libcmd using a symbolic link ala /usr/bin/alias.

Lo and behold, it wasn't! Why? Because ksh93 printf insists (like all ksh93 builtins) on having -- and - getopt style processing. This is fundamentally incompatible with legacy printf. (Why does it do this? So it can dump its builtin man page, e.g. printf --man, to the console. A feature I've railed against in the past.)

Here's what should happen:

% printf -v

Here's what ksh93 does:

garrett@thinkpad:~$ printf -v
ksh93: printf: -v: unknown option
Usage: printf [ options ] format [string ...]

Now there is an argument to be made that a script which relies on the legacy behavior is fundamentally broken. But it doesn't matter -- the scripts are in the field (there are real examples of them), and the legacy behavior must be preserved. Breaking these legacy scripts just so that we can dump printf --version output is... silly. This is case where pragmatism wins over purity.

Rather than try to rip this out and fight with the ksh93 about "deviation from the upstream" (apparently the ksh93 folks view any changes we make in illumos or OpenSolaris as automatically toxic unless they originate from David Korn or Glenn Fowler), I've just gone ahead and implemented my own printf(1) on top of FreeBSD's. This will be the implementation in illumos.

I've added significantly to FreeBSD's code though. Specifically, I added handling of %n$ processing to get parameterized position handling. This is needed for internationalization -- it allows you to change the order of output as part of the output from something like gettext(1). (This is needed when you have to change word order to accommodate different natural language grammars.)

So my implementation is superior to FreeBSD's, and its superior to the legacy closed binary version. Why? Because rather than a half-hearted attempt at processing positional parameters, my version really handles these, including full support for the usual format specifiers. For example:

New open code:

garrett@thinkpad{4}> printf '%2$1d %1$s\n' one 2 three 4
2 one
4 three

Old closed code:

garrett@master{22}> printf '%2$1d %1$s\n' one 2 three 4
134511600 one

Clearly the old behavior is just plain wrong. For the record, ksh93 does the right thing here too. (Although somewhat older versions of ksh93 would dump core on this command line.)

My diffs (which also include style and lint fixes required for illumos) relative to FreeBSD are online. You can also review a webrev of the changes that I hope to integrate into illumos. The license remains BSD, so the various BSD operating systems (or even Oracle) are free to incorporate these improvements if they like.

Friday, October 8, 2010

illumos gets global

I just pushed a major set of changes:

8 libc locale work needs updated license files
223 libc needs multibyte locale support for collation
225 libc locale binary files should be in native byte order
309 populate initial locales for illumos

As a result, illumos has gained base support for some 157 different locales, spanning 67 languages and 116 different territories. This includes nearly all the major languages of the world -- missing are Serbian, Javanese, Farsi, Malaysian, Burmese, and some languages spoken in central and west Africa. (Some of these will be very easy for someone else to add... let me know if you want one of these and are willing to do the work.)

The support for these locales includes full POSIX compliant collating support, which was completely absent in illumos before this integration.

Also, included, is a new open source implementation of localedef(1). To my knowledge, this new implementation is the only non-GNU version of localedef that is fully open, and this version is more fully functional than the GNU version. (The GNU localedef lacks full support for collation data.)

Other notes: this is only the base support for these locales. This will for example give localized output from "date". There is quite a lot of additional effort required to fully localize an illumos system, including support for input methods, fonts, and message catalogs for all the various applications. However, with this base support, it makes doing that other work much more practical.

This integration adds nearly 2 million lines to illumos, although far and away the vast majority of it is in the form of data from Unicode and the CLDR (common locale data repository). The ability to import data directly from these sources is the new code that I've written, including a major overhaul of the underlying ctype and collation support in libc to properly support multibyte locales.

Its my belief that with this integration, one of the biggest feature gaps between illumos and Solaris is closed.

Sunday, October 3, 2010

Emacs & Gnome Terminal Co-existence Resolved

For many years, I've been stuck with old xterm, because it was the only one that honored my Meta keys in the same way that GNU emacs did. I could never figure out how to make gnome-terminal work, which always bothered me somewhat. (Notably GNOME terminal has better Unicode support which has lately become important to me.)

I finally found a reference that helped me out. I understood that the problem was conflicting ideas about modifier keys; gnome-terminal uses Mod1, but Emacs uses Mod4. What I didn't know was something I found out here, namely that Emacs only uses Mod4 if it exists. So a better solution for me is to simply clear Mod4 altogether, and both programs happily honor Mod1. (This leaves xterm hosed, but if gnome-terminal works, then I don't need xterm anymore.)

My resulting .xmodmap looks like this:

remove Lock = Caps_Lock
keysym Caps_Lock = Control_L
add Control = Control_L
clear Mod4

This makes my PC keyboard behave sensibly. Alt is Meta. And Caps Lock is consigned to oblivion and the large key that used to have that function is now much more usefully assigned to Control.

I'm posting this here in case anyone else has struggled with this particular annoyance in the past. The clear Mod4 trick was the surprise ticket. (What I'd really like is a way to tell programs which Modifier is "really" the Meta key, given that the programs can't seem to agree on this. And with just one preference -- redefining the numerous bindings in emacs for each sequence, while possible, is not my idea of a fun thing to do.)

The other thing I'd like is a standard way in illumos/opensolaris to integrate .xmodmap. Linux/Ubuntu seems to detect my .xmodmap and handles it nicely.