illumos gets global
I just pushed a major set of changes:
8 libc locale work needs updated license files
223 libc needs multibyte locale support for collation
225 libc locale binary files should be in native byte order
309 populate initial locales for illumos
As a result, illumos has gained base support for some 157 different locales, spanning 67 languages and 116 different territories. This includes nearly all the major languages of the world -- missing are Serbian, Javanese, Farsi, Malaysian, Burmese, and some languages spoken in central and west Africa. (Some of these will be very easy for someone else to add... let me know if you want one of these and are willing to do the work.)
The support for these locales includes full POSIX compliant collating support, which was completely absent in illumos before this integration.
Also, included, is a new open source implementation of localedef(1). To my knowledge, this new implementation is the only non-GNU version of localedef that is fully open, and this version is more fully functional than the GNU version. (The GNU localedef lacks full support for collation data.)
Other notes: this is only the base support for these locales. This will for example give localized output from "date". There is quite a lot of additional effort required to fully localize an illumos system, including support for input methods, fonts, and message catalogs for all the various applications. However, with this base support, it makes doing that other work much more practical.
This integration adds nearly 2 million lines to illumos, although far and away the vast majority of it is in the form of data from Unicode and the CLDR (common locale data repository). The ability to import data directly from these sources is the new code that I've written, including a major overhaul of the underlying ctype and collation support in libc to properly support multibyte locales.
Its my belief that with this integration, one of the biggest feature gaps between illumos and Solaris is closed.
Comments