Tuesday, May 27, 2014

illumos, identification, and versions

Recently, there has been a bit of a debate on the illumos mailing lists, beginning I suppose with an initial proposal I made concerning the output from the uname(1) command on illumos, which today, when presented with "-s -r" as options, emits "SunOS 5.11".

A lot of the debate centers on various ways that software can or cannot identify the platform it is running on, and use that information to make programmatic decisions about how to operate. Most often these are decisions made at compile time, by tools such as autoconf and automake.

Clearly, it would be better for software not to rely on uname for programmatic uses, and detractors of my original proposal are correct that even in the Linux community, the value from uname cannot be relied upon for such use.  There are indeed better mechanisms to use, such as sysconf(3c), or the ZFS feature flags, to determine individual capabilities.  Indeed, the GNU autotools contains many individual tests for such things, and arguably discourages the use of uname except as a last resort.

Yet there can be no question that there are a number of packages that do make such use.  And changes to the output from uname become risky to such packages.

But perversely, not changing the output from uname also creates risk for such packages, as the various incarnations of SunOS 5.11 become ever less like one another.  Indeed, illumos != SunOS, and uname has become something of a lie over the past 4 years or so.

Clearly, the focus for programmatic platform determination -- particularly for enabling features or behaviors, should be to move away from uname.  (Actually, changing uname may actually help package maintainers in identifying this questionable behavior as questionable, although there is no doubt that such a change would be disruptive to them.)

But all this debate completely misses the other major purpose of uname's output, which is to identify the platform to humans.  Be they administrators, or developers, or distributors.  There is no question in mind that illumos' failure to self identify, and to have a regular "release" schedule (for whatever a release really means in this regard) is harmful.

The distributors, such as Joyent (SmartOS), would prefer that people only identify their particular distribution, and I believe that much of the current argument from them stems from two primary factors.  First, they see additional effort in any change, with no direct benefit to them.  Second, in some ways these distributors are disinclined to emphasize illumos itself.   Some of the messages sent (either over IRC or in the email thread) clearly portray some resentment towards the rest of the illumos ecosystem, especially some of the niche players), as well as a very heavy handed approach by one commercial concern towards the rest of the ecosystem.

Clearly, this flies in the spirit of community cooperation in which illumos was founded.  illumos should not be a slave to any single commercial concern, and nobody should be able to exercise a unilateral veto over illumos, or shut down conversations with a single "thermonuclear" post.

So, without regard to the merits, or lack thereof, of changing uname's output, I'm quite certain, with sufficient clear evidence of my own gathering, that illumos absolutely needs to have a regularly scheduled "releases" of illumos, with humanly understandable release identifiers.  The lack of both the releases, and the identifiers to go with them, hinders meaningful reviews, hurts distributors (particularly smaller ones), and makes support for the platform harder, particularly when the questions of support are considered across distributions.  All of these hinder adoption of the illumos platform; clearly an undesirable outcome.

Some would argue that we could use git tags for the identifiers.  From a programmatic standpoint, these would be easy to collect.  Although they have problems as well (particularly for distributions which neither use git, or use a private fork that doesn't preserve our git versions), there are worse problems.

Specifically, a humans aren't meant to derive meaning from something like "e3de96f25bd2ea4282eea2d1a86c1bebac8950cb".   While Neo could understand this, most of use merely mortal individuals simply can't understand such tags.  Worse, there is no implied sequencing here.  Is "e3de96f25bd2ea4282eea2d1a86c1bebac8950cb" newer than "d1007364f5b14efdd7d6ba27aa458669a6365d48" ?  You can't do a meaningful comparison without examining the actual git history.

This makes it hard to guess whether a given running release has a bug integrated or not.  It makes it hard to have conversations about the platform.  It even makes it hard for independent reviewers of the platform  to identify anything meaningful about the platform in the context of reviewing a distribution.

Furthermore, when we talk about the platform, its useful for version numbers to convey more than just serial meaning.  In particular, version numbers help set expectations for developers.  Historically Solaris (or rather SunOS, from which illumos is derived), set those expectations in the form of stability guarantees.   A given library interface might be declared as Stable, or Evolving, (or Committed or Uncommitted), or Obsolete.  This was a way to convey to developers the relative risks of an interface (in terms of interface change), and it set some ground rules for rate of change.  Indeed, Solaris (and SunOS) relied upon a form of semantic versioning, and many of the debates in Sun's architectural leadership for Solaris (PSARC) revolved around these commitments.

Yet today, the illumos distributors seem over willing to throw that bit of our heritage into the waste bin.  A trend, I fear, which ultimately leads to chaos, and an increase in the difficulty of adoption by ISVs and developers.

illumos is one component -- albeit probably the most important by far -- of a distribution.  Like all components, it is useful to be able to determine and talk about it.  This is not noticeably different than Linux, Xorg, gnome, or pretty much any of the other systems which you are likely to find as part of a complete Ubuntu or RedHat distribution.  In fact, our situation is entirely analogous other than we combine our libc and some key utilities and commands with the kernel.

Technically, in Solaris parlance, illumos is a consolidation.  In fact, this distinction has alway been clear.  And historically the way the consolidation is identified is with the contents of the utsname structure, which is what is emitted by uname.  

Furthermore, when we talk about the platform, its useful for version numbers to convey more than just serial meaning

In summary, regardless of whether we feel uname should return illumos or not, there is a critical need, although one not necessarily agreed upon by certain commercial concerns, for the illumos platform to have a release number at a minimum, and this release number must be useful to convey meaningful information to end-users, developers, and distributors alike.  It would be useful if this release number were obtainable in the traditional fashion (uname), but its more important that the numbers convey meaning in the same way across distributions (which means packaging metadata cannot be used, at least not exclusively).