Friday, July 11, 2014

POSIX 2008 locale support integrated (illumos)

A year in the making... and finally the code is pushed.  Hooray!

I've just pushed 2964 into illumos, which adds support for a bunch of new libc calls for thread safe and thread-specific locales, as well as explicit locale_t objects.   Some of the interfaces added fall under the BSD/MacOS X "xlocale" class of functions.

Note that not all of the xlocale functions supplied by BSD/MacOS are present.  However, all of the routines that were added by POSIX 2008 for this class are present, and should conform to the POSIX 2008 / XPG Issue 7 standards.  (Note that we are not yet compliant with POSIX 2008, this is just a first step -- albeit a rather major one.)

The webrev is also available for folks who want to look at the code.

The new APIs are documented in newlocale(3c), uselocale(3c), etc.   (Sadly, man pages are not indexed yet so I can't put links here.)

Also, documentation for some APIs that was missing (e.g. strfmon(3c)) are now added.

This project has taken over a year to integrate, but I'm glad it is now done.

I want to say a big -- huge -- thank you to Robert Mustacchi who not only code reviewed a huge amount of change (and provided numerous useful and constructive feedback), but also contributed a rather large swath of man page content in support of this effort, working on is own spare time.  Thanks Robert!

Also, thanks to both Gordon Ross and Dan McDonald who also contributed useful review feedback and facilitated the integration of this project.  Thanks guys!

Any errors in this effort are mine, of course.  I would be extremely interested in hearing constructive feedback.  I expect there will be some minor level of performance impact (unavoidable due to the way the standards were written to require a thread-specific check on all locale sensitive routines), but I hope it will be minor.

I'm also extremely interested in feedback from folks who are making use of these new routines.  I'm told the Apache Standard C++ library depends on these interfaces -- I hope someone will try it out and let me know how it goes.   Also, if someone wants/needs xlocale interfaces that I didn't include in this effort, please drop me a note and I'll try to get to it.

As this is a big change, it is not entirely without risk.  I've done what I could to minimize that risk, and test as much as I could.  If I missed something, please let me know, and I'll attempt to fix in a timely fashion.

Thanks!

Tuesday, May 27, 2014

illumos, identification, and versions

Recently, there has been a bit of a debate on the illumos mailing lists, beginning I suppose with an initial proposal I made concerning the output from the uname(1) command on illumos, which today, when presented with "-s -r" as options, emits "SunOS 5.11".

A lot of the debate centers on various ways that software can or cannot identify the platform it is running on, and use that information to make programmatic decisions about how to operate. Most often these are decisions made at compile time, by tools such as autoconf and automake.

Clearly, it would be better for software not to rely on uname for programmatic uses, and detractors of my original proposal are correct that even in the Linux community, the value from uname cannot be relied upon for such use.  There are indeed better mechanisms to use, such as sysconf(3c), or the ZFS feature flags, to determine individual capabilities.  Indeed, the GNU autotools contains many individual tests for such things, and arguably discourages the use of uname except as a last resort.

Yet there can be no question that there are a number of packages that do make such use.  And changes to the output from uname become risky to such packages.

But perversely, not changing the output from uname also creates risk for such packages, as the various incarnations of SunOS 5.11 become ever less like one another.  Indeed, illumos != SunOS, and uname has become something of a lie over the past 4 years or so.

Clearly, the focus for programmatic platform determination -- particularly for enabling features or behaviors, should be to move away from uname.  (Actually, changing uname may actually help package maintainers in identifying this questionable behavior as questionable, although there is no doubt that such a change would be disruptive to them.)

But all this debate completely misses the other major purpose of uname's output, which is to identify the platform to humans.  Be they administrators, or developers, or distributors.  There is no question in mind that illumos' failure to self identify, and to have a regular "release" schedule (for whatever a release really means in this regard) is harmful.

The distributors, such as Joyent (SmartOS), would prefer that people only identify their particular distribution, and I believe that much of the current argument from them stems from two primary factors.  First, they see additional effort in any change, with no direct benefit to them.  Second, in some ways these distributors are disinclined to emphasize illumos itself.   Some of the messages sent (either over IRC or in the email thread) clearly portray some resentment towards the rest of the illumos ecosystem, especially some of the niche players), as well as a very heavy handed approach by one commercial concern towards the rest of the ecosystem.

Clearly, this flies in the spirit of community cooperation in which illumos was founded.  illumos should not be a slave to any single commercial concern, and nobody should be able to exercise a unilateral veto over illumos, or shut down conversations with a single "thermonuclear" post.

So, without regard to the merits, or lack thereof, of changing uname's output, I'm quite certain, with sufficient clear evidence of my own gathering, that illumos absolutely needs to have a regularly scheduled "releases" of illumos, with humanly understandable release identifiers.  The lack of both the releases, and the identifiers to go with them, hinders meaningful reviews, hurts distributors (particularly smaller ones), and makes support for the platform harder, particularly when the questions of support are considered across distributions.  All of these hinder adoption of the illumos platform; clearly an undesirable outcome.

Some would argue that we could use git tags for the identifiers.  From a programmatic standpoint, these would be easy to collect.  Although they have problems as well (particularly for distributions which neither use git, or use a private fork that doesn't preserve our git versions), there are worse problems.

Specifically, a humans aren't meant to derive meaning from something like "e3de96f25bd2ea4282eea2d1a86c1bebac8950cb".   While Neo could understand this, most of use merely mortal individuals simply can't understand such tags.  Worse, there is no implied sequencing here.  Is "e3de96f25bd2ea4282eea2d1a86c1bebac8950cb" newer than "d1007364f5b14efdd7d6ba27aa458669a6365d48" ?  You can't do a meaningful comparison without examining the actual git history.

This makes it hard to guess whether a given running release has a bug integrated or not.  It makes it hard to have conversations about the platform.  It even makes it hard for independent reviewers of the platform  to identify anything meaningful about the platform in the context of reviewing a distribution.

Furthermore, when we talk about the platform, its useful for version numbers to convey more than just serial meaning.  In particular, version numbers help set expectations for developers.  Historically Solaris (or rather SunOS, from which illumos is derived), set those expectations in the form of stability guarantees.   A given library interface might be declared as Stable, or Evolving, (or Committed or Uncommitted), or Obsolete.  This was a way to convey to developers the relative risks of an interface (in terms of interface change), and it set some ground rules for rate of change.  Indeed, Solaris (and SunOS) relied upon a form of semantic versioning, and many of the debates in Sun's architectural leadership for Solaris (PSARC) revolved around these commitments.

Yet today, the illumos distributors seem over willing to throw that bit of our heritage into the waste bin.  A trend, I fear, which ultimately leads to chaos, and an increase in the difficulty of adoption by ISVs and developers.

illumos is one component -- albeit probably the most important by far -- of a distribution.  Like all components, it is useful to be able to determine and talk about it.  This is not noticeably different than Linux, Xorg, gnome, or pretty much any of the other systems which you are likely to find as part of a complete Ubuntu or RedHat distribution.  In fact, our situation is entirely analogous other than we combine our libc and some key utilities and commands with the kernel.

Technically, in Solaris parlance, illumos is a consolidation.  In fact, this distinction has alway been clear.  And historically the way the consolidation is identified is with the contents of the utsname structure, which is what is emitted by uname.  

Furthermore, when we talk about the platform, its useful for version numbers to convey more than just serial meaning

In summary, regardless of whether we feel uname should return illumos or not, there is a critical need, although one not necessarily agreed upon by certain commercial concerns, for the illumos platform to have a release number at a minimum, and this release number must be useful to convey meaningful information to end-users, developers, and distributors alike.  It would be useful if this release number were obtainable in the traditional fashion (uname), but its more important that the numbers convey meaning in the same way across distributions (which means packaging metadata cannot be used, at least not exclusively).

Sunday, April 6, 2014

SP protocols improved again!

Introduction


As a result of some investigations performed in response to my first performance tests for my SP implementation, I've made a bunch of changes to my code.

First off, I discovered that my code was rather racy.  When I started bumping up GOMAXPROCS, and and used the -race flag to go test, I found lots of issues. 

Second, there were failure scenarios where the performance fell off a cliff, as the code dropped messages, needed to retry, etc. 

I've made a lot of changes to fix the errors.  But, I've also made a major set of changes which enable a vastly better level of performance, particularly for throughput sensitive workloads. Note that to get these numbers, the application should "recycle" the Messages it uses (using a new Free() API... there is also a NewMessage() API to allocate from the cache), which will cache and recycle used buffers, greatly reducing the garbage collector workload.

Throughput


So, here are the new numbers for throughput, compared against my previous runs on the same hardware, including tests against the nanomsg reference itself.

Throughput Comparision
(Mb/s)
transportnanomsg 0.3betaold gdamore/spnew
(1 thread)
new
(2 threads)
new
(4 threads)
new
(8 threads)
inproc 4k432255516629775186548841
ipc 4k947023796176661550255040
tcp 4k974425153785427944114420
inproc 64k83904216154561835044b4431247077
ipc 64k389297831a48400651906447163506
tcp 64k309791259834994496085306453432

a I think this poor result is from retries or resubmits inside the old implementation.
b I cannot explain this dip; I think maybe unrelated activity or GC activity may be to blame

The biggest gains are with large frames (64K), although there are gains for the 4K size as well.  nanomsg still out performs for the 4K size, but with 64K my message caching changes pay dividends and my code actually beats nanomsg rather handily for the TCP and IPC cases.

I think for 4K, we're hurting due to inefficiencies in the Go TCP handling below my code.  My guess is that there is a higher per packet cost here, and that is what is killing us.  This may be true for the IPC case as well.  Still, these are very respectable numbers, and for some very real and useful workloads my implementation compares and even beats the reference.

The new code really shows some nice gains for concurrency, and makes good use of multiple CPU cores.

There are a few mysteries though.  Notes "a" and "b" point to two of them.  The third is that the IPC performance takes a dip when moving from 2 threads to 4.  It still significantly outperforms the TCP side though, and is still performing more than twice as fast as my first implementation, so I guess I shouldn't complain too much.

Latency


The latency has shown some marked improvements as well.  Here are new latency numbers.

Latency Comparision
(usec/op)
transportnanomsg 0.3betaold gdamore/spnew
(1 thread)
new
(2 threads)
new
(4 threads)
new
(8 threads)
inproc6.238.476.569.9311.011.2
ipc15.722.627.729.131.331.0
tcp24.850.541.042.742.942.9

All in all, the round trip times are reasonably respectable. I am especially proud of how close I've come within the best inproc time -- a mere 330 nsec separates the Go implementation from the nanomsg native C version.  When you factor in the heavy use of go routines, this is truly impressive.   To be honest, I suspect that most of those 330 nsec are actually lost in the extra data copy that my inproc implementation has to perform to simulate the "streaming" nature of real transports (i.e. data and headers are not separate on message ingress.)

There's a sad side to story as well.  TCP handling seems to be less than ideal in Go.  I'm guessing that some effort is done to use larger TCP windows, and Nagle may be at play here as well (I've not checked.) Even so, I've made a 20% improvement in latencies for TCP from my first pass.

The other really nice thing is near linear scalability when threads (via bumping GOMAXPROCS) are added.  There is very, very little contention in my implementation.  (I presume some underlying contention for the channels exists, but this seems to be on the order of only a usec or so.)  Programs that utilize multiple goroutines are likely to benefit well from this.

Conclusion


Simplifying the code to avoid certain indirection (extra passes through additional channels and goroutines), and adding a message pooling layer, have yielded enormous performance gains.  Go performs quite respectably in this messaging application, comparing favorably with a native C implementation.  It also benefits from additional concurrency.

One thing I really found was that it took some extra time to get my layering model correct.  I traded complexity in the core for some extra complexity in the Protocol implementations.  But this avoided a whole other round of context switches, and enormous complexity.  My use of linked lists, and the ugliest bits of mutex and channel synchronization around list-based queues, were removed.  While this means more work for protocol implementors, the reduction in overall complexity leads to marked performance and reliability gains.

I'm now looking forward to putting this code into production use.

Thursday, March 27, 2014

Names are Hard

So I've been thinking about naming for my pure Go implementation of nanomsg's SP protocols.

nanomsg is trademarked by the inventor of the protocols.  (He does seem to take a fairly loose stance with enforcement though -- since he advocates using names that are derived from nanomsg, as long as its clear that there is only one "nanomsg".)

Right now my implementation is known as "bitbucket.org/gdamore/sp".  While this works for code, it doesn't exactly roll off the tongue.  Its also a problem for folks wanting to write about this.  So the name can actually become a barrier to adoption.  Not good.

I suck at names.  After spending a day online with people, we came up with "illumos" for the other open source project I founded.  illumos has traction now, but even that name has problems.  (People want to spell it "illumOS", and they often mispronounce it as "illuminos"  (note there are no "n"'s in illumos).  And, worse, it turns out that the leading "i" is indistinguishable from the following "l's" -- like this: Illumos --  when used in many common san-serif fonts -- which is why I never capitalize illumos.  Its also had a profound impact on how I select fonts.  Good-bye Helvetica!)

go-nanomsg already exists, btw, but its a simple foreign-function binding, with a number of limitations, so I hope Go programmers will choose my version instead.

Anyway, I'm thinking of two options, but I'd like criticisms and better suggestions, because I need to fix this problem soon.

1. "gnanomsg" -- the "g" evokes "Go" (or possibly "Garrett" if I want to be narcissistic about it -- but I don't like vanity naming this way).  In pronouncing it, one could either use a silent "g" like "gnome" or "gnat", or to distinguish between "nanomsg" one could harden the "g" like in "growl".   The problem is that pronunciation can lead to confusion, and I really don't like that "g" can be mistaken to mean this is a GNU program, when it most emphatically is not a GNU.  Nor is it GPL'd, nor will it ever be.

2. "masago" -- this name distantly evokes "messaging ala go", is a real world word, and I happen to like sushi.  But it will be harder for people looking for nanomsg compatible layers to find my library.

I'm leaning towards the first.  Opinions from the community solicited.



Wednesday, March 26, 2014

Early performance numbers

I've added a benchmark tool to my Go implementation of nanomsg's SP protocols, along with the inproc transport, and I'll be pushing those changes rather shortly.

In the meantime, here's some interesting results:

Latency Comparision
(usec/op)
transport nanomsg 0.3beta gdamore/sp
inproc6.238.47
ipc15.722.6
tcp24.850.5


The numbers aren’t all that surprising.  Using go, I’m using non-native interfaces, and my use of several goroutines to manage concurrency probably creates a higher number of context switches per exchange.  I suspect I might find my stuff does a little better with lots and lots of servers hitting it, where I can make better use of multiple CPUs (though one could write a C program that used threads to achieve the same effect).

The story for throughput is a little less heartening though:


Throughput Comparision
(Mb/s)
transport message size nanomsg 0.3beta gdamore/sp
inproc4k43225551
ipc4k94702379
tcp4k97442515
inproc64k8390421615
ipc64k389297831 (?!?)
tcp64k3097912598

I didn't try larger sizes yet, this is just a quick sample test, not an exhaustive performance analysis.  What is interesting is that the ipc case for my code is consistently low.  It uses the same underlying transport to Go as TCP, but I guess maybe we are losing some TCP optimizations.  (Note that the TCP tests were performed using loopback, I don't really have 40GbE on my desktop Mac. :-)

I think my results may be worse than they would otherwise be, because I use the equivalent of NN_MSG to dynamically allocate each message as it arrives, whereas the nanomsg benchmarks use a preallocated buffer.   Right now I'm not exposing an API to use preallocated buffers (but I have considered it!  It does feel unnatural though, and more of a "benchmark special".)

That said, I'm not unhappy with these numbers.  Indeed, it seems that my code performs reasonably well given all the cards stacked against it.  (Extra allocations due to the API, extra context switches due to extra concurrency using channels and goroutines in Go, etc.)

A litte more details about the tests.

All test were performed using nanomsg 0.3beta, and my current Go 1.2 tree, running on my Mac running MacOS X 10.9.2, on 3.2 GHz Core i5.  The latency tests used full round trip timing using the REQ/REP topology, and a 111 byte message size.  The throughput tests were performed using PAIR.  (Good news, I've now validated PAIR works. :-)

The IPC was directed at file path in /tmp, and TCP used 127.0.0.1 ports.

Note that my inproc tries hard to avoid copying, but does still copy due to a mismatch about header vs. body location.  I'll probably fix that in a future update (its an optimization, and also kind of a benchmark special since I don't think inproc gets a lot of performance critical use.  In Go, it would be more natural to use channels for that.

Monday, March 24, 2014

SP (nanomsg) in Pure Go

I'm pleased to announce that this past weekend I released the first version of my implementation of the SP (scalability protocols, sometimes known by their reference implementation, nanomsg) implemented in pure Go. This allows them to be used even on platforms where cgo is not present.  It may be possible to use them in playground (I've not tried yet!)

This is released under an Apache 2.0 license.  (It would be even more liberal BSD or MIT, except I want to offer -- and demand -- patent protection to and from my users.)

I've been super excited about Go lately.  And having spent some time with ØMQ in a previous project, I was keen to try doing some things in the successor nanomsg project.   (nanomsg is a single library message queue and communications library.)

Martin (creator of ØMQ) has written rather extensively about how he wishes he had written it in C instead of C++.  And with nanomsg, that is exactly what he is done.

And C is a great choice for implementing something that is intended to be a foundation for other projects.  But, its not ideal for some circumstances, and the use of async I/O in his C library tends to get in the way of Go's native support for concurrency.

So my pure Go version is available in a form that makes the best use of Go, and tries hard to follow Go idioms.  It doesn't support all the capabilities of Martin's reference implementation -- yet -- but it will be easy to add those capabilities.

Even better, I found it pretty easy to add a new transport layer (TLS) this evening.  Adding the implementation took less than a half hour.  The real work was in writing the test program, and fighting with OpenSSL's obtuse PKI support for my test cases.

Anyway, I encourage folks to take a look at it.  I'm keen for useful & constructive criticism.

Oh, and this work is stuff I've done on my own time over a couple of weekends -- and hence isn't affiliated with, or endorsed by, any of my employers, past or present.

PS: Yes, it should be possible to "adapt" this work to support native ØMQ protocols (ZTP) as well.  If someone wants to do this, please fork this project.  I don't think its a good idea to try to support both suites in the same package -- there are just too many subtle differences.

Thursday, February 6, 2014

The Failed Promise

My dislike for C++ is well-known by those who know me.  As is my lack of fondness for Perl.

I have a new one though.  Java.  Oracle and Apple have conspired to kill it.

(I should say this much -- its been a long time, about a decade, since I developed any Java code.  Its entirely possible that I remember the language with more fondness than it truly warrants.  C++ was once beautiful too -- before ANSI went and mutated it beyond all hope back in 1990 or thereabouts.)

Which is a shame, because Java started with such promise.  Write-once, run-anywhere.  Strongly typed, and a paradigm for OO that was so far superior to C++.  (Multiple inheritance is the bane of semi-competent C++ engineers, who often can't even properly cope with pointers to memory.)

For years, I bemoaned the fact that the single biggest weakness of Java was the fact that it was seen as a way to make more dynamic web pages.  (I remember HotJava -- what a revolutionary thing it was indeed.)  But even stand-alone apps struggled with performance (startup times were hideous for anyone starting a Swing app.)

Still, we suffered because of the write once, run-anywhere promise was just too alluring.  All those performance problems were going to get solved by optimization, and faster systems.  And wasn't it a shame that people associated Java with applets instead of applications?  (Java WebStart tried to drive more of the latter, which should have been a good thing.)

But all the promise that Java seemed to offer is well and truly lost now.  And I don't think it can ever be regained. 

Here's a recent experience I had.

I had reason to go run some Java code to access an IPMI console on a SuperMicro system.  I don't run Windows.   The IPMI console app seems to be a mishmash of Java and native libraries produced by ATEN for SuperMicro.  Supposedly it supported MacOS X and Linux.

I lost several hours (several, as in more than two) trying to get a working IPMI console on my Mac.  I tried Java 7 from Oracle, Java 6 from Apple, I even tried to get a Linux version working in Ubuntu (and after a number of false starts I did actually succeed in the last attempt.)  All just to get access to a simple IPMI console.  Seriously?!?

What are the problems here?
  • Apple doesn't "support" Java officially anymore.
  • Apple disables Java by default on Safari even when it is installed.
  • Apple disables Java webstart almost entirely.  (The only way to open an unsigned Java webstart file -- has anyone ever even seen a signed .jnlp?) is to download it, and open it with a right-click in the finder and explicitly answer "Yes, I know its not from an approved vendor, but open the darn thing anyway.  Even though Java also asks me the same question.  Several times.)
  • Oracle ships Java 7 without 32-bit support on Macs, so only Safari can use it (not Chrome)
  • Oracle Java 7 Update 51 has a new security manager that prevents most unsigned apps from running at all.  (The only way to get them to run at all is to track down the Java preferences pane and reduce the security setting.)
  • Developers like ATEN produce native libraries which means that their "Java" apps are totally unportable.
All of this is because people are terrified of the numerous bugs in Java.  Which is appropriate, since there have been bugs in various JVMs.  Running unverified code without some warning to the end-user is dangerous -- really these should be treated with the same caution due a native application.

But, I should also not have to be a digital contortionist to access an application.  I am using a native IPMI console from an IP address that I entered by hand on a secured network.  I'd expect to have to answer the "Are you sure you want to run this unsigned app?" question (perhaps with an option to remember the setting) once.  But the barriers to actually executing the app are far too high.  (So high, that I still have not had success in getting the SuperMicro IPMI console running on my Mac -- although I did eventually get it to work in a VM running Ubuntu.)

So, shame on Apple and Oracle for doing everything in their power to kill Java.  Shame on ATEN / SuperMicro for requiring Java in the first place, and for polluting it with native code libraries.  And for not getting their code signed.

And shame on the Java ecosystem for allowing this state of affairs to come about.

I'll most likely never write another line of Java, in spite of the fact that I happen to think the language is actually quite elegant for solving OO programming problems.  The challenges in deploying such applications are just far too high.  In the past the problem was that the cost of the sandbox meant that application startups are slow.  Now, even though we have fast CPUs, we have traded 30 second application startup times for situations where it takes hours for even technical people to get an app running.

I guess its probably better on Windows.

But if you want to use that as an excuse, then just write a NATIVE APPLICATION, and stop screwing around with this false promise of "run-anywhere". 

If a Javascript app won't work for you, then just bite the bullet and write a native app.  Heck, with the porting solutions available, it isn't that much work to make native apps portable to a variety of platforms.  (Though such solutions necessarily leave niche systems out in the cold.  Let's face it, the market for native desktop apps on SPARC systems running NetBSD is pretty tiny.)  But covering the 5 mainstream client platforms (Windows, Mac, iOS, Android, Linux) is pretty easy.  (Yes, illumos isn't in that list.  And maybe Linux shouldn't be either, although I think the number of folks running a desktop with Linux is probably several orders of magnitude larger than all other UNIXish platforms -- besides MacOS -- combined.)

And, RIP "write once, run-anywhere".  Congratulations Oracle, Apple.  You killed it.