Sunday, April 6, 2014

SP protocols improved again!

Introduction


As a result of some investigations performed in response to my first performance tests for my SP implementation, I've made a bunch of changes to my code.

First off, I discovered that my code was rather racy.  When I started bumping up GOMAXPROCS, and and used the -race flag to go test, I found lots of issues. 

Second, there were failure scenarios where the performance fell off a cliff, as the code dropped messages, needed to retry, etc. 

I've made a lot of changes to fix the errors.  But, I've also made a major set of changes which enable a vastly better level of performance, particularly for throughput sensitive workloads. Note that to get these numbers, the application should "recycle" the Messages it uses (using a new Free() API... there is also a NewMessage() API to allocate from the cache), which will cache and recycle used buffers, greatly reducing the garbage collector workload.

Throughput


So, here are the new numbers for throughput, compared against my previous runs on the same hardware, including tests against the nanomsg reference itself.

Throughput Comparision
(Mb/s)
transportnanomsg 0.3betaold gdamore/spnew
(1 thread)
new
(2 threads)
new
(4 threads)
new
(8 threads)
inproc 4k432255516629775186548841
ipc 4k947023796176661550255040
tcp 4k974425153785427944114420
inproc 64k83904216154561835044b4431247077
ipc 64k389297831a48400651906447163506
tcp 64k309791259834994496085306453432

a I think this poor result is from retries or resubmits inside the old implementation.
b I cannot explain this dip; I think maybe unrelated activity or GC activity may be to blame

The biggest gains are with large frames (64K), although there are gains for the 4K size as well.  nanomsg still out performs for the 4K size, but with 64K my message caching changes pay dividends and my code actually beats nanomsg rather handily for the TCP and IPC cases.

I think for 4K, we're hurting due to inefficiencies in the Go TCP handling below my code.  My guess is that there is a higher per packet cost here, and that is what is killing us.  This may be true for the IPC case as well.  Still, these are very respectable numbers, and for some very real and useful workloads my implementation compares and even beats the reference.

The new code really shows some nice gains for concurrency, and makes good use of multiple CPU cores.

There are a few mysteries though.  Notes "a" and "b" point to two of them.  The third is that the IPC performance takes a dip when moving from 2 threads to 4.  It still significantly outperforms the TCP side though, and is still performing more than twice as fast as my first implementation, so I guess I shouldn't complain too much.

Latency


The latency has shown some marked improvements as well.  Here are new latency numbers.

Latency Comparision
(usec/op)
transportnanomsg 0.3betaold gdamore/spnew
(1 thread)
new
(2 threads)
new
(4 threads)
new
(8 threads)
inproc6.238.476.569.9311.011.2
ipc15.722.627.729.131.331.0
tcp24.850.541.042.742.942.9

All in all, the round trip times are reasonably respectable. I am especially proud of how close I've come within the best inproc time -- a mere 330 nsec separates the Go implementation from the nanomsg native C version.  When you factor in the heavy use of go routines, this is truly impressive.   To be honest, I suspect that most of those 330 nsec are actually lost in the extra data copy that my inproc implementation has to perform to simulate the "streaming" nature of real transports (i.e. data and headers are not separate on message ingress.)

There's a sad side to story as well.  TCP handling seems to be less than ideal in Go.  I'm guessing that some effort is done to use larger TCP windows, and Nagle may be at play here as well (I've not checked.) Even so, I've made a 20% improvement in latencies for TCP from my first pass.

The other really nice thing is near linear scalability when threads (via bumping GOMAXPROCS) are added.  There is very, very little contention in my implementation.  (I presume some underlying contention for the channels exists, but this seems to be on the order of only a usec or so.)  Programs that utilize multiple goroutines are likely to benefit well from this.

Conclusion


Simplifying the code to avoid certain indirection (extra passes through additional channels and goroutines), and adding a message pooling layer, have yielded enormous performance gains.  Go performs quite respectably in this messaging application, comparing favorably with a native C implementation.  It also benefits from additional concurrency.

One thing I really found was that it took some extra time to get my layering model correct.  I traded complexity in the core for some extra complexity in the Protocol implementations.  But this avoided a whole other round of context switches, and enormous complexity.  My use of linked lists, and the ugliest bits of mutex and channel synchronization around list-based queues, were removed.  While this means more work for protocol implementors, the reduction in overall complexity leads to marked performance and reliability gains.

I'm now looking forward to putting this code into production use.

Thursday, March 27, 2014

Names are Hard

So I've been thinking about naming for my pure Go implementation of nanomsg's SP protocols.

nanomsg is trademarked by the inventor of the protocols.  (He does seem to take a fairly loose stance with enforcement though -- since he advocates using names that are derived from nanomsg, as long as its clear that there is only one "nanomsg".)

Right now my implementation is known as "bitbucket.org/gdamore/sp".  While this works for code, it doesn't exactly roll off the tongue.  Its also a problem for folks wanting to write about this.  So the name can actually become a barrier to adoption.  Not good.

I suck at names.  After spending a day online with people, we came up with "illumos" for the other open source project I founded.  illumos has traction now, but even that name has problems.  (People want to spell it "illumOS", and they often mispronounce it as "illuminos"  (note there are no "n"'s in illumos).  And, worse, it turns out that the leading "i" is indistinguishable from the following "l's" -- like this: Illumos --  when used in many common san-serif fonts -- which is why I never capitalize illumos.  Its also had a profound impact on how I select fonts.  Good-bye Helvetica!)

go-nanomsg already exists, btw, but its a simple foreign-function binding, with a number of limitations, so I hope Go programmers will choose my version instead.

Anyway, I'm thinking of two options, but I'd like criticisms and better suggestions, because I need to fix this problem soon.

1. "gnanomsg" -- the "g" evokes "Go" (or possibly "Garrett" if I want to be narcissistic about it -- but I don't like vanity naming this way).  In pronouncing it, one could either use a silent "g" like "gnome" or "gnat", or to distinguish between "nanomsg" one could harden the "g" like in "growl".   The problem is that pronunciation can lead to confusion, and I really don't like that "g" can be mistaken to mean this is a GNU program, when it most emphatically is not a GNU.  Nor is it GPL'd, nor will it ever be.

2. "masago" -- this name distantly evokes "messaging ala go", is a real world word, and I happen to like sushi.  But it will be harder for people looking for nanomsg compatible layers to find my library.

I'm leaning towards the first.  Opinions from the community solicited.



Wednesday, March 26, 2014

Early performance numbers

I've added a benchmark tool to my Go implementation of nanomsg's SP protocols, along with the inproc transport, and I'll be pushing those changes rather shortly.

In the meantime, here's some interesting results:

Latency Comparision
(usec/op)
transport nanomsg 0.3beta gdamore/sp
inproc6.238.47
ipc15.722.6
tcp24.850.5


The numbers aren’t all that surprising.  Using go, I’m using non-native interfaces, and my use of several goroutines to manage concurrency probably creates a higher number of context switches per exchange.  I suspect I might find my stuff does a little better with lots and lots of servers hitting it, where I can make better use of multiple CPUs (though one could write a C program that used threads to achieve the same effect).

The story for throughput is a little less heartening though:


Throughput Comparision
(Mb/s)
transport message size nanomsg 0.3beta gdamore/sp
inproc4k43225551
ipc4k94702379
tcp4k97442515
inproc64k8390421615
ipc64k389297831 (?!?)
tcp64k3097912598

I didn't try larger sizes yet, this is just a quick sample test, not an exhaustive performance analysis.  What is interesting is that the ipc case for my code is consistently low.  It uses the same underlying transport to Go as TCP, but I guess maybe we are losing some TCP optimizations.  (Note that the TCP tests were performed using loopback, I don't really have 40GbE on my desktop Mac. :-)

I think my results may be worse than they would otherwise be, because I use the equivalent of NN_MSG to dynamically allocate each message as it arrives, whereas the nanomsg benchmarks use a preallocated buffer.   Right now I'm not exposing an API to use preallocated buffers (but I have considered it!  It does feel unnatural though, and more of a "benchmark special".)

That said, I'm not unhappy with these numbers.  Indeed, it seems that my code performs reasonably well given all the cards stacked against it.  (Extra allocations due to the API, extra context switches due to extra concurrency using channels and goroutines in Go, etc.)

A litte more details about the tests.

All test were performed using nanomsg 0.3beta, and my current Go 1.2 tree, running on my Mac running MacOS X 10.9.2, on 3.2 GHz Core i5.  The latency tests used full round trip timing using the REQ/REP topology, and a 111 byte message size.  The throughput tests were performed using PAIR.  (Good news, I've now validated PAIR works. :-)

The IPC was directed at file path in /tmp, and TCP used 127.0.0.1 ports.

Note that my inproc tries hard to avoid copying, but does still copy due to a mismatch about header vs. body location.  I'll probably fix that in a future update (its an optimization, and also kind of a benchmark special since I don't think inproc gets a lot of performance critical use.  In Go, it would be more natural to use channels for that.

Monday, March 24, 2014

SP (nanomsg) in Pure Go

I'm pleased to announce that this past weekend I released the first version of my implementation of the SP (scalability protocols, sometimes known by their reference implementation, nanomsg) implemented in pure Go. This allows them to be used even on platforms where cgo is not present.  It may be possible to use them in playground (I've not tried yet!)

This is released under an Apache 2.0 license.  (It would be even more liberal BSD or MIT, except I want to offer -- and demand -- patent protection to and from my users.)

I've been super excited about Go lately.  And having spent some time with ØMQ in a previous project, I was keen to try doing some things in the successor nanomsg project.   (nanomsg is a single library message queue and communications library.)

Martin (creator of ØMQ) has written rather extensively about how he wishes he had written it in C instead of C++.  And with nanomsg, that is exactly what he is done.

And C is a great choice for implementing something that is intended to be a foundation for other projects.  But, its not ideal for some circumstances, and the use of async I/O in his C library tends to get in the way of Go's native support for concurrency.

So my pure Go version is available in a form that makes the best use of Go, and tries hard to follow Go idioms.  It doesn't support all the capabilities of Martin's reference implementation -- yet -- but it will be easy to add those capabilities.

Even better, I found it pretty easy to add a new transport layer (TLS) this evening.  Adding the implementation took less than a half hour.  The real work was in writing the test program, and fighting with OpenSSL's obtuse PKI support for my test cases.

Anyway, I encourage folks to take a look at it.  I'm keen for useful & constructive criticism.

Oh, and this work is stuff I've done on my own time over a couple of weekends -- and hence isn't affiliated with, or endorsed by, any of my employers, past or present.

PS: Yes, it should be possible to "adapt" this work to support native ØMQ protocols (ZTP) as well.  If someone wants to do this, please fork this project.  I don't think its a good idea to try to support both suites in the same package -- there are just too many subtle differences.

Thursday, February 6, 2014

The Failed Promise

My dislike for C++ is well-known by those who know me.  As is my lack of fondness for Perl.

I have a new one though.  Java.  Oracle and Apple have conspired to kill it.

(I should say this much -- its been a long time, about a decade, since I developed any Java code.  Its entirely possible that I remember the language with more fondness than it truly warrants.  C++ was once beautiful too -- before ANSI went and mutated it beyond all hope back in 1990 or thereabouts.)

Which is a shame, because Java started with such promise.  Write-once, run-anywhere.  Strongly typed, and a paradigm for OO that was so far superior to C++.  (Multiple inheritance is the bane of semi-competent C++ engineers, who often can't even properly cope with pointers to memory.)

For years, I bemoaned the fact that the single biggest weakness of Java was the fact that it was seen as a way to make more dynamic web pages.  (I remember HotJava -- what a revolutionary thing it was indeed.)  But even stand-alone apps struggled with performance (startup times were hideous for anyone starting a Swing app.)

Still, we suffered because of the write once, run-anywhere promise was just too alluring.  All those performance problems were going to get solved by optimization, and faster systems.  And wasn't it a shame that people associated Java with applets instead of applications?  (Java WebStart tried to drive more of the latter, which should have been a good thing.)

But all the promise that Java seemed to offer is well and truly lost now.  And I don't think it can ever be regained. 

Here's a recent experience I had.

I had reason to go run some Java code to access an IPMI console on a SuperMicro system.  I don't run Windows.   The IPMI console app seems to be a mishmash of Java and native libraries produced by ATEN for SuperMicro.  Supposedly it supported MacOS X and Linux.

I lost several hours (several, as in more than two) trying to get a working IPMI console on my Mac.  I tried Java 7 from Oracle, Java 6 from Apple, I even tried to get a Linux version working in Ubuntu (and after a number of false starts I did actually succeed in the last attempt.)  All just to get access to a simple IPMI console.  Seriously?!?

What are the problems here?
  • Apple doesn't "support" Java officially anymore.
  • Apple disables Java by default on Safari even when it is installed.
  • Apple disables Java webstart almost entirely.  (The only way to open an unsigned Java webstart file -- has anyone ever even seen a signed .jnlp?) is to download it, and open it with a right-click in the finder and explicitly answer "Yes, I know its not from an approved vendor, but open the darn thing anyway.  Even though Java also asks me the same question.  Several times.)
  • Oracle ships Java 7 without 32-bit support on Macs, so only Safari can use it (not Chrome)
  • Oracle Java 7 Update 51 has a new security manager that prevents most unsigned apps from running at all.  (The only way to get them to run at all is to track down the Java preferences pane and reduce the security setting.)
  • Developers like ATEN produce native libraries which means that their "Java" apps are totally unportable.
All of this is because people are terrified of the numerous bugs in Java.  Which is appropriate, since there have been bugs in various JVMs.  Running unverified code without some warning to the end-user is dangerous -- really these should be treated with the same caution due a native application.

But, I should also not have to be a digital contortionist to access an application.  I am using a native IPMI console from an IP address that I entered by hand on a secured network.  I'd expect to have to answer the "Are you sure you want to run this unsigned app?" question (perhaps with an option to remember the setting) once.  But the barriers to actually executing the app are far too high.  (So high, that I still have not had success in getting the SuperMicro IPMI console running on my Mac -- although I did eventually get it to work in a VM running Ubuntu.)

So, shame on Apple and Oracle for doing everything in their power to kill Java.  Shame on ATEN / SuperMicro for requiring Java in the first place, and for polluting it with native code libraries.  And for not getting their code signed.

And shame on the Java ecosystem for allowing this state of affairs to come about.

I'll most likely never write another line of Java, in spite of the fact that I happen to think the language is actually quite elegant for solving OO programming problems.  The challenges in deploying such applications are just far too high.  In the past the problem was that the cost of the sandbox meant that application startups are slow.  Now, even though we have fast CPUs, we have traded 30 second application startup times for situations where it takes hours for even technical people to get an app running.

I guess its probably better on Windows.

But if you want to use that as an excuse, then just write a NATIVE APPLICATION, and stop screwing around with this false promise of "run-anywhere". 

If a Javascript app won't work for you, then just bite the bullet and write a native app.  Heck, with the porting solutions available, it isn't that much work to make native apps portable to a variety of platforms.  (Though such solutions necessarily leave niche systems out in the cold.  Let's face it, the market for native desktop apps on SPARC systems running NetBSD is pretty tiny.)  But covering the 5 mainstream client platforms (Windows, Mac, iOS, Android, Linux) is pretty easy.  (Yes, illumos isn't in that list.  And maybe Linux shouldn't be either, although I think the number of folks running a desktop with Linux is probably several orders of magnitude larger than all other UNIXish platforms -- besides MacOS -- combined.)

And, RIP "write once, run-anywhere".  Congratulations Oracle, Apple.  You killed it.

Wednesday, January 29, 2014

Sorry for the interruption....

Some of you may have noticed that my email, or my blog, or website, was down.

I discontinued my hosting service with BlueHost, and when I did, I mistakenly thought that by keeping my domain registration with them, that I'd still have DNS services.  It was both a foolish mistake, and yet an easy one to make.  (I should have known better...)

Anyway, things are back to normal now, once the interweb's negative caches have timed out...

It does seem unfortunate that BlueHost:


  • Does not include DNS with domain registration, nor do they have a service for it unless you actually host content with them.
  • Did not backup my DNS zone data.  (So even if I paid them to reactivate hosting, I was going to have to re-enter my zone records by hand.)  This is just stupid.

So, I've moved my DNS, and when my registration expires, I'll be moving that to another provider as well.  (Dyn, in case you wondered.)

Saturday, January 18, 2014

Worst Company ... Ever

So I've not blogged in a while, but my recent experience with a major mobile provider was so astonishingly bad that I feel compelled to write about my experiences.  (Executive summary: never do business with T-mobile again!)

I had some friends out from Russia back in November, and they wanted to purchase an iPhone 5s for their son as a birthday gift.  Sadly, we couldn't get unlocked ones from the Apple Store at the time, because they simply lacked inventory.

So we went to the local T-mobile store here on San Marcos Blvd (San Marcos, CA.)  I knew that they were offering phones for full-price (not subsidized), and it seemed like a possible way for them to get a phone.  Tiffany was the customer service agent that "assisted" us.  She looked like she was probably barely out of high school,  but anyway she seemed pleased to help us make a purchase.  I asked her no fewer than three times whether the phone was unlocked, and was assured that, yes, the phone was unlocked and would work fine with Russian GSM operators.  I was told that in order for T-mobile to sell us the phone, we had to purchase a 1-month prepaid plan for $50, and a $10 sim card.  While I was little annoyed at this, I understood that this was their prerogative, and the cost of wanting to buy the device "right now" and not being able to wait 2 weeks for Apple to ship one.  (My friends had less than a week left in California at this time.)  We were buying the phone outright, so it seemed like there ought to be no reason for the device to have a carrier lock on it.

So, of course, the phone was not unlocked.  It didn't work at all when it got to Russia.  Clearly Tiffany didn't have a clue about the product she sold us.  I guess I shouldn't have been too surprised. But we had put down $760 in cash for a device that now didn't work.  Worst of all, I was embarrassed, as was my friend, when his son got what was essentially a lemon for his birthday gift.

I was pretty upset, and went back to the store.  Tiffany was there, and apologized.  She offered to pay up to $25 for me to use a third party service to unlock the phone since "it was her fault".  Of course, the third party service stopped offering this for iPhones, and their cost for iPhone 5's was over $100 when they were offering it.

I tried to get T-mobile to unlock the phone.  I waited in the store for a manager for about 2 hours, but Tiffany finally got upset and called the police to get me to leave the store after I indicated that I preferred to wait on the premises for a manager.   I left the store premises, unhappy, but resolved to deal with this through the corporate office.

I went home, and called T-mobile customer service.  It turns out that T-mobile store is really a 3rd party company, even though they wear T-mobile shirts, and exclusively offer T-mobile products.  (More on that later.)

I spent quite a few hours on the phone with T-mobile.  Somewhere in the process, the customer service people called Tiffany, whereupon she immediately reversed her story, claiming she had informed me that the phone was locked.  This was just the first of a number of false and misleading statements that were made to me by T-mobile or its affiliates.   I spent several hours on the phone that day.  At the end of the call, I was told that T-mobile would unlock my phone only after at least 40 days had passed from the date of activation.   The person at corporate assured me that if I would just wait the 40 days, then I'd be able to get the phone unlocked.  Since the phone was already in Russia, I figured it would take at least that long to send it back and send a replacement from another carrier.

Oh one other thing.  During that first call, I discovered that Tiffany had actually taken the $50 prepaid plan we purchased, and kept it for herself or her friends.  We didn't figure that out until T-mobile customer service told me that I didn't have a plan at all.  Nice little scam.  While dishonest, I would not have begrudged the action had the phone actually worked.  Of course it didn't.

(During all this, on several different occasions, T-mobile customer service personnel tried to refer me to the same 3rd party unlocking service.  Because hey, if T-mobile can't get its own phones unlocked, maybe their customers should pay some shady third party service to get it done for them.  But it turns out that whatever backdoor deal that service had previously doesn't work anymore, because they've stopped doing it for Apple phones.)

So we waited 40 days.

And then we called to have the phone unlocked.

And T-mobile refused again.  This time I was told that I needed to have $50.01 or more, not just $50 on the plan.  After a few more hours on the phone and escalation up through a few levels of their management chain, they credited my account for $0.20, and then resubmitted the unlock request.  I was guaranteed that the phone would be unlocked.

Two days later, T-mobile denied the unlock request.

At this point, I was informed that the phone had to have T-mobile activity on it within 7 days of the unlock request.  This was simply not going to happen.  The phone was in Russia!  I complained, and I spent quite a few more hours on the phone with T-mobile.  It seems like the folks who control the unlocking process of their phones have nothing to do with the people who answer their customer service lines, nor those people's bosses, nor those people's bosses.  Astonishing!

At this point I had spent over a dozen hours trying to get T-mobile to unlock the darned thing.  T-mobile had got their money from me already, and had done pretty much everything possible to upset me.  The amount of money T-mobile spent dealing on the phone with me, trying to enforce a stupid policy, which wouldn't have been necessary if they had just admitted their mistake and fixed it (which would have cost them nothing whatsoever) is astonishing.  Talk about penny-wise and pound foolish.

At that time, T-mobile told me that I would be able to return the other phone, provided I got it back from Russia.  They agreed to this even though the usual 14 days "buyer's remorse" window had passed.

So at this point, I went and purchased a phone from Verizon (and paid a $50 premium because I was at BestBuy instead of the Apple store and the Verizon store wouldn't sell the phone unless it was part of a plan), and I sent it to Russia with my step-son, who was going their for Christmas.  That phone did work, and my step-son exchanged the T-mobile phone for it, bringing the T-mobile phone back.

The next day, I went to return the phone at the store where I bought it.  Sadly, Tiffany was there.  So was her manager, Erica.

After spending about an hour at the store trying to return it, they agreed to take it back, minus a $50 restocking fee.  And as I paid cash, I had to provide my checking account information so they could do a deposit for me, which would take a few days.  I got a paper showing that they were sending me a refund, but nothing indicating the account number.  (Turns out it took a few calls from Erica -- who claimed to be the store manager and probably was about 10 minutes older than Tiffany -- the next day, since she apparently had no clue what she was doing and needed to get two separate pieces of information that she failed to collect while I was in the story.)

I did spend a bunch of time with customer service on the phone -- a few more hours I guess, trying to get that last $50.  At this point it was just a matter of principle.  I resented the whole thing, and I wanted as much of my money back as possible.  The customer service person tried to make it right, but because the store (Hit Mobile is the company apparently) is separate from T-mobile, the store's decision was final.  The store manager (Erica) refused to refund the stocking fee, and there was nothing T-mobile could do about it.  To their credit, they did offer me a $50 service credit if I was inclined to keep an account with them.  Needless to say, I have no interest in ever being a T-mobile customer, so I thanked them and declined... indicating that I'd prefer to have funds back in cash.  (Nobody ever mentioned the restocking fee; on the contrary I was told I'd receive the full purchase price less the $60 service plan and SIM card.  Again, story and reality don't match at T-mobile.)

I did eventually get $650 credited back to my account.

So, I was out $110, plus the extra $50, plus the 13+ hours of my time, plus the embarrassment and long turnaround time to get the replacement out to Russia.  All because T-mobile wouldn't fix a mistake they made, even though numerous people at that company recognized that it was clearly the Right Thing To Do.

Turns out that Verizon iPhones are all unlocked.  And I'm already a Verizon customer.  I was thinking about T-mobile's plans -- some of them are attractive on paper, and potentially could have saved me money relative the rather premium prices I pay for Verizon.  Needless to say, I will not be changing my provider any time soon.  In spite of the high prices, I've always been dealt with honestly and fairly, and I've been happy with the service I've gotten at Verizon.

If for some reason, you decide to get a T-mobile phone, please, please avoid the store located on San Marcos Blvd.  In fact, I urge you to verify that the store is owned by T-mobile corporation.  Its my belief that I might have had more satisfactory results had I been dealing with just a single entity.  That said, numerous people at T-mobile corporate lied to me and made false promises.

I feel so strongly about this, that I'm happy to have spent the hour or so writing up my terrible experiences with them.  I hope that I save someone else these painful experiences.  I won't be unhappy if it also costs T-mobile many potential customers.

For the record, I also think it should be illegal to sell a carrier locked device without clearly indicating this is so as part of the transaction.  The practice of carrier locking devices makes sense when the cost of the device is being subsidized by the carrier as part of a service plan.  But it should be possible to purchase the device outright and remove any such lock.  T-mobile's practices here are a disservice to their customers, and their partners.  The handset manufacturers should apply pressure to them to get them to change their policy here.