Thursday, November 13, 2014

A better illumos...

If you follow illumos very closely, you may already know some of this.

A New Fork


Several months ago, I forked illumos-gate (the primary source code repository for the kernel and system components of illumos) into illumos-core.

I had started upstreaming my work from illumos-core into illumos-gate.  I've since ceased that effort, largely because I simply have no time for the various arguments that my work often generates.  I think this is largely because my vision for illumos is somewhat different from that of other folks, and sadly illumos proper lacks anything resembling a guiding vision now, which means that only entirely non-contentious changes can get integrated into illumos.

However, I still want to proceed apace with illumos-core, because I believe that work has real value, and I firmly believe that my vision for illumos is the one that will lead to greater adoption by users, and by distributors as well, since much of what I'm trying to achieve in illumos-gate is aimed at reducing barriers to adoption and to developers both of illumos itself and of systems that want to build on top of or integrate illumos.  (An example of reducing barriers to adoption -- I recently implemented a BSD compatible flock() within libc, which is sometimes used by applications developed for BSD or Linux.)

Relationship to Upstream


I do also invite other parties to cherry-pick from illumos-core into illumos-gate.  I suspect that a large number of the enhancements I've made, such as the support for the fexecve() function specified by POSIX 2008, are likely to be more widely useful.  Within illumos-core, I want to retain a high standard of quality, and facilitate the effort of upstreaming for those who want to make the effort to do so.

I do want to reiterate that unlike other projects that have forked from illumos, it is not my intent to divorce myself from the community -- rather I see this illumos-core as an experimental branch aimed at exploring new directions that I ultimately hope will be embraced by the wider illumos community some day; by doing this in a separate repository/branch/fork, illumos-core can drive towards these goals without getting mired in questions that would prevent progress on these goals within illumos-gate proper.

The focus here is on delivery, rather than on discussion.  (In fact, one of my taglines on social media has for many years been "Code first, questions later."  The illumos-core effort represents a return to that core value.)

Call for Participation


I'm also interested in having co-collaborators on this project.  The goals are large, and while I hope to achieve them someday even if I have to do it all myself, I'm certain that the project will move quite a lot faster with help.  Also, because of our lack of bureaucracy, I hope that illumos-core can be an easier path to integration than illumos-gate.  I just use a simple github pull-request for integration at present.

There is an opportunity for folks at all different technical levels to participate.  We need work that involves systems programming, but also there is work around documentation, research, shell scripting, test development and release engineering to be performed.  I'm happy to mentor folks who want to help out, based on their skill level.

And, of course, for folks who want to focus primarily on improving illumos-gate upstream, there is effort that could be spent to figure out what to cherry-pick and to do the various illumos-gate process wrangling steps to get those bits integrated.

Friday, October 17, 2014

Your language sucks...

As a result of work I've been doing for illumos, I've recently gotten re-engaged with internationalization, and the support for this in libc and localedef (I am the original author for our localedef.)

I've decided that human languages suck.  Some suck worse than others though, so I thought I'd write up a guide.  You can take this as "your language sucks if...", or perhaps a better view might be "your program sucks if you make assumptions this breaks..."

(Full disclosure, I'm spoiled.  I am a native speaker of English.  English is pretty awesome for data-processing, at least at the written level.  I'm not going to concern myself with questions about deeper issues like grammar, natural language recognition, speech synthesis, or recognition, automatic translation, etc.  Instead this is focused strictly on the most basic display and simple operations like collation (sorting), case conversion, and character classification.)

1. Too many code points. 

Some languages (from Eastern Asia) have way way too many code points.  There are so many that these languages can't actually fit into 16-bits all by themselves.  Yes, I'm saying that there are languages with over 65,000 characters in them!  This explosion means that generating data for languages results in intermediate lookup tables that are megabytes in size.  For Unicode, this impacts all languages.  The intermediate sources for the Unicode supported in illumos blow up to over 2GB when support for the additional code planes is included.

2. Your language requires me to write custom code for symbol names. 

Hangul Jamo, I'm looking at you.  Of all the languages in Unicode, only this one is so bizarre that it requires multiple lookup tables to determine the names of the characters, because the characters are made up of smaller bits of phonetic portions (vowels and consonants.)  It even has its own section in the basic conformance document for Unicode (section 3.12).  I don't speak Korean, but I had to learn about Jamo.

3. Your language's character set is continuing to evolve. 

Yes, that's Asia again (mostly China I think).   The rate at which new Asian characters are added rivals that of updates to the timezone database.  The approach your language uses is wrong!

4. Characters in your language are of multiple different cell widths. 

Again, this is mostly, but not exclusively, Asian languages.  Asian languages require 2 cells to display many of their characters.  But, to make matters far far worse, some times the number f code points used to represent a character is more than one, which means that the width of a character when displayed may be 0, 1, or 2 cells.   Worse, some languages have both half- and full-width forms for many common symbols.  Argh.

5. The width of the character depends on the context. 

Some widths depend on the encoding because of historical practice (Asia again!), but then you have composite characters as well.  For example, a Jamo vowel sound could in theory be displayed on its own.  But if it follows a leading consonant, then it changes the consonant character and they become a new character (at least to the human viewer).

6. Your language has unstable case conversions.

There are some evil ones here, and thankfully they are rare.  But some languages have case conversions which are not reversible!  Case itself is kind of silly, but this is just insane!  Armenian has a letter with this property, I believe.

7. Your language's collation order is context-dependent. 

(French, I'm looking at you!)  Some languages have sorting orders that depend not just on the character itself, but on the characters that precede or follow it.  Some of the rules are really hard.  The collation code required to deal with this generally is really really scary looking.

8. Your language has equivalent alternates (ligatures). 

German, your ß character, which stands in for "ss", is a poster child here.  This is a single code point, but for sorting it is equivalent to "ss".  This is just historical decoration, because it's "fancy".  Stop making my programming life hard.

9. Your language can't decide on a script. 

Some languages can be written in more than one script.  For example, Mongolian can be written using Mongolian script or Cyrillic.  But the winner (loser?) here is Serbian, which in some places uses both Latin and Cyrillic characters interchangeably! Pick a script already! I think the people who live like this are just schizophrenic.  (Given all the political nonsense surrounding language in these places, that's no real surprise.)

10. Your language has Titlecase. 

POSIX doesn't do Titlecase.  This happens because your language also uses ligatures instead of just allocating a separate cell and code point for each character.  Most people talk about titlecase used in a phrase or string of words.  But yes, titlecase can apply to a SINGLE CHARACTER.  For example, Dž is just such a character.

11. Your language doesn't use the same display / ordering we expect.

So some languages use right to left, which is backwards, but whatever.   Others, crazy ones (but maybe crazy smart, if you think about it) use back and forth bidirectional.  And still others use vertical ordering.  But the worst of them are those languages (Asia again, dammit!) where the orientation of text can change.  Worse, some cases even rotate individual characters, depending upon context (e.g. titles are rotated 90 degrees and placed on the right edge).  How did you ever figure out how to use a computer with this crazy stuff?

12. Your encoding collides control codes.

We use the first 32 or so character codes to mean special things for terminal control, etc.  If we can't use these, your language is going to suck over certain kinds of communication lines.

13. Your encoding uses conflicting values at ASCII code points.

ASCII is universal.  Why did you fight it?  But that's probably just me being mostly Anglo-centric / bigoted.

14. Your language encoding uses shift characters. 

(Code page, etc.)  Some East Asian languages used this hack in the old days.  Stateful encodings are JUST HORRIBLY BROKEN.   A given sequence of characters should not depend on some state value that was sent a long time earlier.

15. Your language encoding uses zero values in the middle of valid characters. 

Thankfully this doesn't happen with modern encodings in common use anymore.  (Or maybe I just have decided that I won't support any encoding system this busted.  Such an encoding is so broken that I just flat out refuse to work with it.)

Non-Broken Languages


So, there are some good examples of languages that are famously not broken.

a. English.  Written English has simple sorting rules, and a very simple character set.  Dipthongs are never ligatures.  This is so useful for data processing that I think it has had a great deal to do with why English is the common language for computer scientists around the world.  US-ASCII -- and English character set, is the "base" character set for Unicode, and pretty much all other encodings use ASCII encodings in the lower 7 bits.

b. Russian.  (And likely others that use Cyrillic, but not all of them!)  Russian has a very simple alphabet, strictly phonetic.  The number of characters is small, there are no composite characters, and no special sorting rules.  Hmm... I seem to recall that Russia (Soviet era) had a pretty robust computing industry.  And these days Russians mostly own the Internet, right?  Coincidence?  Or maybe they just don't have to waste a lot of time fighting with the language just to get stuff done?

I think there are probably others.  (At a glance, Geoergian looks pretty straight-forward.   I suspect that there are languages using both Cyrillic and Latin character sets that are sane.  Ethiopic actually looks pretty simple and sane too.  (Again, just from a text processing standpoint.)

But sadly, the vast majority of natural languages have written forms & rules that completely and utterly suck for text processing.

Sunday, October 12, 2014

My Problem with Feminism

I'm going to say some things here that may be controversial.  Certainly that headline is.  But please, bear with me, and read this before you judge too harshly.

As another writer said, 2014 has been a terrible year for women in tech.  (Whether in the industry, or in gaming.)  Arguably, this is not a new thing, but rather events are reaching a head.  Women (some at any rate) are being more vocal, and awareness of women's issues is up.  On the face of it, this should be a good thing.

And yet, we have incredible conflict between women and men.  And this is at the heart of my problem with "Feminism".

The F-Word


Don't get me wrong.  I strongly believe that women should be treated fairly and with respect; in the professional place they should receive the same level of professional respect -- and compensation! -- as their male counterparts can expect.  I believe this passionately -- as a nerd, I prefer to judge people on the merits of their work, rather than on their race, creed, gender, or sexual preference.  A similar principle applies to gaming -- after all, how do you really know the gender of the player on the other side of the MMO?  Does it even matter?  When did gaming become a venue for channeling hate instead of fun?

The problem with "feminism" is that instead of repairing inequality and trying to bring men and women closer together, so much of it seems to be divisive.  The very word itself basically suggests a gender based conflict, and I think this, as well as much of the recent approach, is counterproductive.

Instead of calling attention to inequalities and improper behaviors (lets face it, nobody wants to deal with sexual harassment, discrimination, or some of the very much worse behavior that a few terribly bad actors are guilty of), we've become focused on gender bias and "fixing" gender bias as a goal in and of itself, rather than instead focusing on fair and equal treatment for all.

Every day I'm inundated with tweets and Facebook postings extolling the terrible plight of women at the expense of men.  Many of these posts seem intended to make me either angry at men, or ashamed of being one.  This basically drives a wedge between people, even unconsciously, to the point that it has become impossible to avoid being a soldier on one side or the other of this war.  And don't get me wrong, it has indeed degenerated to a total war.

I don't think this is what most feminists or their advocates really want.  (Though, I think it is what some of them want.  The side of feminism has its bad actors who thrive on conflict just as much as the other side has.  Extremism is gender and color and religion blind, as we've ample evidence of.)

I think one thing that advocates for women in tech can do, is to pick a different term, and a different way of stating their goals, and perhaps a different approach.  I think we've reached the critical mass necessary for awareness, so the constant tweets about how terrible it is to be a woman are no longer helpful.

I'm not sure what "term" should replace feminism -- in the workplace I'd suggest "professionalism".  After all everyone wants to be treated professionally, not just women.  (Btw, I'd say that in the gaming community, the value should be "sportsmanship".  Sadly some will see that word is gender biased, but I don't ascribe to the notion that we have to completely change our language in order to be more politically correct.  You know what I mean.)

Likewise, instead of dog piling on the one person (as I'm sure will happen in response to this post) on someone who doesn't immediately appear to support the feminist agenda, perhaps a little more tolerance, and education should be used in the approach.  Focus should, IMO, be on public praise for the parties who are working to make conditions better.

Educate instead of punish.  Make allies instead of enemies.

Salary Gap


The salary gap issue that was raised recently by Microsoft is another case in point.

I don't agree with Satya Nadella's comments saying that women should not ask for raises, but I think many women are nearly as likely to get a raise upon requesting one as a man of similar accomplishments.  (Yes, it would be better if this statement could have been said without "nearly".)   Far too few women feel comfortable asking for a merit based raise in the first place -- that is something that should change. But using race or gender as a bias to demand pay increases is a recipe for further division.  Indeed, men may begin to wonder if women are being compensated unfairly because they are women, but in the reverse direction. 

Likewise, bringing up discrimination in a salary discussion puts the other party on the defensive.  It presumes to imply prior wrong-doing.  This may be the case, but it may well not be.  After all, I've known many men that were under compensated simply because they sold themselves short, or were not comfortable asking for more money.   Why look for a fight when there isn't one?  (I suspect this is what Satya was really trying to get at.)

None of this helps the cause of "professionalism", and probably not the cause of "feminism".

Average tech salary figures are easily obtainable.  If a worker, man or woman, feels under compensated -- for any reason -- then they should take it to his employer and ask for a correction.  But to presume that the reason is gender, starts the conversation from a point of conflict.

Far far better is to demand far pay based on work performance and merit, relative to industry norms as appropriate.   If an employer won't compensate fairly, just leave.  There is no shortage of tech jobs in the industry.  If you're a woman, maybe look for jobs at companies that employ (and successfully retain) women.  Ask the people who work at a prospective employer about conditions, etc.  That's true for minorities too!  Ultimately, an employer who discriminates will find itself at a severe competitive advantage, as both the discriminated-against parties, and their allies refuse to do business with them.

An employer is not obligated to pay you "more" because of your gender.  But they must also not pay you less because of gender.  And yet every company will generally try to pay as little as they think they can get away with.  So don't let them -- but keep discrimination out of the conversation unless there is really compelling proof of wrong doing.  (And if there is such evidence, I'd recommend looking elsewhere, and possibly explore stronger legal measures.)

And yes, I strongly strongly believe that most men feel as I do.  They support the notion that everyone should be treated equally and professionally, and would like to stamp out sexism in the workplace, but many of us are starting to show symptoms of battle fatigue, and even more of us just don't want to be involved in a conflict at all.   Frankly, I think a lot of us are annoyed at feminist attempts to draw us into the conflict, even though we do support many of the stated goals of equal pay, fair treatment, etc. etc.

Closing Thoughts

As for me, I support the plight of women who find themselves discriminated against based on their gender, and I would like to see more women in my industry.  And I've put my money where my mouth is. 

But at the same time, you won't find me supporting "feminism".  I want to heal the rift, and work with awesome people -- and I happen to believe at least half of the awesome people in the world are of a different gender than I am.  Why would I want to alienate them?

I happen to believe that many well meaning people of many causes damage their cause by basically forcing people to deal with their "diversity" first, instead of of being able to deal with people as people on their own merit.  Its so much harder to appreciate a person on her own merits, when at least half of what she is saying is that she's unfairly treated because of gender, race, sexual preference, etc.  This true for everyone.  Show me how you're excellent, and I promise to appreciate you for your awesomeness, and to treat you fairly and with the same respect I would for anyone of my own gender/race/sexual preference.

You are awesome because of your accomplishments/innovations/contributions, not because of your gender or race or sexual preference.

But, if you won't let me look past your race/gender/etc. identity, then please don't be offended if I don't see anything else.  If you want to be treated like a "person", then let me see the person instead of just some classification in an equal opportunity survey.

Thursday, October 2, 2014

Supporting Women in Open Source

Please have a look at Sage Weil's blog post on supporting the Ada Initiative, which supports women in open source development.

Sage is sponsoring an $8192 matching grant, to support women in open source development of open storage technology.

You may have heard my talk recently, where I expressed that there have been no female contributions to illumos (that includes ZFS by the way!)  This is kind of a tragedy; intelligence and creativity of at least half the population are simply not represented here, and we are worser for it.

If you want to try to do something about it, heres a small thing.  There's a week remaining to do so, so I encourage folks to step up.  ($3392 has already been granted.)

I'm making a donation myself, if you think supporting more women in open source is a worthwhile cause, please join me!

Sunday, September 7, 2014

Modernizing "less"

I have just spent an all-nighter doing something I didn't expect to do.

I've "modernized" less(1).  (That link is to the changeset.)

First off, let me explain the motivation.  We need a pager for illumos that can meet the requirements for POSIX IEEE 2003.1-2008 more(1).  We have a suitable pager (barely), in closed source form only, delivered into /usr/xpg4/bin/more.  We have an open source /usr/bin/more, but it is incredibly limited, hearkening back to the days of printed hard copy I think.  (It even has Microsoft copyrights in it!)

So closed source is kind of a no go for me.

less(1) looks attractive.  It's widely used, and has been used to fill in for more(1) to achieve POSIX compliance on other systems (such as MacOS X.)

So I started by unpacking it into our tree, and trying to get it to work with an illumos build system.

That's when I discovered the crazy contortions autoconf was doing that basically wound up leaving it with just legacy BSD termcap.   Ewww.   I wanted it to use X/Open Curses.

When I started trying to do that, I found that there were severe crasher bugs in less, involving the way it uses scratch buffer space.  I started trying to debug just that problem, but pretty soon the effort mushroomed.

Legacy less supports all kinds of crufty and ancient systems.   Systems like MS-DOS (actually many different versions with different compiler options!) and Ultrix and OS/2 and OS9, and OSK, etc.  In fact, it apparently had to support systems where the C preprocessor didn't understand #elif, so the #ifdef maze was truly nightmarish.  The code is K&R style C even.

I decided it was high time to modernize this application for POSIX systems.  So I went ahead and did a sweeping update.  In the process I ditched thousands of lines of code (the screen handling bits in screen.c are less than half as big as they were).

So, now it:


  • Speaks terminfo (X/Open Curses) instead of ancient BSD termcap
  • Uses glob(3C) instead of a hack involving the shell and a helper program (lessecho, which I've removed from my tree.)
  • Functions properly as /usr/bin/more, both with and without -e (even on broken xterms)
  • Is fully ANSI C (or ISO C, if you prefer)
  • Passes illumos' cstyle code style checks
  • Is lint(1) clean


There is more work to do in the future if someone wants to.  Here are the ideas for the future:


  • Internationalization.  This is a pretty easy task involving gettext().
  • Make less use getopt() instead of its byzantine option parser (it needed that for PC operating systems.  We don't need or want this complexity on POSIX.)
  • Fix its character set handling so it can use the mbstring and wcstring routines in the platform instead of relying on it's own implementation of UTF-8.  (This would make it support other multibyte locales.)
  • Make it support port events instead of sleeping when acting in "tail -f" mode.


If someone wants to pick up any of this work, let me know.  I'm happy to advise.  Oh, and this isn't in illumos proper yet.  It's unclear when, if ever, it will get into illumos -- I expect a lot of grief from people who think I shouldn't have forked this project, and I'm not interested in having  a battle with them.  The upstream has to be a crazy maze because of the platforms it has to support.  We can do better, and I think this was a worthwhile case.  (In any event, I now know quite a lot more about less internals than I did before.  Not that this is a good thing.)

Wednesday, August 20, 2014

It's time already

(Sorry for the political/religious slant this post takes... I've been trying to stay focused on technology, but sometimes events are simply too large to ignore...)

The execution of John Foley is just the latest.  But for me, its the straw that broke the camel's back. 

Over the past weeks, I've become extremely frustrated and angry.  The "radical Islamists" have become the single biggest threat to world peace since Hitler's Nazi's.  And they are worse than the Nazi's.  Which takes some doing.  (Nazi's "merely" exterminated Jews.  The Islamists want to exterminate everyone who does't believe exactly their own particular version of extreme religion.)

I'm not a Muslim.  I'm probably not even a Christian when you get down to it.  I do believe in God, I suppose.  And I do believe that God certainly didn't intend for one group of believes to exterminate another simply because they have different beliefs.

Parts of the Muslim world claim that ISIS and those of its ilk are a scourge, primarily, I think, because they are turning the rest of the world against Islam.  If that's true, then the entire Muslim world who rejects ISIS and radical fundamentalist Islam (and it's not clear to me that rejecting one is the same as the other) needs to come together and eliminate ISIS, and those who follow its beliefs or even sympathize with it. 

That hasn't happened.  I don't see a huge military invasion of ISIS territory by forces from Arabia, Indonesia, and other Muslim nations.  Why not?

I don't believe it is possible to be a peace loving person (Muslim or otherwise), and stand idly by (or advocate standing by) why the terrorist forces who want nothing more than to destroy the very fabric of human society work to achieve their evil ends.

Just as Nazi Germany and Imperial Japan were an Axis of Evil during our grandparents' generation, so now we have a new Axis of Evil that has taken root in the middle east.

It's time now to recognize that there is absolutely no chance for a peaceful coexistence with these people.  They are, frankly, subhuman, and their very existence is at odds with that of everyone everywhere else in the world.

It's time for those of us in civilized nations to stop with our petty nonsense bickering.  The actions taking place in Ukraine, unless you live there (an in many case even if you do live there), are a diversion.  Putin and Obama need to stop their petty bickering, and cooperate to eliminate the real threat to civilization, which is radical Islam.

To be clear, I believe that the time has now come for the rest of the world to pick up and take action, where the Muslim world has failed.  We need to clean house.  We can no longer cite "freedom of expression" and "freedom of religion" as reasons to let imam's recruit young men into death cults.  We must recognize that these acts of incitement to terrorism are indeed what they are, and the perpetrators have no more right to life and liberty than Charles Manson. 

These are forces that seek to overthrow from within, by recruitment, by terrorism, or by any means they can.  These are forces that place no value on human life.  These are forces with which are inimical to the very concept of civilization.

There can be no tolerance for them.  None, whatsoever. 

To be clear, I'm advocating that when a member of one of these organizations willing self identifies as such, we should simply kill them.  Wherever they are.  These are the enemy, and there is no separate battlefield, and they do not recognize "civilians" or "innocents"; therefore, like a cancer, radical Islam must be purged from the very earth, by any means necessary.

The militaries of the world should unit, and work together, to eradicate entrenched forces of radical Islam wherever it exists in the world.  This includes all those forms that practice Sharia law, where a man and woman can be stoned to death simply for marrying without parental consent, as well as those groups that seek to eliminate the state of Israel, that seek to kill those who don't believe exactly as they do, that would issue a fatwa demanding the death of a cartoonist simply for depicting their prophet,  and those who seek to reduce women to the status of mere cattle.

To be clear, we have to do the hard work, all nations of the world, to eliminate this scourge, and eliminate it at its source.  Mosques where radicalism are preached must no longer be sanctuaries.  Schools where "teachers" train their students in the killing of Christians and Jews, and that their God demands the death of "unbelievers" and rewards suicide bombers with paradise, need to be recognized as the training camps they are.  Even if the students are women and children.

Your right to free speech and to religion does not trump my right to live.  Nor, by the way, does it trump my own rights to free speech and religion.

I suppose this means that we have to be willing to accept some losses of combat, in the fight against radicalism.  We also have to accept that "collateral damage" is inevitable.  As with rooting out a cancer, some healthy cells are going to be destroyed.  But these losses have to be endured if the entire organism that is civilization is to survive. 

If this sounds like I'm a hawk, perhaps that's true.  I think, rather, I'm merely someone who wants to survive, and wants the world to be a place where my own children and grandchildren can live without having to endure a constant fear of nut jobs who want to kill them simply because they exist and think differently.

Btw, if Islam as a religion is to survive in the long run, it must see these forces purged.  Because otherwise the only end result becomes an all out war of survival between Muslims and the rest of the world.  And guess which side has the biggest armies and weapons? And who will be the biggest losers in a conflict between Muslims and everyone else?

So, it's time to choose a side.  There is no middle ground.  Radical Islam tolerates no neutrality.  So, what's it going to be?

As for me, I choose civilization and survival.  That means a world without radical Islam.  Period.

Friday, July 11, 2014

POSIX 2008 locale support integrated (illumos)

A year in the making... and finally the code is pushed.  Hooray!

I've just pushed 2964 into illumos, which adds support for a bunch of new libc calls for thread safe and thread-specific locales, as well as explicit locale_t objects.   Some of the interfaces added fall under the BSD/MacOS X "xlocale" class of functions.

Note that not all of the xlocale functions supplied by BSD/MacOS are present.  However, all of the routines that were added by POSIX 2008 for this class are present, and should conform to the POSIX 2008 / XPG Issue 7 standards.  (Note that we are not yet compliant with POSIX 2008, this is just a first step -- albeit a rather major one.)

The webrev is also available for folks who want to look at the code.

The new APIs are documented in newlocale(3c), uselocale(3c), etc.   (Sadly, man pages are not indexed yet so I can't put links here.)

Also, documentation for some APIs that was missing (e.g. strfmon(3c)) are now added.

This project has taken over a year to integrate, but I'm glad it is now done.

I want to say a big -- huge -- thank you to Robert Mustacchi who not only code reviewed a huge amount of change (and provided numerous useful and constructive feedback), but also contributed a rather large swath of man page content in support of this effort, working on is own spare time.  Thanks Robert!

Also, thanks to both Gordon Ross and Dan McDonald who also contributed useful review feedback and facilitated the integration of this project.  Thanks guys!

Any errors in this effort are mine, of course.  I would be extremely interested in hearing constructive feedback.  I expect there will be some minor level of performance impact (unavoidable due to the way the standards were written to require a thread-specific check on all locale sensitive routines), but I hope it will be minor.

I'm also extremely interested in feedback from folks who are making use of these new routines.  I'm told the Apache Standard C++ library depends on these interfaces -- I hope someone will try it out and let me know how it goes.   Also, if someone wants/needs xlocale interfaces that I didn't include in this effort, please drop me a note and I'll try to get to it.

As this is a big change, it is not entirely without risk.  I've done what I could to minimize that risk, and test as much as I could.  If I missed something, please let me know, and I'll attempt to fix in a timely fashion.

Thanks!

Tuesday, May 27, 2014

illumos, identification, and versions

Recently, there has been a bit of a debate on the illumos mailing lists, beginning I suppose with an initial proposal I made concerning the output from the uname(1) command on illumos, which today, when presented with "-s -r" as options, emits "SunOS 5.11".

A lot of the debate centers on various ways that software can or cannot identify the platform it is running on, and use that information to make programmatic decisions about how to operate. Most often these are decisions made at compile time, by tools such as autoconf and automake.

Clearly, it would be better for software not to rely on uname for programmatic uses, and detractors of my original proposal are correct that even in the Linux community, the value from uname cannot be relied upon for such use.  There are indeed better mechanisms to use, such as sysconf(3c), or the ZFS feature flags, to determine individual capabilities.  Indeed, the GNU autotools contains many individual tests for such things, and arguably discourages the use of uname except as a last resort.

Yet there can be no question that there are a number of packages that do make such use.  And changes to the output from uname become risky to such packages.

But perversely, not changing the output from uname also creates risk for such packages, as the various incarnations of SunOS 5.11 become ever less like one another.  Indeed, illumos != SunOS, and uname has become something of a lie over the past 4 years or so.

Clearly, the focus for programmatic platform determination -- particularly for enabling features or behaviors, should be to move away from uname.  (Actually, changing uname may actually help package maintainers in identifying this questionable behavior as questionable, although there is no doubt that such a change would be disruptive to them.)

But all this debate completely misses the other major purpose of uname's output, which is to identify the platform to humans.  Be they administrators, or developers, or distributors.  There is no question in mind that illumos' failure to self identify, and to have a regular "release" schedule (for whatever a release really means in this regard) is harmful.

The distributors, such as Joyent (SmartOS), would prefer that people only identify their particular distribution, and I believe that much of the current argument from them stems from two primary factors.  First, they see additional effort in any change, with no direct benefit to them.  Second, in some ways these distributors are disinclined to emphasize illumos itself.   Some of the messages sent (either over IRC or in the email thread) clearly portray some resentment towards the rest of the illumos ecosystem, especially some of the niche players), as well as a very heavy handed approach by one commercial concern towards the rest of the ecosystem.

Clearly, this flies in the spirit of community cooperation in which illumos was founded.  illumos should not be a slave to any single commercial concern, and nobody should be able to exercise a unilateral veto over illumos, or shut down conversations with a single "thermonuclear" post.

So, without regard to the merits, or lack thereof, of changing uname's output, I'm quite certain, with sufficient clear evidence of my own gathering, that illumos absolutely needs to have a regularly scheduled "releases" of illumos, with humanly understandable release identifiers.  The lack of both the releases, and the identifiers to go with them, hinders meaningful reviews, hurts distributors (particularly smaller ones), and makes support for the platform harder, particularly when the questions of support are considered across distributions.  All of these hinder adoption of the illumos platform; clearly an undesirable outcome.

Some would argue that we could use git tags for the identifiers.  From a programmatic standpoint, these would be easy to collect.  Although they have problems as well (particularly for distributions which neither use git, or use a private fork that doesn't preserve our git versions), there are worse problems.

Specifically, a humans aren't meant to derive meaning from something like "e3de96f25bd2ea4282eea2d1a86c1bebac8950cb".   While Neo could understand this, most of use merely mortal individuals simply can't understand such tags.  Worse, there is no implied sequencing here.  Is "e3de96f25bd2ea4282eea2d1a86c1bebac8950cb" newer than "d1007364f5b14efdd7d6ba27aa458669a6365d48" ?  You can't do a meaningful comparison without examining the actual git history.

This makes it hard to guess whether a given running release has a bug integrated or not.  It makes it hard to have conversations about the platform.  It even makes it hard for independent reviewers of the platform  to identify anything meaningful about the platform in the context of reviewing a distribution.

Furthermore, when we talk about the platform, its useful for version numbers to convey more than just serial meaning.  In particular, version numbers help set expectations for developers.  Historically Solaris (or rather SunOS, from which illumos is derived), set those expectations in the form of stability guarantees.   A given library interface might be declared as Stable, or Evolving, (or Committed or Uncommitted), or Obsolete.  This was a way to convey to developers the relative risks of an interface (in terms of interface change), and it set some ground rules for rate of change.  Indeed, Solaris (and SunOS) relied upon a form of semantic versioning, and many of the debates in Sun's architectural leadership for Solaris (PSARC) revolved around these commitments.

Yet today, the illumos distributors seem over willing to throw that bit of our heritage into the waste bin.  A trend, I fear, which ultimately leads to chaos, and an increase in the difficulty of adoption by ISVs and developers.

illumos is one component -- albeit probably the most important by far -- of a distribution.  Like all components, it is useful to be able to determine and talk about it.  This is not noticeably different than Linux, Xorg, gnome, or pretty much any of the other systems which you are likely to find as part of a complete Ubuntu or RedHat distribution.  In fact, our situation is entirely analogous other than we combine our libc and some key utilities and commands with the kernel.

Technically, in Solaris parlance, illumos is a consolidation.  In fact, this distinction has alway been clear.  And historically the way the consolidation is identified is with the contents of the utsname structure, which is what is emitted by uname.  

Furthermore, when we talk about the platform, its useful for version numbers to convey more than just serial meaning

In summary, regardless of whether we feel uname should return illumos or not, there is a critical need, although one not necessarily agreed upon by certain commercial concerns, for the illumos platform to have a release number at a minimum, and this release number must be useful to convey meaningful information to end-users, developers, and distributors alike.  It would be useful if this release number were obtainable in the traditional fashion (uname), but its more important that the numbers convey meaning in the same way across distributions (which means packaging metadata cannot be used, at least not exclusively).

Sunday, April 6, 2014

SP protocols improved again!

Introduction


As a result of some investigations performed in response to my first performance tests for my SP implementation, I've made a bunch of changes to my code.

First off, I discovered that my code was rather racy.  When I started bumping up GOMAXPROCS, and and used the -race flag to go test, I found lots of issues. 

Second, there were failure scenarios where the performance fell off a cliff, as the code dropped messages, needed to retry, etc. 

I've made a lot of changes to fix the errors.  But, I've also made a major set of changes which enable a vastly better level of performance, particularly for throughput sensitive workloads. Note that to get these numbers, the application should "recycle" the Messages it uses (using a new Free() API... there is also a NewMessage() API to allocate from the cache), which will cache and recycle used buffers, greatly reducing the garbage collector workload.

Throughput


So, here are the new numbers for throughput, compared against my previous runs on the same hardware, including tests against the nanomsg reference itself.

Throughput Comparision
(Mb/s)
transportnanomsg 0.3betaold gdamore/spnew
(1 thread)
new
(2 threads)
new
(4 threads)
new
(8 threads)
inproc 4k432255516629775186548841
ipc 4k947023796176661550255040
tcp 4k974425153785427944114420
inproc 64k83904216154561835044b4431247077
ipc 64k389297831a48400651906447163506
tcp 64k309791259834994496085306453432

a I think this poor result is from retries or resubmits inside the old implementation.
b I cannot explain this dip; I think maybe unrelated activity or GC activity may be to blame

The biggest gains are with large frames (64K), although there are gains for the 4K size as well.  nanomsg still out performs for the 4K size, but with 64K my message caching changes pay dividends and my code actually beats nanomsg rather handily for the TCP and IPC cases.

I think for 4K, we're hurting due to inefficiencies in the Go TCP handling below my code.  My guess is that there is a higher per packet cost here, and that is what is killing us.  This may be true for the IPC case as well.  Still, these are very respectable numbers, and for some very real and useful workloads my implementation compares and even beats the reference.

The new code really shows some nice gains for concurrency, and makes good use of multiple CPU cores.

There are a few mysteries though.  Notes "a" and "b" point to two of them.  The third is that the IPC performance takes a dip when moving from 2 threads to 4.  It still significantly outperforms the TCP side though, and is still performing more than twice as fast as my first implementation, so I guess I shouldn't complain too much.

Latency


The latency has shown some marked improvements as well.  Here are new latency numbers.

Latency Comparision
(usec/op)
transportnanomsg 0.3betaold gdamore/spnew
(1 thread)
new
(2 threads)
new
(4 threads)
new
(8 threads)
inproc6.238.476.569.9311.011.2
ipc15.722.627.729.131.331.0
tcp24.850.541.042.742.942.9

All in all, the round trip times are reasonably respectable. I am especially proud of how close I've come within the best inproc time -- a mere 330 nsec separates the Go implementation from the nanomsg native C version.  When you factor in the heavy use of go routines, this is truly impressive.   To be honest, I suspect that most of those 330 nsec are actually lost in the extra data copy that my inproc implementation has to perform to simulate the "streaming" nature of real transports (i.e. data and headers are not separate on message ingress.)

There's a sad side to story as well.  TCP handling seems to be less than ideal in Go.  I'm guessing that some effort is done to use larger TCP windows, and Nagle may be at play here as well (I've not checked.) Even so, I've made a 20% improvement in latencies for TCP from my first pass.

The other really nice thing is near linear scalability when threads (via bumping GOMAXPROCS) are added.  There is very, very little contention in my implementation.  (I presume some underlying contention for the channels exists, but this seems to be on the order of only a usec or so.)  Programs that utilize multiple goroutines are likely to benefit well from this.

Conclusion


Simplifying the code to avoid certain indirection (extra passes through additional channels and goroutines), and adding a message pooling layer, have yielded enormous performance gains.  Go performs quite respectably in this messaging application, comparing favorably with a native C implementation.  It also benefits from additional concurrency.

One thing I really found was that it took some extra time to get my layering model correct.  I traded complexity in the core for some extra complexity in the Protocol implementations.  But this avoided a whole other round of context switches, and enormous complexity.  My use of linked lists, and the ugliest bits of mutex and channel synchronization around list-based queues, were removed.  While this means more work for protocol implementors, the reduction in overall complexity leads to marked performance and reliability gains.

I'm now looking forward to putting this code into production use.

Thursday, March 27, 2014

Names are Hard

So I've been thinking about naming for my pure Go implementation of nanomsg's SP protocols.

nanomsg is trademarked by the inventor of the protocols.  (He does seem to take a fairly loose stance with enforcement though -- since he advocates using names that are derived from nanomsg, as long as its clear that there is only one "nanomsg".)

Right now my implementation is known as "bitbucket.org/gdamore/sp".  While this works for code, it doesn't exactly roll off the tongue.  Its also a problem for folks wanting to write about this.  So the name can actually become a barrier to adoption.  Not good.

I suck at names.  After spending a day online with people, we came up with "illumos" for the other open source project I founded.  illumos has traction now, but even that name has problems.  (People want to spell it "illumOS", and they often mispronounce it as "illuminos"  (note there are no "n"'s in illumos).  And, worse, it turns out that the leading "i" is indistinguishable from the following "l's" -- like this: Illumos --  when used in many common san-serif fonts -- which is why I never capitalize illumos.  Its also had a profound impact on how I select fonts.  Good-bye Helvetica!)

go-nanomsg already exists, btw, but its a simple foreign-function binding, with a number of limitations, so I hope Go programmers will choose my version instead.

Anyway, I'm thinking of two options, but I'd like criticisms and better suggestions, because I need to fix this problem soon.

1. "gnanomsg" -- the "g" evokes "Go" (or possibly "Garrett" if I want to be narcissistic about it -- but I don't like vanity naming this way).  In pronouncing it, one could either use a silent "g" like "gnome" or "gnat", or to distinguish between "nanomsg" one could harden the "g" like in "growl".   The problem is that pronunciation can lead to confusion, and I really don't like that "g" can be mistaken to mean this is a GNU program, when it most emphatically is not a GNU.  Nor is it GPL'd, nor will it ever be.

2. "masago" -- this name distantly evokes "messaging ala go", is a real world word, and I happen to like sushi.  But it will be harder for people looking for nanomsg compatible layers to find my library.

I'm leaning towards the first.  Opinions from the community solicited.



Wednesday, March 26, 2014

Early performance numbers

I've added a benchmark tool to my Go implementation of nanomsg's SP protocols, along with the inproc transport, and I'll be pushing those changes rather shortly.

In the meantime, here's some interesting results:

Latency Comparision
(usec/op)
transport nanomsg 0.3beta gdamore/sp
inproc6.238.47
ipc15.722.6
tcp24.850.5


The numbers aren’t all that surprising.  Using go, I’m using non-native interfaces, and my use of several goroutines to manage concurrency probably creates a higher number of context switches per exchange.  I suspect I might find my stuff does a little better with lots and lots of servers hitting it, where I can make better use of multiple CPUs (though one could write a C program that used threads to achieve the same effect).

The story for throughput is a little less heartening though:


Throughput Comparision
(Mb/s)
transport message size nanomsg 0.3beta gdamore/sp
inproc4k43225551
ipc4k94702379
tcp4k97442515
inproc64k8390421615
ipc64k389297831 (?!?)
tcp64k3097912598

I didn't try larger sizes yet, this is just a quick sample test, not an exhaustive performance analysis.  What is interesting is that the ipc case for my code is consistently low.  It uses the same underlying transport to Go as TCP, but I guess maybe we are losing some TCP optimizations.  (Note that the TCP tests were performed using loopback, I don't really have 40GbE on my desktop Mac. :-)

I think my results may be worse than they would otherwise be, because I use the equivalent of NN_MSG to dynamically allocate each message as it arrives, whereas the nanomsg benchmarks use a preallocated buffer.   Right now I'm not exposing an API to use preallocated buffers (but I have considered it!  It does feel unnatural though, and more of a "benchmark special".)

That said, I'm not unhappy with these numbers.  Indeed, it seems that my code performs reasonably well given all the cards stacked against it.  (Extra allocations due to the API, extra context switches due to extra concurrency using channels and goroutines in Go, etc.)

A litte more details about the tests.

All test were performed using nanomsg 0.3beta, and my current Go 1.2 tree, running on my Mac running MacOS X 10.9.2, on 3.2 GHz Core i5.  The latency tests used full round trip timing using the REQ/REP topology, and a 111 byte message size.  The throughput tests were performed using PAIR.  (Good news, I've now validated PAIR works. :-)

The IPC was directed at file path in /tmp, and TCP used 127.0.0.1 ports.

Note that my inproc tries hard to avoid copying, but does still copy due to a mismatch about header vs. body location.  I'll probably fix that in a future update (its an optimization, and also kind of a benchmark special since I don't think inproc gets a lot of performance critical use.  In Go, it would be more natural to use channels for that.

Monday, March 24, 2014

SP (nanomsg) in Pure Go

I'm pleased to announce that this past weekend I released the first version of my implementation of the SP (scalability protocols, sometimes known by their reference implementation, nanomsg) implemented in pure Go. This allows them to be used even on platforms where cgo is not present.  It may be possible to use them in playground (I've not tried yet!)

This is released under an Apache 2.0 license.  (It would be even more liberal BSD or MIT, except I want to offer -- and demand -- patent protection to and from my users.)

I've been super excited about Go lately.  And having spent some time with ØMQ in a previous project, I was keen to try doing some things in the successor nanomsg project.   (nanomsg is a single library message queue and communications library.)

Martin (creator of ØMQ) has written rather extensively about how he wishes he had written it in C instead of C++.  And with nanomsg, that is exactly what he is done.

And C is a great choice for implementing something that is intended to be a foundation for other projects.  But, its not ideal for some circumstances, and the use of async I/O in his C library tends to get in the way of Go's native support for concurrency.

So my pure Go version is available in a form that makes the best use of Go, and tries hard to follow Go idioms.  It doesn't support all the capabilities of Martin's reference implementation -- yet -- but it will be easy to add those capabilities.

Even better, I found it pretty easy to add a new transport layer (TLS) this evening.  Adding the implementation took less than a half hour.  The real work was in writing the test program, and fighting with OpenSSL's obtuse PKI support for my test cases.

Anyway, I encourage folks to take a look at it.  I'm keen for useful & constructive criticism.

Oh, and this work is stuff I've done on my own time over a couple of weekends -- and hence isn't affiliated with, or endorsed by, any of my employers, past or present.

PS: Yes, it should be possible to "adapt" this work to support native ØMQ protocols (ZTP) as well.  If someone wants to do this, please fork this project.  I don't think its a good idea to try to support both suites in the same package -- there are just too many subtle differences.

Thursday, February 6, 2014

The Failed Promise

My dislike for C++ is well-known by those who know me.  As is my lack of fondness for Perl.

I have a new one though.  Java.  Oracle and Apple have conspired to kill it.

(I should say this much -- its been a long time, about a decade, since I developed any Java code.  Its entirely possible that I remember the language with more fondness than it truly warrants.  C++ was once beautiful too -- before ANSI went and mutated it beyond all hope back in 1990 or thereabouts.)

Which is a shame, because Java started with such promise.  Write-once, run-anywhere.  Strongly typed, and a paradigm for OO that was so far superior to C++.  (Multiple inheritance is the bane of semi-competent C++ engineers, who often can't even properly cope with pointers to memory.)

For years, I bemoaned the fact that the single biggest weakness of Java was the fact that it was seen as a way to make more dynamic web pages.  (I remember HotJava -- what a revolutionary thing it was indeed.)  But even stand-alone apps struggled with performance (startup times were hideous for anyone starting a Swing app.)

Still, we suffered because of the write once, run-anywhere promise was just too alluring.  All those performance problems were going to get solved by optimization, and faster systems.  And wasn't it a shame that people associated Java with applets instead of applications?  (Java WebStart tried to drive more of the latter, which should have been a good thing.)

But all the promise that Java seemed to offer is well and truly lost now.  And I don't think it can ever be regained. 

Here's a recent experience I had.

I had reason to go run some Java code to access an IPMI console on a SuperMicro system.  I don't run Windows.   The IPMI console app seems to be a mishmash of Java and native libraries produced by ATEN for SuperMicro.  Supposedly it supported MacOS X and Linux.

I lost several hours (several, as in more than two) trying to get a working IPMI console on my Mac.  I tried Java 7 from Oracle, Java 6 from Apple, I even tried to get a Linux version working in Ubuntu (and after a number of false starts I did actually succeed in the last attempt.)  All just to get access to a simple IPMI console.  Seriously?!?

What are the problems here?
  • Apple doesn't "support" Java officially anymore.
  • Apple disables Java by default on Safari even when it is installed.
  • Apple disables Java webstart almost entirely.  (The only way to open an unsigned Java webstart file -- has anyone ever even seen a signed .jnlp?) is to download it, and open it with a right-click in the finder and explicitly answer "Yes, I know its not from an approved vendor, but open the darn thing anyway.  Even though Java also asks me the same question.  Several times.)
  • Oracle ships Java 7 without 32-bit support on Macs, so only Safari can use it (not Chrome)
  • Oracle Java 7 Update 51 has a new security manager that prevents most unsigned apps from running at all.  (The only way to get them to run at all is to track down the Java preferences pane and reduce the security setting.)
  • Developers like ATEN produce native libraries which means that their "Java" apps are totally unportable.
All of this is because people are terrified of the numerous bugs in Java.  Which is appropriate, since there have been bugs in various JVMs.  Running unverified code without some warning to the end-user is dangerous -- really these should be treated with the same caution due a native application.

But, I should also not have to be a digital contortionist to access an application.  I am using a native IPMI console from an IP address that I entered by hand on a secured network.  I'd expect to have to answer the "Are you sure you want to run this unsigned app?" question (perhaps with an option to remember the setting) once.  But the barriers to actually executing the app are far too high.  (So high, that I still have not had success in getting the SuperMicro IPMI console running on my Mac -- although I did eventually get it to work in a VM running Ubuntu.)

So, shame on Apple and Oracle for doing everything in their power to kill Java.  Shame on ATEN / SuperMicro for requiring Java in the first place, and for polluting it with native code libraries.  And for not getting their code signed.

And shame on the Java ecosystem for allowing this state of affairs to come about.

I'll most likely never write another line of Java, in spite of the fact that I happen to think the language is actually quite elegant for solving OO programming problems.  The challenges in deploying such applications are just far too high.  In the past the problem was that the cost of the sandbox meant that application startups are slow.  Now, even though we have fast CPUs, we have traded 30 second application startup times for situations where it takes hours for even technical people to get an app running.

I guess its probably better on Windows.

But if you want to use that as an excuse, then just write a NATIVE APPLICATION, and stop screwing around with this false promise of "run-anywhere". 

If a Javascript app won't work for you, then just bite the bullet and write a native app.  Heck, with the porting solutions available, it isn't that much work to make native apps portable to a variety of platforms.  (Though such solutions necessarily leave niche systems out in the cold.  Let's face it, the market for native desktop apps on SPARC systems running NetBSD is pretty tiny.)  But covering the 5 mainstream client platforms (Windows, Mac, iOS, Android, Linux) is pretty easy.  (Yes, illumos isn't in that list.  And maybe Linux shouldn't be either, although I think the number of folks running a desktop with Linux is probably several orders of magnitude larger than all other UNIXish platforms -- besides MacOS -- combined.)

And, RIP "write once, run-anywhere".  Congratulations Oracle, Apple.  You killed it.

Wednesday, January 29, 2014

Sorry for the interruption....

Some of you may have noticed that my email, or my blog, or website, was down.

I discontinued my hosting service with BlueHost, and when I did, I mistakenly thought that by keeping my domain registration with them, that I'd still have DNS services.  It was both a foolish mistake, and yet an easy one to make.  (I should have known better...)

Anyway, things are back to normal now, once the interweb's negative caches have timed out...

It does seem unfortunate that BlueHost:


  • Does not include DNS with domain registration, nor do they have a service for it unless you actually host content with them.
  • Did not backup my DNS zone data.  (So even if I paid them to reactivate hosting, I was going to have to re-enter my zone records by hand.)  This is just stupid.

So, I've moved my DNS, and when my registration expires, I'll be moving that to another provider as well.  (Dyn, in case you wondered.)

Saturday, January 18, 2014

Worst Company ... Ever

So I've not blogged in a while, but my recent experience with a major mobile provider was so astonishingly bad that I feel compelled to write about my experiences.  (Executive summary: never do business with T-mobile again!)

I had some friends out from Russia back in November, and they wanted to purchase an iPhone 5s for their son as a birthday gift.  Sadly, we couldn't get unlocked ones from the Apple Store at the time, because they simply lacked inventory.

So we went to the local T-mobile store here on San Marcos Blvd (San Marcos, CA.)  I knew that they were offering phones for full-price (not subsidized), and it seemed like a possible way for them to get a phone.  Tiffany was the customer service agent that "assisted" us.  She looked like she was probably barely out of high school,  but anyway she seemed pleased to help us make a purchase.  I asked her no fewer than three times whether the phone was unlocked, and was assured that, yes, the phone was unlocked and would work fine with Russian GSM operators.  I was told that in order for T-mobile to sell us the phone, we had to purchase a 1-month prepaid plan for $50, and a $10 sim card.  While I was little annoyed at this, I understood that this was their prerogative, and the cost of wanting to buy the device "right now" and not being able to wait 2 weeks for Apple to ship one.  (My friends had less than a week left in California at this time.)  We were buying the phone outright, so it seemed like there ought to be no reason for the device to have a carrier lock on it.

So, of course, the phone was not unlocked.  It didn't work at all when it got to Russia.  Clearly Tiffany didn't have a clue about the product she sold us.  I guess I shouldn't have been too surprised. But we had put down $760 in cash for a device that now didn't work.  Worst of all, I was embarrassed, as was my friend, when his son got what was essentially a lemon for his birthday gift.

I was pretty upset, and went back to the store.  Tiffany was there, and apologized.  She offered to pay up to $25 for me to use a third party service to unlock the phone since "it was her fault".  Of course, the third party service stopped offering this for iPhones, and their cost for iPhone 5's was over $100 when they were offering it.

I tried to get T-mobile to unlock the phone.  I waited in the store for a manager for about 2 hours, but Tiffany finally got upset and called the police to get me to leave the store after I indicated that I preferred to wait on the premises for a manager.   I left the store premises, unhappy, but resolved to deal with this through the corporate office.

I went home, and called T-mobile customer service.  It turns out that T-mobile store is really a 3rd party company, even though they wear T-mobile shirts, and exclusively offer T-mobile products.  (More on that later.)

I spent quite a few hours on the phone with T-mobile.  Somewhere in the process, the customer service people called Tiffany, whereupon she immediately reversed her story, claiming she had informed me that the phone was locked.  This was just the first of a number of false and misleading statements that were made to me by T-mobile or its affiliates.   I spent several hours on the phone that day.  At the end of the call, I was told that T-mobile would unlock my phone only after at least 40 days had passed from the date of activation.   The person at corporate assured me that if I would just wait the 40 days, then I'd be able to get the phone unlocked.  Since the phone was already in Russia, I figured it would take at least that long to send it back and send a replacement from another carrier.

Oh one other thing.  During that first call, I discovered that Tiffany had actually taken the $50 prepaid plan we purchased, and kept it for herself or her friends.  We didn't figure that out until T-mobile customer service told me that I didn't have a plan at all.  Nice little scam.  While dishonest, I would not have begrudged the action had the phone actually worked.  Of course it didn't.

(During all this, on several different occasions, T-mobile customer service personnel tried to refer me to the same 3rd party unlocking service.  Because hey, if T-mobile can't get its own phones unlocked, maybe their customers should pay some shady third party service to get it done for them.  But it turns out that whatever backdoor deal that service had previously doesn't work anymore, because they've stopped doing it for Apple phones.)

So we waited 40 days.

And then we called to have the phone unlocked.

And T-mobile refused again.  This time I was told that I needed to have $50.01 or more, not just $50 on the plan.  After a few more hours on the phone and escalation up through a few levels of their management chain, they credited my account for $0.20, and then resubmitted the unlock request.  I was guaranteed that the phone would be unlocked.

Two days later, T-mobile denied the unlock request.

At this point, I was informed that the phone had to have T-mobile activity on it within 7 days of the unlock request.  This was simply not going to happen.  The phone was in Russia!  I complained, and I spent quite a few more hours on the phone with T-mobile.  It seems like the folks who control the unlocking process of their phones have nothing to do with the people who answer their customer service lines, nor those people's bosses, nor those people's bosses.  Astonishing!

At this point I had spent over a dozen hours trying to get T-mobile to unlock the darned thing.  T-mobile had got their money from me already, and had done pretty much everything possible to upset me.  The amount of money T-mobile spent dealing on the phone with me, trying to enforce a stupid policy, which wouldn't have been necessary if they had just admitted their mistake and fixed it (which would have cost them nothing whatsoever) is astonishing.  Talk about penny-wise and pound foolish.

At that time, T-mobile told me that I would be able to return the other phone, provided I got it back from Russia.  They agreed to this even though the usual 14 days "buyer's remorse" window had passed.

So at this point, I went and purchased a phone from Verizon (and paid a $50 premium because I was at BestBuy instead of the Apple store and the Verizon store wouldn't sell the phone unless it was part of a plan), and I sent it to Russia with my step-son, who was going their for Christmas.  That phone did work, and my step-son exchanged the T-mobile phone for it, bringing the T-mobile phone back.

The next day, I went to return the phone at the store where I bought it.  Sadly, Tiffany was there.  So was her manager, Erica.

After spending about an hour at the store trying to return it, they agreed to take it back, minus a $50 restocking fee.  And as I paid cash, I had to provide my checking account information so they could do a deposit for me, which would take a few days.  I got a paper showing that they were sending me a refund, but nothing indicating the account number.  (Turns out it took a few calls from Erica -- who claimed to be the store manager and probably was about 10 minutes older than Tiffany -- the next day, since she apparently had no clue what she was doing and needed to get two separate pieces of information that she failed to collect while I was in the story.)

I did spend a bunch of time with customer service on the phone -- a few more hours I guess, trying to get that last $50.  At this point it was just a matter of principle.  I resented the whole thing, and I wanted as much of my money back as possible.  The customer service person tried to make it right, but because the store (Hit Mobile is the company apparently) is separate from T-mobile, the store's decision was final.  The store manager (Erica) refused to refund the stocking fee, and there was nothing T-mobile could do about it.  To their credit, they did offer me a $50 service credit if I was inclined to keep an account with them.  Needless to say, I have no interest in ever being a T-mobile customer, so I thanked them and declined... indicating that I'd prefer to have funds back in cash.  (Nobody ever mentioned the restocking fee; on the contrary I was told I'd receive the full purchase price less the $60 service plan and SIM card.  Again, story and reality don't match at T-mobile.)

I did eventually get $650 credited back to my account.

So, I was out $110, plus the extra $50, plus the 13+ hours of my time, plus the embarrassment and long turnaround time to get the replacement out to Russia.  All because T-mobile wouldn't fix a mistake they made, even though numerous people at that company recognized that it was clearly the Right Thing To Do.

Turns out that Verizon iPhones are all unlocked.  And I'm already a Verizon customer.  I was thinking about T-mobile's plans -- some of them are attractive on paper, and potentially could have saved me money relative the rather premium prices I pay for Verizon.  Needless to say, I will not be changing my provider any time soon.  In spite of the high prices, I've always been dealt with honestly and fairly, and I've been happy with the service I've gotten at Verizon.

If for some reason, you decide to get a T-mobile phone, please, please avoid the store located on San Marcos Blvd.  In fact, I urge you to verify that the store is owned by T-mobile corporation.  Its my belief that I might have had more satisfactory results had I been dealing with just a single entity.  That said, numerous people at T-mobile corporate lied to me and made false promises.

I feel so strongly about this, that I'm happy to have spent the hour or so writing up my terrible experiences with them.  I hope that I save someone else these painful experiences.  I won't be unhappy if it also costs T-mobile many potential customers.

For the record, I also think it should be illegal to sell a carrier locked device without clearly indicating this is so as part of the transaction.  The practice of carrier locking devices makes sense when the cost of the device is being subsidized by the carrier as part of a service plan.  But it should be possible to purchase the device outright and remove any such lock.  T-mobile's practices here are a disservice to their customers, and their partners.  The handset manufacturers should apply pressure to them to get them to change their policy here.