Saturday, June 9, 2018

Self Publishing Lessons

Over the past several weeks I've learned far more than I ever wanted to about the self publishing process.  I'm posting some of my findings here in the hopes that they may help others. 

TLDR;


If you're going with eBooks, and you should, consider using an author website to sell it "early", and once your book is finished publish it with Kindle Direct Publishing and Smashwords.  Keep the author website / store up even after, so you can maximize returns.  Price your eBook between $2.99 and $9.99.

If you're going to go with Print, start with Amazon Kindle Direct Publishing first, unless you're only needing a small run of books printed only in the USA (in which case TheBookPatch.com looks good).  Once you're book is really done, and you're ready to branch out to see it available internationally and from other bookstores, publish it with Ingram Spark.

Get and use your own ISBNs (From MyIdentifiers.com -- buy 10 at a time), and make sure you opt out of Kindle Select!

More details are below.

eBook Formats


Let's start with ebook formats first.  Be aware that I'm writing this from California, with no "nexus" elsewhere, and electronic goods (when they are purely electronic) are not taxable here.  That means I haven't worried much about accounting for VAT or Sales Tax, because I don't have to.

Leanpub


Leanpub is how I started out, but they've altered there terms several times.  Right now their royalties are not substantially less than anyone else (they used to be), and you can get an even higher return selling through your own store (such as Selz.com).  The one thing they have over everyone else is their Markua tools, and their focus on helping authors in the early stages -- you can "publish" before the book is complete on Leanpub.  I'm not sure how useful Markua is to other people -- I don't use it at all.  If you have a book in progress, this will let you sell copies early.  But, they do take a cut -- 80%. Frankly, their business model seems a bit iffy right now, and I wouldn't like to put too many eggs in that basket.  You won't need an ISBN at Leanpub.  They pay 80% royalties, allow free updates (to a limit most authors are unlikely to hit).  Leanpub has very limited reach, and doesn't distribute to anywhere else.

Author Website


The cheapest and most cost effective way to sell your ebooks is to open your own author store.  Sites like Selz.com will let you do this for free, and only charge reasonable transaction fees.  With this approach you can get about 95% of the book sales.  You can publish as soon as you want, send updates as often as you want, and don't need ISBNs or anything like that. On the downside, you have to do a little more work to set things up.  You'll also have limited reach, and pretty much look like a fly by night operation.  If you want to "pre-publish" before a work is complete, this is a way to do that, without paying that 20% to Leanpub.  You can also leave this store open, and point to it from your personal author pages, even after you are working with the larger distributers.

Ingram Spark


Ingram Spark's ebook distribution service gets broad reach through relationships with the various outlets.  You can use them to get to Apple, Kobo, even Amazon.  And yet I would not recommend doing this.  First off they charge to set up the book, typically $25.  Then if you want to make a revision to the book, it's another $25.  And then you're typically going to set it up so that you get only 45% of the royalties (or 40% if you really messed up and didn't opt out of the Amazon agreement.) . Furthermore, I found that their conversion of my ePub to Kindle format was inferior, leading to a poor reading experience on those devices.  (I have some complex layout, and custom fonts, as the book is technical in nature.)   I had much better luck generating my own .mobi format and working with Amazon directly.   Their service takes forever to get back to you -- I'm still waiting while I try to remove my eBook from their distribution.  In short, I would not use Ingram Spark for eBook.  You also will need an ISBN if you use Ingram Spark.

Amazon (Kindle Direct Publishing)


Using the Kindle Direct Publishing was pretty easy, and this let me provide a .mobi file that was optimized for Kindle, addressing issues caused by converting from ePub.  (To be fair most authors won't have these problems.)  If you want to reach Kindle readers (and you do!), you should just set up a KDP account.  One word though -- don't opt-in to Kindle Select!!  Amazon is great, for distributing to Amazon customers.  But you don't want to give away your exclusivity.    There is a weird set of rules about royalties with KDP though.  If you want to get their best 70% (which won't be in all markets, but the main ones) you need to set your List Price between $2.99 and $9.99, inclusive.  (Other values are used for other currencies.)  Deducted from your 70% rate is the cost to transfer the data to the user, which turns out to be pretty cheap -- less than a dollar typically.  (But if you're only selling a $2.99 book, make sure you keep the file sizes down, or this will hurt your rates.)  You can opt for a flat 35% royalty instead, which might make sense if your book is heavy on content, and its required if your book is outside the price points.  (This is why you never see ebooks listed for $11.99 or something like that on Amazon.)

Smashwords


I just set up my account with Smashwords, and I'm thrilled so far.  It looks like you'll get about 80% royalties through their own store, and 60% if your book is bought through one of their partners -- which includes just about everyone -- including Apple, GooglePlay, Kobo, etc.  This gets you pretty much everywhere, except Amazon.  But you did set up a KDP account already right?  They take the royalty, and you're done.  There is one fairly severe draw back to Smashwords -- they want you to upload your manuscript as a specially formatted Word document.  (They do have direct ePub though, which you can use if you want.  I did this because I don't have a Word version of my book, and it would be difficult to get one -- it was authored in Asciidoctor.)   You will need an ISBN to get into their expanded distribution program, of course.  They will offer to sell you one, but I recommend you not do that and use your own.  (Especially if you're uploading your own ePub.)

Direct Retailer Accounts


You can maximize royalties by setting up direct accounts with companies like Apple, Kobo,  and Barnes&Noble.  In my experience, it just isn't worth it.  Dealing with these all is a headache, and it takes forever.  Some, like Google Play Store, are almost impossible to get into.  As the list gets large, the percentage of your distribution that are covered here diminishes, consider whether that extra 10% royalty rate is worth the headache.  Some of these will need ISBNs, and the pricing and royalties will all vary of course.

Printed Books


If you've spent a lot of time making a great book, you probably want to see it in print format, right?  Nothing is quite the same to an author as being asked to sign a physical copy of professionally bound book of their own work.  Note that it takes some extra effort to set up a book for print -- you'll need to ensure that you have a press-ready PDF (I had to purchase a copy of Adobe Acrobat DC so that I could properly preflight my files), and setting up the cover can be a challenge if you're not a designer.

Note that details such as print margins, paper weight, and hence cover sizes, can vary between the different printers.  Be prepared to spend a lot of time if you decide to go down this road, and to have to spend a lot of time for each printer you use.

TheBookPatch.com


After doing some research, I decided to give these guys a shot at printing my first version.  I was really impressed with the quality -- while the first printing of my book had a number of issues, none of them were the fault of TheBookPatch -- they were all on me.  The problem with these guys is that they are tiny.  Almost nobody has ever heard of them, and you won't get be getting this listed at places like Barnes&Noble.  Additionally, they are rather expensive, particularly if you want to send books to places overseas.  At one point I wanted to send one copy of my book to the Netherlands.  The shipping cost was going to be about $80.  Needless to say, my relationship with TheBookPatch came to an abrupt end.  (I'd still recommend giving these guys a shot if you're printing books for your own use here in the USA.)   One big advantage is that they were able to put together an attractive cover using their website cover designer, with no special skills.  You also don't need an ISBN to print through TheBookPatch.com.

Ingram Spark


Ingram Spark has the best rates internationally, and is reputed to have excellent print quality.  My book is available from them.  They charge $49 to set it up, and $25 for updates.  This is super annoying, so I wouldn't publish with them until and unless you know that you're ready and need international distribution or want to see your printed book available via Barnes&Noble or other retailers.  They're also slow.  I ordered 3 copies of my book a week ago, and they only confirmed that they are shipping them today.  If you're serious about selling printed books widely, I would definitely go with them.  But unless you anticipate the volume, I'd hold off.  You will need an ISBN as well.  With Ingram Spark, you set up your royalty rates which are usually 45% of net.   Typically this means you'll get something like a 20-25% actual royalty, depending on the book.

Amazon KDP


Now available for authors, you can use Amazon Print on Demand.  After setting up the layout, and doing the work to ensure the quality is good -- which can take some effort -- it's pretty easy.  Amazon will sell you an ISBN if you want one -- I'm not sure if they are required for print books or not.  (I already had one from my Ingram Spark journey.)  Amazon gives a much better royalty, of 60% of net, and their printing costs for small runs seem to be fairly inexpensive, as is shipping.  For example, my 430 page, 2 lb (7.5"x9.25" paperback) book cost about $6 to print, and about $10 to ship.  That means that as my list price is $49.95, I can expect to receive about $20.  Amazon will cut into their own margins to discount the book as well, to optimize the price.  Having said all that, I'm still waiting for my proof, which Amazon apologized for taking an extra day or two to print -- I should be getting it in a couple of days (I opted for the cheap shipping -- you can't use your Prime account to ship author proofs which are made available to you at cost).  Their paper is thicker than Ingram's and so I had to redesign the cover, and their margins are stricter (my page numbers fell outside their strict .5" margins), so I wound up having to re-do the whole layout.  It would have been better if I had started with Amazon first.

There are other print-on-demand players, but I've heard enough complaints about print quality when using them, that I just avoided them.  After all, if you're bothering to put your book into print, you want the results to reflect all the effort you put into it.

Monday, June 4, 2018

Altering the deal... again....

(No, this is not about GitHub or Microsoft... lol.)

Back in March (just a few months ago), I signed up on Leanpub to publish the NNG Reference Manual.  I was completely in the dark about how to go about self-publishing a book, and a community member pointed me at Leanpub.

Leanpub charged $99 to set up, back in March, and offered a 90% (minus 50 cents) royalty rate.  On top of it they let me choose a price from free, or $0.99 to $99.  Buyers could choose within that range.  This looked great, although I was a bit hesitant to spend the $99 since there was no way to try their platform out.

Note that at this time I was not interested (and am still not interested) in their authoring tools based on Markua.  I had excellent tooling already in Asciidoctor, plus a bunch of home-grown tools (that I've since further expanded upon) to markup and layout the book, plus previewing, etc.

Everything was great, and I made sales ranging from $0.99 to $20.  Not a lot of sales, but enough to nearly recoup my $99 investment.  Now, I wasn't looking at this as a money making venture, but as a way to help support my work around NNG -- having a professionally produced reference manual was something I considered an important step for NNG.

Shortly after I created the book and published, Leanpub changed the minimum price that buyers could pay to $4.99.  We're talking about a digital good here.  First time the deal was altered....

Then in April, they introduced a new SaaS pricing model, where I could have ditched the $99 fee.  So I'm feeling like a chump, but hey at least I have that 90% royalty rate, right?  (By this time I'd sold enough to cover that initial $99 outlay, thanks to generous supporters from the NNG community.) . Deal altered again.

Then they introduced a freemium model in May, where I really could have skipped that $99 outlay.  But they told me that I was grandfathered, so I could keep my 90% rate, so I was getting something for that $99 I spent originally.  Deal altered third time?

Now, they've told me that they've changed their mind, and no, they aren't going to let me keep that grandfathered rate.  Deal altered again?!?

They posted a long essay explaining why they "had" to do this.  I get it, their old business model wasn't working.  But in the past 3 months they've made not one, not two, but three changes to their pricing and business model.  They've made promises, and gone back on their word.

But it's ok, because at 80% I'm making more than with Amazon, right?  Well, no, not really.  I won't repeat the calculations here, but it turns out that I would have made slightly more money with Amazon.  Now, that's partly due to the fact that my sales have been quite slow (as they were predicted to be -- this is a really niche book -- a reference manual for a product that isn't even 1.0 yet.)

The thing is, I'm slightly irked about the loss of income, but I'm much more angry about the lack of respect they've given us, their authors and customers.  Clearly, their promises don't carry much weight.  They've offered lifetime free Pro accounts to customers who were with them long enough to have at least $500 in royalties, but everyone else is out of luck.  As to those lifetime pro accounts -- well, it's "lifetime, or until we change our mind".   Which seems to occur about once a month.

Now Leanpub isn't some big bad company, but their attitude and thinking reflected in how they've handled this process shows clear alignment with the same thought processes that those big bad companies have.  As an author you're not a valued partner to them -- you're a source of revenue, with very little effort on their part required to support you.

I've started rethinking my use of Leanpub obviously.

It seems like I can make use of Selz which seems to have really good support for selling digital goods like eBooks (and even has a Pay What You Want option!), and with my small number of digital goods will only charge me the transaction processing costs -- either 2.9% or 3.9% depending on location.  (Digital goods are not taxable in California.)  So what was I gaining from Leanpub again?

For Kindle and iBooks, it also looks like dealing with Amazon and Apple directly look like a better deal than Leanpub.  You get their expanded distribution, and yes, you only get 70% royalties, but you don't have to pay any recurring fees.  Unless you're doing large volumes, the math on these works out better than any of the Leanpub paid plans.

 (IngramSpark, where I have also posted the book, also works, but I've had less than satisfactory results with their epub->mobi conversion, so I can't recommend using them for Kindle at least, and I think the royalties you get from dealing directly with Apple are superior anyway.)

This all seems like a lot of work, but I hope this helps other authors who might be considering using Leanpub.

(There is one feature which is nice on Leanpub, which is the ability to publish an incomplete work in progress, and then keep updating it.  But let's face it, you can do that equally well from your own website and something like Selz.)

Not Abandoning GitHub *yet*

The developer crowds are swarming off of GitHub in the wake of today's announcement that Microsoft has agreed to purchase GH for $7.5B.

I've already written why I think this acquisition is good for neither GitHub nor Microsoft.  I don't think it's good for anyone else either... but maybe at least it alerts us all to the dangers of having all our eggs in the same basket.

At the moment my repositories will not be moving.  The reason for this is quite simple -- while the masses race off of GitHub, desperate for another safe harbor, the panic that this has created is overwhelming alternative providers.  GitLab reported a 10X growth.  While this might be good for GitLab, its not good for people already on GitLab, as there was already quite a well understand performance concern around GitLab.com.

At least in the short term, GitHub's load will decrease (at least once all the code repo exports are done), I think. 

The other thing is that Microsoft has come out and made some pretty strong promises about not altering the GitHub premise, and the "new leadership" over there is ostensibly quite different from the old.  (Having said that, there is a lot of bad blood and history between FOSS and Microsoft. A lot of the current generation of millenials don't have that history, but some of us haven't forgotten when Steve Ballmer famously said "Linux is a cancer", and when Microsoft used every dirty trick in the book to try to kill all competitors, including open source software.  If Microsoft had had its way back in the 90s and 00s, the Internet would have been a company shanty-town, and Linus Torvalds would have been a refugee outlaw.

Thankfully that didn't happen.

Microsoft is trying to clean its image up, and maybe it is reformed now, but the thing we all have to remember is that Microsoft is beholden first, foremost, and exclusively to it's shareholders.  Rehabiliting it's image is critical to business success today, but at it's roots Microsoft still has those same obligations.)

The past couple of years of good behavior doesn't undo decades of rottenness; many of us would have been thrilled to see Microsoft enter chapter 11 as the just dessert for its prior actions.

Microsoft was losing mindshare to OSS and software like Git (and companies like GitHub). Purchasing GitHub is clearly an effort to become relevant again.   The real proof will be seen if Microsoft and GitHub are still as FOSS friendly in two years as they are today.  Promises made today are cheap.

But I'm willing to let them have the benefit of the doubt, understanding that I retain my options to depart at any time.  I won't be creating *new* repositories there, and my private one's will be moving off of GitHub because I don't want Microsoft to have access to my proprietary work.  (Probably they can still get it from backups at GitHub, but we do what we can...)

But my open source stuff is still there.  For now.

That means mangos, NNG, nanomsg, and tcell remain.  For now.

It's up to Microsoft and GitHub to see if they stay.

 - Garrett

Sunday, June 3, 2018

Microsoft Buying GitHub Would be Bad

So apparently Microsoft wants to buy GitHub.

This is a huge mistake for both companies, and would be tragic for pretty much everyone involved.

GitHub has become the open source hosting site for code, and for a number of companies, it also hosts private repositories.  It's the place to be if you want your code to be found and used by other developers, and frankly, its so much of a de facto standard for this purpose that many tools and services work better with GitHub.

GitHub was founded on the back of Git, which was invented by Linus Torvalds to solve source code management woes for the Linux kernel. (Previously the kernel used an excellent tool called BitKeeper for this job, but some missteps by the owners of BitKeeper drove the Linux team away from it.  It looks like GitHub is making similar, albeit different, commercial missteps.)

Microsoft already has their own product, Visual Studio Team Services, which competes with GitHub, but which frankly appeals mostly to Microsoft's own developer base.  I don't think it is widely used by Linux developers for example.


Implications for Open Source


Microsoft has been much more "open source friendly" of late, but I have to admit I still don't trust them.  I'm hardly alone in this.

It also is a breach of sorts of an unwritten trust that the open source community has placed in them.  There is much bad blood between Microsoft and open source software.  Many of the most treasured open source systems exist directly in conflict to proprietary systems.  Think about software like Samba, and Wine and OpenOffice.  These were created as alternatives to Microsoft.  Being acquired by Microsoft means that these projects will feel compelled to abandon GitHub.

As this happens, many tools and services that offer services that are tailored to GitHub (automated code review, CI/CD, etc.) are going to be rushing to find a way to offer services for alternatives, as their client base runs screaming from GitHub.  (Back in February of 2016 I tried to leave GitHub, because of philosophical differences of opinion with their leadership.  I abandoned the effort after discovering that too many of the external support services I used for these open source projects were either GitHub only, or could only be converted away from GitHub with large amounts of additional effort and big negative impact for my users.)

This is a watershed moment for GitHub.
I predict in as little as 6 months nobody will be creating new open source projects on GitHub.

Unfortunately, it's probably already too late for GitHub.  Unless they were to come out and immediately deny any acquisition attempts, and make some public announcements recognizing the trust they've been given, and asserting the importance of honoring it, nobody will trust them any more.


Implications for Commercial Use


This is also going to harm commercial customers, driving them away.

Microsoft has many commercial ventures which overlap with those of almost everyone doing anything in software.  GitHub being acquired by Microsoft will in one fell swoop make GitHub a direct competitor with vast amounts of their own customer base.  (Essentially, your either a Microsoft competitor, or a partner.  And often both.)

If you're using GitHub for private repositories, it probably is time to rethink that.  Unless you trust Microsoft not to do evil.  (They've never even made any such promises.)  This also means, I think that it might be time to reconsider hosting your private data with anyone else.  GitLab and BitBucket look better to be sure, but what's to prevent another large company from acquiring them?

It's time to reconsider the cost of hosting in the cloud.  I've been expecting a move back to on-premises storage and hosting for some time now, but this will only accelerate that.


Implications for Microsoft


Microsoft will spend quite a lot of money to acquire GitHub.  But instead of acquiring a goose that lays golden eggs, they are going to have one that needs to be fed and turns that into fecal material.

At the same time, while this may help bolster some of the technology in VSTS in the short term, the reality is that most of the best stuff isn't that hard to build, and most of what GitHub has can be done on any cloud based system with sufficient storage and compute.  Most of their tech is not tied to Windows, almost certainly.

The VSTS team will no doubt be impacted, and there will be a lot of pain and suffering attempting to more tightly integrate VSTS with the new adopted child.  I'm sure there are redundancies that will be eliminated, but I expect part of what is going to happen is a shift in focus from providing the best experience for Visual Studio developers and making things work well on Azure, to figuring out how to more tightly integrate GitHub's toolset into theirs.  Can you imagine trying to reconcile the differences between VSTS and GitHub's issue tracking systems?  Yikes!

The uncertainty will annoy customers, and I suspect will drive them away from the existing VSTS stack.  Whey they leave, they probably won't be moving to GitHub.

Like the proverbial dog with the bone looking at his reflection in the water while on the bridge, instead of having one bone, Microsoft's greed will leave it with none (at least in this space.)

I'm sure that the founders and investors of GitHub will make a mint taking Microsoft's money.  Normally I'd applaud anyone with plans to part Microsoft from some of it's funds.  But this move is just plain bad business.


Anti-Trust Violations?


As I mentioned above, Microsoft has their own product, Visual Studio Team Services, which competes with GitHub.  This alleged acquisition of GitHub seems to me to fly in the face of anti-trust rules.  Microsoft clearly has been trying to make inroads into the open source community with projects like Visual Studio Code and Linux support for VSTS, so I would hope that the regulatory bodies involved would examine this with great scrutiny.

Of course, if GitHub is for sale, many of the same concerns except the antitrust legislation would apply.  It would be a Bad Thing (tm) if GitHub were to be acquired by Facebook, Google, or Amazon, for example, for most of the same reasons that being acquired by Microsoft would be bad.

Now please pardon me while I go back to setting up gogs on my own systems...

Tuesday, May 22, 2018

No, Nanomsg is NOT dead

There seems to have been some pretty misleading data out on the Internet, indicating that "nanomsg is dead".  The main culprit here is a "postmortem" by Drew Crawford.  Unfortunately comments are apparently not working on that post according to Drew himself.

The thing is this (apologies to Samuel Clemens):  "reports of the death of nanomsg have been greatly exaggerated".

So it's time to set the record straight.

I've been working hard on nanomsg, and the Scalability Protocols that are intrinsic to nanomsg for quite some time.  It has been generally occupied my full time paid job for approximately the past year.  I've been working on this stuff part time for longer than that.

The main focus during this time has been a complete rewrite of the core library, known as NNG.  NNG, or nanomsg-next-gen, aims to be a far superior version of nanomsg, with significant new capabilities, greatly improved reliability, scalability, extensibility, and maintainability.  It is wire compatible with legacy nanomsg and mangos, and retains a backwards compatible API (though it also offers a newer API which should be quite a lot easier to use).

During all this time, I've continued to act as the maintainer for nanomsg, although at this point I'd say that nanomsg itself is in sustaining mode, as I'm very focused on having NNG stand in as a full replacement for nanomsg.

We've also published the first NNG book, which is really just the reference manual.  There are over 400 pages (actually about 650 in the 7.5"x9.25" printed edition, which I've not put up yet) of detailed API documentation available.  (Let me know if you're interested in the print edition -- it costs me about $35 to produce, but I'm willing to make it available for folks that are willing to pay for it.  Admittedly the electronic version is probably a lot more useful since it has working hyperlinks and supports searching.)  Oh, and by the way, the book also covers the legacy API used with legacy libnanomsg.

I'm working on the second NNG book, which will be much more of a "how-to" (backed up with case studies and code) now.  (This will take some time to come to market.  At the moment these books are a secondary effort, since the time spent on them is time spent away from working on the code itself or on related commercial activities.)

There have been more contributors to NNG of late, and interest is picking up as NNG itself is already on final countdown for its FCS approach.  (The first beta release, 1.0.0-beta.1 was released last week.  I expect to release a 2nd beta today, and then the final release will probably come a week or so later, depending upon beta test results of course.)

The work I've done for NNG has also inspired me to make further improvements to mangos.  Over the course of the next few months you can expect to see further harmonization between these two projects as NNG gains support for the STAR protocol from mangos, and mangos gains some new capabilities (such as optional separable contexts to enable much easier development of concurrent applications.)

So, if you've heard that "nanomsg is dead", now you know better.  In fact, I'd venture to say that the project is healthier and more alive than it ever was.

Furthermore, in many respects the new NNG project is far more robust, scalable, and stable than I believe nanomsg or ZeroMQ have ever been.  (This because NNG has been designed with a serious eye towards production readiness from the first line of code.  Every error case is carefully considered.)

If you haven't looked at any this stuff lately, give it another look!


Tuesday, January 23, 2018

Why I'm Boycotting Crypto Currencies

Unless you've been living under a rock somewhere, you probably have heard about the crypto currency called "Bitcoin".  Lately its skyrocketed in "value", and a number of other currencies based on similar mathematics have also arisen.  Collectively, these are termed cryptocurrencies.

The idea behind them is fairly ingenious, and based upon the idea that by solving "hard" problems (in terms of mathematics), the currency can limit how many "coins" are introduced into the economy.  Both the math and the social experiment behind them is something that on paper looks really interesting.

The problem is that the explosion of value has created a number of problems, and as a result I won't be accepting any of these forms of currencies for the foreseeable future.

First, the market for each of these currencies is controlled by a relatively small number of individuals who own a majority of the outstanding "coins".  The problem with this is that by collusion, these individuals can generate "fake" transactions, which appear to drive up demand on the coins, and thus lead to a higher "value" (in terms of what people might be willing to pay).  The problem is that this is a "bubble", and the bottom will fall right out if enough people try to sell their coins for hard currency.  As a result, I believe that the value of the coins is completely artificial, and while a few people might convert some of these coins into hard cash for a nice profit, the majority of coin holders are going to be left out in the cold.

Second, the "cost" of performing transactions for some of these currencies is becoming prohibitively expensive.  With most transactions of real currency, its just a matter of giving someone paper currency, or running an electronic transaction that normally completes in milliseconds.  Because of the math associated with cryptocurrencies, the work to sign block chains becomes prohibitive, such that for some currencies transactions can take a lot of time -- and processors are now nearly obliged to charge what would be extortionary rates just to cover their own costs (in terms of electricity and processing power used).

The environmental impact, and monumental waste, caused by cryptocurrencies cannot be overstated.  We now have huge farms of machines running, consuming vast amounts of power, performing no useful work except to "mine" coins.  As time goes on, the amount of work needed to mine each coin grows significantly (an intentional aspect of the coin), but what this means is that we are burning large amounts of power (much of which is fossil-fuel generated!) to perform work that has no useful practical purpose.   Some might say something similar about mining precious metals or gems, but their a many many real practical applications for metals like gold, silver, and platinum, and gems like diamonds and rubies as well.

Finally, as anyone who wants to build a new PC probably realizes, the use of computing hardware, and specifically "GPUs" (graphical processing units, but which also can be used to solve many numerical problems in parallel) have increased in cost dramatically -- consumer grade GPUs are generally only available today for about 2x-3x their MSRPs.  This is because the "miners" of cryptocurrencies have snapped up every available GPU.  The upshot of this is that the supply of this hardware has become prohibitive for hobbyists and professionals alike.  Indeed, much of this hardware would be far far better used in HPC arenas where it could be used to solve real-world problems, like genomic research towards finding a cure for cancer, or protein folding, or any number of other interesting and useful problems which solving would benefit mankind as a whole.  It would not surprise me if a number of new HPC projects have been canceled or put on hold simply because the supply of suitable GPU hardware has been exhausted, and putting some of those projects out of budget reach.

Eventually, when the bottom does fall out of those cryptocurrencies, all that GPU hardware will probably wind up filling land-fills, as many people won't want to buy used GPUs, which may (or may not) have had their lifespans shortened.  (One hopes that at least the eWaste these cause will be recycled, but we know that much eWaste winds up in landfills in third world countries.)

Crypto-curency mining is probably one of the most self-serving and irresponsible (to humanity and our environment) activities one can take today, while still staying in the confines of the law (except in a few jurisdictions which have sensibly outlawed cryptocurrencies.)

It's my firm belief that the world would be far better off if crypto-currencies had never been invented.

Wednesday, November 22, 2017

Small Business Accounting Software Woes

I'm so disappointed with the online accounting software options available to me; and I've spent far far too much time in the past couple of days looking for an accounting solution for my new business. The current state of affairs makes me wonder if just using a spreadsheet might be as easy.

I am posting my experiences here for two reasons.
  1. To inform others who might have similar needs, and
  2. To inform the hopefully smart people at these companies, so maybe they will improve their products.
Let me start with a brief summary of my needs:

  • Track time (esp. billable hours)
  • Tracked time should include date, and project/client, and some description of work performed.
  • Multiple currency support. I have international clients that I need to bill in their preferred currency.
  • Invoicing and payment tracking for above.
  • Payroll -- preferably integrated with someone like Gusto.
  • Support for two employees with plans to grow. 
  • Double-entry accounting (including bank reconciliation) for my accountant.
  • Affordable -- I'm a small business owner.
That's it. Nothing super difficult, right?  You'd think there would be dozens of contenders who could help me.

You'd be wrong.

Here's what I looked at, and their deficiencies:

Freshbooks 



I really like most of what Freshbooks has to offer, and this was my starting point. Super easy to use, an integration with Gusto, and their invoicing solution is super elegant. Unfortunately, their lack of reconciliation and double-entry accounting (or any of the other "real" accounting stuff) disqualifies them. Adding to the problem, I already use them for my personal consulting business (where I've been a happy user), and they don't have support for multiple business on their "Classic Edition".

Then there is the whole confusion between "New Freshbooks" and "Classic Freshbooks".

This is a company that states they intend to continue to keep two separate software stacks, with about 90% functionality overlap, running ~forever. Why? Because they have some features (and some integrations) that they lack in the new one. (I've been informed that my use patterns indicate that I should stay on the "Classic" edition forever because of my heavy use of Time Tracking.) Some of us with real world software engineering experience know how costly and hateful it is to have multiple simultaneous versions of a product in production. Freshbook's approach here, with no plans to merge the functionality, is about the most boneheaded decision I've seen engineering management take.

Being stuck on the "Classic Edition" makes me feel like a loser, but really it's a sign that their own product is the loser.  I have to believe at some point one product or the other is going to be a dead end.

Quickbooks Online


This is a product that is well recommended, and probably one of the most widely used. It has so much capability. It also lacks the "hacked together by a bunch of different engineering teams that didn't talk to each other" feeling that their desktop product has. (Yes, I have experience with Quickbooks Pro, too. Sad to say.)  It's probably a good thing I can't look at their code behind the curtain.

The biggest, maybe even only, failing they have for my use case is their inability to bill against clients that are in a different currency. Wait, they are multicurrency capable, right?  Uh, no they aren't. If I can't record my billable hours against a client in another country in their preferred currency, then whatever you think your "multicurrency" support is doesn't count. I have international clients that demand billing in their local currency.  So this is a non-starter for me. This feature has been asked for before from them, and they have ignored it. Major, and honestly unexpected, fail.

Cost wise they aren't the cheapest, but this one feature absence is a show stopper for me, otherwise I'd probably have settled here.

Xero


Xero is another of the main companies, and in Gartner's magic quadrant as their leader in the sector. I didn't actually try them out -- though I did research. Their shortcomings for me were: price (multi-currency support requires me to pay $70 / month, which is about 2x all the others), and lack of time tracking. Sure, I can add an integration from some other company like Tsheets, for another $20 / month. But now this solution is like 3x the cost of everyone else.

One feature that Xero includes for that $70 is payroll processing -- but only for a handful of states (California is one), and I can't seem to find any reviews for folks who have used them.   If I want to use an outside company with a longer track record and broader coverage across states, like SurePayroll or Gusto or ADP, I will wind up paying double.

If Xero would change their menu somewhat (make it ala carte), we'd be able to work together. Let me integrate with Gusto, and not have to pay exorbitant fees for multi-currency support. Add time tracking and it would be even better.

Arguably I could stop being such a penny pincher, and just go with Xero + Tsheets or somesuch. Outside of the crazy expensive options for companies that can afford a full time accountant (Sage, NetSuite, looking at you!), this was the most expensive option.  I'd also have to use Xero's payroll service, and I'm not sure

ZipBooks


At first blush, ZipBooks looked like a great option. On paper they have everything I need -- they even partnered with Gusto, and claim to have multicurrency support.  Amazingly, they are even freeOf course if you elect to use some of their add-ons, you pay a modest fee, but from a pure price perspective, this looks like the cheapest.

Unfortunately, as I played with their system, I found a few major issues. Their multi-currency support is a bit of an inconvenient joke. They don't let you set a per-client currency. Instead you change the currency for the entire account, then generate invoices in that currency (or accept payments), then have to switch back to the home currency. This is account wide, so you better not have more than one person access the account at a time. The whole setup feels really hinky, and to be honest I just don't trust it.

Second, their bank integration is (as of today) broken -- meaning the website gives me conflict errors before I even can select a bank (I wanted to see if my business bank -- a regional smaller bank -- is on their list). So, not very reliable.

Finally, their support is nearly non-existent. I sent several questions to them through their on-line support channel, and got back a message "ZipBooks usually responds in a day". A day. Other companies I looked at took maybe 10-20 minutes to respond -- I still have not received a response from ZipBooks.

I need a service that supports real multicurrency invoicing, is reliable, and with reachable support. Three strikes for ZipBooks.  Damn, I really wanted to like these guys.

Kashoo


Kashoo was well reviewed, but I had some problems with them. First their only payroll integration is with SurePayroll. I hate being locked in, although I could probably overlook this. Second, they don't have any time tracking support. Instead they partner with Freshbooks, but only the "Classic Edition" (and apparently no plans to support the "New Freshbooks".)  A red flag.

And, that brings in the Freshbooks liability (only one company, so I can't have both my old consulting business and this new one on the same iOS device for example), and I'd have to pay for Freshbooks service too.

On the plus side, the Kashoo tech support (or pre-sales support?) was quite responsive.  I don't think they are far off the mark.

Wave Accounting 


Wave is another free option, but they offer payroll (although full service only in five states) as an add-on.  (They also make money on payment processing, if you use that.)  Unfortunately, they lacked support for integrations, time tracking, or multi-currency support.  I'd like to say close but no cigar, but really in this case, it's just "no cigar".  (I guess you get what you pay for...)

Zoho Books


Zoho Books is another strong option, well regarded.  So far, it seems to have everything I need except any kind of payroll support.  I'd really love it if they would integrate with Gusto.  I was afraid that I would need to set up with Zoho Project and pay another service fee, but it looks -- at least so far from my trial, like this won't be necessary.

So my feature request is for integration with Gusto.  In the meantime, I'll probably just handle payroll expenses by manually copying the data from Gusto.

Conclusion


So many, so close, and yet nothing actually hits the mark.   (These aren't all the options I looked at, but they are the main contenders.  Some weren't offered in the US, or were too expensive, or self-hosted.  For now I'm going to try Zoho.  I will try to update this in a few months when I have more experience.

Updates: (As of Nov. 30, 2017) 


  1. Zoho has since introduced Zoho Payroll, and they contacted me about it.  It's only available for California at this time, and has some restrictions.  I personally don't want to be an early adopter for my payroll processing service, so I'm going to stick with Gusto for now.   Zoho's representative did tell me that they welcome other payroll processing companies to develop integrations for Zoho Books.   I hope Gusto will take notice.
  2. ZipBooks also contacted me.  They apologized for the delays in getting back to me -- apparently their staff left early for Thanksgiving weekend.  They indicated that they have fixed whatever bug caused me to be unable to link my bank account.  Their COO also contacted me, and we had a long phone call, mostly to discuss my thoughts and needs around multi-currency support.  I'm not quite ready to switch to them, but I'd keep a close eye on them.  They do need to work to improve their initial customer service experience, in my opinion.
  3. It looks like my own multi-currency needs may be vanishing, as my primary external customer has agreed to be billed in USD and to pay me in USD.  That said, I want to keep the option open for the future, as I may have other international customers in the future.
  4. None of the other vendors reached out to me, even though I linked to them on Twitter.  The lack of response itself is "significant" in terms of customer service, IMO. 

Tuesday, November 14, 2017

TLS close-notify .... what were they thinking?

Close-Notify Idiocy?


TLS (and presumably SSL) require that implementations send a special disconnect message, "close-notify", when closing a connection.  The precise language (from TLS v1.2) reads:

The client and the server must share knowledge that the connection is
ending in order to avoid a truncation attack. Either party may
initiate the exchange of closing messages. 
close_notify 
This message notifies the recipient that the sender will not send
any more messages on this connection. Note that as of TLS 1.1,
failure to properly close a connection no longer requires that a
session not be resumed. This is a change from TLS 1.0 to conform
with widespread implementation practice. 
Either party may initiate a close by sending a close_notify alert.
Any data received after a closure alert is ignored. 
Unless some other fatal alert has been transmitted, each party is
required to send a close_notify alert before closing the write side
of the connection. The other party MUST respond with a close_notify
alert of its own and close down the connection immediately,
discarding any pending writes. It is not required for the initiator
of the close to wait for the responding close_notify alert before
closing the read side of the connection.

This has to be one of the stupider designs I've seen.

The stated reason for this is to prevent a "truncation attack", where an attacker terminates the session by sending a clear-text disconnect (TCP FIN) message, presumably just before you log out of some sensitive service, say GMail.

The stupid thing here is that this is for WebApps that want to send a logout, and don't want to wait for confirmation that logout had occurred before sending confirmation to the user.  So this logout is unlike every other RPC.  What...?!?

Practical Exploit?


It's not even clear how one would use this attack to compromise a system... an attacker won't be able to hijack the actual TLS session unless they already pwned your encryption.  (In which case, game over, no need for truncation attacks.)  The idea in the truncation attack is that one side (the server?) still thinks the connection is alive, while the other (the browser?) thinks it is closed.  I guess this could be used to cause extra resource leaks on the server... but that's what keep-alives are for, right?

Bugs Everywhere


Of course, close-notify is the source of many bugs (pretty much none of them security critical) in TLS implementations.  Go ahead, Google... I'll wait...  Java, Microsoft, and many others have struggled in implementing this part of the RFC.

Even the TLS v1.1 authors recognized that "widespread implementation practice" is simply to ignore this part of the specification and close the TCP channel.

So you may be asking yourself, why don't implementations send the close-notify ... after all sending a single message seems pretty straight-forward and simple, right?

Semantic Overreach


Well, the thing is that on many occasions, the application is closing down.  Historically, operating systems would just close() their file descriptors on exit().  Even for long running applications, the quick way to abort a connection is ... close().  With no notification.  Application developers expect that close() is a non-blocking operation on network connections (and most everywhere else)1.

Guess what, you now cannot exit your application without sending this, without breaking the RFC.   That's right, this RFC changes the semantic of exit(2).  Whoa.

That's a little presumptive, dontcha think?

Requiring implementations to send this message means that now close() grows some kind of new semantic, where the application has to stop and wait for this to be delivered.  Which means TCP has to be flowing and healthy.  The only other RFC compliant behavior is to block and wait for it flow.

What happens if the other side is stuck, and doesn't read, leading to a TCP flow control condition?  You can't send the message, because the kernel TCP code won't accept it -- write() would block, and if you're in a non-blocking or event driven model, the event will simply never occur.  Your close() now blocks forever.

Defensively, you must insert a timeout somehow -- in violation of the RFC.  Otherwise your TCP session could block forever.  And now you have to contemplate how long to hold the channel open?  You've already decided (for whatever other reason) to abort the session, but you now have to wait a while ... how long is too long?  And meanwhile this open TCP sits around consuming buffer space, an open file descriptor, and perhaps other resources....

A Bit of Sanity


The sensible course of action, treating a connection abort for any reason as an implicit close notification, was simply "not considered" from what I can tell.

In my own application protocols, when using TLS, I may violate this RFC with prejudice. But then I also am not doing stupid things in the protocol like TCP connection reuse.  If you close the connection, all application state with that connection goes away.  Period.  Kind of ... logical, right?

Standards bodies be damned.

1. The exception here is historical tape devices, which might actually perform operations like rewinding the tape automatically upon close(). I think this semantic is probably lost in the mists of time for most of us.

Wednesday, November 8, 2017

CMake ExternalProject_add In Libraries

First off, I'm a developer of open source application libraries, some of which are fairly popular.

TLDR: Library developers should not use ExternalProject_Add, but instead rely on FindPackage, demanding that their downstream developers pre-install their dependencies.

I recently decided to try to add TLS v1.2 support to one of my messaging libraries, which is written in C and configured via CMake.



The best way for me to do this -- so I thought -- would be to add a dependency in my project using a sub project, bringing in a 3rd party (also open source) library -- Mbed TLS.

Now the Mbed TLS project is also configured by CMake, so you'd think this would be relatively straight-forward to include their work in my own.  You'd be mistaken.

CMake includes a capability for configuring external projects, even downloading their source code (or checking out the stuff via git) called ExternalProjects.

This looks super handy -- and it almost is.  (And for folks using CMake to build applications I'm sure this works out well indeed.)

Unfortunately, this facility needs a lot of work still -- it only runs at build time, not configuration time.

It also isn't immediately obvious that ExternalProject_Add() just creates the custom target, without making any dependencies upon that target.  I spent a number of hours trying to understand why my ExternalProject was not getting configured.  Hip hip hurray for CMake's amazing debugging facilities... notIt's sort of like trying to debug some bastard mix of m4, shell, and Python.  Hint, Add_Dependencies() is the clue you need, may this knowledge save you hours lack of it cost me.  Otherwise, enjoy the spaghetti.
Bon Apetit, CMake lovers!

So once you're configuring the dependent library, how are you going to link your own library against the dependent?

Well, if you're building an application, you just link (hopefully statically), have the link resolved at compile time, and forget about it forever more.

But if you're building a library the problem is harder.  You can't include the dependent library directly in your own.  There's no portable way to "merge" archive libraries or even dynamic libraries.

Basically, your consumers are going to be stuck having to link against the dependent libraries as well as your own (and in the right order too!)  You want to make this easier for folks, but you just can't. 
(My kingdom for a C equivalent to the Golang solution to this problem.  No wonder Pike et. al. got fed up with C and invented Go!)

And Gophers everywhere rejoiced!

Making matters worse, the actual library (or more, as in the aforementioned TLS software there are actually 3 separate libraries -- libmbedcrypto, libmbedx509, and libmbedtls) is located somewhere deeply nested in the build directory.   Your poor consumers are never gonna be able to figure it out.

There are two solutions:

a) Install the dependency as well as your own library (and tell users where it lives, perhaps via pkgconfig or somesuch).

b) Just forget about this and make users pre-install the dependency explicitly themselves, and pass the location to your configuration tool (CMake, autotools, etc.) explicitly.

Of these two, "a" is easier for end users -- as long as the application software doesn't also want to use functions in that library (perhaps linking against a *different* copy of the library).  If this happens, the problem can become kind of intractable to solve.

So, we basically punt, and make the user deal with this.  Which tests days for many systems is handled by packaging systems like debian, pkg-add, and brew.

After having worked in Go for so long (and admittedly in kernel software, which has none of these silly userland problems), the current state of affairs here in C is rather disappointing.

Does anyone out there have any other better ideas to handle this (I mean besides "develop in Y", where Y is some language besides C)?

Licensing... again....

Let me start by saying this... I hate the GPL.  Oh yeah, and a heads up, I am just a software engineer, and not a lawyer.  Having said that....

I've released software under the GPL, but I never will again.  Don't get me wrong, I love open source, but GPL's license terms are unaccountably toxic, creating an island that I am pretty sure that original GPL authors never intended.


My Problem....


So I started by wanting to contemplate a licensing change for a new library I'm working on, to move from the very loose and liberal MIT license, to something with a few characteristics I like -- namely patent protection and a "builtin" contributor agreement.   I'm speaking of course of the well-respected and well-regarded Apache License 2.0.

The problem is, I ran into a complete and utter roadblock.

I want my software to be maximally usable by as many folks as possible.

There is a large installed base of software released under the GPLv2.  (Often without the automatic upgrade clause.)

Now I'm not a big fan of "viral licenses" in general, but I get that folks want to have a copy-left that prevents folks from including their work in closed source projects.  I get it, and it's not an entirely unreasonable position to hold, even if I think it limits adoption of such licensed software.

My problem is, that the GPLv2's terms are incredibly strict, prohibiting any other license terms being applied by any other source in the project.  This means that you can't mix GPLv2 with pretty much anything else, except the very most permissive licenses.  The patent grant & protection clauses breaks GPLv2.  (In another older circumstance, the CDDL had similar issues which blocks ZFS from being distributed with the Linux kernel proper.  The CDDL also had a fairly benign choice-of-venue clause for legal action, which was also deemed incompatible to the GPLv2.)

So at the end of the day, GPLv2 freezes innovation and has limited my own actions because I would like to enable people who have GPLv2 libraries to use my libraries.  We even have an ideological agreement -- the FSF actually recommends the Apache License 2.0!  And yet I can't use it; I'm stuck with a very much inferior MIT license in order to let GPLv2 folks play in the pool.

Wait, you say, what about the GPLv3?  It fixed these incompatibilities, right?   Well, yeah, but then it went and added other constraints on use which are even more chilling than the GPLv2.  (The anti-Tivoization clause, which is one of the more bizarre things I've seen in any software license, applying only to equipment intended primarily "consumer premises".  What??)

The GPL is the FOSS movements worst enemy, in my opinion.  Sure, Linux is everywhere, but I believe that this is in spite of the GPLv2 license, rather than as a natural by product.  The same result could have been achieved under a liberal, or a file-based copyleft.

GPL in Support of Proprietary Ecosystems


In another turn of events, the GPL is now being used by commercial entities in a bait-and-switch.  In this scheme, they hook the developer on their work under the GPL.  But when the developer wants to add some kind of commercial capability and retain the source confidentially, the developer cannot do that -- unless the developer pays the original author a fee for a special commercial license.    For a typical example, have a look at the WolfSSL license page.

Now all that is fine and dandy legal as you please.  But, in this case, the GPL isn't being used to promote open source at all.  Instead, it has become an enabler for monetization of closed source, and frankly leads to a richer proprietary software ecosystem.  I don't think this was what the original GPL authors had intended.

Furthermore, because the author of this proprietary software needs to be able to relicense the code under commercial terms, they are very very unlikely to accept contributions from third parties (e.g. external developers) -- unless those contributors are willing to perform a copyright assignment or sign a contributor agreement giving the commercial entity very broad relicensing rights.

So instead of becoming an enabler for open collaboration, the GPL just becomes another tool in the pockets of commercial interests.

The GPL Needs to Die

If you love open source, and you want to enhance innovation, please, please don't license your stuff under GPL unless you have no other choice.  If you can relicense your work under other terms, please do so!  Look for a non-viral license with the patent protections needed for both your and your downstreams.  I recommend either the Mozilla Public License (if you need a copyleft on your own code), or the Apache License (which is liberal but offers better protections over BSD or MIT or similar alternatives.)

Monday, October 24, 2016

MacOS X Mystery (Challenge)

(Maybe my MacOS X expert friends will know the answer.)

This is a mystery that I cannot seem to figure out.  I think its a bug in the operating system, but I cannot seem to figure out the solution, or even explain the behavior to my satisfaction.

Occasionally, a shell window (iTerm2) will appear to "forget" my identity.

For example:

% whoami
501

That's half right... The same command in other window is more correct:

% whoami
garrett

Further, id -a reports differently:

The broken window:

% id -a
uid=501 gid=20(staff) groups=20(staff),501,12(everyone),61(localaccounts),79(_appserverusr),80(admin),81(_appserveradm),98(_lpadmin),33(_appstore),100(_lpoperator),204(_developer),395(com.apple.access_ftp),398(com.apple.access_screensharing),399(com.apple.access_ssh)

The working one:

% id -a
uid=501(garrett) gid=20(staff) groups=20(staff),501(access_bpf),12(everyone),61(localaccounts),79(_appserverusr),80(admin),81(_appserveradm),98(_lpadmin),33(_appstore),100(_lpoperator),204(_developer),395(com.apple.access_ftp),398(com.apple.access_screensharing),399(com.apple.access_ssh)

It appears that the shell (and this broken behavior seems to be inherited by child shells, by the way), somehow loses the ability to map numeric Unix ids to login names.

So I tried another command:

% dscl . -read /Users/garrett
Operation failed with error: eServerError

The same works properly in my other window (I'm not posting the entire output, since its really long).

I am wondering what could possibly be different.  The behavior doesn't seem to depend on environment variables (I've tried stripping those out).

I'm thinking that there is something in the process table (in the MacOS X equivalent of the uarea?) that gives me access to directory services -- and that this is somehow clobbered.  As indicated, whatever the thing is, it appears to be inherited across fork(2).

I thought maybe I could figure this out with DTrace or dtruss... but Apple have crippled DTrace on the platform and this is one of those binaries that I am unable to introspect.  Arrgh!

sudo dtruss dscl . -read /Users/garrett
Password:
dtrace: system integrity protection is on, some features will not be available

dtrace: failed to execute dscl: dtrace cannot control executables signed with restricted entitlements

Btw, I'm running the latest MacOS X:

% uname -a
Darwin Triton.local 16.0.0 Darwin Kernel Version 16.0.0: Mon Aug 29 17:56:20 PDT 2016; root:xnu-3789.1.32~3/RELEASE_X86_64 x86_64

So, for my MacOS X expert friends -- anyone know how directory services really works?  (As in how it works under the hood?)  I don't think we're in UNIX land anymore, Toto!

Security Advice to IoT Firmware Engineers

Last Friday (October 16, 2016), a major DDoS attack brought down a number of sites across the Internet.  My own employer was amongst those affected by the wide spread DNS outage.

It turns out that the sheer scale (millions of unique botnet members) was made possible by the IoT, and rather shoddy engineering practices.

Its time for device manufacturers and firmware engineers to "grow up", and learn how to properly engineer these things for the hostile Internet, so that they don't have to subsequently issue recalls when their customers' devices are weaponized by attackers without their owners knowledge.

This blog is meant to offer some advice to firmware engineers and manufacturers in the hope that it may help them prevent their devices from being used in these kinds of attacks in the future.


Passwords


Passwords are the root of most of the problems, and so much of the advice here is about improving the way these are handled.


No Default Passwords


The idea of using a simple default password and user name, like "admin/admin", is a practice from the 90's, and is intended to facilitate service personnel, and eliminate management considerations from dealing with many different passwords.  Unfortunately, this is probably the single biggest problem -- bad usernames and passwords.  Its far worse in an IoT world, where there are many thousands, or even millions, of devices that have the same user name and password.

The proper solution is to allocate a unique password to each and every device.  Much like we already do manage unique MAC addresses, we need every device to have a unique password.  (Critically, the password must not be derived from the MAC address though.)

My advice is to simply have a small amount of ROM that is factory burned with either a unique password, or a numeric key that can be used to create one.  (If you have enough memory to store a dictionary in generic firmware -- say 32k words, you can get very nice human manageable default passwords by storing just four 16-bit numbers, each representing an index into the dictionary (so only 15 bits of unique data, but thats 60 bits of total entropy, which is plenty to ensure that every device has its own password -- and only requires storing a 64-bit random number in ROM.)

Then you have nice human parseable passwords like "bigger-stampede-plasma-pandering".  These can be printed on the same sticker that MAC passwords are typically given.  (You could also accept a hexadecimal representation of the underlying 64-bit value, or just use that instead of human readable passwords if you are unable to accommodate an English dictionary.  Devices localized for use in other countries could use locale-appropriate dictionaries as well.)


Mandatory Authorization Delay


Second, IoT devices should inject a minimum delay after password authentication attempts (regardless of whether successful or otherwise).  Just a few seconds is enough to substantially slow down dictionary attacks against poorly chosen end-user passwords.  (2 seconds means that only 1800 unique attempts can be performed per hour under automation - 5 seconds reduces that to 720.  It will be difficult to iterate a million passwords against a device that does this.)


Strong Password Enforcement


User chosen passwords should not be a single dictionary word; indeed, the default should be to use a randomly generated password using the same dictionary approach above (generate a 64-bit random number, break into chunks, and index into a stock dictionary).  It may be necessary to provide an end-user override, but it should be somewhat difficult to get at by default, and when activate should display large warnings about the compromise to security that user-chosen passwords typically represent.


Networks


Dealing with the network, and securing the use of the network, is the other part of the problem that IoT vendors need to get right.


Local Network Authentication Only


IoT devices generally know the network they are on; if the device has a separate management port or LAN-only port (like a WiFi Router), it should only by default allow administrator access from that port.

Devices with only a single port, or that exist on a WiFi network, should prevent administrator access from "routed" networks, by default.   That is, devices should not allow login attempts from a remote IP address that is not on a local subnet, by default.  While this won't stop many attacks (especially those on public WiFis), it makes attacking them from a global botnet, or managing them as part of a global botnet, that much harder.   (Again, there has to be a provision to disable this limitation, but it should present a warning.)


Encrypted Access Only


Use of unsecured channels (HTTP or telnet) is unacceptable in this day and age.  TLS and/or SSH are the preferred ways to do this, and will let your customers deploy these devices somewhat more securely.

Secure All Other Ports


Devices should disable any network services that are not specifically part of the service they offer, or intrinsic to their management.   System administrators have known to do this on systems for decades now, but it seems some firmwares still have stock services enabled that can be used as attack vectors.


Don't Advertise Yourself


This one is probably the hardest.  mDNS and device discovery over "standard" networks is one of the ways that attackers find devices to target.  Its far far better to have this disabled by default -- if discovery is needed during device configuration, then it can be enabled briefly, when the device is being configured.  Having a "pairing" button to give end-users the ability to enable this briefly is useful -- but mDNS should be used only with caution.


Secure Your Channel Home


Devices often want to call-home for reporting, or web-centric command & control.  (E.g. remote management of your thermostat.)  This is one of the major attack vectors.   (If you can avoid calling home altogether, this is even better!)

Users must be able to disable this function (it should be disabled by default in fact).  Furthermore, the channels must be properly secured entirely through your network, with provision for dealing with a compromise (e.g. leaked private keys at the server side).  Get a security expert to review your protocols, and your internal security practices.


Mesh Securely


Building local mesh networks of devices, e.g. to create a local cloud, means having strong pairing technology.  The strongest forms of this require administrator action to approve -- just like pairing a bluetooth keyboard or other peripheral.

If you want to automate secure mesh provisioning, you have to have secure networking in place -- technologies like VPN or ZeroTier can help build networking layers that are secure by default.


Don't Invent Your Own Protocols


The roadside is littered with the corpses of protocols and products that attempted to invent their own protocols or use cryptography in non-standard ways.  The best example of this is WEP, which took a relatively secure crypto layer (RC4 was not broken at the time), but deployed it naively and brokenly.  RC4 got a very bad rap for this, but it was actually WEP that was broken.  (Since then, RC4 itself has been shown to have some weaknesses, but this is relatively new compared to the brokenness that was WEP.)


General Wisdoms


Next we have some advice that most people should already be aware of, but yet bears repeating.


Don't Rely on Obscurity


Its an old adage that "security by obscurity is no security at all".  Yet we often see naive engineers trying to harden systems by making them more obscure.  This really doesn't help anything long term, and can actually hinder security efforts by giving a false sense of security or creating barriers to security analysis.


Audit


Get an independent security expert to audit your work.  Special focus should be paid to the items pointed out above.  This should include a review of the product, as well as your internal practices around engineering, including secure coding, use of mitigation technologies, and business practices for dealing with keying material, code signing, and other sensitive data.

Saturday, May 14, 2016

Microsoft Hates My Name (Not Me, Just My Name)

In order to debug nanomsg problems on Windows, I recently installed a copy of Windows 8.1 in a VMWare guest VM, along with Visual Studio 14 and CMake 3.5.2.  (Yes, I've entered a special plane of Hell, reserved for just for people who try to maintain cross-platform open source software.  I think this one might be the tenth plane, that Dante skipped because it was just too damned horrible.)

Every time I tried to build, I got bizarre errors from the CMake / build process ... like this:

Cannot evaluate the item metadata "%(FullPath)

Turns out that when I created my account, using the "easy" installation in VMWare, it created my Windows account using my full name.  "Garrett D'Amore".  Turns out that the software is buggy, and can't cope with the apostrophe in my full name, when it appears in a filesystem path. 

Moving the project directory to C:\Projects\nanomsg solved the problem.

Really Microsoft?  This is 2016.  I expected programs to struggle and for me to find bugs in programs (often root exploits  -- all hackers should try using punctuation in their login and personal names) with the apostrophe in my name back in the 1990s.  Not in this decade.

Not only that, but the error message was so incredibly cryptic that it took a Google search to figure out that it was a problem with the path.  (Other people encountered this problem with paths > 260 characters.  I knew that wasn't my problem, but I hypothesized, and proved, that it was my name.)  I have no idea how to file a bug on Visual Studio to Microsoft.  I'm not a paying user of it, so maybe I shouldn't complain, and I really have no recourse.  Still, they need to fix this.

Normally, I'd never intentionally create a path with an apostrophe in it, but in this case I was being lazy and just accepted some defaults.  I staunchly refuse to change my name because some software is too stupid to cope with it -- this is a pet peeve for me. 

We're in the new millennium, and have been for a decade and half.  Large numbers of folks with heritage from countries like Italy, France, and Ireland have this character in their surname.  (And more recently -- since like the 1960s! -- the African-American community has been using this character in their first names too!)  If your software can't accommodate this common character in names, then it's broken, and you need to fix it.  There are literally millions of us that are angered by this sort of brokenness every day; do us all a favor and make your software just a little less rage inducing by letting us use the names we were born with please.