Tuesday, October 16, 2012

GNU grep - A Cautionary Tale About GPLv3

My company, DEY Storage Systems, is in the process of creating a new product around the illumos operating system.  As you might imagine, this product includes a variety of open and proprietary source code.  The product itself is not delivered as a separate executable, but as a complete product.  We don't permit our customers to crack it open, both from the sense of protecting our IP, but also to protect our support and release engineering organizations -- our software releases consist only of a single file and we don't supply tools or source for other parties to modify that file.

One of the pieces that we wanted to integrate into the tree is an excellent little piece of software called Zookeeper, produced by the Apache organization.  Like illumos, Zookeeper has a nice non-viral copyleft license, which makes it nice for integration into our product.

However, I discovered that as part of our integration, one of my engineers had decided to integrate GNU grep.  Why? Because the startup script for Zookeeper apparently makes use of some GNU grep extensions that are not illumos' grep.  (The need for us to enhance illumos' grep utility, and the need to teach folks not to rely on non-portable GNU extensions, are both topics for another time.)

Fortunately, I caught this in time.  Because that one little "worm" - used simply to support a startup script, could have created a situation where we would have been required to open source our entire product.

Now if you're Richard Stallman -- this is your goal -- all source code is "liberated".

But if you're a business trying to create value, this is actively harmful.  Its almost impossible to build a product around GPLv3 code unless the only way you create value is through selling support.  (There are a variety of business reasons this is a bad model... with open code, 3rd parties can start "selling support" for your product, possibly giving your product a bad name, and generally leaching off your engineering effort.   In the end, without proprietary code, there is vastly reduced economic incentive to innovate -- you can't use innovation in software to gain a competitive advantage when all software is open.  I would argue that even the innovation that occurs in Linux exists largely due to economic pressures arising from attempts to compete against proprietary systems. )

So, while the correction here is not big for us -- just a small change to a startup script for now -- the hazard of these viral licenses cannot be understated.  If you make your living, as I do, writing software, then you should be very wary of the lure of code carrying the GPLv3 -- its a potential license bomb that may have nasty consequences for you later.

And if you're a producer of open source software -- as I am -- be wary of your dependencies as well.  While your code may be licensed under reasonably generous terms that support commercial uses, if your dependencies include things like GPLv3 GNU grep, you may find that your product has less wide acceptance than you intended.  Inclusion (by dependency or otherwise) of a GPLv3 product in your own product means that you are willing to abdicate any decisions about the licensing of your own product.

I'll get off my soapbox now...

25 comments:

Lucian said...

At worst, you'd have had to remove the offending code. There's no magical "you must open-source all your stuff" aspect to the GPL, whichever version.

Justin Ryan said...

You're completely misinformed about Free Software. You do not have to open source your product because it relies on Zookeeper, whose startup scripts rely on GNU Grep.

You simply have to provide a means for obtaining the source to the version of GNU Grep you use.

Robert said...

I believe you are incorrect. The source you would have to give out would have been the GREP source not your product source.

Did your lawyers give you that answer?

Kirk Strauser said...

That's the strangest interpretation of the GPL I've yet heard. Who told you that packaging the standalone grep program would cause your other code not linked to it to be subject to its license? Whoever misrepresented the GPL so badly did you no favors.

Harris Newman said...

Why don't you just "grep -iR GPL *" on all your source code tree to find any issues prior to release?

Garrett D'Amore said...

You all need to go back and re-read the GPLv3. It is pretty darn clear here in its guards against Tivoization. Thank you for making me do the same -- I got a surprise. (More in a sec.)

So if you include it, you have to include directions about how to rebuild the product with a different version (customer modified) of the source (for grep in this case).

Have a look at this:

http://www.gnu.org/licenses/gpl-faq.html#Tivoization

However, it turns out that for a product *not* intended for deployment in a dwelling, it looks like the GPLv3 doesn't provide any anti-Tivoization clauses. (This is really strange -- it seems to offer a different level of requirements for a router intended for the home consumer than one intended for a datacenter -- how bizarre!) Strangely, the Affero license seems to address this for network services, but there is a wide gap for product deliveries into non-consumer markets!

It may be time to have the lawyers do a review of GPLv3 for us.

Still, its safest to avoid this mess altogether. When we have to have lawyers interpret the boundaries of the Work (which in this case is a single binary file), and whether a product is a User Product or not, its not a good sign.

fdsafads said...

As someone making proprietary products using free and open source software/tools, you should have very little issue with the concept of leeching.

Take a step back for a moment and look where you are coming from. Copyright holders have just as much right to prevent you from using their work if you plan on not giving back, as you do selling proprietary products based on the hard work of others. It all comes down to morals.

Garrett D'Amore said...

About proprietary vs. free morals... yes. (And by the way we contribute significantly to open source -- I started the illumos project after all!) The GPL isn't so much a software license as a religious platform. It limits innovation by requiring that all derivatives -- even something that derives simply by linking or unmodified inclusion -- must "give back". Sometimes that's the right stance for software to take. But don't complain if I warn about the dangers of including such products in closed source systems. GPLv3 is still veery dangerous for commercial consumers -- although it may be less so than I thought since the dangers seem only to apply to "User Products".

Alan Coopersmith said...

If you don't supply any sources, then aren't you violating the CDDL requirement to provide sources for all CDDL licensed files? (See section 3.1 of CDDL 1.0.)

Kyle Summerfield said...

Pretty funny/sad that your developer team was helped because grep's source was free, freedom protected by the GPL, and now that you've got your project done, forget free software. That's _exactly_ why the GPL exists, to stop companies and people like you from taking free work, benefiting from it, and then stopping others from benefiting.

Garrett D'Amore said...

Alan: we will have sources for CDDL stuff posted -- but actually all we have to do is give a link to that stuff (our our open illumos tree) to our customers. No sweat.

The problem is that the viral nature of GPLv3 can extend itself (through its anti-tivoization clause) to make it difficult or impossible to include GPLv3 in a project unless you're going to give enough of the source away to allow someone to replace that component. (There are ways to achieve this without giving source, but its ugly. And IMO it leaves too much to be interpreted by lawyers. And again, it appears that this limitation is strangely limited to only the consumer space -- which was something I didn't realize until today.)

Kyle: wow. Thanks for that bit of GPL zealotry. We -- I personally and companies that I have worked for -- have written and given away a lot of software. That doesn't mean I want to give it *all* away.

I venture to believe that if I (and my company) left the open source community, and stopped contributed altogether, the open source community would be poorer off for it.

I believe that each contributor should have a reasonable right to decide whether their code contributions should be open source or not. GPL zealots would do as much as they can to deny me that right, by robbing me the advantage of using open source. I retain that right -- and I elect not to use GPLv3 software (so that we derive no "benefit" from it) so that I can retain the privilege of choosing how the code *I* write should get licensed.

(For the record, I am a fan of more restricted copyleft. The CDDL used in illumos is great here -- as are the Apache and Mozilla licenses. They keep the core stacks open and minimize the chance of commercial fragmentation, while still allowing those subsystems to be used in other commercial endeavors that would build unique value on top of them.)

I'll go one further too -- I think that if Linux were GPLv3, there would be no commercial uptake of it in the various embedded arenas. I doubt there would even be such a thing as Android.

printf.net said...

So if you include it, you have to include directions about how to rebuild the product with a different version (customer modified) of the source (for grep in this case).

You're correct here. Yet in the main post you wrote:

could have created a situation where we would have been required to open source our entire product.

There is no circumstance under which this could have a created a situation where you were required to open source your entire product, and that's why everyone is disagreeing with what you wrote, and they're right. Your entire product is not GNU grep.

Giving access to install a modified GNU grep -- even under GPLv3's anti-tivoization provisions, even if you're providing a User Product -- has simply nothing to do with "open sourcing your entire product".

Patrick said...

I don't think anyone realized that we (ZooKeeper) were using GNU grep extensions. Please feel free to create a jira and if possible submit a patch to address. Thanks. https://issues.apache.org/jira/browse/ZOOKEEPER

Cathal said...

I understand your frustration, but you're really making it sound like this is the GPL's fault. It's not. It's your company's fault for assuming you can use software on your terms, not those of the creator.

From another perspective, you're trying to impose *your* desires on the GPL-licensed Grep utility; you want it to be free, and pliable, and to have no obligations in return. If you want that, look elsewhere; the makers of Grep demand that they get something in return. Thankfully, they are community minded folk, and what they get in return is the knowledge that you will, in turn, contribute to the code-commons.

By all means warn people who want to keep source closed not to use the GPL. Just don't make it sound like the GPL is at fault here; it's doing exactly what the code writers want it to do; prevent people making use of their hard work without any compensation (in this case, shared communal compensation).

I GPL my source code for *exactly* this reason: I don't want to see anyone, no matter how much open source code they otherwise write/contribute, using my work for closed source work from which I see no communal benefit. And if you don't agree with my outlook, that's fine; don't use my code. GPL, hard at work. :)

Sam said...

Garrett: Thank you very much for pointing out that anti-Tivoization in the GPL3 actually only covers *home* usage. Previously I've considered the GPL3 unworkable because I assumed that the clause covered any kind of product, and I don't think it's safe to allow the user to modify, for example, the engine management software in a car. Even with the best intentions it would be possible to endanger lives through incompetence when working on heavy machinery like that.

However, it turns out the GPL3 guys realised that and the requirement is only for *home* devices, which have far less potential for harmful effects than heavy machinery such as cars. I'm still not confident that the wording is rock solid, since I'm not a lawyer and it's never been tried in court, but we have to start somewhere and I'd be much happier to use the GPL3 for software in the future now that I've realised that the anti-Tivoization clause is at least fundamentally sound.

Unknown said...

With tivoization and its example of using hardware signatures and shutting down if the signature doesn't match. How do you think the position with gnu-grep in Solaris11 is ? If someone replaces 'ggrep' with a newer version by compiling from source etc. Later on a 'pkg fix ggrep' would put back the original gnu grep. Is this tivoization or something else ?

Something else perhaps since one can pkg uninstall ggrep and install their own without fear of their own being replaced again by pkg fix.

So this differs from your product in that one can't uninstall gnu grep and install their own. Hence the tivoization. Right ?

I'm just looking at the issue from another perspective.

A Person said...

Curious that you didn't mention the BSD license as favorable. Is there a reason for that?

Peter Musgrave said...

Pretty hilarious to see all of this GPLv3 religious zealotry. GPLv3 is an okay license, but it really is a bad choice for business. I know, it's tired, everyone says "if the GPL is so less friendly, why did it win in the adoption category, why does linux outmatch BSD". Well, because Linux is GPLv2. That's why. Also, it is a strawman to say Linux wins. Actually, BSD wins. OS X is the proof. Also, look at FreeBSD, they're dumping GCC because of GPLv3 and moving to llvm/clang. There is a fork of OpenBSD now that is dumping GCC and the GPLv3 mess. Stallmanites think the world can't survive without GCC & GNU, well, they're wrong. As De Gaulle said, "graveyards are full of indispensable men". Well, /dev/null is full of indispensable code. Stallman/FSF/GNU like to act like they can strong arm people into buying their motto "WE'LL FORCE YOU TO BE THE TYPE OF FREE WE SAY IS FREE". Bad move. The hacker community has dumped bullies before, and we'll dump them again. We'll make new compilers, new parser/generators, and we'll make new operating systems. Far from the theocracy of the FSF.

Unknown said...

"The problem is that the viral nature of GPLv3 can extend itself (through its anti-tivoization clause) to make it difficult or impossible to include GPLv3 in a project unless you're going to give enough of the source away to allow someone to replace that component."

That's not a bug, that's a feature.

miine said...

GPL3 states that if the tool is used in an "arms length" manner - then the calling tool IS GPLed3 too.

Had the same issue with mbuffer who's included version in OpenIndiana is the GPL3 version while an older source version is GPL2...

At the end its a definition of "arms length" - I can only recommend to not use any GPL3 tools/source as it is totally unclear what this means in court.

Marnen Laibow-Koser said...

"Its almost impossible to build a product around GPLv3 code unless the only way you create value is through selling support."

I'm no fan of the GPL, but this is utterly incorrect, at least in the hardware field (which I gather is where you're selling). If you have a hardware device (like a TiVo or a router) with user-modifiable code on it, consumers will still buy it. Most won't modify it, but those that want to will find the ability to do so a selling point. If anything, you get *more* hardware sales out of it.

Look at the router market: non-tivoization is a selling point for sure. If I buy a router and then put aftermarket firmware on it, everyone benefits: the manufacturer makes a sale, and I get firmware that better suits my needs. I probably wouldn't consider buying a router where I couldn't run aftermarket firmware, so if the manufacturer locked that ability down, they'd *lose* my purchase.

Garrett D'Amore said...

If you're selling hardware, sure. But, what if your value is in software. GPLv3 basically refutes the idea that anyone should create any unique value, or have any right to that unique value, that they create in software -- basically preventing the creation of value through software, unless somehow it ties to something else that has intrinsic value in the world (like a piece of hardware.)

With the commodization of hardware, there is little reason for anyone who doesn't produce something really unique in hardware -- i.e. the chip vendors themselves -- to ever invest in software. At the end of the day a Tivo is nothing more than a capture card, a hard disk, and a regular computer. Oh, and *software*, which is what sets the Tivo apart from other systems.

So yeah, if I'm a vendor like Tivo, I'm not going to be able to use GPLv3 components without sacrificing any chance of me being able to create my own value. And if I'm someone like oh say a games publisher, I also can't do that. Imagine if some AAA title had to be open source and freely redistributable just because an author picked up a GPLv3 string tokenizer or somesuch.

This is why GPLv3 is *toxic* for use in creating commercial software, and why I believe every business involved in the creation of *products* should avoid it like the plague. If your entire business is support, yeah, go for it. Or if your entire business is just "cheaper hardware", and you don't produce any software of your own (or that's not where your value is), sure, go for it. But those are typically race-to-the-bottom knock off vendors, rather than anyone actually improving the state of the art.

Don't get me wrong; I love open source. I use copy-left licenses heavily (CDDL and MPL both). But the scope of their impact is limited to just their own implementations; which is as it should be. Its impossible that a 100 line chunk of code is going to create a situation where a 10 million line effort is going to suddenly become vulnerable due to GPLv3 considerations.

Viral licenses like GPL serve a particular socialist agenda, that all software should be free, and that agenda is directly contrary to the business goals of thousands of companies and billions of dollars of investment (including, btw, engineering salaries!) For that reason, I think the GPL (especially v3) is self-selecting against itself -- ultimately I expect GPLv3 to become extinct except for a few projects of limited value and scope. We're already seeing this with clang.

Marnen Laibow-Koser said...

I was specifically addressing the hardware angle, because I thought that was the market niche you were in. I agree that it's tough (not impossible) to make a viable commercial software-only product with GPL, and I use other licenses (typically MIT) for most of my work.

Dren Kajm said...

If you don't want to use free software, try creating everything on your own and see how much profit you have.

Garrett D'Amore said...

Dren -- you clearly didn't bother to read the post in its entirety. I have crated *huge* amounts of software, and I share liberally. Go google me to see what I've given back -- those of you who want to bitch about just taking and not giving back should ask yourselves -- how much have *you* given back? I suspect for most of you I've got you beat hands down.

Some of us who contribute software that is free *want* it to be used commercially. This has historically been true for UNIX, for X11, BSD networking (which is the basis for all modern TCP/IP implementations), etc. Frankly without the ability to have that stuff in commercial products, UNIX (and btw consequently Linux) would never have gained popularity, and we'd all be stuck trying to figure out how to use Windows or VMS derived systems. So please don't confuse your social platform of "mandatory sharing" with free software. Indeed, in "guaranteeing freedoms", I consider that copyleft licenses (especially viral ones like GPLv3) are substantially *less* free than other more liberal licenses.

Proprietary software does have a place, and at various times my salary has been paid for by it. (And having that salary has enabled me to spend time working on free software too!) GNU GPLv3 and such software mix very very poorly.

Going back and looking at this, its become clear that most of the folks who responded that all I have to do is release the sources to Grep are wrong as well. That would be true *only* if I wanted to open up the entire product to modification -- recall this was a single firmware image. I didn't for some very strong business reasons -- most of which related to supportability but some were driven by competitive concerns. Getting forced into this because of grep being used in a startup script is just plain stupid, particularly since the GPLv3 contribution here was trivially worked around -- we did eliminate the offending software.

Now, in the case of that firmware image -- its kind of moot. The company, DEY, went bankrupt and the product never went into production. The reasons for this had nothing to do with technology or licensing, however. (The product itself was complete, but we ran out of money before we could successfully market it.)