Wednesday, November 8, 2017

CMake ExternalProject_add In Libraries

First off, I'm a developer of open source application libraries, some of which are fairly popular.

TLDR: Library developers should not use ExternalProject_Add, but instead rely on FindPackage, demanding that their downstream developers pre-install their dependencies.

I recently decided to try to add TLS v1.2 support to one of my messaging libraries, which is written in C and configured via CMake.

The best way for me to do this -- so I thought -- would be to add a dependency in my project using a sub project, bringing in a 3rd party (also open source) library -- Mbed TLS.

Now the Mbed TLS project is also configured by CMake, so you'd think this would be relatively straight-forward to include their work in my own.  You'd be mistaken.

CMake includes a capability for configuring external projects, even downloading their source code (or checking out the stuff via git) called ExternalProjects.

This looks super handy -- and it almost is.  (And for folks using CMake to build applications I'm sure this works out well indeed.)

Unfortunately, this facility needs a lot of work still -- it only runs at build time, not configuration time.

It also isn't immediately obvious that ExternalProject_Add() just creates the custom target, without making any dependencies upon that target.  I spent a number of hours trying to understand why my ExternalProject was not getting configured.  Hip hip hurray for CMake's amazing debugging facilities... notIt's sort of like trying to debug some bastard mix of m4, shell, and Python.  Hint, Add_Dependencies() is the clue you need, may this knowledge save you hours lack of it cost me.  Otherwise, enjoy the spaghetti.
Bon Apetit, CMake lovers!

So once you're configuring the dependent library, how are you going to link your own library against the dependent?

Well, if you're building an application, you just link (hopefully statically), have the link resolved at compile time, and forget about it forever more.

But if you're building a library the problem is harder.  You can't include the dependent library directly in your own.  There's no portable way to "merge" archive libraries or even dynamic libraries.

Basically, your consumers are going to be stuck having to link against the dependent libraries as well as your own (and in the right order too!)  You want to make this easier for folks, but you just can't. 
(My kingdom for a C equivalent to the Golang solution to this problem.  No wonder Pike et. al. got fed up with C and invented Go!)

And Gophers everywhere rejoiced!

Making matters worse, the actual library (or more, as in the aforementioned TLS software there are actually 3 separate libraries -- libmbedcrypto, libmbedx509, and libmbedtls) is located somewhere deeply nested in the build directory.   Your poor consumers are never gonna be able to figure it out.

There are two solutions:

a) Install the dependency as well as your own library (and tell users where it lives, perhaps via pkgconfig or somesuch).

b) Just forget about this and make users pre-install the dependency explicitly themselves, and pass the location to your configuration tool (CMake, autotools, etc.) explicitly.

Of these two, "a" is easier for end users -- as long as the application software doesn't also want to use functions in that library (perhaps linking against a *different* copy of the library).  If this happens, the problem can become kind of intractable to solve.

So, we basically punt, and make the user deal with this.  Which tests days for many systems is handled by packaging systems like debian, pkg-add, and brew.

After having worked in Go for so long (and admittedly in kernel software, which has none of these silly userland problems), the current state of affairs here in C is rather disappointing.

Does anyone out there have any other better ideas to handle this (I mean besides "develop in Y", where Y is some language besides C)?

1 comment:

solvingj said...

RE: dependency management. IMO, the problems you are having come from the lack of a package manager. Many projects have made the CMake mechanism work in the absence of a package manager, but it's an example of clever engineering to compensate for the right thing. Every other language has a package manager to bridge the gap between libraries and projects because it's the right thing.

Fortunately, there are a handful of projects trying to solve this problem in C and C++ alike. They're all still finding their footing, but the ball is rolling in the right direction.

RE: MBedTLS. My team has been trying to work with the mbedTLS CMakeLists project and it's pretty bad. We work with a LOT of 3rd party projects CMakeLists and this one has a lot of small but frustrating problems. I've opened tickets, and the person who responded didn't seem to understand some fundamentals. It's so bad that it's currently stuck in limbo. It's unfortunate, because the library itself has potential. We're now about to start looking toward other crypto libs instead. There seem to be many, wolfssl, boringssl are on the list.