Modernizing "less"
I have just spent an all-nighter doing something I didn't expect to do.
I've "modernized" less(1). (That link is to the changeset.)
First off, let me explain the motivation. We need a pager for illumos that can meet the requirements for POSIX IEEE 2003.1-2008 more(1). We have a suitable pager (barely), in closed source form only, delivered into /usr/xpg4/bin/more. We have an open source /usr/bin/more, but it is incredibly limited, hearkening back to the days of printed hard copy I think. (It even has Microsoft copyrights in it!)
So closed source is kind of a no go for me.
less(1) looks attractive. It's widely used, and has been used to fill in for more(1) to achieve POSIX compliance on other systems (such as MacOS X.)
So I started by unpacking it into our tree, and trying to get it to work with an illumos build system.
That's when I discovered the crazy contortions autoconf was doing that basically wound up leaving it with just legacy BSD termcap. Ewww. I wanted it to use X/Open Curses.
When I started trying to do that, I found that there were severe crasher bugs in less, involving the way it uses scratch buffer space. I started trying to debug just that problem, but pretty soon the effort mushroomed.
Legacy less supports all kinds of crufty and ancient systems. Systems like MS-DOS (actually many different versions with different compiler options!) and Ultrix and OS/2 and OS9, and OSK, etc. In fact, it apparently had to support systems where the C preprocessor didn't understand #elif, so the #ifdef maze was truly nightmarish. The code is K&R style C even.
I decided it was high time to modernize this application for POSIX systems. So I went ahead and did a sweeping update. In the process I ditched thousands of lines of code (the screen handling bits in screen.c are less than half as big as they were).
So, now it:
There is more work to do in the future if someone wants to. Here are the ideas for the future:
If someone wants to pick up any of this work, let me know. I'm happy to advise. Oh, and this isn't in illumos proper yet. It's unclear when, if ever, it will get into illumos -- I expect a lot of grief from people who think I shouldn't have forked this project, and I'm not interested in having a battle with them. The upstream has to be a crazy maze because of the platforms it has to support. We can do better, and I think this was a worthwhile case. (In any event, I now know quite a lot more about less internals than I did before. Not that this is a good thing.)
I've "modernized" less(1). (That link is to the changeset.)
First off, let me explain the motivation. We need a pager for illumos that can meet the requirements for POSIX IEEE 2003.1-2008 more(1). We have a suitable pager (barely), in closed source form only, delivered into /usr/xpg4/bin/more. We have an open source /usr/bin/more, but it is incredibly limited, hearkening back to the days of printed hard copy I think. (It even has Microsoft copyrights in it!)
So closed source is kind of a no go for me.
less(1) looks attractive. It's widely used, and has been used to fill in for more(1) to achieve POSIX compliance on other systems (such as MacOS X.)
So I started by unpacking it into our tree, and trying to get it to work with an illumos build system.
That's when I discovered the crazy contortions autoconf was doing that basically wound up leaving it with just legacy BSD termcap. Ewww. I wanted it to use X/Open Curses.
When I started trying to do that, I found that there were severe crasher bugs in less, involving the way it uses scratch buffer space. I started trying to debug just that problem, but pretty soon the effort mushroomed.
Legacy less supports all kinds of crufty and ancient systems. Systems like MS-DOS (actually many different versions with different compiler options!) and Ultrix and OS/2 and OS9, and OSK, etc. In fact, it apparently had to support systems where the C preprocessor didn't understand #elif, so the #ifdef maze was truly nightmarish. The code is K&R style C even.
I decided it was high time to modernize this application for POSIX systems. So I went ahead and did a sweeping update. In the process I ditched thousands of lines of code (the screen handling bits in screen.c are less than half as big as they were).
So, now it:
- Speaks terminfo (X/Open Curses) instead of ancient BSD termcap
- Uses glob(3C) instead of a hack involving the shell and a helper program (lessecho, which I've removed from my tree.)
- Functions properly as /usr/bin/more, both with and without -e (even on broken xterms)
- Is fully ANSI C (or ISO C, if you prefer)
- Passes illumos' cstyle code style checks
- Is lint(1) clean
There is more work to do in the future if someone wants to. Here are the ideas for the future:
- Internationalization. This is a pretty easy task involving gettext().
- Make less use getopt() instead of its byzantine option parser (it needed that for PC operating systems. We don't need or want this complexity on POSIX.)
- Fix its character set handling so it can use the mbstring and wcstring routines in the platform instead of relying on it's own implementation of UTF-8. (This would make it support other multibyte locales.)
- Make it support port events instead of sleeping when acting in "tail -f" mode.
If someone wants to pick up any of this work, let me know. I'm happy to advise. Oh, and this isn't in illumos proper yet. It's unclear when, if ever, it will get into illumos -- I expect a lot of grief from people who think I shouldn't have forked this project, and I'm not interested in having a battle with them. The upstream has to be a crazy maze because of the platforms it has to support. We can do better, and I think this was a worthwhile case. (In any event, I now know quite a lot more about less internals than I did before. Not that this is a good thing.)
Comments
I've just also pushed another substantial set of fixes, which I think bring the utility into 100% compliance with XPG7/POSIX 2008. I don't think any other more utility on the planet is conformant (they all get the "s" command wrong from my testing.)
It will take a seriously dedicated effort for someone to notice that this more is "not" the same more we have had all along. Check out the repo.
thanks
Did you use version control while doing the changes?
If not, please first import the original less (if it has a git repo, even better) and apply your changes to it, even if only in one commit, it will help understanding what changed.
https://bitbucket.org/gdamore/less-fork/
This has the latest changes. I've verified full POSIX compliance, and I've fixed quit a number of other bugs, including a few potential crashers.
oh well.
It's not that important for a pager, but think of all the utilities that run constantly on a server..
No disagreement from me. I didn't write this code. I just improved it.
That said, a private copy of the bss page, or a page of stack -- shouldn't make much difference either way.
Typically even on a busy server there aren't more than a few copies of less running. (vi is another story!)
Now as far as the rest of the coreutils... well, bleh. Hopefully the BSD code is better, and even more so the illumos code.
Feel free to offer up diffs for improvements, btw!
I'll take a look at the illumos code :)