Monday, April 28, 2008

XEmacs is Dead. Long Live XEmacs!

 "We're going to get lynched, aren't we?" — Phouchg


And you thought I'd given up on controversial blogs. Hah!

Preamble

This must be said: Jamie Zawinski is a hero. A living legend. A major powerhouse programmer who, among his many other accomplishments, wrote the original Netscape Navigator and the original XEmacs. A guy who can use the term "downward funargs" and then glare at you just daring you to ask him to explain it, you cretin. A dude with arguably the best cat-picture blog ever created.

I've never met him, but I've been in awe of his work since 1993-ish, a time when I was still wearing programming diapers and needing them changed about every 3 hours.

Let's see... that would be 15 years ago. I've been slaving away to become a better programmer for fifteen years, and I'm still not as good — nowhere near as good, mind you — as he was then. I still marvel at his work, and his shocking writing style, when I'm grubbing around in the guts of the Emacs-Lisp byte-compiler.

It makes you wonder how many of him there are out there. You know, programmers at that level. He can't be the only one. What do you suppose they're all working on? Or do they all eventually make 25th level and opt for divine ascension?

In any case, I'm sad that I have to write the obit on one of his greater achievements. Sorry, man. Keep up the cat blog.

Forking XEmacs

I have to include a teeny history lesson. Bear with me. It's short.

XEmacs was a fork of the GNU Emacs codebase, created about 17 years ago by a famous-ish startup called Lucid Inc., which, alas, went Tango Uniform circa 1994. As far as I know, their two big software legacies still extant are a Lisp environment now sold by LispWorks, and XEmacs.

I'd also count among their legacies an absolutely outstanding collection of software essays called Patterns of Software, by Lucid's founder, Richard P. Gabriel. I go back and re-read them every year or so. They're that good.

Back when XEmacs was forked, there were some fireworks. Nothing we haven't seen many times before or since. Software as usual. But there was a Great Schism. Nowadays it's more like competing football teams. Tempers have cooled. At least, I think they have.

As for the whole sordid history of the FSF-Emacs/XEmacs schism, you can read about it online. I'm sure it was a difficult decision to make. There are pros and cons to forking a huge open-source project. But I think it was the right decision at the time, just as decomissioning it is the right decision today, seventeen years later.

XEmacs dragged Emacs kicking and screaming into the modern era. Among many other things, XEmacs introduced GUI widgets, inline images, colors in terminal sessions, variable-size fonts, and internationalization. It also brought a host of technical innovations under the hood. And XEmacs has always shipped with a great many more packages than GNU Emacs, making it more of a turnkey solution for new users.

XEmacs was clearly an important force helping to motivate the evolution of GNU Emacs during the mid- to late-1990s. GNU Emacs was always playing catch-up, and the dev team led by RMS (an even more legendary hacker-hero) complained that XEmacs wasn't really playing on a level field. The observation was correct, since XEmacs was using a Bazaar-style development model, and could move faster as a direct consequence.

A lot of people were switching over to XEmacs by the mid-1990s: the fancy widgets and pretty colors attracted GNU Emacs users like moths to a bug-zapper.

Problem was, it could actually zap you.

The downside of the Bazaar

I personally tried to use XEmacs many times over a period of many years. I was jealous of its features.

However, I never managed to use XEmacs for very long, because it crashed a lot. I tried it on every platform I used between ~1996 and 2001, including HP/UX, SunOS, Solaris, Ultrix, Linux, Windows NT and Windows XP. XEmacs would never run for more than about a day under moderate use without crashing.

I've argued previously that one of the most important survival traits of a software system is that it should never reboot. Emacs and XEmacs are at the leading edge of living software systems, but XEmacs has never been able to take advantage of this property because even though it can live virtually forever, it's always tripping and falling down manholes.

Clumsy XEmacs. Clumsy!

I assume its propensity for inopportune heart attacks is a function of several things, including (a) old-school development without unit tests, (b) the need to port it to a gazillion platforms, including many that nobody actually uses, (c) a culture of rapid addition of new features. There are probably other factors as well.

I'm just speculating though. All I know is that it's always been very, very crashy. It doesn't actually matter what the reasons are, since there's no excuse for it.

Interestingly, most XEmacs users I've talked to say they don't notice the crashing. I'm sure this is because it's all relative. XEmacs doesn't crash any more often than Firefox, for instance. Probably less often. When Firefox crashes I make a joke about it and restart it, because the crashing rarely has an impact. It even restores your state properly most of the time, so it's just a minor blip, an almost trivial inconvenience, so long as whatever text field you happen to be editing has an auto-save feature. And most of the good ones do.

XEmacs may crash even less than Eclipse and IntelliJ. Crashing editors usually aren't a big problem. Programmers all learn the hard way to save their buffers frequently. For me, saving is like punctuation; I save whenever my typing pauses, out of reflex. Doesn't matter whether it's Emacs or NeoOffice or GMail or... do I use any other apps? Oh yeah, or the Gimp. When I pause, I save, and if you're a programmer I bet you do too. So occasional crashes may seem OK.

Another reason the crashes aren't called out more often is that most Emacs and XEmacs users are at best casual users. They open up an {X}Emacs session whenever they need to edit a file, and close it when they're done. It's just Notepad with colors and multi-level Undo.

If your average session length is shorter than the editor's MTBF, then yeah, you're not going to notice much crashing.

In contrast, your more... ah, seasoned (read: fanatical) Emacs users gradually come to live in it. Anything you can't do from within Emacs is an annoyance. It's like having to drive to a government building downtown to take care of some random paperwork they should have been offering as an online service a decade ago. You can live with it, but you're annoyed.

Even Firefox, the other big place I live, really wants to be Emacs. Tabs don't scale. Tabbed browsing was revolutionary in the same way adding more tellers to a bank was revolutionary: it's, like, 4x better. w00t. Emacs offers the unique ability to manage its open buffers in another first-class buffer, as a list. Imagine what your filesystem life would be like if the only view of a directory was one tab per file. Go to your pictures directory and watch it start vomiting tabs out like it tried to swallow a box of chiclets. Fun!

I feel sad when I see Eclipse users with fifty open tabs, an army of helpful termites eating away at their screen real-estate and their time.

I have a feeling I've veered off course somewhere... where was I? Oh yeah. Crashing.

So XEmacs has never been a particularly good tool for serious Emacs users because even though it's written in C, it crashes like a mature C++ application. You know the drill: major faceplants, all the fugging time.

Your ability to become an inhabitant of Emacs is gated by how stable it is. GNU Emacs has always been famously stable. Sure, the releases happen less frequently than presidential inaugurations. Sure, for a long time it always lacked some XEmacs feature or other. But it's really, really stable. Its MTBF is measurable in weeks (or even months, depending on what you're doing with it) as opposed to hours or days.

Emacs, like Firefox, can be configured to back up your state periodically, so that in theory it can recover after a crash. That's part of the problem: you didn't actually have to configure Firefox to get that behavior. It does it automatically. And to be honest, I've never had much luck with the Emacs save-state facilities. I'm a pretty competent elisp hacker these days, but the desktop.el has never worked reliably for me. I could probably get it to work, but I've always found it easier to write specialized startup "scripts" (lisp functions) that load up particular favorite configurations.

If I can't get desktop-save working, I'd guess that fewer than 1/10th of 1 percent of Emacs users use that feature. So crashes blow everything away.

If the state isn't being auto-saved, the next best thing is for it not to crash.

XEmacs never got that right.

Don't get me wrong...

I just realized I'm going to get screamed at by people who think I'm just an XEmacs-hater slash GNU-fanboy.

Make no mistake: I'm a fan of XEmacs. I think it was a great (or at least, necessary) idea in 1991. I think the execution, aside from the stability issue, was top-notch. I think it had a good architecture, by and large, at least within the rather severe constraints imposed by Emacs Lisp. I think it spurred competition in a healthy way.

I think the XEmacs development team, over the years, has consisted of engineers who are ALL better than I am, with no exceptions. And I even like certain aspects of the interface better, even today now that GNU Emacs has caught and surpassed XEmacs in features. For instance, I like the XEmacs "apropos" system better.

If you're going to scream at me for irrational reasons, it really ought to be for the right irrational reasons. Legitimate dumb reasons for screaming at me include: you're lazy and don't want to learn anything new; you invested a lot of time in XEmacs and don't see why you should be forced to switch; you are a very slow reader, causing you to skip three out of every five words I write, resulting in your receipt of a random approximation of my blog content, with a high error bar; you're still mad about my OS X blog. All good bad reasons.

Heck, you could even scream for rational reasons. Perhaps you have a philosophical beef with the FSF or GPL3. Perhaps XEmacs still has some vestiges of feature support that do not yet exist in GNU Emacs, and you truly can't live without them. I would think you're being a teeny bit uptight, but I would respect your opinion.

Whatever you do, just don't yell at me for thinking I'm dissing XEmacs or taking some sort of religious stance. Far from it. I just want a unified Emacs-o-cratic party.

XEmacs vs. GNU Emacs today

GNU Emacs pulled into the lead in, oh... I'd say somewhere around maybe 2002? 2003? I wasn't really keeping track, but one day I noticed Emacs had caught up.

Even today I maintain XEmacs/FSF-Emacs compatibility for my elisp files – some 50k lines of stuff I've written and maybe 400k lines of stuff I've pilfered from EmacsWiki, friends, and other sources. I still fire up XEmacs whenever I need to help someone get un-stuck, or to figure out whether some package I've written can be coerced to run, possibly in restricted-feature mode, under XEmacs.

For years I chose stability over features. And then one day GNU Emacs had... well, everything. Toolbars, widgets, inline images, variable fonts, internationalization, drag-and-drop in and out of the OS clipboard (even on Windows), multi-tty, and a long laundry-list of stuff I'd written off as XEmacs-only.

And it was still stable. Go figure.

I don't have the full feature-compatibility list. Does it even exist? You know, those tables that have little red X's if the Evil Competitor product is missing some feature your product offers, and little green checkmarks, and so on. We ought to make one of those. It would be useful to know what (if any) XEmacs features are preventing the last holdouts from migrating to FSF Emacs.

But for the past five years or so, just about every time an XEmacs user on a mailing list has mentioned a feature that's keeping them from switching, it's been solved.

If GNU Emacs isn't a perfect superset of XEmacs yet, I'm sure we could get it there if we had the big unified-platform carrot dangling in front of us. And I bet it's pretty close already.

Features and stability aside, XEmacs is looking pretty shabby in the performance department. Its font-lock support has never been very fast, and a few years back GNU Emacs took a giant leap forward. XEmacs can take 4 or 5 seconds or longer to fontify a medium-sized source file. Sure, it shows that big progress bar in the middle of the screen, so you know it's not dead, but when you're used to it being almost instantaneous, coming back to XEmacs is a real shocker.

And XEmacs has bugs. Man, it has a lot of bugs. I can't begin to tell you how many times I've had to work around some horrible XEmacs problem. It has bugs (e.g. in its fontification engine and cc-engine) that have been open for years, and they can be really painful to work around. I've had to take entire mode definitions and if-xemacs them, using an ancient version of the mode for XEmacs because nothing even remotely recent will run.

You may not notice the bugs, but as elisp developers, we feel the pain keenly.

Fundamental incompatibilities

As if issues with stability, performance and bugs weren't enough, XEmacs has yet another problem, which is that its APIs for dealing with UI elements (widgets and input events, but also including things like text properties, overlays, backgrounds and other in-buffer markup) are basically completely different from their GNU-Emacs counterparts. The two Emacsen share a great deal of common infrastructure at the Lisp level: they have mostly compatible APIs for dealing with files, buffers, windows, subprocesses, errors and signals, streams, timers, hooks and other primitives.

But their APIs range from mildly to completely different for keyboard and mouse handling, menus, scrollbars, foreground and background highlighting, dialogs, images, fonts, and just about everything else that interfaces with the window system.

The GUI and display code for any given package can be a significant fraction of the total effort, and it essentially has to be rewritten from scratch when porting from GNU Emacs to XEmacs or vice-versa. Unsurprisingly, many package authors just don't do it. The most famous example I can think of is James Clark's nxml-mode, which claims it'll never support XEmacs. I found that pretty shocking, since I thought it was basic Emacs etiquette to try to support XEmacs, and here James was cutting all ties, all public about it and everything. Wow.

But I totally understand, since I really don't want to rewrite all the display logic for my stuff either.

I'll be the first to admit: the API discrepancies are not XEmacs's fault. I can't see how they could be, given that for nearly all these features, XEmacs had them first.

For a developer trying to release a productivity package, it doesn't really matter whose fault it is. You target the platform that will have the most users. I don't know what XEmacs's market share is these days, but I'd be very surprised if it's more than 30%. That's a big number, but when you're an elisp hacker creating an open-source project in your limited spare time, that number can start looking awfully small. Teeny, even.

XEmacs should drop out of the race

At this point it's becoming painful to watch. GNU Emacs is getting all the superdelegates. That warmonger VIM is sitting back and laughing at us. But XEmacs just won't quit!

I'm sure there are a few old-timers out there who still care about the bad blood that originally existed between the two projects. To everyone else it's ancient history. As far as I can tell, there has been an atmosphere of polite (if subdued) cooperation between the two projects. Each of them has incorporated some compatibility fixes for the other, although it's still mostly up to package authors to do the heavy lifting of ensuring compatibility, especially for display code.

I haven't seen any XEmacs/GNU-Emacs flamewars in a long time, either. We're all just *Emacs users, keeping our community alive in the face of monster IDEs that vomit tabs, consume gigabytes of RAM, and attract robotic users who will probably never understand the critical importance of customizing and writing one's own tools.

When the Coke/Pepsi discussion comes up these days, it's usually an XEmacs user asking, in all seriousness, whether they should transition to GNU Emacs, and if so, would someone volunteer to help migrate their files and emulate their favorite behaviors.

Yes, someone will volunteer. I promise.

The dubious future of Emacs

I've got good news and bad news.

The good news is: Emacs is a revolutionary, almost indescribably QWAN-infused software system. Non-Emacs users and casual users simply can't appreciate how rich and rewarding it is, because they have nothing else to compare it to. There are other scriptable applications and systems out there — AppleScript, Firefox, things like that. They're fun and useful. But Emacs is self-hosting: writing things in it makes the environment itself more powerful. It's a feedback loop: a recursive, self-reinforcing, multiplicative effect that happens because you're enhancing the environment you're using to create enhancements.

When you write Emacs extensions, sometimes you're automating drudgery (always a good thing), sometimes you're writing new utilities or apps, and sometimes you're customizing the behavior of existing utilities. This isn't too much different from any well-designed scriptable environment. But unlike in other environments, sometimes you're improving your editing tools and/or your programming tools for Emacs itself. This notion of self-hosting software is something I've been wanting to blog more about, someday when I understand it better.

Eclipse and similar environments want to be self-hosting, but they're not, because Java is not self-hosting. In spite of Java's smattering of dynamic facilities, Java remains as fundamentally incapable of self-hosting as C++. Self-hosting only works if the code can "fold" on itself and become more powerful while making itself smaller and cleaner. I'm not really talking about macros here, even though that's probably the first thing you thought of. I'm thinking more along the lines of implementing JITs and supercompilers in the hosted runtime, rather than in the C++ or Java "hardware" substrate, which is where everyone puts them today.

I suspect (without proof) that in self-hosted environments, you can eventually cross a threshold where your performance gains from features implemented in the hosted environment outpace the gains from features in the substrate, because of this self-reinforcing effect: if code can make _itself_ faster and smarter, then it will be faster and smarter at making itself faster and smarter. In C++ and Java, making this jump to the self-reinforcing level is essentially intractable because, ironically, they have so many features (or feature omissions) for the sake of performance that they get in their own way.

To be sure, Emacs, the current crop of popular scripting languages, and other modestly self-hosting environments are all pretty far from achieving self-reinforcing performance. But Emacs has achieved it for productivity – at least, for the relatively small percentage of Emacs users who learn enough elisp to take advantage of it. There are just enough of us doing it to generate a steady supply of new elisp hackers, and the general-purpose artifacts we produce are usually enough to keep the current crop of casual users happy.

The bad news: the competition isn't the IDEs

I've argued that Emacs is in a special self-reinforcing software category. For productivity gains, that category can only be occupied by editors, by definition, and Emacs is currently way ahead of any competition in most respects. So most Emacs users have felt safe in the assumption that IDEs aren't going to replace Emacs.

Unfortunately, Emacs isn't immunized against obsolescence. It still needs to evolve, and evolve fast, if it's going to stay relevant. The same could be said of any piece of software, so this shouldn't be news. But it's particularly true for Emacs, because increasing numbers of programmers are being lured by the false productivity promises of IDEs.

They really are false promises: writing an Eclipse or IntelliJ (or God help you, Visual Studio) plugin is a monumental effort, so almost nobody does it. This means there's no community of building and customizing your own tools, which has long been the hallmark of great programmers. Moreover, the effort to create a plugin is high enough that people only do it for really significant applications, whereas in Emacs a "plugin" can be any size at all, from a single line of code up through enormous systems and frameworks.

Emacs has the same learning-curve benefit that HTML had: you can start simple and gradually work your way up, with no sudden step-functions in complexity. The IDEs start you off with monumental API guides, tutorials, boilerplate generators, and full-fledged manuals, at which point your brain switches off and you go over to see what's new on reddit. ("PLEASE UPMOD THIS PIC ITS FUNNY!")

And let's not even get into the Million Refactorings yet. It's a blog I've been working on for years, and may never finish, but at some point I'd like to try to show IDE users, probably through dozens or even hundreds of hands-on examples I've been collecting, that "refactoring" is an infinite spectrum of symbol manipulation, and they have, um, twelve of them. Maybe it's thirteen. Thirteen out of infinity – it's a start!

Programmers are being lured to IDEs, but the current crop of IDEs lacks the necessary elements to achieve self-hosting. So the only damage to Emacs (and to programmers in general) is that the bar is gradually going down: programmers are no longer being taught to create their own tools.

IDEs are draining users away, but it's not the classic fat-client IDEs that are ultimately going to kill Emacs. It's the browsers. They have all the power of a fat-client platform and all the flexibility of a dynamic system. I said earlier that Firefox wants to be Emacs. It should be obvious that Emacs also wants to be Firefox. Each has what the other lacks, and together they're pretty damn close to the ultimate software package.

If Emacs can't find a way to evolve into (or merge with) Firefox, then Firefox or some other extensible browser is going to eclipse Emacs. It's just a matter of time. This wouldn't be a bad thing, per se, but there's a good chance it would be done poorly, take forever, and wind up being less satisfying than if Emacs were to sprout browser-like facilities.

Emacs as a CLR

So Emacs needs to light a fire and hurry up and get a better rendering engine. Port to XUL, maybe? I don't know, but it's currently too limited in the application domains it can tackle. I realize this is a very hard problem to solve, but it needs to happen, or at some point a rendering engine will emerge with just enough editing power to drain the life from Emacs.

Emacs also needs to take a page from the JVM/CLR/Parrot efforts and treat itself as a VM (that's what it is, for all intents) and start offering first-class support for other languages. It's not that there's anything wrong with Lisp; the problem is X programmers. They only want to use X, so you have to offer a wide range of options for X. Emacs could be written in any language at all, take your pick, and it wouldn't be good enough.

RMS had this idea a long, long time ago (when he was making the rather controversial point that Tcl isn't a valid option for X), and it eventually led to Guile, which led more or less nowhere. Not surprising; it's a phenomenally difficult challenge. There are really only two VMs out there that have achieved even modest success with hosting multiple languages: the CLR and the JVM. CLR's winning that race, although it's happening in a dimension (Windows-land) that most of us don't inhabit. Parrot is... trying really hard. Actually, I should probably mention LLVM, which (like Parrot) was designed from the ground up for multi-language support, but took a lighter-weight approach. So let's call it four.

In any case, it's a small very group of VMs, and they still haven't quite figured out how to do it: how to get the languages to interoperate, how to get languages other than the first to perform decently, and so on.

This is clearly one of the hardest technical challenges facing our industry for the next 10 years, but it's also one of the most obviously necessary. And Emacs is going to have to play that game. I'm not talking about hacked-together process bridges like PyMacs or el4r, either — I mean first-class support and all that it entails.

I've mentioned the rendering engine and the multi-language support; the last major hurdle is concurrency. I don't know the answer here, either, but it needs an answer. Threads may be too difficult to support with the current architecture, but there are other options, and someone needs to start thinking hard about them. Editing is becoming a complicated business — too complicated for hand-rolling state machines.

Compete or die

So Emacs has some very serious changes ahead.

Let's face it: we're not going to see real change unless ALL the Emacs developers out there – today's crop of JWZs – band together to make it happen. But today we're divided. Two groups of brilliant C hackers working on separate, forked code bases? That's bad. Two groups of maniacal elisp hackers working on incompatible packages, or at best wasting time trying to achieve compatibility? Also bad.

Developers are starting to wake up and realize that the best "mainstream" extensible platform (which excludes Emacs, on account of the Lisp) is Firefox or any other non-dead browser (which excludes IE). Dynamic typing wins again, as it always will. Dynamic typing, property-based modeling and non-strict text protocols won the day for the web, and have resisted all incursions from heavyweight static replacements. And somehow the web keeps growing, against all the predictions and lamentations of the static camp, and it still works. And now the browsers are starting to sprout desktop-quality apps and productivity tools. It won't be long, I think, before the best Java development environment on the planet is written in JavaScript.

Emacs has to compete or die. If Firefox ever "tips" and achieves even a tenth of the out-of-the-box editing power of Emacs, not just for a specific application but for all web pages, widgets, text fields and system resources, Emacs is going to be toast. I may be the last rat on the ship, but I'm sure not going down with it; even _I_ will abandon Emacs if Firefox becomes a minimally acceptable extensible programmer's editor. This is a higher bar than you probably think, but it could happen.

We no longer need XEmacs to spur healthy competition. The competition is coming in hard from entirely new sources. What we need now is unity.

Then why not unify behind XEmacs?

I threw this in just in case you blew through the article, which I'd find perfectly understandable. To summarize, I've argued that XEmacs has a much lower market share, poorer performance, more bugs, much lower stability, and at this point probably fewer features than GNU Emacs. When you add it all up, it's the weaker candidate by a large margin.

Hence there's only one reasonable strategy: Hill, er, I mean XEmacs has to drop out of the race.

I'm really sorry about this. I'm a close personal friend of XEmacs, but I just can't endorse it anymore. I used to be a laissez-faire kinda guy, as long as you were using some flavor of Emacs. But at this point, if you're using XEmacs you're actively damaging not only your long-term productivity, but mine as well. So I'd like to ask you to think long and hard about switching. Soon.

If you're a local Emacs-Lisp guru, please offer your services to XEmacs users who would like to switch over. The more pain-free the migration is, the faster it will happen.

If you're a graphic artist, consider making a nice, tasteful "Euthanize XEmacs!" logo. Not that message, precisely, but something along those lines. Make sure it's tasteful. Perhaps "XEmacs is dead – long live XEmacs"? Yes, I think that would do nicely.

If you happen to know someone on the XEmacs development team, send them some chocolates, or movie tickets, or something. A thank-you, even. We should honor their service. But those guys are the most qualified on the planet to step in and start helping drive GNU Emacs forward, assuming the FSF team will have them. Emacs is in very bad shape indeed if they will not.

If you're a local system administrator, consider sudo rm -rf xemacs. Sorry, I mean consider politely asking your emacs-users mailing list if they might be willing to set a timeline for deprecating XEmacs, thereby starting the most massive flamewar in your company's history. Can't hurt!

If you're seeing red, and you skipped most of this article so you could comment on how amazingly lame this idea is, I recommend taking a little walk and getting some fresh air first.

If you're RMS, thank you for making Emacs and all that other stuff, and for keeping it free. Please be nice to those who wish to help. You're scary to the point of unapproachability, just 'cuz you're you.

XEmacs team, JWZ, and XEmacs package authors: thank you all for helping drive progress in the greatest piece of software of all time. I can only hope that someday I may have chops like that.

Now how about we turn this into the most famous reverse-fork in history?

Thursday, April 24, 2008

Settling the OS X focus-follows-mouse debate

I recently switched to using OS X full-time for all my client-side computing. Still using Linux on the backends, of course, at home and at work, but I now use Macs for my client machines.

I'm not a Mac fanboy. I'm sort of a wannabe Mac fanboy, but I'm not familiar enough with the OS yet (either as a user or as a programmer) to really rave about it. I will say this: it was kinda fun turning off that last Windows box for the last time.

My main reason for switching was that I'm getting old and the fonts look nicer. Pretty stupid reason, isn't it? I thought so too. But getting old kinda sneaks up on you. I've gone from preferring six-point font when I was twelve to 20-point font now that I'm 40. So at least for me, my ideal font point size appears to be (age/2).

That sucks.

One day I noticed that I could actually read the screen when I was browsing in the Apple store, and I did some experimentation and found that yes, I can actually read normal-person's fonts on the Mac. And they look kinda nice, too – the antialiasing engine seems to be smarter (and faster) than the ones I've seen on Windows and Linux.

So there ya go. Fonts. And now I have to learn all this new stuff, like what all those weird little symbols mean on the keys, and how to use the Finder, and what a "DMG file" is, and other stuff. But the screen looks soooo nice, so it's worth it. How do they do it? It's not just the fonts. OS X windows look whiter and cleaner than their Windows/Linux cousins running on the same display with similar video hardware. It's a mystery to me, but it's kinda cool.

Almost pain-free migration

I've been using a work-issued MacBook Pro laptop for the past year, and that helped a lot with the transition, since when you're on the road trying to get some work done, you have no choice but to figure out how to do basic OS tasks. So that was a nice, slow, reasonably pain-free way to teach myself the basic skills you need.

I only bring this up because I know a lot of programmers (myself included) who've tried Macs repeatedly and run away scared. If you stick with it a little longer, it's not too bad! Particularly with their latest OS X release (Leopard), it's gotten a lot easier to do basic configuration for people accustomed to Linux.

For starters, it comes with a good X11 implementation, and there is a MacPorts project that ports all your favorite Unix stuff. And they're not lame half-broken ports like the ones you have to live with in Cygwin. For instance, in a Bash shell running inside Emacs you can ssh into a Linux box and not get a bunch of greebly control characters.

And OS X is Unix, based on FreeBSD, so all your normal favorite Unix stuff pretty much works the same, or at least as much as you can expect across different flavors of Unix.

The only real reason I was using Windows at all, to be honest, was for hardware-device compatibility and for multimedia. The Mac has drivers for everything I cared about (my router, my printer, my camera, etc.) and beats Windows hands-down on any sort of multimedia, so it was becoming clear that Windows wasn't buying me much anymore.

I suppose I could do a blow-by-blow guide for how a Unix-and-Windows user can configure their Mac for maximum happiness. If anyone's interested, anyway. Not today, though. The bottom line is that pretty much everything you don't like is configurable... with one ugly exception. And I don't mean "Mary Ann on Gilligan's Island Ugly", either. I mean Ugly ugly.

The Big Focus Issue

Ever since the Dawn of Time (Jan 1, 1970), people have been bitching about the lack of focus-follows-mouse on Mac computers. They started complaining about it fourteen years before the first Mac was even released, that's how bad it was.

Every time they bring it up on Mac forums, the Mac users with non-Unix backgrounds ask "what's that?" And then a bunch of wrong answers start flying around, with a few right answers interspersed but drowned out in the noise.

So let me tell you what it is first, in case you're not from a Unix background. Focus-follows-mouse means that when you move the mouse cursor, the window under the cursor gets the keyboard focus. But saying that confuses Mac people who all assume that "focused" is synonymous with "foreground", because that's the way it works on the Mac.

The confusion stems from the fact that focus-follows-mouse comes in not one, but two, yes that's right, two yummy flavors.

Flavor #1: autofocus — in this flavor, reminiscent perhaps of a sweet juicy mandarin orange, the window under the mouse gets the keyboard focus but does not come to the front. This allows you to interact with a partially-obscured window. It's especially useful when you have a terminal or shell window open, and it's running a background process that you want to observe... you guessed it, in the background! You leave a little bit of the bottom and/or side of the window uncovered so you can keep an eye on the output.

Real-life use case: let's say you're a programmer who writes in C++. You will, of course, spend most of your working day playing Solitaire and reading reddit, because C++ is too goddamned stupid to do anything but gigantic, slow batch compiles of the entire dependency universe. So you have at least four windows open at any given time: your editor, your compile shell, your browser, and your Solitaire game. You've spent a lot of time adjusting your window configuration to be "just right", and unless you have a 30-inch screen (for instance, because you work for Google), your windows overlap.

Watching your compile status is like checking your rear-view mirror; you do it every 7 seconds or so, even though you know the compiler will take a minimum of 15 minutes. It's like a slow-motion train wreck that you just can't tear your eyes from, even while playing Solitaire and reading reddit. And every once in a while you'll need to enter a command (e.g. "make", after you've fixed the umpteenth compiler warning about doing a perfectly valid type conversion). The last thing you want is to have to click the window to bring it to the front just so you can type "make", because then you'll need to go futz around with your window configuration again to get the window to go to back to whatever Z-location it used to be in the window stack.

I know it doesn't sound like a big effort, but programmers are really, really lazy, and they like to minimize motion. They'd use feeder tubes if the Health Department would let them.

So in the autofocus flavor, it's important that the window that gets the focus does not automatically come to the front.

Flavor #2: autoraise — in this pungent flavor, somewhat evocative of a slightly overripe Durian fruit left in the tropical sun for about nine hours, moving the mouse into a new window automatically brings that window to the front. In the especially horrible default configuration, it comes to the front instantly, so the act of moving your mouse across the screen makes it look like that old "rectangles" screen saver, and your window configuration is utterly obliterated in under a second.

Many programmers feel that autofocus is delicate butterfly and autoraise is a big, stinky buffalo. That's just how they feel about it. No accounting for taste. I, for one, think of autoraise as a big, stinky, deceased buffalo carcass that someone thoughtfully dragged into my living room while I was on vacation, probably towards the beginning of the vacation, and then they turned up my thermostat to 110°F, closed the windows and tossed a Durian fruit at the wall for good measure.

But maybe it's just me.

So one of the most annoying aspects of the whole "how do I get focus-follows-mouse behavior on my Mac" debate is that everyone assumes you mean autoraise. There are a number of packages out there, most of them commercial, that offer autoraise as a feature, and Mac users point you to these products and then get all smugly about how they've solved your problem and how Macs still rule the universe, when in fact the problem is still festering away.

It's no wonder people still use Linux as their UI. That one feature alone keeps hordes of programmers from switching. (And yes, you can get the behavior on Windows using their TweakUI power tools, so some programmers use Windows as a Linux shell with a decent media player.)

Super-Stevey to the rescue (well, almost)

Given that I switched quite recently to the Mac, I'm still reeling from the lack of focus-follows-mouse behavior. To help you put yourself in my shoes, imagine that your latest operating system upgrade (whatever OS you happen to be running) includes a new mandatory feature wherein each time you click on a window to focus it, a loud alarm goes off ("BLONK! BLONK! BLONK! ...") and you have to open the System menu and select "silence window alarm" to shut it up.

That's what not having autofocus is like to people who've been using it for the past 10 to 30 years (in my case, 20 years). BLONK! BLONK! BLONK! I'm serious. It's that bad. Not exaggerating even a tiny bit.

I'm sure you could eventually get used to this behavior, and even find yourself arguing on newsgroups that you rather like the blonk blonk sound, since it reminds you that you've recently chosen to switch to another application or to another window within the current application, plus it's really not that big a deal because you can just open the System menu and turn it off.

It's amazing how so many people choose to rationalize stuff they're forced to live with. Why not just admit it sucks? Sometimes stuff sucks! C'mon, admit it! Jeez!

But even if you eventually managed to rationalize it, you'd be pretty fugging pissed off the first, oh, ten thousand or so times it happened to you after the upgrade.

So the other day, after the 100th or so BLONK! BLONK! BLONK! alarm when I innocently tried to type into a different window, I found myself quietly contemplating the pros and cons of getting an assault rifle, heading down to the Apple campus, and making my opinions known to all until the SWAT team took me out. And then I thought: "hey, as attractive as that option sounds right now, I have a better idea: why not fix it myself? I make occasional claims to being a programmer, right? How blonking hard can it be? I'll budget one evening for it."

I actually wound up spending 2 evenings on it, since although coming up to speed on the Apple tools and APIs was almost trivial, this particular issue turned out to be thorny in a variety of unexpected ways.

But I did get it working, for some definition of "working", and now I'm in a position to settle the debate for the forseeable future, which I estimate to be about the next five to seven years in this case.

The Definitive Answer

Short version: you can almost get it working, but not quite, on account of an arguable bug in one of the Carbon (that is, Mac C++) APIs. What I got working was a system-wide autofocus mode that unfortunately only re-routes unmodified keys to the window under the cursor. You can fake the Shift modifier by translating the key code manually, but the Control, Command and Alt/Option keys never make it through to background applications. (They do get delivered to the foreground app if your mouse is there, so the bug only happens for background apps.)

So if your use case is limited to, say, typing commands into a command shell, and you don't need to use the control, alt or command keys, then you can have working autofocus! In fact, you can even make it so that only your terminal windows (or any application list of your choice) get the autofocus behavior, and all other applications get the normal must-click-to-focus behavior. So even my "failed" attempt might yet hold some small utility for us ancient Unix hackers. I'll play with it for a while and see.

Long Version: As close as I came in my little 2-day effort, I now believe at this point that it's unlikely that autofocus will ever be available on Macs, sad as that news will be for thousands of would-be Mac users. And not just any users: they're programmers, all potentially capable of learning to write Mac applications and collectively enhancing the value of Apple's platform. So it's kind of a big opportunity cost for Apple. But there are both technical issues and design issues that make it a serious problem to support autofocus on Macs.

It's probably not impossible, but the cost is high enough that when their OS engineers think about tackling it, they'll probably decide it's not worth the effort, since the company seems to fail to appreciate just how big a stumbling block the lack of autofocus-sans-autoraise really is for so many competent Unix programmers out there.

So, Apple OS engineers, I'm not saying you're not smart. I'm just saying you're not smart enough. ;-)

Just kidding, of course, and I'll dispense with the child psychology. Here's why I think they're not going to fix it. The rest of this blog entry consists of boring technical details, so if you're getting antsy, please feel completely free to skip to the very end.

How it works

First, one caveat: I'm not a Mac programmer. I don't even play one on TV. I just downloaded Xcode (their development toolkit) for the first time three days ago. I've never written any Mac programs before this one, not even an AppleScript script, and I only started looking at their APIs a couple days ago. So I might be wrong about some or all of this.

The first problem you encounter is that there's no public Mac API for getting any sort of usable handle to a running application so you can interact with it programmatically. This is apparently for security reasons. I won't harp on this decision, although it does seem odd to deny sophisticated (read: sudo-enabled) users the choice of loading privileged apps into their system. Any application can run amok with your filesystem, personal data and network connection, so it seems odd that you'd arbitrarily choose not to let them also run amok with your other running apps.

In any case, there's a loophole. Apple, out of sheer generosity, goodwill, and the kindness of their heart o' hearts, and also partly because United States Federal Law requires it, but mostly out of sheer generosity, goodwill and the kindness of their hearts, has provided a set of "Accessibility APIs" that give you a certain federally mandated level of remote control over running applications in the system.

OS X actually has two more or less discrete sets of APIs: one for C/C++ (Carbon), and one for Objective-C (Cocoa). Cocoa incidentally also happens to be the API you use for Python and Ruby scripting on the Mac; I took a detour for a few hours and learned the basics of RubyCocoa, and was quite pleased at how well it worked.

One of the reasons I took the RubyCocoa detour was that the subset of the Accessibility APIs I needed for implementing autofocus is fairly cumbersome to explore using C++ and Xcode. I made an executive decision to spend (and potentially waste) some time seeing if I could make faster progress using one of the scripting APIs, because I was encountering bugs and/or unexpected behavior that called for some exploratory programming.

Carbon offers an abstraction called an AXUIElementRef, which is a proxy object representing a UI element (e.g. an application, a window, or widget) in any running app on the system. This subsystem is designed and implemented entirely using the Properties Pattern, which, as it happens, I'll be blogging about at length in the very near future. Normally this pattern is quite flexible, and I can fully understand their reasons for using it here: it gave them legal compliance with an absolute minimum of effort.

But the Properties pattern is healthiest in a dynamic environment that lets you poke around reflectively to get the names of properties, fetch their values, traverse parent links, and so on. Carbon provides APIs for manipulating all these UI-element properties with C++, but it really is cumbersome: lots of casting, lots of wrappers, lots of recompilation every time you want to try just one more thing. Call me spoiled, but I only budgeted a day for this feature!

So I learned a bit of RubyCocoa, and it appears – as far as I can tell – that the relevant Accessibility APIs are only available through Carbon, and not through Cocoa, which means if you want to use them, you can't use Objective-C, Ruby or Python. Or at least I couldn't find a way. If I'm wrong, someone please correct me, since I'd really like an experimentation platform that handles Carbon APIs that have no Cocoa equivalents.

Really Grubby Details

I told you it'd be boring! What are you still doing here?

OK, whatever. You're a glutton for punishment, I tell ya.

There are three basic components to the focus-follows-mouse solution:
  1. Create an "event tap" to filter all system keypresses.
  2. Find the frontmost window under the mouse
  3. Redirect the keypresses to that window.
That's all there is to it. This is the solution I envisioned before I'd even downloaded Xcode, and unsurprisingly, it appears to be the only reasonable way to accomplish the task in OS X. I mean, how else would you do it?

The event-tap API is straightforward, with just one teeny, minor exception almost not worth mentioning, which is that it doesn't work. It compiles, runs, and fails silently. This took me several hours to figure out. It turns out that event taps are considered to be part of the Accessibility APIs, and for security reasons, your process either has to be running as root, or you have to enable "assistive technologies" in the Universal Access section of System Preferences. I stumbled across this in some random newsgroup after a LOT of searching. In retrospect it was kinda there in the API documentation, but they didn't make it super clear.

Whew! There went several hours down the drain, but now I had a C program that fired up and listened for keypresses, printing them to stdout. The event-tap API gives you the option of swallowing the keypresses (or changing the event, or even returning a new event to replace the old one), so it's plenty flexible enough for our needs.

Next, I needed to find the window under the cursor, which first meant finding the global cursor position. This also turned out to be surprisingly non-obvious. The best solution I found, from someone's blog, was to create a NULL event and then get its mouse coordinates. So intuitive! Just like Mom used to do it!

Sigh.

Once you have your mouse coordinates, you use a "hit test" to find the window at that screen position. It's one of two Carbon-only APIs you need: you create a proxy for the system-wide UI object with AXUIElementCreateSystemWide, and pass it to the global hit-test function AXUIElementCopyElementAtPosition to get the UI element under the mouse.

Then it gets a little ugly, though not terribly so. These AXUIElementRef objects have all their information in property lists. This would be trivial to navigate in RubyCocoa, but the AXUI API set doesn't seem to exist in Cocoa — specifically the parts that deal with "any running application" rather than "your application".

So you grub around in the object and its parent chain, clunkily printing stuff in C++ and releasing reference-counts, until you find an ancestor with a "role" attribute of "application". That's the element you need for delivering keyboard events.

We can deliver the keyboard event to the unfocused window, through this poorly-documented API call:
AXError AXUIElementPostKeyboardEvent(AXUIElementRef application,
CGCharCode keyChar,
CGKeyCode virtualKey,
Boolean keyDown);

All good so far. Excluding the time spent figuring out the access-control problem with event taps, and the time spent playing with RubyCocoa, I'm only about five hours into the whole endeavor.

Oh yeah, and the time spent dicking with Xcode trying to figure out how to add a library build target to the executable. I've done it two or three times now and still can't remember how I did it.

The AXUIElementCopyElementAtPosition function points you to its non-Accessible cousin CGPostKeyboardEvent, which is also more or less undocumented. Can you tell they really don't want you to use this stuff? This cousin function has a teeny bit of explanation in its header file CGRemoteOperation.h, which Xcode provides no easy way of locating via search. You can look for it in Spotlight, hoping you'll get lucky and it won't hang like it did for me just now. Or you can do what I did and just use Google Desktop search to pop to it instantly.

The explanation in the header file says:
/*
* Synthesize keyboard events. Based on the values entered,
* the appropriate key down, key up, and flags changed events are generated.
* If keyChar is NUL (0), an appropriate value will be guessed at, based on the
* default keymapping.
*
* All keystrokes needed to generate a character must be entered, including
* SHIFT, CONTROL, OPTION, and COMMAND keys. For example, to produce a 'Z',
* the SHIFT key must be down, the 'z' key must go down, and then the SHIFT
* and 'z' key must be released:
* CGPostKeyboardEvent( (CGCharCode)0, (CGKeyCode)56, true ); // shift down
* CGPostKeyboardEvent( (CGCharCode)'Z', (CGKeyCode)6, true ); // 'z' down
* CGPostKeyboardEvent( (CGCharCode)'Z', (CGKeyCode)6, false ); // 'z' up
* CGPostKeyboardEvent( (CGCharCode)0, (CGKeyCode)56, false ); // 'shift up
*/

That's it. That's what they give you. Open questions about the explanation include (a) why are they passing capital-'Z' if they already reported that the shift key was down, (b) if there's a guesser when you pass NULL, why do you need to pass 'Z', (c) how do you get the char code for a given key code, and is it the same on all keyboards, and (d) why didn't they include a "THE FIRST PARAGRAH IS A LIE" disclaimer around the first paragraph?

Open questions be damned. We fearlessly press on, and just pass "whatever" and see what happens. Specifically, I always pass NULL for the char code, and pass the key code I got from the event tap callback as-is.

And it worked! Sort of! I start my little app (which has no UI), move the mouse into a window from a non-foreground application, and I can type into it!

Except I can only type unmodified keys. It's completely ignoring my keyboard event posts for Shift, Control, Alt and Command. That's the lie part. They said they'd generate flags changed events. They lied.

After some more painful C++ experimentation, I find that the call does NOT ignore modifier keys when posting to (a) the focused application or (b) the system-wide application, which just posts to the focused app. The call only drops modifier keys on the floor for non-focused apps.

There's a big ol' thread about this exact problem from six years ago on the Apple accessibility-dev mailing list. Six years! I read every last word of the thread.

The first takeaway is that Apple OS engineers don't want you to do stuff that they don't want you to do, and they specifically define "stuff they don't want you to do" as "stuff they don't think you want to do." This is actually endemic to Apple forums in general. Whenever someone says "I want focus follows mouse behavior!", some people inevitably reply that "you really don't want to do this". It's that whole "we designed it the right way for everyone" mentality that turns off so many would-be Mac users.

For what it's worth, the Apple engineers really were trying to be helpful in this guy's situation, and I know how hard it can be to respond to a mailing list in the capacity of "developer representing the company". But it took them a long time to understand his needs, because (and I'm speculating here) they implemented the Accessibility APIs only because their Mom told them to, and they don't truly appreciate at a deep level what it means to have a disability, and how important it is for many people to be able to choose a different UI paradigm.

And yes, I am taking the arguably un-PC position that having your fingers hardwired to focus-follows-mouse from 20 years of use can be viewed as a minor disability of sorts. I'll be the first to admit that it's not the kind of "real" disability the government probably had in mind when they mandated the Accessibility APIs.

But I did switch to the Mac because my eyes are slowly beginning to fail. Ironic that I should be forced to trade one disability for another.

The underlying issue

OK, let's assume for the moment that Apple really does have our best interests at heart, and that they can get over the painful notion that their usability test findings may not actually apply to 100% of all users 100% of the time.

Even if they wanted to help fix it — and in this case, all they'd need to do is NOT drop the modifier keys on the floor when you call AXUIElementPostKeyboardEvent — there are some deep-rooted architectural issues at play here, and I finally "got it" while reading that thread.

The problem is that Macs, always and forever, have put the menu bar of the focused application at the top of the screen. The menu bars of unfocused applications are hidden and are not in any way user-interactible.

As you might expect, this UI assumption has been baked into the Mac APIs from the very beginning. Programmers will take advantage of any axiom they can in order to get things working, so over time this has turned from a UI design assumption into an architectural "feature".

In particular, when an app is in the background, its menu structure may not be intact, and the app may be in a state that assumes it will not be receiving any keyboard input. One concrete example mentioned in the thread was that when an app is in the background, the child menu items do not have parent links (although the parents still have pointers to the children.)

This has serious ramifications for focus-follows-mouse. There are certain built-in hotkeys that can activate menu entries, and apps are also free to define their own. If you try to activate a menu in a background application, it could in theory wind up crashing the app, if the app is assuming an intact menu structure and is traversing bottom-up rather than top-down.

You could attempt a knee-jerk solution by allowing Control and Alt through, but deny the Command modifier, since that's the most common menu-activation modifier (I think). But there's another class of applications (Emacs included) that dynamically generates at least part of its menu structure based on the data content. For instance, the Emacs Imenu package generates a list of jump targets from a source-code buffer. Even typing a new function definition could still trigger a rebuild of the IMenu, which (for all anyone knows) could crash Emacs.

You could of course ask app developers to fix this on an application-by-application basis, but there are generations of legacy apps that can never be fixed. The only way to guarantee that pressing a key could never crash an application would be to fix the OS X user-interface architecture to normalize application behavior for foreground and background operation. This could be hard. It could expose its own set of difficult or even intractable problems for legacy apps. Or maybe it's really easy. I don't know, since I can't see their code. But I suspect it's not easy.

And Apple has no real motivation to fix it, because their UI was designed for "everyone". People who would use focus-follows-mouse are presumably a tiny minority, so even if they're mostly programmers the cost/benefit likely isn't there.

It is, of course, completely fixable if Apple really wanted to fix it. I've heard many war stories over the years from Microsoft folks who've had to put compatibility hacks into OS releases, some quite extensive, in order to support popular applications that relied on undocumented OS behavior that suddenly broke. Imagine those poor guys that had to implement perfect DOS/Windows emulation on NT, for example. I suspect that by comparison, fixing focus-follows-mouse would be relatively straightforward.

But I predict it won't happen in the next 5 to 7 years, unless the government suddenly decides that this API is required for properly assistive technologies.

It's interesting that you can get so close using the existing APIs: I have true focus-follows-mouse behavior implemented for non-modified keys. Sure, the window doesn't actually focus, so some applications don't even show a cursor. But if you're willing to live with occasional glitches, the feature works great.

In any event, what's weirdest about all this is that the API lets you send non-modifier keys to the background app, because as I pointed out, it's still possible for vanilla keys to crash applications! If the state of the app is materially different when it's unfocused, and the app isn't expecting keyboard input when unfocused, then it could crash. Dropping the modifier keys on the floor may reduce the probability of Badness, but it certainly doesn't eliminate the possibility.

Was it an accident that they let any keys through at all? That would surprise me greatly, since the OS engineers seem determined to close undocumented behavior loopholes. But if they had a reason (perhaps a legal reason) to permit unmodified keys through, what was it about the reason that lets them drop the modifier keys?

I wish I knew.

Open letter to the Apple Wish Fairy

If I could wave a magic wand, I'd ask for them to fix the API to pass the modifier keys along to the app, and just put a note in the docs that Bad Things could happen, so Buyer Beware.

They already do this for those exact APIs anyway. CGPostKeyboardEvent's documentation says: "This function is not recommended for general use because of undocumented special cases and undesirable side effects." And this is for the API that only talks to the focused foreground app! The AXUI version is obviously double-buyer-beware, and apps can't even use it without prompting for the superuser password, unless the user has purposely enabled assistive technologies.

There would be bugs, yes. Some applications would have to push out new releases to properly support focus-follows-mouse, and some legacy apps would never be fixed. But you could disable the behavior on an app-by-app basis, or just take a "Doctor, it hurts when I do this" approach.

Unfortunately I don't have a magic wand; all I have is my distinctly nonmagical blog, which I'm using to soothe myself via copious whining.

Epilogue

This whole thing has been an interesting lesson in how the government can actually force companies to open up their locked-down systems. The whole "you'll have it our way, and like it" mentality is crumbling with these assistive technologies. I hope the feds mandate opening things up even further, since we're only partway there so far.

In the interim, I'm sure I'll eventually get used to life without autofocus. BLONK! My 30-inch screen helps, as does Spaces, since it's easier to give windows their own non-overlapping real estate.

I might even give autoraise another try. Some developers have implemented it in a horrible way, by generating a mouse click when you move the mouse into a window, which often results in activating UI objects unintentionally just by moving the mouse. Ouch. But there might be some commercial implementations that do it "right", or I can just hack my autofocus app to do it that way. Combined with an autoraise delay and minimizing my overlapping windows, it might just work. We'll see.

And everything else so far about the Mac has seemed pretty nice, or when it's been not nice, at least it's been fixable with a little configuration effort. I liked the OS X APIs on the whole, and the RubyCocoa thing is pretty sweet. I might even wind up writing some native clients — something I thought I'd never do again, given how awful my Windows native-client experiences have been over the years.

So I'll keep using my Macs. They're all just plumbing for Emacs, anyway. And now my plumbing has nicer fonts.

If you skipped straight to the end...

...you didn't miss much. See you next time!