Tuesday, June 26, 2007

Rhino on Rails

What a day. Apparently getting John Lam'd is worse than getting Slashdotted. My team has been laughing at me all day; we have no idea how I get into messes like this.

Rather than reply individually to the crush of email, I guess I'll just do a bulk update.

But first: is anyone else as nonplussed as I am that of all the amazing things that were discussed in Foo Camp, my little improvised talk -- complete with a picture of me that for some reason looks as if I'd just crawled out of a tent 20 minutes prior, hung over and disoriented and wondering why I was in a field in the middle of Sebastopol, CA, only to find that the night before I'd apparently I'd signed up to give a 10am talk -- winds up being splashed all over everyone's blog, not to mention my inbox?

I mean, Foo Camp was truly amazing. There were just insanely brilliant people there, all presenting slick, well-prepared talks and roundtable discussions on vitally important topics such as improving democracy through technology, measuring the health of social networks, addictive collaboration in programming competitions, and the future of Open Source software. And here I was making the technological equivalent of armpit noises in an attempt to cover my complete lack of preparation, not to mention lack of a shower, and somehow that's all I hear about when I get back.

Sigh.

So here's the deal. I work at Google. No, let me put it a bit more accurately: I am privileged to work at Google. It's a privilege. It's something that I have no expectations about. If you were a slovenly teenager with a rich uncle who for some reason liked you and let you play golf at his $250k-per-year membership country club for free whenever you liked, you couldn't have lower expectations than I do about how long it might last. Not because my rich uncle won't last, mind you, but because teenagers just have a way of ruining situations like that. You just enjoy it while you can.

In addition to working at Google, I also happen (independently) to have this ridiculously over-subscribed blog with almost no actual content and far too many lowbrow booger jokes. Sometimes people seem to get a little confused, and think that maybe I (the teenager) have some sort of influence with my magnanimous uncle Google, or that I speak on his behalf, or some other such nonsense.

Let's get real for a sec, folks. I'm nobody at Google. I'm a very comfy, very small fish in a sea full of large, toothy sea creatures, if I may (*ahem*) be permitted to invest the metaphor with its most flattering possible meaning. And I certainly don't speak for anyone but myself.

With that said, I figured I'd give you the outline of my Foo Camp talk, which will hopefully answer most of the questions that have been flying around since I got back.

By way of background, I'm on a really cool project in Kirkland -- one that I can't talk about, alas, except to say that I think it's the veritable cat's meow. But I can't talk about it. So it goes.

Among other things, this meowing project requires a snazzy, Ajax-enabled web frontend. It's part of a larger herd of projects purring along on Java infrastructure, so our project also has to run on the JVM.

One of the (hundreds of) cool things about working for Google is that they let teams experiment, as long as it's done within certain broad and well-defined boundaries. One of the fences in this big playground is your choice of programming language. You have to play inside the fence defined by C++, Java, Python, and JavaScript.

My project's playground is actually a bit smaller than that, since it has to run on the JVM. But that's still a pretty big playground, and trying out (or creating) new web frameworks is totally fair game for experimentation, even if you're in a relatively high-profile domain.

Another boundary we have to play within is software engineering, including unit testing, documentation, code refactoring, security, scalability, internationalization, and a host of other non-negotiable criteria.

I hope you're beginning to see, at least faintly, why I love working at Google. It's because the code base is clean. And anything that takes more than a week of effort requires a design document, with specific sections that have to be filled out, and with feedback from primary and secondary reviewers of your choice. The net result is that for any significant piece of code at Google, you can find almost a whole book about it internally, and a well-written one at that.

I've never seen anything like it before, to be honest. I don't think you can get that kind of software-engineering discipline without doing it right from the start, and creating a culture that enforces and reinforces that discipline as the organization grows.

Whatever. I digress. My point is that as long as you play inside the fences, and you don't foolishly disregard the wealth of infrastructure for distributed computing that Google offers, then you have a great deal of leeway for experimenting with new approaches.

As it happens, I kinda like Rails. I like Ruby, too, independent of Rails. But for making web pages, Rails is the nicest thing I've used. And given the surprising number of titles available about our heroine Rails and her mute little woodland creature sidekicks Prototype and Scriptaculous, not to mention the vast number of Rails clones out there in other languages, I'd say DHH was really on to something unique and significant when he created it. Kudos.

Ruby is on the other side of the fence. We can see it over there, rolling around in the grass and pretending it's having fun, hoping we'll invite it in. Sometimes it even comes snuffling right up to the edge of our playground, and we have to throw a giant knife switch to engage the high-voltage coils, and there's a loud hum while Ruby's hair stands all the way up, and it backs away slowly. It tried digging a hole once, but... well, you get the idea.

The thing is, Google's decision to limit the number of languages allowed for production work is actually pretty smart. It's one of those things that might sting for a few weeks after you start (and not to put too fine a point on it, but why aren't you sending me your resume?), and if you're a bumbling fool like me it might take you a little longer to figure it out. But programming languages have a core of standard functionality that everyone knows, followed by a long tail of murky semantics that become increasingly difficult to figure out, especially for dynamic languages with no actual spec like Perl, Python, Ruby and their ilk. Google very prudently keeps the number of languages as small as practical, so that we can build up large groups of experts on the semantics of the languages we've chosen.

It's also to cut down on the combinatorial explosion of components required for language interoperability -- a huge tax at other companies I've been at, a tax that Google has managed, by and large, to minimize.

Google has set up an environment that makes it pretty easy for people to transfer between groups, and even between sites. You have to be a grown-up about it, of course, and not ditch your team during some critical crunch time. But provided you exercise mature judgment in the matter, you can pretty much switch teams whenever you want, and it doesn't cause the company to grind to a halt, as such a policy would at other places I've worked or visited.

It's NOT easy to set up that kind of environment, and it's certainly not easy to make it scale. There are many ingredients to the full recipe, far beyond the scope of this little blog entry, but one of the keys is to make the code base homogeneous across projects, teams, sites, and even (to the maximum extent practical) languages. That helps minimize the learning curve, and thus the resistance, when people are moving between groups.

So now you have all the key factors that went into my decision to port Ruby on Rails. I like Rails, and wanted to use it on our fancy new project in Kirkland. I needed to be on the JVM for interoperability with the rest of our code base. (Take my word on this one -- it's not something I could have solved by RPC calls; I needed to be running on the JVM itself for this.) Being on the JVM rules out C++ and (native) Python.

My only choices, then, were Java, Jython and Mozilla Rhino, which is an implementation of server-side JavaScript on the JVM.

Looking at the Rails APIs, which typically pass around hash literals as parameters, it didn't seem like Java was going to be a particularly good match. Ruby has no method overloading, and both Ruby and Rails have all sorts of conventions for passing varargs or block (function) parameters. Java's different enough from Ruby that the impedance mismatch might have killed the effort, so I decided to stick with a dynamic language.

Jython seemed like the obvious choice at first, except that it hasn't had much momentum since around 2001. It used to be arguably the best non-Java JVM language around; Jim Hugunin did an amazing job with it. Unfortunately it's had almost no love in the past six years, and it's now lagging the Python spec by several major versions (2.2 vs. soon-to-be 2.6).

Rhino, in contrast, has a great deal of momentum. It's been around even longer than Jython; it began life as a port of the SpiderMonkey (Netscape/Mozilla) JavaScript C engine, and it was written with an eye for performance. The Rhino code base reads almost like C code: it avoids allocation and does as much as possible with jump tables to avoid the overhead of virtual method lookups. It has two code paths: a bytecode interpreter that runs in a tight loop, and an optimizing Java bytecode compiler that turns many expensive-ish JavaScript property lookups into Java local or instance-variable lookups. It's a pretty serious piece of software.

When you start digging into Rhino, you find unexpected depth. JavaScript (unlike Perl, Python and Ruby, at least today) actually has a real specification, and Rhino follows it rigorously, aiming for complete SpiderMonkey compatibility within the bounds allowed by the different language platforms. Rhino also offers rich configurability, has well-defined multi-threading semantics, has a full set of hooks for debugging and profiling, and much more besides. There's a lot under the hood.

Oh, and Sun is now bundling it with the JDK; it's javax.script in Java 6. With that kind of endorsement it would have been hard to justify using anything else, even if some language other than Jython or JavaScript had been up for consideration.

Of course, the downside is that it's JavaScript, right?

JavaScript is admittedly a bit quirky, but keep in mind that this is server-side JavaScript we're talking about, and on the JVM to boot. So libraries aren't an issue; there are plenty. And browser incompatibilities aren't an issue either - there's only one Rhino, and it works as advertised. But we've been able to go a step further and make some fundamental extensions that have made it almost Ruby-esque.

As just one example, in client-side JavaScript there's currently no way to define a property that's non-enumerable, which implies that you can't add new functions or properties to Object.prototype, since they'll suddenly start showing up in your object literals (which are JavaScript's hashes, more or less). But in Rhino you just create a Java method that calls into the Rhino runtime to define non-enumerable properties, and you can extend Object.prototype to your heart's content. So we went wild, and added pretty much every interesting Ruby (and Python) built-in method on the built-in classes (Object, String, Array and the like).

JavaScript has also begun evolving again after a long hiatus, and its 1.7 features in Firefox 2, mostly borrowed from Python, are really pretty snazzy.

Rhino doesn't have those features yet, so we started a 20% project to add them, and people all over the company are signing up to help.

Did I mention how cool it is to work for Google? Well, I mentioned it a few months back to Norris Boyd, the original author of Rhino, and he thought it sounded like a fine idea. Now that he works for Google too, the Rhino 20% work is charging along like a large, gray, scaly African land-beast, if I do say so myself.

And then there's JavaScript 2.0, but the Foo Camp crowd heckled me mercilessly when I brought that up, so we didn't get to talk about it much. That's what I get...

Last thing on the menu today: I'll address the Rails port briefly, since I know there are questions. There really isn't all that much to say, though. Ports are kinda boring when you get right down to it.

I ported essentially all of ActionView, ActionController and Railties, plus a teeny bit of ActiveRecord, although for the most part we were already constrained in how we talk to our backends. Long-term, I think the story is going to be Hibernate.

I didn't use a code generator, since I've never met one that I liked. I just used Good Ole Emacs and a whole lot of keyboard macros. Initially I started by trying to do my own implementations of the Rails "spec", as it were, based on the Pickaxe book, but I pretty quickly found that Rails is its own most succinct implementation, with very few exceptions. (I think the routing engine could be cleaner, and there are a few spots here and there where the metaprogramming was a bit gratuitous, but otherwise the Rails code base is very clean and remarkably well-tested.)

We did some extensions here and there for security, performance, and internationalization support, but I'm guessing I shouldn't talk too much about them just yet. We've also been building some nice tools support, including spiffy debuggers and other IDE functionality for Eclipse and Emacs. But we've also tried to stay as Rails-compatible as practical, so that new developers on our team (in case you were thinking of applying, hint hint) can get started just by picking up a Rails book.

We've talked about open-sourcing our "Rhino on Rails", but so far it's been no more than idle chit-chat. For one thing, work on the framework plays second fiddle to work on the actual application, so it's progressing slowly, mostly as 20% work. For another, we're dependent on various other Google infrastructure components that are themselves in the process of being open-sourced, although I don't know if any of them have been announced yet.

The real key, though, is watching to see what happens in the Rails-ish web framework space on the JVM. There are plenty of other Rails clones out there in varying stages of maturity, including JRuby on Rails, Grails, Phobos, TrimPath, and probably half a dozen others.

If one of them emerges over the next couple of years as the dominant player, and it meets Google's internal quality bar for security, performance, scalability, internationalization support, and so on, then we'd be fools not to consider the possibility of migrating to it.

But platforms (like languages and IDEs) take years to flesh out, and they evolve fairly glacially, whether as open-source or as home-grown internal projects. They need teams of users to build entire applications with them, and during each feedback cycle the framework (or IDE, or language) gets saddled with a few more years of work.

I have absolutely no idea how it's all going to play out. All I know is that Rhino on Rails is working out very nicely for our little team up in Kirkland, and that curious folks at other sites inside Google are keeping an eye on it to see whether it turns into something cool enough to try out.

And that's about it. Did I forget anything? If so, I'm sure someone will let me know, probably by tying a nice note to a nice red brick and hurling it through my window. You should see the email I get sometimes.

I promise to give you a status update in exactly one year, provided someone reminds me. I can almost guarantee that at our current rate of framework progress, nothing interesting or newsworthy will happen to our Rails clone between now and then.

I encourage you to check the Rhino site if you want to be looped in on (or participate in) the Rhino enhancements. It's fun work, and they'd love to have your help.

And with that, I'm signing off to spend a little time working on my super-secret home project that should be ready for release later this year. This one's a bona-fide magic trick, which of course means I can't give it away. Let's just call it "NBE" for now, shall we? Yes, I think NBE sums it up nicely.

Betcha can't wait.