9.08.2006

language invention

Today I gave a small talk about some of the most interesting work that we were able to do over the years at my current job, or at least a subset of it.

The best times were when real computer science principles were put into practice. The pinnacle of that, in my mind, was the invention of a small expression language, replete with constants, variables, mathematical operators, comparators, string literals, boolean operators, arbitrary nesting of parentheses, and functions with an arbitrary number of arguments (each argument potentially being any arbitrary sub-expression).

Hand-rolling a parser for such a language would be a perverse exercise, but the simple application of some time-tested tools made the problem approachable and, in retrospect, even trivial.

There are two main tools that we used.

The first was a conceptual tool: context-free grammars. These were first invented by linguists to model the structure of natural languages, and adopted by computer scientists in the early days of experimenting with programming languages. You can think of these as rigorous mathematical formalisms of the same idea behind the grammar rules you learned in junior high school.

(Do they even teach diagramming anymore?)

The second tool was a parser-generator. These are an incredible invention: they read-in the aforementioned grammar, automagically generate a provably-valid parser for the language, and provide a way for you to subsequently define the semantics of your language.

The particular parser-generator that we used was SableCC, which had a nice, novel way of allowing one to define the semantics of the language they wished to invent: while most other parser-generators expected us to modify generated source code, SableCC merely generated an interface that allowed us to implement a vistor, in the sense of the visitor design pattern.

Slick.

Inventing languages in this manner used to be de rigueur among computer programmers who had become somewhat advanced in their technique. Today most need for small custom languages (configuration languages, typically) is satisfied by XML, which is much easier to wrap one's mind around. This renders the work facile, but with the tradeoff of rendering the resulting language inelegant -- and rendering the programmer flaccid.

Mind you I'm not saying this as someone who considers himself an expert in the subject, but as someone who felt the rush of sudden experience points just from having gone through the exercise once.

One day Vint Cerf came to speak to us at work, and I got the opportunity to be among a few who attended a small breakfast with him. Mostly I just listened, drinking in as much of his luminosity as I could. Eventually, however, his eyes fell upon me and he asked me to describe what I did.

I told him. He sat back, smiled, and said that he was very happy to hear that language invention was still alive and well in the corporate world.

I feel very fortunate to have had the opportunity to practice real computer science on the job -- it's so much more rewarding than charting the path of one's career by popular shrinkwrap, entrenched brands, or skyrocketing buzzwords.

It's the road less travelled, but the view is lovely.

8 comments:

jesse said...

\o/

good times

Jon Conradt said...

Where I work now we have an XML document for the layout and back these with C# objects written for the behaviors. We don't want XML authors messing with the C# so every little bit of customization requires another tweak of another C# file. It is blob all over again.

I have begun advocating a different approach that would (again) incorporate an expression engine into the XML. I explained that initially you need to implement a fair number of functions (take Excel as a starting point) and then the amount of custom C# you need to write eventually drops off. You end up with the XML authors having enough power to do their work without having low enough access to really cause trouble.

The immediate criticism was that this would lead to lots and lots of XML files. This of course leads to the next big improvement in our XML language -- including libraries. The XML authors quickly figure out that they can save themselves time and pain by creating libraries of common features. That was a great thing to watch.

Finally, when one separates the layout from the presentation with something like XSL/CSS then you have a truly powerful and flexible tool.

The sad thing about getting to this point is that you have only one big challenge left -- making it faster. True, you could port it to another language (Objective C?) but in the end you have created something that will likely stand the test of time and there won't be any really fun problems left.

In my new job I have found that there are always places for real computer science. These places are often overlooked and whitewashed by people in too much of a hurry to replace the brute force solution with an elegant one. However, there are in any organization allies who given the support and encouragement will take the leap and do something really cool.

Working with those kinds of people is the greatest reward I could ever imagine receiving from a workplace.

podunk said...

That's so funny, Jon. There's some old axiom about all systems inevitably becomming emacs as their feature set grows over time. Maybe the same is true for g.

Jesse, good to see you posting here too. Good times, indeed.

I hope my new job will have moments as rewarding as those I had working with you all.

pahari said...

I was fortunate to be a part of this presentation.I started off the gung ho types needing or aspiring to know all new buzz words and what little I could gather ,to make a point (maybe let people see that I knew , a falacy ) how ridiculous it may be. Now I think I am wiser.
I think I realize the the rush. "It's the road less travelled, but the view is lovely." is an apt description of the coziness u feel being among thinkers , rather than the "users" .I do beleive that sadly the "popular shrinkwrap, entrenched brands, or skyrocketing buzzwords" increasingly overshadow pure practioners of science .What worries me is that , the shrillnes with which the above is sounded ,does make the less fortunate(who are more inclined to identify with the target and thus less puritan) decide to forego the quest to create .
The basic building blocks of computer sciences are increasingly(and surprisingly ) sounding more alien
, to the extent that people like u may be considered out of step with the current "trends" by a growing number who and I am pretty sure are afraid and intimidated by the extra resolve it takes to create something newer and better. But than they loose the the pure joy of practising pure science , unmolested and pristine.

It has been a pleasure .

podunk said...

Pahari, language of the mountain people. Cool pseudonym for you.

I'm happy that you called me on the shrill tone of that line. (I know I can always count on you to shoot straight.) It sounded that way to my ears too, but I couldn't figure out a more subtle way to convey it.

Thank you for posting, my friend. Your thoughts are potent.

pahari said...

"What worries me is that , the shrillnes with which the above is sounded" that was not for your comments, more directed towards people who expond the other side of the argument . I beleive I was misunderstood . There is no sublte way to state what u stated :)

podunk said...

Ah, yes. Now I see what you were getting at with that comment. So true, too.

lauri said...

As the humble user of one of your earlier languages, I salute you all! Full marks and stand up job!