Thursday, October 31, 2013

Backward Is Forward

The ML section of Dan Grossman's Coursera course is now behind us, and Racket is coming up fast. But I'm still musing about ML and syntax.

One thing Professor Grossman mentioned in passing was the |> operator, which he says is predefined in F# and can easily be defined in ML as:

fun x |> f = f x
That is, it reverses the order of function application: the argument comes first, then the function. This lets you write what feel like Unix-style pipelines:
strs |> (filter (length _ < 3)) |> (map toUpper)
This seems somehow easier to read than:
map toUpper (filter (length _ < 3) strs)
But why is it easier to read? I suppose it's because deep in our hearts we think of functions as little machines, and we like the picture of feeding an input to the first machine, which then feeds its output to the next machine, and so on in a little assembly line.

One reason why object-oriented programming Refuses to Die, even after repeated declarations of Total Victory by the Functional Programming Army, may be that it naturally chains operations in this backwards order. Compare:

strs |> (filter (length _ < 3)) |> (map toUpper)
strs.filter(_.length < 3).map(_.toUpper)
There is, of course, more going on in the OO version than a reordering of the functional version; the objects in question are actually dispatching the methods after the periods. But visually, the OO version doesn't do much more than substitute . for |>, and it retains the readability advantages over the forwards-facing FP version (assuming you agree that using |> does, indeed, enhance readability).

If you read my previous post on concise syntax, you might not have cared for the fact that I wrote a pattern-matching expression as:

(pat1, expr1 | pat2, expr2 | ... | patn, expr2) foo
that is, putting the value to match at the end. But with the pipeline operator, you can say:
foo |> (pat1, expr1 | pat2, expr2 | ... | patn, expr2)
which looks a lot more traditional (maybe only if your traditions are of very recent origin).

But all this brings up the question: Why does function application normally put the function first, anyway? Why isn't the default the other way around?

Aha, you say, Forth! Or at least something like it; Arcane Sentiment has an amusing post speculating on how languages might have evolved in postfix world. And also one on the pipeline operator itself.

Here I have to insert a confession: I sometimes have difficulty keeping track of the ordering in the usual function composition operator. I like to read fg as “f after g” instead of the officially sanctioned “f composed with g”, because putting the f before the g somehow suggests that the f happens first.

Forth, of course, lays out the functions in the order they are applied. g f means fg. In fact, rather than thinking of a program in such a language as being a series of operations on a stack, you can think of it as a composition of functions—curiously, it comes out looking just the same.

You might think that all the backwards-facing languages are untyped or dynamically typed, but if you just imagine looking at all your forwards-facing statically typed code in the mirror, you realize that that doesn't have to be the case (backwards everything parse just). And so you might be interested in Jon Purdy's statically typed concatenative language Kitten, something like the love child of Forth and Haskell. He explains how the type-checking works in this post.

In Forth (or Kitten), the function application operator is still just whitespace, as in ML or Haskell. I wonder, would either language let you define a <| that applies a function in the usual order? I don't recall enough Forth, or understand enough Kitten, to be sure.

And where does infix fit into all this? If forwards (or "head-on-the-left") languages can accommodate infix operators, surely the stack-based head-on-the-right ones can too. You can also imagine a language where the forwards and backwards function application operators are equally concise (<| and |>? \ and /? ◁ and ▷?), where you'd have no inherent direction at all, tossed around at the whim of whatever operator you've landed on most recently. Perhaps it would be fun to execute programs in such a language twice, once from each end, swapping the precedences of the forwards and backwards operators.

For all that forwards and backwards function application seem both to be useful (perhaps even equally so), I've never seen anyone suggest that function definition should, in general, be reversible—that is, that you should be able to declare your parameter after the function body:

x + y <= y fn <= x fn
At first glance, this appears to be a case of That's Just Wrong. But it looks a lot more reasonable if you include the bindings for the parameters immediately afterwards, as in a Haskell-style where clause:
x + y
where x = 1
      y = 2
And maybe there's an argument somewhere for reading a backwards definition as:
x + y forSome y forSome x
but, if so, I'm not going to be the one to make it.

Wednesday, October 30, 2013

The Language Is Dead. Long Live the Language!

All Souls Day is almost upon us, so it seems a fine time to be contemplating everything that has died, and everything that will.

I attended a meetup a couple of months ago billed, in part, as a discussion of the future of Scala. As you might imagine, that's a sufficiently broad topic to be hijacked in interesting ways. But one of the questions that surprised me by its frequency was: does Scala have a future?

Having spent a number of years working on a language whose niche was far smaller than Scala's, I'm accustomed to thinking of Scala as having an enormous community and being extremely well-established. It doesn't seem very likely that it could all just go away, at least not on short notice.

But I've been around long enough to recognize that everything does, without exception, ultimately go away. Scala will too. But Scala's demise probably won't come about because Martin Odersky and everyone he's inspired to work on the language will just decide to leave the world of software and take a vow of silence in a monastery somewhere.

The most likely scenario for the End of Scala is not that it instantly disappears (wouldn't all the code written in it keep running?), and we all have to start over at the beginning using the original version of Fortran from 1957. Instead, I'd expect that Scala's ideas get incorporated into something that the Scala community likes even better—a prospect that is not particularly grim, and maybe even thrilling.

Given that Scala has been less resistant to change than some languages, the Next Big Thing after Scala might just be a newer version of Scala. Then again, it might not. But those of us who, through Scala, have acquired a much richer vision of what programming is all about will never regret having learned it.

Sunday, October 27, 2013

A Brief Ceremony

Warning: This post makes use of unconventional notation. If you believe that the One True Meaning of the ASCII characters was revealed for all time to Saints Kernighan and Ritchie, or if you are otherwise firmly attached to the syntax of a particular programming language, read at risk of disgust and disgruntlement.

As I mentioned previously, I've been learning Standard ML as part of Dan Grossman's Coursera course. As usual when one starts to pick up a new language, ML seems remarkably clean and simple—and, as usual, I'm not sure whether that's because it is all that clean and simple, or whether I'm just learning the easy parts first, because that's generally where you start.

The parts of ML we've covered so far in class are focused on the functional basics: declaring a function, invoking a function, passing a function to another function—in short, the lambda calculus. And contemplating the lambda calculus in its many guises is always as pleasurable as watching fluffy clouds drift by in the summer sky.

But one thing that strikes me about ML is that its take on these basics is a bit verbose compared with other languages I've seen. For example:

(* ML *)
let val a = b
    val c = d
in  e
seems a lot wordier than Scala's:
// Scala
val a = b
val c = d
as well as requiring deeper indentation. And fn to introduce an anonymous function has a lot more visual weight than Haskell's simple backslash.

So all this got me to wondering: just how terse a notation can I come up with for declaring and applying functions? ML and Haskell already use whitespace as the application operator—and that's zero characters if you're not counting whitespace, so I can't imagine how you'd do better.

How about function declaration? All the languages I've ever encountered have distinct syntax for declaring a named function and binding an anonymous function to a name. For example, in Scala you have:

def foo (a: Int, b: Int) = bar
val foo = (a: Int, b: Int) => bar
and similarly in ML (currying rather than tupling, to keep up with the fashionable):
fun foo a b = bar
val foo = fn a => fn b => bar
Is there a way to make an analog to the val versions above so simple that the def/fun forms are unnecessary? If so, binding an identifier needs to be pretty concise. The best way I can think of to approach this is to think of the binding operator as an infix operator, as in a = b, with no introductory let or val. However, the equals sign usually has a pretty important role to play in comparisons, and I'd rather not overload it. Ideal would be ≡, which is Unicode character hex 2261 (“identical to”), but the gods of ASCII did not see fit to endow us with that character, and for regrettably many tools, venturing outside of 7-bit territory is still a leap into the mojibake abyss. So I've provisionally settled on:
a ~ b
which defines the identifier a to have the value of the expression b in whatever follows, up to the next scope-closing delimiter.

The next step in overthrowing the tyranny of the fun keyword is an infix lambda operator (about which the author of the Arcane Sentiment blog has done some philosophizing). Although most of the languages I know require some sort of introductory prefix when declaring an anonymous function, in Scala you can get away with:

a => b
provided that the type inference winds are blowing your way.

However, a two-character infix lambda looks pretty hefty to me, particularly if you're going to chain it in a curried function declaration like:

a => b => c => d
On the same basis, I reject /\ (even though it looks like a capital lambda). In fact, even single-character choices with a vague A-frame resemblance to the letter lambda seem to clamor for too much attention when you lay them out:
a \ b \ c \ d    -- does this look like quoted spaces to you?
a / b / c / d    -- would you do this to your division operator?
a % b % c % d    -- a lambda with a bug squashed on it?
What I'm really looking for is something as lightweight as the series comma used when declaring C-style tupled parameters. And then it occurred to me—the comma itself is up for grabs, as in:
a, b, c, d
So now we can turn something like:
val foo = fn a => fn b => bar
foo ~ a, b, bar

There remains one more thicket of verbosity to hack through before we can pose beneath the Mission Accomplished banner, and that is pattern matching. Pattern matching is a sign of civilization as surely as the motorcar and the wireless, and I'd hate to try to present any examples longer than half a line without it. Pattern matching syntax is one of ML's less attractive features, as it requires not one but two keywords to get going:

case foo of
    pat1 => expr1
  | pat2 => expr2
  | patn => exprn
Scala sort of starts out improving on that by having an infix match keyword:
foo match {
  case pat1 => expr1
  case pat2 => expr2
  case patn => exprn
but then you've got the curly braces and the repeated case keyword, and it all goes to hell.

One way to approach the patterns you're matching is to treat each as a function parameter (oh, didn't I mention that all function parameters are patterns? No, because you already knew that). So given a value to match and a sequence of functions, the pattern matcher's task is to find the first function in the sequence that would be legal to call with the given value as an argument. Indeed, you might say that your sequence of functions is itself a function, and the value to match is its argument, and that's precisely how we'll do it:

f ~ f1 | f2 | ... | fn
is the function that first tries to call f1, then if that doesn't work, f2, and so on up to fn. So we've reduced the above Scala and ML examples to:
(pat1, expr1 | pat2, expr2 | ... | patn, expr2) foo

Now we're just about far enough along to fold something leftwards like:

foldl ~ f, z, ([L], z | a :: as, foldl f (f z a) as)
except that the L in [L] needs some excusing. ASCII just doesn't have enough matching delimiters. At a minimum, you'll want convenient groupers for lists and tuples. And if you start doing anything useful, you'll quickly want shorthand for sets and maps as well. Parentheses and curly braces are off limits, because I'm giving them their usual Scala meaning (mutatis mutandis), and less than and greater than are comparison operators (this isn't XML). That leaves an awful lot of work for the square brackets, so I'm letting the first token after the [ disambiguate, as in:
[T "This" "is" "a" "tuple"]
[L a list]
[S 1 2 3 1]                           -- a set of *three* elements
[M key1 -> value1 key2 -> value2]     -- and a map
As you can see, within square brackets, whitespace is treated as an item separator rather than function application (infix operators are still recognized). You can always parenthesize to call your favorite function:
[T (cos x + sin y) (sin x + cos y)]
And if you're really insecure about having lost your tupling comma, you can use double comma as an infix operator to make a synonym for the above like:
(cos x + sin y),,(sin x + cos y)

One last thought before putting this post to bed. It would be a wonderful universe if we could dispense entirely with a built-in if then else in favor of pattern matching, but even:

(true, expr1 | false, expr2) cond
is awfully talky (not to mention backwards), and I don't see how to trim it down without betraying my logic. So, infixally once more:
cond ? expr1 ! expr2
where exclamation mark takes over for C's colon. The colon is reserved for specifying types, about which, mercifully, I have given no thought whatsoever.

Wednesday, October 2, 2013

Classes (Not the OO Kind)

Labor Day is well behind us, but it's not too late to go back to school. The proliferation of free online classes means that if you have the time and the curiosity, there's no reason to be ignorant about anything anymore.

The first massive online open courses (MOOCs) I signed up for were Peter Norvig and Sebastian Thrun's AI course and Andrew Ng's machine learning class, offered by Stanford in the fall of 2011. I joined because I'd heard about them on a news story on NPR. NPR was reporting that signups were exceeding all expectations, into the tens of thousands of students (the AI course ultimately had about 160,000). I figured that this was an experiment unlikely to be repeated and I'd better make sure I took advantage of it. Both courses were a lot of fun, and I found myself more motivated to learn and complete the course than if I'd just read books on those topics.

I've taken several more courses since then, both from Coursera and from Udacity, the two companies created after the successes of the Stanford courses in the fall of 2011 (Professor Ng cofounded Coursera, and and Professor Thrun Udacity). None of the classes has been graded quite as rigorously as a typical university undergraduate course, mainly because the constraints of automatic grading (required for classes with thousands of students) mean that a lot of test questions are multiple-choice. That said, the MOOCs have still been a very satisfying experience—I've learned plenty and, I think, retained as much as in a typical in-person class.

Right now I'm taking four Coursera courses: Tim Roughgarden's Algorithms Part 2 (I somehow managed to graduate without taking an algorithms class, but it's never too late); Martin Odersky's Functional Programming in Scala (so far a bit elementary, but everyone I've talked with who uses Scala seems to think it's worthwhile); Michael Geneserth's General Game Playing (I've always wondered how that works); and Dan Grossman's Programming Languages (an excuse to learn ML, Racket, and Ruby, none of which I know).

I still don't understand how so many MOOCs are being offered at absolutely no cost, and maybe it's a business model that won't last. Get 'em while they're hot! (I'm not sure what the Coursera signup deadlines are like, but the General Game Playing and Programming Languages classes have just begun, if you want to join me.)

And finally, if there's something you want to study for which no one has set up a MOOC, the modern world offers you another study alternative besides curling up in a corner with a book. I mentioned previously that the Scala Study Meetup Group is working its way through a textbook; the Bay Area Categories and Types group has been doing the same with Benjamin Pierce's Basic Category Theory for Computer Scientists (I still can't claim to understand much of category theory despite attending the meetups faithfully, but it's a start).

So if there's something you want to learn, don't sit around bemoaning your ignorance. Join a class or start a meetup group, have some fun, and find out just how much more there is to be ignorant about.