Tuesday, June 30, 2009

Actors and Concurrent Computation

Carl Hewitt was in fine form when he spoke about the Actor model of concurrent computation at the SDForum SAM SIG meeting on "Actor Architectures for Concurrent Computation". The meeting was a panel with three speakers. Carl Hewitt, Emeritus Professor in Electrical Engineering and Computer Science at the Massachusetts Institute of Technology. Hewitt and his students invented Actors in the 1970's. Frank Sommers of Artima Software, is an active writer in the area of information technology and computing and is currently writing a book on Actors in the Scala programming language. Robey Pointer is a software engineer at Twitter, and is an enthusiastic user of Scala and Actors at Twitter. Bill Venners is president of Artima Software and an author of a book on Scala moderated the proceedings.

Hewitt had retired when the advent of multi-core processors and distributed parallelism renewed interest in the Actor model. He has come out of retirement and is currently visiting Stanford. Hewitt described the genesis of the Actor methodology in the Alan Kay's Smalltalk 72, where every object was autonomous and you acted on an object by sending it messages. Later versions of Smalltalk moves in a different direction. The most important aspect of the Actor model is that it decouples the sender from the communications. In practice this allows the Scala implementation to scale to millions of Actors engaged in concurrent communication (more on this later).

Hewitt spoke with great flourish on a number of other topics, including his determination that the Actor model should be properly represented on Wikipedia and spooks and the internet archive. He spent some time with the unbounded non-determinism in the Actor model versus other concurrency formalisms that only support bounded non-determinism. An audience member challenged him to explain this better and citing Map-Reduce. Amusingly, Hewitt answered by describing the parallelism in Map-Reduce as like Stalin. Stalin has three deputies and each of those deputies has three deputies. Stalin tells his deputies what to do, and those deputies tell their deputies what to do and so on. Thus the business of the state can proceed in parallel. Map-Reduce is a sequential algorithm which is speeded up by parallelism. There is no non-determinism. This is parallelism as opposed to concurrency.

Next Frank Sommers spoke on how Actors are used in Scala. The good news is that Actors are implemented in Scala and Hewitt much preferred the Scala implementation over the Erlang implementation of Actors. The bad news is that there are a number of issues with the Scala implementation. For example, a Scala program cannot exit from a "receive" statement. Another issue is that messages are supposed to be immutable, however the current implementation may not ensure this. These and other issues are being worked on, and the next version of Actors in Scala will be much better.

Finally, Robey Pointer talked about how he is using Scala. He implements message queuing systems that deals with large numbers of long lived connections where each connection is mostly idle but has sporadic bursts of activity. Robey has implemented this system in many different ways. For example, a straight thread implementation and a lot of tuning got up to 5000 thread based connections working at one time, however this fell well short of his goal of supporting millions of connections. A thread pool implementation with a few hundred threads worked better but the code became unwieldy and more complicated than it should have been. Now he has an Actor based implementation in Scala that does scale to the millions of connections and yet the code remains straightforward and small.

He also showed us how Actors can be mixed-in with thread based synchronization to solve problems for which even Actors are too heavyweight. I am in two minds about this. On the one hand, there are legitimate uses for this low level synchronization (as discussed in my PhD thesis). On the other hand, thread based concurrency is awful as I keep promising to explain in another post. Also to do it safely, you need to understand is great detail how Actors are implemented in Scala, and one reason for adopting a high level construct like Actors is that it should hide gory implementation details.

After the meeting I spoke with Carl Hewitt. We agreed that sending a message needs to have a low overhead. It should have a similar costs to calling a procedure. Computers have specialized instructions to support procedure calls and they need specialized instructions to support message passing. We did this for the Tranputer, although that was before its time, and it is eminently possible for Actors. All we need is for a high level concurrency formalism like Actors to get enough traction that the chip manufacturers become interested in supporting it.

No comments: