On June 24, we had a thought provoking set of presentations at the SDForum SAMsig, arranged by our departing Co-Chair Paul O’Rourke. The presentations ended with how Twitter is scaling today with the rewriting of the backend infrastructure from Ruby on Rails to a new language called Scala. Notably, a queuing system, kestrel, that mediates between Twitter’s web user interface and the processing of “Tweet following” and sending tweets was written in Scala and implements Actor. This implementation is much simplified from other implementations, is more reliable, and scales to handle billions of tweets.
3 themes came out in the meeting, that pointed to Twitter’s switch to Scala:
- Actor model enables simple programming of applications involving concurrency
- Scala language features make programming fun and interesting
- JVM has solid reliability and thread scaling
Created by Carl Hewitt some 35 years ago at MIT, the Actor model is surfacing as a good way to think about and to implement systems that involve multiple simultaneous communications, such as chat and Twitter. Actors are objects that do not operate on the same content that changes (mutable states) and communicate with each other via messages (message passing). As the result, the actor model eliminates many of the headaches that programmers face when solving the concurrency issues involving millions of senders and receivers of messages.
At SAMsig, Hewitt reviewed some of these issues and the history of the creation of the Actor model.
Scala is a recently created language that runs on top of the Java Virtual Machine (JVM) and thus uses all the facilities of the Java environment. It takes advantage of the proven reliability, performance, and other capabilities offered by the JVM. However, many programmers find coding in Java could be tedious with its formal requirements. Scala makes it fun for programmers with its simplicity of passing functions (functional programming like LISP) and pattern matching (more general and powerful than C case statement). Scala also implements Actor using the multi-threading capabilities of JVM, while removes the complexity of thread communication for the programmers.
At SAMsig, Frank Sommers presented many of the features of Scala in a short 40 minutes. There was much to take in in a short time.
JVM reliability and multi-threading
JVM has proven to be reliable and can scale easily to take advantage of the latest multi-core processors and large cluster of servers running together in the data center. Robey Pointer indicated that by using Scala to write the Twitter queuing system, kestrel, he was able to take advantage of all the goodness in the JVM, without having to write in Java. And yet, Java could be a backup language if Scala fails. Furthermore, by using Actor within Scala, the coding was much simplified from similar code written in Java, as the Actor enforces a share-nothing form of communication, designed for concurrent environment. The code size for kestrel is estimated at half the size of similar code written in Java. Without the complexity of managing threads and locks explicitly in kestrel, along with smaller code size, the support and maintenance of the code is much easier. In fact, kestrel runs on multiple servers and processes billions of tweets without failing. System responsiveness is fast. Pointer’s slides are here.
The combination of Actor, Scala, and JVM makes the kestrel queueing system and Twitter reliable, scalable, and fast. By writing the code in Scala and using Actors, the code is easier to develop and simpler to maintain. Such combination of these elements points to the continuing innovation happening with software. This serves as a good example for us, as we endeavor to develop new products, that we consider new advances in software (such as Scala), rather than just being stuck with doing things the same old way.
Copyright (c) 2009 by Waiming Mok