Concurrency and Erlang
I gave a talk titled Concurrency and Erlang at Linux.conf.au in January, 2007. Thank you so much to everybody who attended it and made me feel welcome: I got overwhelmingly positive feedback from it, and it was an honour to speak at such an esteemed conference. This also serves as a homepage for the very similar Erlang and Concurrency talk that I gave to the Sydney Linux Users’ Group (SLUG) in July 2006, since the two talks were more-or-less the same.
- View the official talk homepage on the Linux.conf.au site.
- Download the (annotated) slides in PDF format (~9.5MB). Note: These slides are slightly more updated than the official ones that I gave to the Linux.conf.au committee.
Resources and Links
- Wikipedia’s information about Tim Sweeney’s Unreal Engine series, which powers a pretty extraordinary list of games, from first-person shooters to massively multiplayer online role-playing games. Small plug: BioShock and Mass Effect, two games I’m so hanging out for in 2007, use the Unreal Engine 3.
- The Company of Heroes E3 Trailer and in-game engine demo, which I showed briefly during my talk. zomgwtf that’s some fantastic graphics.
Concurrency and General Programming Articles
- Added on 11/02/2007: Edward A. Lee’s essay about “The Problem with Threads” is a great introductory essay for those of you who don’t have a lot of experience with multithreading, and don’t yet believe it to be intractable. Highly recommended.
- Lambda the Ultimate’s discussion about Tim Sweeney’s talk at the PoPL (Principles of Programming Languages) conference in 2006, titied The Next Mainstream Programming Languages. Highly recommended.
- Software and the Concurrency Revolution by Herb Sutter (well-respected in the C++ community) and James Larus.
- Why implicit parallelism in functional languages are not quite the solution to everything.
- An interview with Tom Leonard of Valve Software about multi-threaded challenges (the creators of that somewhat decent-selling Half-Life 2 game).
- The Solution to C++ Threading is Erlang by Todd Hoff, a good article on why shared-state concurrency is difficult and why message-passing semantics are just a ton better.
- An interview with John Carmack (you know, the author of those small games named DOOM and Quake), where he speaks a little about problems dealing with concurrency.
- Wikipedia’s page about the Actor model for concurrency (which is employed by Erlang). Steve Dekorte has a page about the history of support for actors in programming languages, with some links to other sites that talk about actor implementations.
- Futures are another interesting idea that can be used with either shared-state concurrency or message passing: the concept is to return from any RPC calls immediately, but the variable is lazy so that any attempts to access it will block until the RPC call returns.
- Dan Kegel’s classic C10K problem article, which discusses finding solutions to the problem of a single machine serving 10,000 clients simultaneously. Even though the article is written in 1999, all the techniques there are still applicable today. (Of course, then I just snigger when I think that Erlang can handle 10,000 concurrent connections with relative ease.)
- Added on 31/01/2007: An excellent
Slashdot discussion about multi-core computing (in
the context of IBM’s
chief architect claiming that developing software for
multi-core systems will be hard). Most of the
Score:5comments I read are pretty good, and mention things such as Erlang, new programming paradigms, that most applications are not CPU-bound, and, of course, the horror of using mutable shared state. Perhaps the Slashdot comments are an indication that the industry is maturing and finally waking up (or perhaps not at all…).
- Added on 12/04/2007: You know it’s mainstream news when it hits news.com.com.com.com: apparently game developers now have to adapt to a multi-core world. Naw, really?
- Added on 24/04/2007: Mark-Jason Dominus’s classic article “Why I Hate Advocacy” is a must-read for all programmers: “you have to lead people, not drive them before you”. Highly recommended.
Software Transactional Memory, or STM, is the last bastion of hope for people who, for whatever reason, must still use shared mutable state in concurrency. Wikipedia has a great introductory page about it, and Simon Peyton-Jones has the best papers about it. STM’s still a very active research topic so it’s not recommended you use it in production work unless you’re really confident with it, but it’s a promising future direction for shared-state concurrency.
- Simon Peyton Jones’s STM page has a lot of links to his research.
- There’s also a new draft paper named Beautiful Concurrency that Simon’s working on.
Added on 08/02/2007: Patrick Logan rips software transactional memory to shreds in a blog posting. His most valid point is that STM is basically untested right now, whereas message-passing and share-nothing concurrency has been used successfully in many systems for many years. Don’t take the article’s opinion as gospel, though! Yes, Patrick’s a smart and experienced developer, but Cale Gibbard (who responds in the comments to the article) is also a smart guy, and Simon Peyton Jones, a leader in STM research, is a very smart guy—as is Tim Sweeney, who supports STM and has his fair share of experience with threading issues. Read the article and be aware of the issue, though. Given the choice, I’d personally prefer share-nothing and message-passing vs transactional memory, but the two aren’t really mutually exclusive. (You can implement message-passing on top of STM, if you like.)
Added on 09/02/2007: The Lambda the Ultimate community discusses Patrick Logan’s post about STM (intelligently, as usual).
Here’s some Erlang tutorials:
- Getting Started with Erlang.
- An Erlang Course.
- Best practices for Erlang development: a recommended read since it introduces you to some different philosophies that Erlang takes, such as why Erlang does not encourage “programming defensively”.
- Added on 24/01/2007: a tutorial on how to build an OTP application in Erlang.
- Added on 06/02/2007: Thinking in Erlang, an excellent ~30-page introduction for C/C++/Java/Python programmers who haven’t dealt with the functional programming paradigm before.
- Added on 04/03/2007: The Pragmatic Programmers (famous for their “Pragmatic Programmer” and “Agile Web Development with Rails” books) have released a book named Programming Erlang. The book appears to have two chapters dedicated to the very important OTP libraries, too. Yeah, now there’s actually a modern book you can learn from—I can’t wait to get my hands on this either!
- Added on 18/04/2007: Dave Thomas (of The Pragmatic Programmer fame) writes two great introductory articles about Erlang on his blog: article un, article deux.
- and, of course, the Erlang homepage.
… and here’s some general articles about Erlang:
- Performance Measurements of Threads in Java vs Processes in Erlang.
- 20000 users connected to a single ejabberd server.
- The Wikipedia entry on Erlang.
- An IBM developerWorks article about Erlang.
- Erlang SMP Performance on a Sun Fire T2000 (that rather lovely 32-core 2U rackmount server).
- Joe Armstrong shows how to write a fault-tolerant server in Erlang in, uhh, about 200 lines.
- Added on 17/02/2007: Jay Nelson writes an excellent, informative email about why he chose Erlang (over Java + JSP) for implementing a web server and game server. He covers a lot of reasons, from the merits of evaluating different programming languages, to the kickass stuff that the Erlang OTP Platform provides you that no other language does.
- Added on 06/03/2007: The MMORPG Vendetta Online is now using an Erlang-based backend server… excellent, Smithers. Includes mandatory link to Lambda the Ultimate discussion (that of course started talking about LISP vs Erlang vs C++!)
You have the choice of either using real kernel-level threads in C, or using a library that implements purely userspace threading (like Erlang, a.k.a. green threads). Userspace threads are very appealing since they’re so cheap and they arguably interact better with system libraries (a lot of which simply aren’t threadsafe), but unfortunately a big reason to use concurrency in C is to have background threads that perform I/O, which userspace threads don’t help you with at all since you’re still running one user process (from the point of view of the kernel) for all your threads. It’s not very hard to write a lightweight message-passing implementation in C, though: just email me if you’d like some details on how to do this. (In GUI applications, it’s really easy: just send a new event back to the main loop, and let the main thread pick it up and distribute the event to whatever thread should handle it.)
- Protothreads for C.
- GNU Pth (a.k.a. GNU portable threads), which implements userspace threading in a fairly portable library.
In general, Google around for protothreads, user-space threads, green threads, and actors in combination with C, libraries and the like to search for more interesting C libraries.
(This section added on 22/02/2007).
- Here’s an implementation of Protothreads for Objective-C.
- I’m very grateful to one of my readers for a pointer to an implementation of Erlang-style message queues for Objective-C: each thread has its own message queue, and sending to another thread’s message queue is asynchronus for the sending process. The receiving process can process the messages at their leisure. (Of course, I only found out about this after spending a day writing my own implementation of thread-local message queues… Murphy’s Law strikes again!)
Note that Mac OS X 10.5 (Leopard) will augment the very
method with a bunch of new methods to perform selectors on
other threads besides the main thread. But, uhh,
don’t tell anybody, because I think that’s
information under NDA right now
or something. (Ssssh.)
- A mind-blowing article by Andrei Alexandescru (of
Design fame) about
volatilequalifier to detect compiler-checked mutex locking.
- A paper on implementing software transactional memory in C++, titled Lowering the Overhead of Nonblocking Software Transactional Memory.
- As usual, you can hack up C++ templates to support just about anything, including futures.
- Added on 23/01/2007: A rather interesting-looking C++ framework named Channel provides (what else?) a crazy C++ template library to support (a)synchronous distributed message passing and event dispatching.
- Added on 13/02/2007: The Kent C++ CSP Library, a library modelling Tony Hoare’s Communicating Sequential Processes idea (very similar to Erlang-style actors/concurrency) in C++.
- Added on 16/03/2007: Boost is gunning to have a new coroutine library available for evaluation at the end of the Google 2007 Summer of Code run.
- Added on 09/04/2007: The Await & Locks library provides the excellent await (atomic wait) synchronisation primitives often taught in (good) concurrency textbooks. This is much more sane than using mutexes, semaphores and condition variables for managing shared-state concurrency. Compare the pipe example on that webpage to one that’d you’d have to write with mutexes and condition variables!
- An introductory article about JavaSpaces.
- I’ve also heard some good things about Jini, Sun’s Java networking that can be used as a message-passing infrastructure to build applications on. I don’t know enough about it to judge whether it’d be useful at all or not, so check it out yourself and see!
- The Stackless Python Homepage, which adds Erlang-style message passing to Python.
- An excellent Introduction to Concurrent Programming with Stackless Python.
- Note: I did some informal benchmarks with Stackless Python, and the runtime turned out to be pretty good indeed: it was roughly 4-5x slower than Erlang on code that achieved similar goals. (Note that this is actually quite a compliment: Erlang’s messaging system is, as you’d expect, bloody fast.) I think it’d be more than feasible to use Stackless in heavily concurrent projects.
(This section added on 15/02/2007.)
(This section added on 03/02/2007.)
Haskell has all of the same primitives and ingredients
that Erlang has to support pure userspace threading and
message-passing. Arguably, Haskell could be an even better
choice than Erlang in the future for two reasons. First,
it’s actually possible to omit threading from the
core language and add it via a library (as some people have
already done), which means that you can choice your
threading model at your pleasure: kernel threads? Userspace
threads? It’s up to you. Second, the type system can
be used to check invariants and concurrency properties if
you wield it correctly, and also forces a very clear
separation of functions that do and do not manipulate
state. GHC already supports many
concurrency primitives (including the
the nicest sort of mutable state primitive that I’ve
seen yet), including using a combination of purely
userspace threads with kernel threads.
I haven’t done any threading in Haskell for a long time and no doubt the state-of-the-art has moved very forward in the past few years, so Google around and check out the Haskell wiki yourself: there’s bound to be a plethora of information out there. Keep in mind that Software Transactional Memory is also in Haskell!
Blogs and good programming language sites:
- Patrick Logan often talks about concurrency and distributed programming issues, and has insightful comments on everything from Python to LISP, Erlang, Smalltalk and Java.
- Lambda the Ultimate is an excellent site about programming languages that has an enormous number of interesting threads covering everything from concurrency issues to type systems.
- gamearchitect.net has a lot of interesting articles on dealing with the complexity problems faced by games programmers, including a few about concurrency.
- And, of course, there’s my own blog, where I write a whole bunch of my own opinionated crap about coding.
There’s a lot of information that I wanted to disseminate during the talk, but there’s just not enough time in 30 minutes to talk about too much stuff! A few people commented that I was making Erlang out to be the best thing ever. Please don’t take it that way: I’ve had experience with quite a few languages (C, C++, Python, Ruby, Erlang, Haskell, Objective-C) and all of them have their place and merits. Erlang is a stellar choice for writing highly distributed or concurrent applications, or for anything to do with network servers. Due to its lack of good GUI bindings, I wouldn’t use it for a normal desktop application unless the domain’s really quite specialised: stick with a more mainstream language for that kind of app, though still try to avoid shared-state concurrency if possible!
If there’s anything you’d like me to clarify or ask about at all, please feel free to email me via the link at the bottom of this page.