This blog has moved to Medium

Subscribe via email


Lock Freedom is overrated (sometimes)

Locking and synchronization is expensive in a multi-threaded environment. When you’re writing a component that should scale to 10K concurrent requests, all your data structures must be lock-free, right?

Wrong.

It is true that if you have a single object that is synchronized in every request, locking that object will severely impact performance when concurrency goes up. However, you should analyze the actual usage pattern and data structures to determine the actual contention.

For instance, suppose you are writing a messaging application, where clients write and read from their respective inboxes. The application as a whole is meant to support 10,000s concurrent clients. The supported operations are writing an object to someone’s inbox, reading objects from a specific inbox, and subscribing to notifications. With this performance profile, it’s seems like a good idea to make the global data structure that maps client’s IDs to inboxes lock-free, otherwise concurrent searches to find a specific inbox will block one another.

However, if any specific inbox is not meant to carry a significant load, there is no harm in implementing the Inbox data structure (that is responsible for one specific Inbox/Message Queue) without restricting yourself to lock-free data structures. Locking when there is no contention is not an expensive operation!

Writing and using lock-free structures is usually harder than locking, except for simple scenarios. In order to maintain class invariants between the different data structures you use, sometimes a lock is simple and effective. You should be familiar with your language’s lock-free options, but don’t be swayed by the hype and “coolness” of lock-freedom, and use it when it’s actually required or where it saves you code. If it complicates your design, try to defer it until you’re convinced you actually need it.

4 Comments

  1. Oren Ellenbogen:

    I might be missing something, but you’re performance profiling looks a bit off.
    It seems that you’ll always need to call the map (MANY MANY times) to in order to do any operation. It doesn’t really matter if the value of the map is not used that often (thus it might be okay to lock the inbox object), it’s the map that you’re trying to avoid contention on.

  2. ripper234:

    This is exactly my point.

    The map is the global object, so it deserves special attention. I would start with a concurrent hashmap for its implementation. Another option is to have several (say 100) maps, each with its own lock – so the contention for each lock is much smaller (you can calculate which map holds which inbox by hashing its ID).

    An individual inbox, however, will get almost zero contention – so its implementation can be very naive. The gist here is to think carefully about how much contention each data structure will see, and choose its implementation accordingly.

  3. Oren Ellenbogen:

    looks like after you edited the post my comment became obsolete.. 🙂

  4. Tomer Gabel:

    Concurrency 101: think before you do. This applies to correct data structure choices as much as it does concurrency/synchronization layout, trading safety and simplicity for performance and complexity.

    In simpler terms: don’t do anything I wouldn’t do… 🙂