Ruby Mutex Mayhem (Part 2)

ruby-mutex-mayhem-part-2

Introduction

Last month I shared a simple but effective strategy of how to create a Redis Mutex in your Ruby applications to handle cross-process/cross server synchronisation. As a quick recap, the problem this is solving is if you need to use locking synchronisation to control access across multiple systems to a resource. In this case a standard Ruby Mutex won't help you as you are bound within the constraints of a single process. And a file mutex (for example) won't help you because you're bound within the constraints of a single server (NAS implementations aside :)

Now, looking at the RedisMutex implementation we shared last week, there is an obvious but slightly tricky in-efficiency with it. To illustrate this let me refresh your memory by outlining the essential workflow below:

1. try and take the lock (atomically) --> fail
2. wait <recheck frequency seconds>> 
3. try and take the lock (atomically) --> fail
4. wait <recheck frequency seconds>> 
5. try and take the lock (atomically) --> success!

So what you see is that we have a recheck frequency seconds during which time we are waiting (for simplicity lets call this recheck_wait from here on). If we make recheck_wait very small then we are going to end up hitting our Redis server a lot (causing unnecessary load there) and if we make recheck_wait too big then we end up with periods of time where the lock is released, but we are still waiting! To get around this we can extend our RedisMutex class to make use of Redis Key Event Notifications!

As with the last post, I'll talk through the way that we implemented the code below. Please feel free to use it or reach out if you have any feedback or comments. Remember to always understand the code you're running and using it is at your own risk; Cloud 66 can not be held liable.

This time you'll need a Redis Server running at least version 2.8 to use this Mutex.

Implementation

1. Redis Initializer

Once a redis connection is in a subscription mode, it can only perform subscription related actions - so we need a pool of redis connections to use exclusively for our subscription events. To manage this pool we use Mike Perham's excellent connection_pool gem. (Unrelated: we're huge fans of Mike's sidekiq gem too)
Redis needs to be told that it should raise key events for us. This should be done when your application starts up. You can choose which events to receive notifications on - we need the following ones:
K: Keyspace events published with __keyspace@DB
g: Generic commands like DEL, EXPIRE, RENAME, ...
x: Expired events (events generated every time a key expires)

# Redis Initializer Implementation
require 'connection_pool'

# this is a pool for our regular redis connections
$conn_pool = ConnectionPool.new do 
  Redis.new(:host => redis_server, :port => 6379)
end

# this is a pool for our subscription redis connections
# note: choose size and timeout that suit your own needs
# if your size is not large enough you may run out of connections
size = 5 
timeout = 5
$subs_pool = ConnectionPool.new(size: size, timeout: timeout) do
  Redis.new(:host => redis_server, :port => 6379)
end

# tell redis we want notifications for specific key events
$subs_pool.with do |conn| 
  conn.config(:set, 'notify-keyspace-events', 'Kgx')
end

2. Mutex Implementation

Instead of waiting a set time for keys to expire we can now simply subscribe to specific Redis del or expired events that are raised for our key; and move on in that case.

# cross-process/cross-server mutex
class RedisMutex

  attr_accessor :global_scope,
                :max_lock_time

  LOCK_ACQUIRER = "return redis.call('setnx', KEYS[1], 1) == 1 and redis.call('expire', KEYS[1], KEYS[2]) and 1 or 0"
  KEY_SPACE_PREFIX = ' __keyspace@0__ :'
  DEL_OR_EXPIRE_EVENTS = Set.new(['del', 'expired'])

  def initialize(global_scope, max_lock_time)
    # the global scope of this mutex (i.e "resource")
    @global_scope = global_scope
    # max time in seconds to hold the mutex
    # (in case of greedy deadlock)
    @max_lock_time = max_lock_time
  end

  def synchronise(local_scope = :global, &block)
    # get the lock
    acquire(local_scope)
    begin
      # execute the actions
      return block.call
    ensure
      # release the lock
      release(local_scope)
    end
  end

  private

  # attempt to acquire the lock
  def acquire(local_scope = :global)
    # construct the mutex key; the local scope
    # of this mutex (i.e "resource_id")
    mutex_key = "#{@global_scope}.#{local_scope}"

    # while statement will either get the lock and
    # set the expiry on the lock or do neither and return 0
    while $conn_pool.with {|conn| conn.eval(LOCK_ACQUIRER, [mutex_key, @max_lock_time])} != 1 do
      # wait and try again (until we can get in)
      $subs_pool.with do |conn| 
        conn.subscribe("#{KEY_SPACE_PREFIX}#{mutex_key}") do |on|
          on.subscribe do |channel, count|
            # unsubscribe if the key was
            # delete before we got here
            key_exists = $conn_pool.with do |pool_conn|
               pool_conn.exists(key_name)
            end            
            conn.unsubscribe unless key_exists
          end
          on.message do |channel, message|
            # unsubscribe if we get a del
            # or expired event for the key
            if DEL_OR_EXPIRE_EVENTS.include?(message)
              conn.unsubscribe 
            end
          end
        end
      end
    end
  end

  # release the lock
  def release(local_scope = :global)
    # return value indicating whether the lock was currently held
    mutex_key = "#{@global_scope}.#{local_scope}"
    return $conn_pool.with {|conn| conn.del(mutex_key)} == 1
  end
end

How to use it

Now, see the simple usage sample below (exactly the same as we did it before)

MY_MUTEX = RedisMutex.new('server_access', 10.seconds)
...
MY_MUTEX.synchronise(server.id) do
  # do some stuff here that needs to be synchronised
  # for this resource (across all application instances) 
end

Conclusion

In this blog post, we took our previous implementation of a multithreading/cross process or server synchronization solution, and enhanced it for better throughput performance.

Happy coding!

Check out part I - 'Ruby Mutex Mayhem'.