Saturday, February 7, 2015

Extending Scredis, a Scala Redis Client, to Write Binary Keys

In my last post I discussed the possibility of reducing Redis key space usage for UUID-based keys by storing them as byte arrays, along with converting key namespacing from human readable strings to integers as well. The conclusion of that post was that is a bad idea(tm), but I'm going to do it anyway. For TL;DR, show me the code, see the pull request.


The Current Scredis interface


Creating and modifying keys with Scredis is wonderfully simple and easy. It looks like this:
package scredis.commands

import org.scalatest._
import org.scalatest.concurrent._
import scredis._
import scredis.protocol.requests.StringRequests._
import scredis.util.TestUtils._

class BlogExampleSpec extends WordSpec
  with GivenWhenThen
  with BeforeAndAfterAll
  with Matchers
  with ScalaFutures {

  private val client = Client()
  private val SomeKey = "someKey"
  private val SomeValue = "HelloWorld!虫àéç蟲"

  Set.toString when {
    "setting a key that does not exist" should {
      "succeed" in {
        client.set(SomeKey, SomeValue)
        client.get(SomeKey).futureValue should contain(SomeValue)
      }
    }
  }

}
Ok great. Let's take that a step further and figure out how to write our UUID values as byte arrays rather than UTF-8 strings. To do that we will implement a Scredis Reader and Writer for the java.util.UUID type:
package scredis.commands

import java.nio.ByteBuffer
import java.util.UUID

import org.scalatest._
import org.scalatest.concurrent._
import scredis._
import scredis.protocol.requests.StringRequests._
import scredis.serialization.{Reader, Writer}
import scredis.util.TestUtils._

class BlogExampleSpec extends WordSpec
  with GivenWhenThen
  with BeforeAndAfterAll
  with Matchers
  with ScalaFutures {

  private val client = Client()
  private val SomeKey = UUID.randomUUID()
  private val SomeValue = UUID.randomUUID()

  implicit val uuidReader = new Reader[UUID] {
    protected def readImpl(bytes: Array[Byte]): UUID =
    bytes.length == 16 match {
      case false => null
      case true =>
        var msb = 0L
        var lsb = 0L
        for (i <- 0 until 8) {
          msb = (msb << 8) | (bytes(i) & 0xff)
        }
        for (i <- 8 until 16) {
          lsb = (lsb << 8) | (bytes(i) & 0xff)
        }
        new UUID(msb, lsb)
    }
  }

  implicit val uuidWriter = new Writer[UUID] {
    protected def writeImpl(value: UUID): Array[Byte] = {
      val bb = ByteBuffer.wrap(new Array[Byte](16))
      bb.putLong(value.getMostSignificantBits)
      bb.putLong(value.getLeastSignificantBits)
      bb.array()
    }
  }

  Set.toString when {
    "setting a key that does not exist" should {
      "succeed" in {
        client.set(SomeKey.toString, SomeValue)
        client.get[UUID](SomeKey.toString).futureValue should contain(SomeValue)
      }
    }
  }

}
Pretty cool, we just got our 36 byte UUID value down to a length 16 byte array. However, our goal was to have both UUID-based keys and values. In order to do that we have to update Scredis to allow the concept of Readers and Writers for keys, the same as we do for values. That's actually a pretty big interface change so I can't paste it all here, but you can review the PR to see how it's done. As an example, here's how the interface of the set command changed:
  def set[W: Writer](
    key: String,
    value: W,
    ttlOpt: Option[FiniteDuration] = None,
    conditionOpt: Option[scredis.Condition] = None
  ): Future[Boolean]
becomes:
  def set[K: Writer, W: Writer](
    key: K,
    value: W,
    ttlOpt: Option[FiniteDuration] = None,
    conditionOpt: Option[scredis.Condition] = None
  ): Future[Boolean]
Now that we have our handy new key writer interface, we can come back to our test spec and make our UUID-key the way we want to:
  Set.toString when {
    "setting a key that does not exist" should {
      "succeed" in {
        client.set(SomeKey, SomeValue)
        client.get[UUID, UUID](SomeKey).futureValue should contain(SomeValue)
      }
    }
  }
Sweet. Now we're properly encoded on both sides of the KV pair and we've shaved off 40 bytes from our original 72 bytes worth of data. The only thing left to do would be to add a binary namespace to our key, but I'll leave that to your imagination. Please, don't do this at home kids. As discussed in the previous post, the maintainability and debugability of your data store is not worth sacrificing for a few extra bytes :)