How to turn 36 bytes into 16
de305d54-75b4-431b-adb2-eb6b9e546014
As a UTF-8 string, this form will take up one byte per character or 36 bytes, 32 for informational characters and 4 for dashes. What a waste! A UUID is simply a 16 octet (16 byte) integer. The maximum value of this number is 2^128:340282366920938463463374607431768211456
As a UTF-8 string, this would be 39 bytes. Not only is that worse than 36 bytes, it's pretty close to the average case integer representation (2^128/2). But! Who says we have to represent this as a human-readable string of bytes? If we represent the UUID as a byte array, it is of course only 16 bytes:��£¬¾ý�ȹɀɱ��ʜͶϋ͍
Boom. We just reduced our data store size on the order of 50%. It's not going to be as good as 16/36 = 55% reduction because there's overhead for every KV pair and there's potentially some other data stored (boolean, int, etc.), but that's still a huge win. Now my single instance of Redis can last me twice as long (assuming a linear growth curve) before I need to worry about sharding, etc. Normally I wouldn't recommend relying on anything less than an order magnitude gain for an architectural decision, but this is really just a thought experiment :)
Are you crazy?
Let's go further
standardUser-de305d54-75b4-431b-adb2-eb6b9e546014
groupNotification-de305d54-75b4-431b-adb2-eb6b9e546015
messageContent-de305d54-75b4-431b-adb2-eb6b9e546016
For these keys we're using about 16 bytes each for the function of namespacing. With 16 bytes we could represent 2^128 namespaces! (Conveniently the same size as a UUID :) How many namespaces do we really need for keys? 256? 65536? Let's go with that. 2 bytes as a byte array gives us our 65k prefixes. We store these in an enum somewhere in the common lib for our project and bingo, we have a way to reference our keys 77.5% more space efficiently:
�¬-de305d54-75b4-431b-adb2-eb6b9e546015
ɱ�-de305d54-75b4-431b-adb2-eb6b9e546015
�Ͷ-de305d54-75b4-431b-adb2-eb6b9e546015
When we combine the two approaches together, we turn an average key length of 16 + 1 + 36 = 53 bytes into 2 + 1 + 16 = 19 bytes, for an average savings of 64%. Awesome:
�¬-��£¬¾ý�ȹɀɱ��ʜͶϋ͍
ɱ�-��£¬¾ý�ȹɀɱ��ʜͶϋ͍
�Ͷ-��£¬¾ý�ȹɀɱ��ʜͶϋ͍
Yea, this is crazy
Bah humbug. As much as I hate to admit it, this is right, the grist just isn't worth the grind. You win this time antirez!"Very short keys are often not a good idea. There is little point in writing "u1000flw" as a key if you can instead write "user:1000:followers". The latter is more readable and the added space is minor compared to the space used by the key object itself and the value object. While short keys will obviously consume a bit less memory, your job is to find the right balance."
That said - in my next post I'm still going to implement an extension to Scredis to do this anyway, just for kicks.