Key/Value Server Part One

Creating a lightweight in-memory key/value storage system.

In part one we create a key/value server that stores data in memory. In part two we'll add a few lines of code to make the server persist its data to disk on shutdown and restore them again on startup. Finally, in part three, we'll refactor the code so that it conforms to Erlang/OTP conventions.

Complete code for this tutorial is available in the GitHub repository.

Motivation

The native dict module in Erlang provides an excellent data structure for storing pairs of keys and values, but there are some use cases where it can be cumbersome to manage the state of a dictionary. For example, some applications have cross-cutting concerns, like caching, which require access to state in many places. In such scenarios, one must transform the dictionary into a key/value server which can be accessed by any client willing to make a remote procedure call.

Client API

Although "remote procedure call" is a big scary phrase, the client API for the key/value server is exceedingly simple. Once the server is running, any client can save a value into the store by calling

The value returned to the client is a unique string containing the hash of the stored value. When a client wishes to retrieve the value again, it can pass the hash as the only argument to

which returns the stored value or, if the key was not found, the atom not_found. Likewise, the value can be removed from the server by passing the hash into

which always returns the atom ok, even if the key is not found.

That rpc/1 function being used in the client API is a very simple send/receive block that reduces the amount of repeated code. Here's what it looks like:

The three simple functions above, along with start/0 and stop/0 are the entire client API for this project. Let's look now at the server; it's almost as simple as the client.

The Server Implementation

The first operation that every server needs to handle is starting itself up. Our server has a simple function called start which takes no arguments.

Starting the crypto OTP application is required to create the hashed keys. Next, we spawn a server process with an empty dictionary as it's only argument and register that process with the name kv. Doing so allows our rpc function (above) to talk with the server without having to keep track of the server process.

Inside the spawned server process, it runs the server_loop/1 function, which takes a dictionary as its only argument, then immediately enters a receive block. The receive block has four patterns to match the four operations that can be requested by a client. First, we need to store values:

We create a hash of the value, then use that hash as the key to save with the value in the state dictionary, Dict. We then respond to the client a message containing the new Hash that uniquely represents the stored value, V, and loop back to process the next incoming message.

When a client asks to lookup a value by its hash key, the server checks to see if the key is stored in the state dictionary, Dict, and returns to the client the associated value or the atom not_found if the key doesn't exist:

A request to delete a key/value pair is almost the same as a lookup:

If the key exists in the state dictionary, we remove it and update the state for the next server loop. Whether the key was in the dictionary or not, we send the client a simple reply with the atom ok.

The final server operation is to stop running:

In addition to the server loop, there are two utility functions that handle creation of the stringified hashes:

There are two steps to creating each hash:

  1. Calculate a 160-bit SHA1 binary hash of the data value.
  2. Convert the 160-bit binary hash to a HEX-encoded string.

The complicated aspect is formatting the binary hash as a string. The magic of that operation is due to the format string "~40.16.0b". The three parts of that string have special meanings:

And that's all the code for the entire client-server system! (Actually, if you look in the git repo you'll find a bunch of unit tests too, but those don't count.) Let's try it out in the Erlang shell:

> erl
Eshell V6.3  (abort with ^G)
1> c(kv).
{ok,kv}
2> kv:start().
ok
3> kv:lookup("Hello").
not_found
4> K1 = kv:store("Cole lives in Canada.").
"66de350ed22175cca006caf8ea8fd1bf80f44e1c"
5> kv:lookup(K1).
"Cole lives in Canada."
6> kv:stop().
{stop}

Next Steps

In the next part of this series, we'll adjust this key/value storage server to write its state to disk at shutdown, then reload that data when it starts up again.