Quick way to emulate a database in Rust


Recently I was working with shuttle.rs, a Backend-as-a-Service platform built on Rust, to build a simple web app with Axum and Postgres. I have a GET handler and a POST handler that reads from and writes to the database, which is provisioned by Shuttle:

async fn axum(
    #[shuttle_shared_db::Postgres] pool: PgPool,
) -> shuttle_axum::ShuttleAxum {
    let router =
        Router::new()
            .route("/posts/:id", get(read_post))
            .route("/new-post", post(create_post));

    Ok(router.into())
}

async fn read_post() {
    // Read from database
}

async fn create_post() {
    // Write to database
}

(I’m omitting the extractors for id and body for brevity. Normally the read_post handler will be given the id from the request path, and the create_post handler will be given the request body.)

When I was doing the integration between the routing on the server and the frontend, I wanted to test these handlers, but I dreaded the prospect of spinning up a whole Postgres instance and doing all the schema migrations that entails. Moreover, I would need to modify the handlers read_post and create_post, passing in the PgPool from sqlx by using closures, of which I am not very fond.

All I need is a global mutable key-value store. For testing purposes, it does not need to persist, i.e., it can be an in-memory struct.

Vector art of a Shiba Inu dog wearing black sunglasses

Easy enough, just use a HashMap in the axum function and pass it into all handlers, which also let us avoid global state.

type Store = HashMap<String, Bytes>;

async fn axum(...) {
    let store: Store = HashMap::new();
    let router =
        Router::new()
            .route("/posts/:id", get(read_post))
            .route("/new-post", post(create_post))
            .with_state(store);

    Ok(router.into())
}

async fn read_post(State(store): State<Store>) {
    // Read from store
}

async fn create_post(State(mut store): State<Store>) {
    // Write to store
}

This works, but how does Axum pass store to each handler? The lifetime of store is bounded within the axum function, but the handlers live much longer than that, so the handlers can’t be referencing the value held in store. But store cannot be moved into the handler either, since there are two handlers here while a value can only be moved once. The only possibility is that the state is cloned to each handler every time they are invoked, which is quite inefficient. What we want is a global singleton database that all handlers can access.

Vector art of a Shiba Inu dog wearing black sunglasses
Hmm. How about we switch to static variables?

Static variables are indeed guaranteed to exist throughout the program’s life and their memory locations are fixed, so that achieves our singleton requirement:

use std::collections::HashMap;

static mut STORE: Store = HashMap::new();
error[E0015]: cannot call non-const fn `HashMap::<String, Bytes>::new` in statics
  |
3 | static mut STORE: Store = HashMap::new();
  |                           ^^^^^^^^^^^^^^
  |
  = note: calls in statics are limited to constant functions, tuple structs and tuple variants
  = note: consider wrapping this expression in `Lazy::new(|| ...)` from the `once_cell` crate: https://crates.io/crates/once_cell

Unfortunately, we cannot call non-const functions, such as HashMap::new(), in const or static context. This is because HashMap::new() allocates, which you cannot do during compile time.1 To declare a global variable, normally we would use the macro lazy_static!, the OnceCell type, or Lazy::new(), but since Rust v1.70.0, we can achieve this without pulling in an additional dependency using OnceLock from the standard library:

use std::collections::HashMap;
use std::sync::OnceLock;

static STORE: OnceLock<Store> = OnceLock::new();
// ...
async fn read_post() {
    let db = STORE.get_or_init(|| HashMap::new());
    // Read from db
}

Still, we can only read db, not write to it, since we can only get an immutable reference from get_or_init(). If we were to write static mut STORE, it would requires unsafe to manipulate, as it is not thread-safe:

error[E0133]: use of mutable static is unsafe and requires unsafe function or block
 --> src/main.rs:9:14
  |
9 |     let db = STORE.get_or_init(|| HashMap::new());
  |              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ use of mutable static
  |
  = note: mutable statics can be mutated by multiple threads: aliasing violations or data races will cause undefined behavior

Instead, we can wrap the HashMap in a Mutex and make it thread-safe, yet still modifiable without mut: in short, we can write

use std::collections::HashMap;
use std::sync::{Mutex, OnceLock};

static STORE: OnceLock<Mutex<Store>> = OnceLock::new();

async fn read_post() {
    let db = STORE
        .get_or_init(|| Mutex::new(HashMap::new()))
        .lock()
        .expect("Failed to get database");
    // Read from db
}

async fn create_post() {
    let mut db = STORE
        .get_or_init(|| Mutex::new(HashMap::new()))
        .lock()
        .expect("Failed to get database");
    // Write to db
}

A little bit of refactoring and we get

use std::collections::HashMap;
use std::sync::{Mutex, OnceLock};

fn db() -> &'static Mutex<Store> {
    static STORE: OnceLock<Mutex<Store>> = OnceLock::new();
    STORE.get_or_init(|| Mutex::new(HashMap::new()))
}

async fn read_post() {
    let db = db().lock().expect("Failed to get database");
    // Read from db
}

async fn create_post() {
    let mut db = db().lock().expect("Failed to get database");
    // Write to db
}

And there you go: a simple and digestible function that returns a global mutable singleton key-value in-memory store.2

Footnotes

  1. However, Vec::new() is const fn and can be called in static context. This is because Vec::new() doesn’t actually allocate anything (it only does when you push elements into it), while HashMap::new() allocates through a call to RandomState::new().

  2. Thanks to this StackOverflow answer for inspiring this blog post.