diff --git a/Database-Design.md b/Database-Design.md index dab6b5f..c25b03d 100644 --- a/Database-Design.md +++ b/Database-Design.md @@ -1,3 +1,54 @@ -Using sled. +# Database Design -Tree -> key subspace -> delineator -> key name = value \ No newline at end of file +The dicebot uses [Sled](https://sled.rs/). Sled is a fast, embedded key-value database that stores keys and values as byte arrays. + +Why Sled? + +* Mostly because I felt like it. + +## Guiding Principles + +Sled essentially operates as a bunch of `Map<[u8], [u8]>`s (that is, a map of byte slice keys and byte slice values), which means both keys and values can be pretty much any arbitrary type. To keep things simple, we follow these guidelines: + +* Separate `Tree`s for different types of data. A `Tree` in Sled is an isolated keyspace. +* Strongly typed access to keys, converted into `Vec` keys by private code. +* Keys are UTF8 strings, sometimes separated by `0xfe` and `0xff` delimiter bytes. `0xfe` and `0xff` are not valid UTF8, making them useful delimiters. + +### Trees, Keys, and Delimiters + +Sled supports opening a number of `Tree` instances, each which acts as an isolated set of key-value pairs. We use trees to categorize data much like a relational SQL table. + +* In general, a single tree should store one kind of data. +* Ideally, the tree does not have more that one format used for keys. + +Keys in a tree are how data is queried, and thus should make sense for the data type. A key often requires some levels of sub-categorization. For example, user variables are defined per room. Thus, the key for a user variable is composed of the `username`, `room ID`, and the actual name of the variable. The delimiters `0xff` and `0xfe` are used for this sub-categorization. + +* The `0xff` delimiter is used to split logically separate parts of the key. In the case of user variables, this separates the where (username + room ID) from the what (variable name). +* The `0xfe` delimiter is used to split related parts of the key into more narrow categories. For user variables, the username and room ID are split by `0xfe`. + +A key format should be designed with how the data will be queried in mind. It usually does not matter when looking up a single value, but for scanning multiple keys, it's important that the key be designed properly. Using delimiters enables clever use of Sled's API. A key can be partially crafted up to the delimiter, and the `scan_prefix` function allows finding all data that begins with that prefix. + +### Values + +The value stored in a tree can be anything, including serialized structs. Following guidelines above, one tree should store one kind of data. + +* For types with a size known at compile time (simple types or anything whose size in memory is not variable), zero-copy serialization using the `zerocopy` crate is preferred. +* For complex structs with arbitrary sizes (strings, lists, etc), the `bincode` crate is preferred. + +Where possible, higher-level Sled APIs should be used. Batches, `compare_and_swap` and `fetch_and_update` are both very useful. In cases where this is not possible, transactions should be used to preserve atomicity. + +# High-Level Architecture Overview + +How things are designed. + +# Design: User Variables + +This documents the database design of user-defined variables. + +## Database Schema +User variables are implemented with two trees: + +* `room_user_variables`: Key format ` 0xfe 0xff `. Value type is `i32`. +* `room_user_variable_count`: Key format ` 0xfe `. Value type is `i32`. + +The `room_user_variables` tree contains the actual variables and their value, defined by the user. The `room_user_variable_count` tree keeps track of how many variables a user has defined on a per room basis. APIs atomically update both trees transactionally. \ No newline at end of file