How do you design for relational data in a non-relational database, i.e. NoSQL database, like Amazon’s DynamoDB or mongoDB?  Pretty much all data has some sort of relationship or connection to other data, so how do we model this in a non-relational database?

You have a couple of options.

Here are some solutions that I thought of whole designing the db structure on my new startup, Route Scout.

  1. Mock a relational db design (not good.)
    You could use a typical relational design by storing an array of ID’s in the parent table that references records in another table.  Since you don’t have access to join operators when querying the table, you have to make multiple read queries to get back all of the data.  You may even have to make one query for every record if your db doesn’t support some kind of IN operator (Dynamodb doesn’t).  This is slow and costly, so lets think of a better solution.
  2. De-normalize your data (gasp!)
    Replicate a subset of the data from the child table in the parent table.  This makes reads very quick and is the way to go.  Read below…

De-normalization Design Pattern

In our de-normalized database, we’ll store some data from a child table in each related table.  We only store the data that we need to query, so our database doesn’t get too bloated.  When you query the parent table, you’ll get back all of the related data because it’s stored with each record in the parent table.  One query gets you all of the data you need.  And it’s super fast because the database isn’t making any joins.

You need to think about the different queries you’ll need to run.  The types of queries you’ll need will direct the design of your database, so make sure you thoroughly think about the different use cases when you start designing your database.  Let’s go through an example:

Example

Say we need two tables: Users and Trips.  Users can plan zero or more trips.  This is a simplified version of the actual database used by Route Scout.

Use Cases

  1. List a user’s trips (display the trip destination name and photo)
  2. Display a trip, along with the user who planned it (we only need the user’s ID, name, and photo)
  3. Create or edit a trip

DB Schema

From the use cases, we design a simple schema as shown below:

No-SQL DB blog post (3)

We’ve duplicated some of the User data in the Trips table, and some of the Trip data in the Users table.  I call the subset of duplicated data “summary” data.  This allows us to make very simple queries and get all of the data we need.   Using DynamoDB, I store the summary data in JSON format.

Actual Data

Here’s an example of the actual data stored in DynamoDB.  Notice how the trip & user summary data is stored as a JSON string.  Nice and compact!
Trip DB table

routescout-data

Fast Reads, Slow Writes

Reads are extremely fast because you’re typically making only one db query to get back a record and the related data that you need.  Writes on the other hand, are slow.  When you create or edit a child record, you need to update that record in all places where it’s stored.

Update all duplicated data in parallel

You can execute these multiple write queries in parallel to save time.  Note that if you’re on DynamoDB, you’ll need a high write capacity to execute these calls at the same time.

With this approach, the write methods in your business layer become a bit more complex than in a traditional relational database.  Unit testing very critical here to ensure data is updated in all places in which it’s stored.

Example

Update trip destination title to “Breckenridge, CO” for a trip with 3 travelers, user A, B, and C, execute each of the following in parallel:

  • Update trip record
    trip:
    { ID: "123", Destination: { Title: "Breckenridge, CO" }, User: { ID: "user-A" }, Travelers: [ { ID: "user-B" }, { ID: "user-C" } ] }
  • Update trip summary on user A
    user:
    { ID: "user-A", Trips: { Title: "Breckenridge, CO" } }
  • Update trip summary on user B
    user: 
    { ID: "user-B", Trips: { Title: "Breckenridge, CO" } }
  • Update trip summary on user B
    user: 
    { ID: "user-C", Trips: { Title: "Breckenridge, CO" } }

Conclusion

Working with a NoSQL database like Amazon’s DynamoDB or MongoDB can be tricky, especially coming from a traditional relational database background like SQL Server or MySQL.  The benefits of a NoSQL db are clear in my opinion: easy to scale horizontally, cheap to operate with high-availability even at low-volume using a cloud-hosted service, and flexible schemas (each row can have different fields).  It takes a bit of patience to learn the nuances and best practices of a NoSQL database, but it’s worth it for sure.  With proper planning and thought when designing your db structure, you’ll end up with a database that’s easy to work with and very quick.  Make sure you unit test all of the methods that interact with the database!

Have a question or comment?  I’d love to hear your feedback.

Follow me @bobbychaz

Like what you read?  Share this post with your friends below.