Skip to content

1) Avoiding a lossy database

Imagine a system that stores product data.

Suppose someone decides to delete a product from our database.

DELETE FROM product WHERE product.id = 1;

In a traditional database, this record is now gone for good, and unrecoverable (except for restoring from backups, which is expensive, time-consuming and notoriously unreliable!).

One common workaround is the use of a status column:

UPDATE product SET status = 'UNAVAILABLE'
WHERE product.id = 1;

The downside of this approach is that all queries to the product table now need to be aware of this workaround and add explicit clauses to their queries.

SELECT * FROM product WHERE status <> 'UNAVAILABLE'

Another downside is that we no longer have any historic record of when a status changed.

This is a trivial example but one that clearly demonstrates the fragility and complexity of managing time in data systems.

Using an immutable database

Using an immutable database we keep everything, including the history of any change to the database. Therefore, we can get back deleted data.

For example, let’s set up a scenario by inserting some product records:

Let’s pretend the day we are inserting these records is 2024-01-01.

2024-01-01
Run
Open in xt-play

Let’s query these products:

Run
Open in xt-play

A month later, someone deletes the product.

2024-02-01
Run
Open in xt-play

Let’s check that the bicycle is no longer in our database:

Run
Open in xt-play

The product is gone! Oh no!

However, don’t worry. Since the database is immutable, we can make a historical query different time. We can do this by adding a qualifier to the query:

Run
Open in xt-play

Conclusion

We’ve shown that it’s possible to use standard SQL to make historical queries against an immutable database, to bring back deleted data.

Now try out part 2.