Using change data capture to perform flexible aggregations with DynamoDB and Druid

DynamoDB is often a perfect fit as the primary, operational system of record store for many types of application. It is fast, maintenance free and (if you use it well) economical. However it cannot perform aggregations or provide analytics on the data it holds. Reflecting the same data in another store like Apache Druid is commonplace. The below video demonstrates this idea in operation. The DynamoDB system of record is updated and Apache Druid is then used to perform aggregations on up to date values....

August 7, 2022 · Alex Reid

DynamoDB pagination with page numbers in URLs

Back when we all used SQL databases, it was common to paginate through large result sets by appending LIMIT offset, rows per page to a SELECT query. Depending on the schema, data volume and database engine, this was inefficient to varying degrees. On smaller result sets and with the right indexes, it was… posssibly OK. On larger result sets, the high page numbers would get progressively slower. Databases like DynamoDB prevent this inefficiency by handling pagination differently....

October 27, 2021 · Alex Reid

Filtering and pagination with Cloud Bigtable

In the previous series of posts, we built a data model capable of filtering and paginating product comments with DynamoDB. This post explores how we could solve the same problem with Cloud Bigtable. You might wonder why another technology is now being discussed. It is my belief that a lot of the thinking that goes into a data model design is somewhat portable, whether it be DynamoDB, Cloud Bigtable, Cassandra, HBase, or maybe even Redis....

December 2, 2020 · Alex Reid

Storage and retrieval of comment statistics in DynamoDB, using index overloading and sparse indexes

This series of posts demonstrates efficient filtering and pagination with DynamoDB. Part 1: Duplicating data with Lambda and DynamoDB streams to support filtering Part 2: Using global secondary indexes and parallel queries to reduce storage footprint and write less code Part 3: How to make pagination work when the output of multiple queries have been combined Part 4: Storage and retrieval of comment statistics using index overloading and sparse indexes...

November 25, 2020 · Alex Reid

DynamoDB pagination when multiple queries have been combined

This series of posts demonstrates efficient filtering and pagination with DynamoDB. Part 1: Duplicating data with Lambda and DynamoDB streams to support filtering Part 2: Using global secondary indexes and parallel queries to reduce storage footprint and write less code Part 3: How to make pagination work when the output of multiple queries have been combined Part 4: Storage and retrieval of comment statistics using index overloading and sparse indexes...

November 24, 2020 · Alex Reid

Filtering with GSIs and parallel queries in DynamoDB

This series of posts demonstrates efficient filtering and pagination with DynamoDB. Part 1: Duplicating data with Lambda and DynamoDB streams to support filtering Part 2: Using global secondary indexes and parallel queries to reduce storage footprint and write less code Part 3: How to make pagination work when the output of multiple queries have been combined Part 4: Storage and retrieval of comment statistics using index overloading and sparse indexes...

November 21, 2020 · Alex Reid

Filtering without using filters in DynamoDB

It never ceases to amaze me just how much is possible through the seemingly constrained model that DynamoDB gives us. It’s a fun puzzle to try to support access patterns beyond a simple key value lookup, or the retrieval of an ordered set of items. The NoSQL gods teach us to store data in a way that mirrors our application’s functionality. This is often achieved by duplicating data so that it appears in multiple predefined sets for inexpensive retrieval....

November 9, 2020 · Alex Reid