DynamoDB vs Fauna: Terminology and features
Constrained by the lack of flexibility in DynamoDB?
Break away from the inefficiencies of DynamoDB and achieve low-latency ACID compliance with Fauna.
Introduction
DynamoDB’s roots originate from an Amazon internal project called "Dynamo", a database built to address the many growing e-commerce needs of the online shopping service. Inspired by Dynamo and its techniques, DynamoDB provides a database with most operations and management automated behind the scenes. Of the many NoSQL databases out there, perhaps DynamoDB is the closest to Fauna, where both databases share a similar value proposition as "serverless databases". While DynamoDB’s on-demand pricing/scaling model lends itself to the "serverless" philosophy, it misses the mark on developer experience when it comes to multi-region transactions, schema flexibility, geo-distribution, burst scaling, and required developer operations.
As for business viability: evolving a schema overtime with DynamoDB can be an arduous experience given the lack of support for relations and joins. As applications grow, they will likely encounter significant technical debt which typically rears its head in features which cannot be changed without recreating the entire table. DynamoDB better serve’s mature and proven businesses, where all data and CRUD is well understood ahead of time. Furthermore, DynamoDB’s query API doesn’t provide extensive functionality for computations, aggregates, etc., requiring either long-term storage (and updating) of such calculated results or a costly layer of server side logic (along with an increased surface area for bugs).
With the recent wave of serverless adoption, Fauna seeks to be the uncompromising data API for client-serverless applications. Fauna offers a functional query language on top of a strongly consistent distributed database engine capable of modeling relational, document, and graph data. Fauna's value proposition improves on traditional database offerings, by converting the underlying database management infrastructure into a Data API that is well-suited for direct client-side usage, allowing backend developers to focus on the server-side logic which matters the most. The notion of database developer operations does not exist with Fauna. Developers are allowed to fully focus on application specific work, without the burden of maintaining throughput and capacity on an API and database.
Terminology
For clarity, here the terminology that each technology uses to describe itself:
DynamoDB | Fauna | Explanation |
---|---|---|
Item | Document | An individual record in the database. |
Table | Collection | A container for items/documents. |
Partition Key | Not Applicable | DynamoDB requires users to choose a partition key that determines how data is grouped and distributed among partitions. How you choose this key impacts DynamoDB’s scalability. In Fauna, optimal distribution is performed automatically for customers by automatically hashing a document’s ID, without impacting scale, ensuring one less thing for users to worry about. |
Partition Metadata System | Node | In DynamoDB, the Partition Metadata System contains a mapping of items to their respective partitions. In a Fauna cluster, every node has a consistent copy of this information. |
Transaction | Transaction | DynamoDB transactions are only ACID-compliant within a single region. Fauna supports transactions on all cluster configurations across multiple partitions. |
Read Capacity Unit (RCU) | Not Applicable | Each DynamoDB RCU allows for one strongly consistent read, or two eventually consistent reads, per second. RCUs are primarily relevant to Provisioned Mode tables, however, they’re still somewhat relevant to tables utilizing On-Demand Mode, as RCUs still operate under the hood and can limit burst scalability. |
Write Capacity Unit (WCU) | Not Applicable | A DynamoDB WCU reserves throughput capacity for one write per second. WCUs are primarily relevant to Provisioned Mode tables, however, they’re still somewhat relevant to tables utilizing On-Demand Mode, as WCUs still operate under the hood and can limit burst scalability. |
Read Request Unit (RRU) | Read Op | While DynamoDB RCUs vary in relevance to both capacity modes, RRUs are only relevant to On-Demand Mode tables, specifically in regards to the distinct pricing model. Simply put, RRUs are a unit of measurement for expended reads. Similar to RRUs, a Fauna read op is just a billing indicator and does not provision throughput. Where they differ, is how RRU expenditure can vary based on the desired level of consistency. |
Write Request Unit (WRU) | Write Op | DynamoDB WRUs, like RRUs, measure expended writes; though they do not have variable usage determined by strength of consistency. Like WRUs, a Fauna write op is just a billing indicator and does not provision throughput. |
Query APIs
Rather than replace SQL with another query language, the DynamoDB creators opted for a simple API with a handful of operations. Specifically, the API lets developers create and manage tables along with their indexes, perform CRUD operations, stream data changes/mutations, and finally, execute CRUD operations within ACID transactions. While DynamoDB doesn’t support complex querying, this tradeoff leads to reduced latency on a broad set of operations, as the database doesn’t need to process or interpret a query language.
Fauna offers its take on a complex query language in the form of the Fauna Query Language (FQL); a flexible, highly-composable, and expressive query language. While FQL is distinctly different from SQL, developers familiar with the popular functional programming paradigm, will feel right at home. Readers well-versed in SQL might be interested in this in-depth explanation of FQL, written specifically for SQL users.
Indexes
Both Fauna and DynamoDB support indexes which can store subsets of data (i.e. projection), optionally, with a specified order and/or uniqueness constraint. However, this is where the similarities end, as Fauna indexes can perform and persist computations, combine data from multiple collections, ensure strict serializability for reads/writes, and more. To further elaborate, Fauna's indexes can handle multiple sources, sort fields, match fields, and returned values. This differs from DynamoDB where indexes are constructed for a single table and can only match with one attribute, along with the ability to sort on one attribute.
Given its indexing flexibility and support for relational data, Fauna is a powerful tool for the evolution of applications over time.
When using indexes in DynamoDB, careful consideration and forethought is required ahead of time to avoid technical debt, unnecessary expenses, or throttling in the worst case. When strongly consistent queries are desired in an index, DynamoDB allows for a maximum of 5 Local Secondary Indexes (LSI), each of which enable sorting on an attribute specified at index creation (i.e. sort key). Developers should know that Local Secondary Indexes can only be created at the same time that a table is created, and cannot be deleted (without deleting the table) afterwards; no such quantity or creation limits exist for Fauna indexes.
Should eventually consistent queries suffice, DynamoDB offers Global Secondary Indexes (GSI), which allow for querying on a different primary key (and optionally, a different sort key). As for billing, all write operations to a DynamoDB table will be multiplied and applied to relevant indexes, resulting in elevated expenses; Fauna doesn’t charge for index entry updates.
Finally, while GSI throughput is separate from tables, LSI throughput is not. Users of DynamoDB must keep in mind that LSIs are multipliers of traffic, resulting in more dramatic peaks. LSI usage can cripple both Provisioned and On-Demand tables if not properly planned for by manually elevating traffic peaks or adjusting an Auto Scaling plan. This differs from Fauna, where all accommodations for traffic and throughput are not the user’s concern, rather, these factors are handled automatically behind the scenes.
Schema design
DynamoDB is largely schemaless, where the few predefined details are the partition key and/or sort key, along with any Local Secondary Indexes. Like many of its NoSQL siblings, DynamoDB lacks support for relational data and is best designed with a denormalized schema in mind, improving read and write performance over traditional relational databases with numerous complex relationships. To satisfy relational needs, DynamoDB places heavy emphasis on denormalization and single-table design, where developers are responsible for maintaining some form of referential integrity among shared/related data. While DynamoDB and its best practices lend itself well to mature applications with proven scope and data needs, it does not bode well for extensive schema evolution.
DynamoDB’s value is best realized when migrating from a deployed/overloaded database that already satisfies a product or project’s needs. Even with extensive planning and design ahead of time, it’s far from uncommon to completely iterate on a database model due to a variety of external factors (e.g. a business pivot); if not a complete redesign, then usually an iteration on partition keys to avoid "hot" partitions. Developers risk significant technical debt if they build an application and schema using DynamoDB without confidence in their understanding (and longevity of their understanding) of an application’s scope.
Fauna in contrast, inherits much of the iterable relational data modeling of traditional RDBMS, while also meeting the scaling promises of NoSQL (more on this later). While denormalization is a perfectly viable approach, developers are encouraged to take advantage of Fauna's first-class relational and referential support. In addition to document and relational data, Fauna also accommodates graph-like patterns and time-series data, along with advanced multi-tenant capabilities allowing for parent-child relationships among databases. With Fauna, schema iteration is very forgiving, unlike DynamoDB, and provides relational data capabilities which developers already know and love.
Transactional model
Support for serializable multi-item read and write transactions exists in DynamoDB with the caveat that they’re only ACID-compliant within the region they occur in. In particular, developers using multi-region deployments of DynamoDB may witness dirty reads or phantom reads among concurrent transactions in separate regions; writes affecting the same data in separate regions are resolved using "last writer wins" reconciliation, where DynamoDB makes a best effort to determine the last writer. The region limitation is fine for applications which depend on a single-region backend, however when utilizing Global Tables (a multi-region deployment of DynamoDB), successful transactions in one replica table will not provide ACID guarantees for other replica tables, due to the delay between replication/propagation of changes. Such a limitation is not uncommon among applications today, where the solution is usually to direct all transactions to a single region, or to store data based on the location of its origin (e.g. a ride-sharing app might store San Francisco trip/user data in a us-west-2 database, and nowhere else). Keep in mind however, that DynamoDB Global Tables do not allow for designating partial replication among regions (i.e. all replica tables will eventually contain the same data); instead, developers themselves must deploy individual DynamoDB instances in each desired region.
With global distribution in mind, Fauna offers strictly serializable transactions, where the strictness provides the additional guarantee of a real-time serial order for transactions. This is a critical distinction for geo-distribution, where variance in the order of propagated transactions can impact the final state between replicas. Fauna achieves this degree of isolation and distribution with heavy inspiration from Calvin, a cutting edge approach to distributed transactions and replications.
Consistency models
By default, DynamoDB uses eventually consistent reads unless specified otherwise. Strong consistency is available to some degree but only within the context of a single region (i.e. strongly consistent reads only consider writes from the region they read from). This may cause confusion when working with Global Tables, as developers are allowed to make read requests parameterized with "strong consistency" in one region, while writes from another region will eventually be propagated (often in a second or less). Additionally, strongly consistent reads can result in throttling if developers aren’t careful, as only the leader node can satisfy strongly consistent reads; DynamoDB leader nodes are also the only node responsible for writes in a partition (unlike Fauna where every node is a Query Coordinator and can perform writes, etc.), thus they are the most trafficked node and critical to the function of a partition.
Fauna offers strong forms of consistency and isolation across the board in all operations, not only within individual regions but globally. By default, indexes and read-only operations will maintain snapshot or serializable isolation (for lower latencies), however developers are free to re-introduce strict serializability should they desire it. Additionally, indexes which aren’t serialized will re-evaluate their relevant documents (providing strong consistency and isolation) when used within a transaction. Essentially, Fauna requires all writes to be strictly serializable for strong consistency across the globe, while also letting applications utilize slightly weaker forms of isolation and consistency for less-critical functionality, enabling faster reads; should the client or developer need stronger consistency in a read, they have the option to introduce strict serializability. This approach protects against inconsistent write results/transactions, which are far more consequential to a business than a stale read. Again, strong consistency is available as an option for indexes and read-only operations. For more information regarding consistency models and their tradeoffs, read this piece written by one of the original Calvin authors.
Storage
It remains unclear exactly what storage engines are utilized under the hood of DynamoDB today. At the time of the Dynamo whitepaper’s publishing, Dynamo utilized a handful of storage engines (most notably the Berkeley Database (BDB) Transactional Data Store and MySQL) through a "pluggable persistence component". Many years have passed since the paper’s publishing however, and there’s no public documentation guaranteeing these storage engines are still in use. Compression, while an AWS recommended practice for storing large attributes, is not natively implemented.
Fauna uses an LSM tree-based storage engine that provides LZ4 compression. By default, Fauna stores the last 30 days of history for each collection (can be as long as desired or even indefinite), and temporal queries may use any point-in-time snapshot within that history. These temporal queries also offer valuable rollback capabilities for applications and their backends, a luxury which often isn’t afforded outside of a full blown database recovery. Finally, temporal storage provides simple recovery after accidental data loss and streamlined integration debugging.
Security
Like many AWS products, DynamoDB inherits the excellent AWS Identity and Access Management (IAM) feature. With it, developers can specify coarse and granular user permissions, applicable to the entire DynamoDB API. Furthermore, developers can specify conditions which must be met before granting permissions (e.g. an IAM policy only grants access to items where the client’s UserID matches an item’s UserId). Authentication and authorization aside, DynamoDB also offers encryption at rest using 256-bit Advanced Encryption Standard (AES-256) and three decryption key types, each with varying customer control. Finally, DynamoDB’s security and compliance is audited by several third-parties, as is standard for many AWS products.
Fauna offers varying levels of access control, however the requirement of authentication is constant; Fauna cannot be left accidentally unprotected. Developers have the freedom to define coarse role-based permissions, more specific identity-based permissions, and even finer attribute-based access control (ABAC). With reserved FQL functions, clients can easily authenticate (e.g. Login) and be provided secure access tokens for database connection. Lastly, Fauna's multi-tenancy features provide even further protection through the natural database hierarchies that spawn themselves (i.e. authenticated users of a database cannot necessarily access a parent or sibling database). With its out-of-the-box tooling, Fauna meets the authentication and authorization requirements for a wide variety of applications, eliminating the need for equivalent custom solutions.
Fault tolerance
DynamoDB relies on AWS Availability Zones (AZ), replication, long-term storage to protect against data loss or service failure. Table partitions consist of three nodes stored in separate AZs, one of which is a Leader Node capable of accepting writes and strongly-consistent reads, while the remaining two nodes simply provide additional replication storage and eventually consistent reads. Customers of DynamoDB should know that Leader Nodes are potential bottlenecks in their application, should they perform too many writes and/or strongly consistent reads to a partition. This differs from Fauna where every node’s read and write capabilities are equal, thus no single node can be a bottleneck.
Fauna relies on its unique transactional protocol derived from Calvin and its multi-cloud topology to achieve fault tolerance.
Because Fauna is strongly consistent, all transactions are first made durable in the transaction log before being applied to data nodes, so that hardware outages don’t affect data correctness or data loss. If a node is down, the length of the transaction log is extended so that it can apply the transactions it missed when it comes back online. In Fauna, as long as you receive a commit for a transaction, you are guaranteed total safety against data loss.
Also, within Fauna's architecture, a functioning system must contain at least three replicas. A logical replica contains numerous nodes. All nodes can simultaneously serve as a query coordinator, data replica, and log replica, with no individual node being a single point of failure. Should a node fail or perform poorly as a data replica, temporarily or permanently, Fauna will smartly redirect reads to a non-local copy until that node becomes available again. Because Fauna's nodes are distributed across multiple cloud platforms, users are shielded from cloud provider outages as well as regional outages within a single provider.
Scalability
Both DynamoDB and Fauna provide abstractions over traditional server hardware specs with "serverless" pricing and consumption models. Along with these new serverless concepts, both databases aim to absorb the responsibility of scaling to customer needs, however DynamoDB still leaves significant operational work and overhead for customers. While DynamoDB is a managed service, you remain responsible for the bulk of the operational heavy lifting. If you’re using DynamoDB, you have to think upfront about volume and scale, continuously manage these parameters, and explicitly provision your capacity to meet your needs.
Understanding both the consumption model and data distribution concepts is particularly critical when using DynamoDB, as even though there are scaling features to better accommodate traffic, they all expose windows where throttling or failure is possible; in particular, developers should be familiar with DynamoDB’s Read Capacity Units (RCUs), Write Capacity Units (WCUs), capacity modes, table partitioning behavior, partition limits, and "hot" partitions.
In contrast, Fauna is built to auto-scale without any input from customers. You are never responsible for provisioning capacity, or tweaking parameters to achieve a desired level of throughput. It works on a utility model, much like your electrical outlet. Plug-in and go, never worry about running out of resources in peak times. Fauna achieves this by maintaining several consistent, full replicas of customer data. A replica consists of several geographically-aware nodes, each with a partition of the full dataset in a single local environment. As mentioned earlier, all nodes share the same set of capabilities (query coordinator, data replica, and log replica), each able to perform reads and writes. Fauna scales its services behind the scenes by adding more full-copy replicas or adding more nodes to a single replica, which requires no additional downtime, manual configuration, or changes to drivers. As a customer of Fauna, you can assume infinite capacity and march on.
Fauna is multi-region by default (in fact, it’s global), with uncompromising consistency and isolation, as was elaborated on earlier. Provisioning throughput or capacity is not a concern nor a reality for customers, where the only information of relevance is the pricing/consumption model. Specifically, Fauna's consumption model primarily focuses on read and write ops, which are almost identical to DynamoDB’s RRUs and WRUs, where they’re simply a metric for representing on-demand usage.
Operations
While the responsibilities of traditional database operations have been abstracted away, DynamoDB customers still have a handful of DynamoDB-specific responsibilities, with the two major items being (1) designing an optimal partition key (including read/write sharding if needed), and (2) specifying one of two capacity modes along with their parameters. Developers can implement and tweak DynamoDB deployments through the AWS CLI, AWS Management Console, AWS SDK, NoSQL Workbench, or directly through the DynamoDB low-level API.
For the responsibilities which don’t fall under the customer’s jurisdiction, many fundamental operations (e.g. partition splitting) are performed by DynamoDB’s internally developed tool, Auto Admin. Additionally, DynamoDB is known to rely on several AWS services to achieve certain functionality (e.g. Auto Scaling uses CloudWatch, SNS, etc.), though the exact scope of this is unknown.
Further elaborating on Auto Admin, the tool is akin to an automated database administrator (DBA), responsible for managing DynamoDB’s Partition Metadata System, Partition Repair, Table Provisioning, and more. Although it isn’t consistently documented, it appears that Auto Admin shares some partition splitting and provisioning functionality with DynamoDB’s Adaptive Capacity, where the most obvious example of this is Adaptive Capacity’s ability to isolate frequently accessed items.
Much of Fauna's infrastructure management relies on the same Calvin-inspired protocol and consistency mechanisms provided to customers, with the addition of some internal process scheduling. Changes to a deployment are performed within the safety of a single transaction, where the final state is once again evaluated by Fauna, before being applied. The internal transactions used for scaling Fauna deployments allow for easy-to-reason-with and seamless migration of data between nodes.
In conclusion, it’s worth highlighting that developer operations occur seamlessly with zero downtime and user maintenance; developers are free to focus on what matters most, building an excellent application.
Jepsen tests
Jepsen tests along with their associated tools and writing, are widely respected among database engineers. The results of a Jepsen test aren’t a simple pass-fail, but are more akin to diagnoses and postmortems; specifically, comparing a database’s performance to its value propositions and elaborating on promises that aren’t sufficiently met. Although DynamoDB lacks an official Jepsen test, it’s one of the most popular NoSQL databases in use today and as an AWS product, is likely to be heavily audited, tested, and scrutinized.
Fauna's goal with Jepsen has been to conduct an exhaustive investigation to identify and fix any errors in the implementation, integrate the resulting tests into continuous integration, and to have a trusted third party verify both public consistency claims and the effectiveness of the core architecture. The current Fauna Jepsen report, which covers versions 2.5.4 and 2.6.0 and represents three months of detailed work, clearly shows Fauna's commitment to providing users with a seamlessly-correct datastore.
"Fauna is based on peer-reviewed research into transactional systems, combining Calvin’s cross-shard transactional protocol with Raft’s consensus system for individual shards. We believe Fauna's approach is fundamentally sound… Calvin-based systems like Fauna could play an important future role in the distributed database landscape."
Summary
DynamoDB aims to provide a fully managed, multi-region, and multi-master, serverless database for internet-scale applications. However, each one of these value propositions has a caveat. While traditional database resources are managed for customers, provisioning and/or designing around DynamoDB’s abstract capacity units is still required. Multi-region and multi-master deployment is available, but at the cost of strong consistency and isolation. Serverless scaling is achievable, but only if developers design an optimal partition key and strategize throughput escalation.
DynamoDB schemas often have little room to grow given their lack of support for relational data (an almost essential function for evolving applications); the heavy-emphasis on single-table design to support relational-like access patterns, leaves customers with the responsibility of maintaining the correctness of denormalized data. Finally, customers are required to build their applications with numerous inconsistencies, conflicts, and race-conditions in mind, or risk producing odd and unpredictable errors caused by DynamoDB.
On the other hand, Fauna promises a highly flexible, zero-maintenance, globally-distributed datastore as an API; with a data model that lets you work with both documents and relations, Fauna simplifies your initial development as well as ongoing evolution of your application; there is no need to write application logic to handle odd bugs, errors, and race-conditions found in many databases with poor consistency and isolation. Transactional by design, Fauna ensures that you’re not locked into limitations when using transactions — your data is always consistent, and there are no constraints placed on you to shard your data in a specific way, or limit the number of keys you use. With Fauna you never have to worry about typical database tasks such as provisioning, sharding, correctness, replication, etc. Consequently, developers find Fauna more flexible to use, and are completely free from backend heavy lifting that is required when using DynamoDB.
Appendix
Deep Dive into DynamoDB Scalability Architecture
Behind the scenes DynamoDB distributes data and throughput for a table among partitions, each outfitted with 10GB of storage. Data distribution to partitions relies on a table’s partition key, and throughput is specified with Read Capacity Units (RCUs) and Write Capacity Units (WCUs); where RCUs and WCUs specify the upper limits of a partition’s read and write capacity per second (a single RCU allows for either 2 eventually consistent reads or 1 strongly consistent read). Note that while RCUs and WCUs are specified at the table-level, in actuality, DynamoDB does not directly limit throughput for tables. Instead, these limits apply to a table’s underlying partitions, where RCUs and WCUs are evenly distributed among all. This even distribution of throughput used to be a common concern, as it often led to over-provisioning to meet the needs of "hot" partitions (partitions with disproportionate traffic compared to their peers).
DynamoDB has both Burst Capacity and Adaptive Capacity to address hot partition traffic. Burst Capacity utilizes unused throughput from the past 5 minutes to meet sudden spikes in traffic, and Adaptive Capacity borrows throughput from partition peers for sustained increases in traffic. DynamoDB has also extended Adaptive Capacity’s feature set with the ability to isolate frequently accessed items in their own partitions. Note that partitions have a hard limit of 3000 RCUs and 1000 WCUs, meaning a frequently accessed item which is isolated in its own partition cannot satisfy an access pattern that exceeds the partition’s hard limits. This is unlikely to be an issue for most applications, however should it arise, Global Tables or a similar implementation can resolve it (strongly consistent reads will still be limited however). Despite DynamoDB releasing several features and improvements targeting hot partitions, they still have a negative impact on a table’s performance, though this consequence is not as significant as it once was. Essentially, the scaling performance of DynamoDB revolves around the design quality of a partition key.
To make throughput scaling on tables easier, DynamoDB offers On-Demand Mode and Auto Scaling (under the Provisioned Mode). On-Demand Mode is DynamoDB’s more recent and abstract take on hands-free scaling of throughput, where traffic is well accommodated for, up to double the table’s previously recorded peak. DynamoDB may require 30 minutes before adjusting to a new peak, therefore developers should be wary of traffic which surpasses twice their previous peak, as throttling can occur. One final note regarding On-Demand Mode is the distinct pricing model, where sustained traffic could result in costs up to 6.94x that of provisioned capacity. Despite being DynamoDB’s best solution for rapid and automatic scaling, the significantly higher cost suggests On-Demand Mode is best suited only for applications which have unpredictable or unknown workloads. Auto Scaling, which is only available under the Provisioned Mode, is DynamoDB’s first iteration on convenient throughput scaling. It raises or lowers read and write capacity based on sustained usage, leaving spikes in traffic to be handled by a partition’s Burst and Adaptive Capacity features.
Additionally, it’s the customer’s responsibility to specify upper and lower bounds to throughput, along with a target utilization rate to specify a consumption ratio which should consistently be met (e.g. 70% of throughput should be used as often as possible). While Auto Scaling and Provisioned Mode are more cost-efficient than DynamoDB’s On-Demand Mode, they don’t handle unforeseen spikes in traffic (which surpass the table’s current overall throughput capacity) as well as On-Demand Mode does. This is due to the watermarks which trigger Auto Scaling’s functionality requiring sustained increases or decreases in traffic. In summary, developers have many parameters and provisioning ops they must keep in mind while using DynamoDB, despite the layers of abstractions (e.g. RCUs and WCUs).
Regarding geo-distribution, DynamoDB offers Global Tables: multi-region deployments of replica tables. Each replica table is capable of accepting both reads and writes, with writes eventually replicating to sibling replica tables (usually under a second). Conflicts in Global Tables are resolved with a "last writer wins" strategy, where all replica tables agree on the latest write and update accordingly. Customers ought to keep in mind that replication under Global Tables will increase throughput traffic among the replica tables, with the primary concern being WCUs as they’re the lower throughput limit (1000 WCUs vs 3000 RCUs).
If you enjoyed our blog, and want to work on systems and challenges related to globally distributed systems, and serverless databases, Fauna is hiring
Subscribe to Fauna's newsletter
Get latest blog posts, development tips & tricks, and latest learning material delivered right to your inbox.