DBaaS Evaluation: Couchbase Cloud, MongoDB Atlas, and Amazon DynamoDB

Our new report compares the performance of three NoSQL databases as a service (DBaaS) while processing update-heavy activities, scans, pagination, and JOINs.

Why database as a service?

NoSQL solutions include a wide variety of database technologies that were developed in response to rapidly growing data sets, data structure organization, and data access management. In comparison, relational databases were not designed to provide the scalability and agility needed by modern high-load apps. While NoSQL systems can help to achieve the performance required by workloads today, they also have complex structures with multiple components.

Compared to traditional databases, NoSQL systems are more scalable and provide superior performance. However, the performance increase comes as a cost both in operational expenses and manpower. NoSQL solutions require significant effort and time for deployment, configuration management, and support. Performance drops caused by lag or other factors can lead to major issues and serious troubleshooting.

Global NoSQL database market (Source)

As an alternative, a database as a service (DBaaS) occurred as a fully managed solution deployed in the cloud. NoSQL DBaaS providers manage these for their customers, enabling companies that concentrate on optimizing resources to benefit from the technology. As a result, managed NoSQL databases remove the operational responsibilities from engineers, enabling organizations to focus manpower on development. Most DBaaS are offered in pay-per-usage models, so customers are able to control their operational expenses by maintaining resource limitations.

Today, we are announcing the results of our latest NoSQL DBaaS benchmark (check out the official press release). The new study compares the performance of three NoSQL DBaaS systems: Couchbase Cloud, MongoDB Atlas, and Amazon DynamoDB. This post sheds some light on the study.

 

What was compared?

Each database was measured in relative performance in terms of latency and throughput. The evaluation was conducted on three different cluster configurations—6, 9, and 18 nodes—as well as under 4 different workloads.

Couchbase Cloud combines features of a key-value store to perform operations involving single documents. The DBaaS also acts as a schemaless document store to access the documents through N1QL queries. For each cluster size of 6, 9, and 18 nodes, the r5.2xlarge instances were used. Each node was configured to run the Data, Index, and Query services. The Data service is the most fundamental of all Couchbase services, providing access to data in memory and on disk. The Index service supports the creation of primary and global secondary indexes on items stored within Couchbase Server. The Query service supports the querying of data by means of SQL- and N1QL-like query language and depends on both the Index and Data services.

MongoDB Atlas makes use of a hierarchical cluster topology. For each cluster size, a config server was deployed as a three-member replica set (a separate machine, not counted in a cluster). Each shard was deployed as a three-member replica set (one primary, two secondaries). MongoDB’s routers were deployed on each node for each shard.

Amazon DynamoDB is a NoSQL DBaaS enabling users to choose between two capacity modes for processing reads and writes: on-demand and provisioned. For this evaluation, the provisioned mode was chosen in order to specify the biggest number of reads and writes per second as individual settings. Autoscaling was disabled to maintain parity with MongoDB Atlas and Couchbase Cloud.

To ensure the evaluation results are repeatable and verifiable, the benchmark was conducted on Amazon Elastic Compute Cloud (EC2) instances. For added consistency, we utilized the Yahoo! Cloud Serving Benchmark (YCSB)—an open-source specification and program suite for evaluating retrieval and maintenance capabilities of computer programs.

Components of the YCSB client

 

Workload with 50% reads and 50% updates

One of the scenarios was an update-heavy workload that simulated typical actions performed on an e-commerce website: 50% of reading operations and 50% of updates. This is a basic key-value workload executed with the following settings:

  • The read/update ratio was 50%–50%.
  • The Zipfian request distribution was used.
  • The size of a data set was scaled in accordance with the cluster size: 50 million records (each 1 KB in size, consisting of 10 fields and a key) on a 6-node cluster, 100 million records on a 9-node cluster, and 200 million records on a 18-node cluster.

The following queries were used to perform the update-heavy workload.

READ
UPDATE
Couchbase N1QLcollection.get(id, getOptions().timeout(kvTimeout))collection.upsert(id, content,
upsertOptions().timeout(kvTimeout).
expiry(documentExpiry).
durability(persistTo, replicateTo))
MongoDB Querydb.ycsb.find({_id: $1}) db.ycsb.update(
   { _id: $1 },
   {
    $set: {
       fieldN: $2
     }
   })
Amazon DynamoDB {
    "TableName": "usertable",
    "Key": {
        "firstname": {
            "_id": "$1"
        }
    },
    "ConsistentRead": "false"
}
{    "TableName":"usertable",
    "Key": {_id={S: $1},
    "AttributeUpdates": {
        $2={
          Value: {
            S: $3
          },
          "Action": "PUT"
        }
    }
}

Configurations of each DBaaS and a detailed description of the testing environment can be found in the full report.

 

Performance for the update-heavy workload

On 6-node clusters, Couchbase Cloud and Amazon DynamoDB performed similarly. Couchbase Cloud and Amazon DynamoDB had a throughput of 33,460 ops/sec and 30,400 ops/sec, respectively. Meanwhile, MongoDB Atlas had a throughput of 19,144 ops/sec, much lower than Couchbase Cloud and Amazon DynamoDB. Couchbase Cloud significantly outperformed MongoDB Atlas and Amazon DynamoDB on 9-node and 18-node clusters.

Throughput and latency for the update-heavy workload

Couchbase Cloud was able to process up to 119,000 ops/sec on a 9-node cluster and 168,908 ops/sec on an 18-node cluster, while Amazon DynamoDB managed 46,344 ops/sec on a 9-node cluster and 54,344 ops/sec on an 18-node cluster. MongoDB Atlas hit the throughput limit on a 9-node cluster with 27,544 ops/sec. The 18-node cluster throughput of MongoDB Atlas grew constantly and did not appear to hit a throughput limit.

Due to a significant amount of failed operations, Amazon DynamoDB had unstable results. On each cluster configuration, Amazon DynamoDB had a 40% failure rate for update operations and a 1% failure rate for read operations.

Based on our evaluation, NoSQL DBaaS systems did not perform as well as their custom software binary counterparts. However, the loss in performance is the tradeoff for not having to manage the complexities of manual deployment and the associated operational costs, which is natural. This way, companies using NoSQL DBaaS are able to save on costs and keep engineers focused on development instead of maintenance.

 

More performance results?

To learn more about how we configured each solution, as well as how each performed in the evaluation, check out the full version of the comparison. In addition to the update-heavy workload, the study also compares each of the DBaaS systems across other scenarios: short-range scan (95% scans and 5% updates), pagination (a query with a single filtering option to which an offset and a limit are applied), and JOIN (with grouping and ordering applied).

The findings show how hardly any DBaaS can perfectly fit all the requirements of any given use case. Every solution has its advantages and disadvantages that become important in varying degrees, depending on the specific criteria to meet. However, fundamentally, DBaaS helps engineers to reduce the time for deployment, configurations, and support.

Download the full report here.

 

Further reading


This blog post was written by Artsiom Yudovin and Carlo Gutierrez,
edited by Alex Khizhniak.