Featured Benchmarks: VoltDB, NuoDB, Xeround, TokuDB, Clustrix, and MemSQL
Benchmarking NewSQL databases
The best way to see how a NewSQL database will work with your application is to run a benchmark. Unfortunately, it is not as easy as doing more arbitrary tests. Emerging technologies and exotic architectures, such as in NewSQL databases, make things even more complicated. Just like NoSQL products, they seek to provide scalable performance for online transaction processing while maintaining the ACID guarantees of a traditional single-node database system. To get a realistic view of expected performance, you need to know exactly how your users interact with the application. In addition, you have to find the right tools and techniques to automate the process.
Before investing your time into testing, check out these recent benchmarks of six popular NewSQL databases.
In his blog post, Henning Diedrich provides test results for VoltDB used with an Erlang driver in an online gaming application. Running on a single core (
-smb +S 1) with a 12-node VoltDB server cluster listening on the other side, the Erlang driver demonstrated a throughput of 26,500 transactions per second (TPS) and more per a single core. When fully utilizing a 16-core cluster instance as the client node, it routinely reached a throughput of 260,000 TPS per machine (the CPU specs can be found in the original blog post). When using eight client nodes connected to a 12-node VoltDB cluster, the average number of transactions per second for each client node reached 109,689. The total result for the whole cluster was 877,519 TPS.
As Kristóf Kovács, a software architect and blogger, wrote, although VoltDB is not widely known, it is definitely a worthy solution that is best suitable for use cases, where you need to act fast on massive amounts of incoming data. Among such examples are point-of-sale data analysis and factory control systems.
A recent benchmark of NuoDB demonstrated a significant improvement if compared to previous results, where NuoDB Starlings release 1.0 scaled across 24 machines, reaching 1.09 million TPS. According to NuoDB’s official blog, this time the system achieved 1.84 million TPS, running on 32 machines in a private cloud. The results were obtained with the Yahoo! Cloud Serving Benchmark (YCSB), an open-source framework used to test data stores for highly distributed, cloud-based applications.
NuoDB focuses on providing scalability and high performance in private clouds. It is the industry’s first and, so far, only patented, elastically scalable cloud data management system (CDMS).
Xeround vs. Amazon RDS
A recently updated benchmark published by Avi Kapuya compares performance of the Xeround cloud database vs. Amazon Relational Database Service (RDS). Test results clearly demonstrate that Amazon RDS is better at dealing with small numbers of concurrent users. However, its performance decreases significantly as the number of users grows. The Xeround cloud database, on the other hand, becomes faster with more concurrent users, which makes it better than Amazon RDS.
The reason for Xeround’s success is the high level of parallelism achieved by its creators, which makes this database suitable for applications that require high concurrency.
Percona Server TokuDB vs. InnoDB
Vadim Tkachenko, who leads Percona’s development team, has recently published a comparison of Percona Server TokuDB vs. InnoDB. The benchmarking revealed that, over time, InnoDB’s performance declines steadily from 24,000 TPS to 18,000 TPS. In addition, InnoDB cannot make a five-hour run. After three hours, the disk is full as the size of the data hits the 210 GB mark with 234,238,440 inserted records. TokuDB averages around 14,000 TPS with some periodical drops into the 10,000 TPS area. After a five-hour run, the size of the data in TokuDB is roughly 50 GB with 276,934,863 records.
The results of the test reveal that InnoDB is somewhat faster, but, over time, the throughput declines. Mr. Tkachenko predicts that, eventually, InnoDB’s performance would drop to 14,000 TPS and below. Unfortunately, the size of the SSD used for testing was not enough to run the benchmark long enough. In its turn, TokuDB was able to compress the dataset to 25% of the original size, which is a great advantage over InnoDB. In fact, it would be possible to fill TokuDB’s tables with one billion of rows using the same disk size. With InnoDB, we would need 1 TB of disk space and the resulting performance would likely be the same or worse.
An older blog post by Vadim Tkachenko also explored Clustrix benchmarks running under TPC-C MySQL workloads with high concurrency. According to the test results, as the number of nodes increases, Clustrix’s performance grows almost linearly. It demonstrates great scalability, automatically distributing data and increasing the data/memory ratio. However, the performance is not good under a workload with a small number of threads.
Thus, great scalability, out-of-the-box fault-tolerance, and auto-rebalancing make Clustrics a very good option for workloads with high concurrency. Still, it is probably not the best choice for single-thread or low concurrency workloads.
MemSQL vs. PostgreSQL
In his blog post, front-end/back-end developer Manuel van Rijn describes a performance test he performed to find the best database for a Ruby-on-Rails application. According to the test results, on average, PostgreSQL spends 14.312 milliseconds to process 6,850 queries. The average result for MemSQL is 6.635 milliseconds per 6,850 queries, not taking into account the first result. This makes MemSQL 7.677 milliseconds faster than PostgreSQL.
The author concludes that MemSQL is blazing fast with the possibility to use all MySQL client tools. In addition, since it uses MySQL, troubleshooting becomes much easier. The cons include a huge cache folder (of 5.5 GB) created by the database in the plancache folder. Another downside of MemSQL is that you cannot create users, at least in the developer edition used for the benchmark.
Testing a NewSQL data store on your own?
Naturally, every solution has its pros and cons. When benchmarking a NewSQL database, it is important to simulate the exact conditions and workloads users will create for your system. You should also take into account that NewSQL solutions are built for processing small and frequent requests and focus on providing fast response times. To automate the process of benchmarking, search for nonfunctional testing techniques—specifically, stress, load, and scalability testing. Finally, keep in mind that the best technique/software for your application will depend on what it actually does.
If you have performance measurements for other NewSQL databases, feel free to share them with us.
- The List of Featured Graph Database Overviews and Benchmarks
- NewSQL News Summary: May 2013
- The 2014 NoSQL Tech Comparison: Cassandra (DataStax), MongoDB, and Couchbase
About the author