HomePortfolioIndependent Infrastructure Performance Benchmarking

Independent Infrastructure Performance Benchmarking

Bash
Cloud-Native
Ruby

The cloud infrastructure provider turned to Altoros to do independent performance tests on their virtual machines and provide recommendations on how to make the system more efficient.

Independent Infrastructure Performance Benchmarking

About the project

The results of our assessment revealed that the system’s performance was in fact 20-30% higher than the results provided by the customer. Our engineers also drew up a list of recommendations on how to improve the system’s efficiency and gain competitive advantage.

The need

A third-party company was considering Altoros’s customer for their big data project. They wanted to test the system’s performance to make sure it would be capable to deal with the required load. Altoros’s main task was to provide independent assessment of Linux and custom OS clusters and also draw up some recommendations on how to improve the system’s overall performance.

The challenge

The customer reported that the in-house tests of the cluster demonstrated that the system can process 1TB of data in 16 minutes and 30 seconds. A standard Hadoop distribution was deployed on 100 Red Hat Linux virtual machines. Each had a double core CPU, 10GB of RAM, and 6 TB of disk space. The cloud-native engineers at Altoros had to replicate tests and check the results.

The solution

Altoros tested both Linux and custom OS clusters in the customer’s public cloud according to such parameters as:

  • block size
  • gzip and LZO compression
  • the number of mappers and reducers

Linux clusters demonstrated similar results with enabled and disabled gzip and LZO compression. However, when LZO compression was enabled on the custom OS cluster, its performance improved by 20%. Changing the number and ratios of Map and Reduce tasks (from three to six) during query processing had little effect on the Linux cluster while the custom OS cluster demonstrated better performance with six map tasks.

We also analyzed how much time was spent on completing each task of Map and Reduce jobs for Linux cluster. We performed profiling with Starfish which showed that most time was spent in a shuffle phase when I/O increased. The test was carried out using 100 GB of TeraSort data.

To abide by security means, developers at Altoros delivered an Apache Kafka-based policy protocol.

The outcome

According to Altoros’s tests, a virtual machine with Ubuntu Linux installed processed 1 TB of TeraSort test data in 13.65 minutes, which is 1.2 times faster than in the customer’s test results. Featuring enhanced CPU bursting and improved disk input/output speed, virtual machines with custom OS installed were able to complete the same task in 6 minutes, which is 2.75 times faster than the results demonstrated during the initial benchmarking.

The tests revealed that non-optimized Linux machines become unstable, if a cluster exceeds a certain size. The reports, instructions, and scripts provided by Altoros can be later used by the customer’s team to replicate the test results or to improve the system’s stability.

Technology stack

Server platforms

Ubuntu, CentOS

Client platforms/Application servers

Ubuntu, MacOS, Windows

Programming languages

Ruby, Bash, Scala

Technologies

Apache Hadoop, Ganglia, Opscode Chef

/
01
02
03

Want to develop something similar?

Preloader
Ryan Meharg

Ryan Meharg

Technical Director

ryan.m@altoros.com650 265-2266

4900 Hopyard Rd. Suite 100 Pleasanton, CA 94588