Hadoop + GPU: Boost Performance of Your Big Data Project by 50x-200x?

This tech study explores the bottlenecks when offloading Hadoop calculations from a CPU to a GPU and suggests what libraries / frameworks to use.

Hadoop has changed the way we deal with big data, helping to improve performance by times. The question is, can we make it work even faster? What about offloading calculations from a CPU to a GPU—designed to perform complex mathematical tasks? In theory, a GPU could perform calculations 50–100 times faster than a CPU.

This paper explores the idea in detail, providing an overview of performance bottlenecks and suggesting the tools to use.

Key take-aways:

  • View the schema of data flow between CPU, GPU, HDD, and memory
  • Discover the impact of data transfer
  • See real-life performance results achieved by different projects
  • Get a list of tools / libraries to use to employ GPU’s capabilities with Hadoop