{"id":52389,"date":"2012-12-06T15:42:07","date_gmt":"2012-12-06T12:42:07","guid":{"rendered":"https:\/\/www.altoros.com\/blog\/?p=52389"},"modified":"2020-03-30T17:02:34","modified_gmt":"2020-03-30T14:02:34","slug":"hadoop-on-windows-azure-hive-vs-javascript-for-processing-big-data-2","status":"publish","type":"post","link":"https:\/\/www.altoros.com\/blog\/hadoop-on-windows-azure-hive-vs-javascript-for-processing-big-data-2\/","title":{"rendered":"Hadoop on Windows Azure: Hive vs. JavaScript for Processing Big Data"},"content":{"rendered":"<div id=\"ez-toc-container\" class=\"ez-toc-v2_0_79_2 counter-hierarchy ez-toc-counter ez-toc-transparent ez-toc-container-direction\">\n<div class=\"ez-toc-title-container\">\n<p class=\"ez-toc-title\" style=\"cursor:inherit\">Table of Contents<\/p>\n<span class=\"ez-toc-title-toggle\"><a href=\"#\" class=\"ez-toc-pull-right ez-toc-btn ez-toc-btn-xs ez-toc-btn-default ez-toc-toggle\" aria-label=\"Toggle Table of Content\"><span class=\"ez-toc-js-icon-con\"><span class=\"\"><span class=\"eztoc-hide\" style=\"display:none;\">Toggle<\/span><span class=\"ez-toc-icon-toggle-span\"><svg style=\"fill: #999;color:#999\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\" class=\"list-377408\" width=\"20px\" height=\"20px\" viewBox=\"0 0 24 24\" fill=\"none\"><path d=\"M6 6H4v2h2V6zm14 0H8v2h12V6zM4 11h2v2H4v-2zm16 0H8v2h12v-2zM4 16h2v2H4v-2zm16 0H8v2h12v-2z\" fill=\"currentColor\"><\/path><\/svg><svg style=\"fill: #999;color:#999\" class=\"arrow-unsorted-368013\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"10px\" height=\"10px\" viewBox=\"0 0 24 24\" version=\"1.2\" baseProfile=\"tiny\"><path d=\"M18.2 9.3l-6.2-6.3-6.2 6.3c-.2.2-.3.4-.3.7s.1.5.3.7c.2.2.4.3.7.3h11c.3 0 .5-.1.7-.3.2-.2.3-.5.3-.7s-.1-.5-.3-.7zM5.8 14.7l6.2 6.3 6.2-6.3c.2-.2.3-.5.3-.7s-.1-.5-.3-.7c-.2-.2-.4-.3-.7-.3h-11c-.3 0-.5.1-.7.3-.2.2-.3.5-.3.7s.1.5.3.7z\"\/><\/svg><\/span><\/span><\/span><\/a><\/span><\/div>\n<nav><ul class='ez-toc-list ez-toc-list-level-1 ' ><li class='ez-toc-page-1 ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-1\" href=\"https:\/\/www.altoros.com\/blog\/hadoop-on-windows-azure-hive-vs-javascript-for-processing-big-data-2\/#Performance_evaluation\" >Performance evaluation<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-2\" href=\"https:\/\/www.altoros.com\/blog\/hadoop-on-windows-azure-hive-vs-javascript-for-processing-big-data-2\/#Further_reading\" >Further reading<\/a><\/li><\/ul><\/nav><\/div>\n<h3><span class=\"ez-toc-section\" id=\"Performance_evaluation\"><\/span>Performance evaluation<span class=\"ez-toc-section-end\"><\/span><\/h3>\n<p><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.altoros.com\/blog\/wp-content\/uploads\/2020\/03\/wp_hadoop_big_1.png\" alt=\"\" width=\"158\" height=\"173\" class=\"alignright size-full wp-image-52390\" \/><\/a>For some time, Microsoft did not offer a solution for processing big data in cloud environments. SQL Server is good for storage, but its ability to analyze terabytes of data is limited. Hadoop, which was designed for this purpose, is written in Java and was not available to .NET developers. So, Microsoft launched the <strong>Hadoop on Windows Azure<\/strong> service to make it possible to distribute the load and speed up big data computations.<\/p>\n<p>The R&amp;D engineers of Altoros evaluated two out-of-the-box ways of processing big data with Hadoop on Windows Azure\u2014<em>Hive querying and JavaScript implementations<\/em>\u2014and compared their performance.<\/p>\n<p>For the research, we created eight types of queries in both languages and measured how fast they were processed. Since we wanted to test how the system would handle big data, we downloaded information on US Air Carrier Flight Delays from Windows Azure Marketplace and generated a data set of 9.15 GB.<\/p>\n<p>The <a href=\"https:\/\/www.altoros.com\/research-papers\/hadoop-on-windows-azure-hive-vs-javascript-for-processing-big-data\/\">evaluation<\/a> reveals how additional grouping parameters of the query and type of an arithmetic operation affect the throughput. It also shows the dependency between the number of MapReduce tasks and the speed of calculations. In addition, the paper contains conclusions on how the HDFS block size (8 MB, 64 MB, and 256 MB) influences performance. You will find two tables and three graphs with the findings.<\/p>\n<p>&nbsp;<\/p>\n<h3><span class=\"ez-toc-section\" id=\"Further_reading\"><\/span>Further reading<span class=\"ez-toc-section-end\"><\/span><\/h3>\n<ul>\n<li><a href=\"https:\/\/www.altoros.com\/research-papers\/hadoop-distributions-cloudera-vs-hortonworks-vs-mapr\/\">Hadoop Distributions: Cloudera vs. Hortonworks vs. MapR<\/a><\/li>\n<li><a href=\"https:\/\/www.altoros.com\/blog\/performance-of-raid-arrays-on-windows-azure-an-alternative-to-horizontal-scaling\/\">Performance of RAID Arrays on Windows Azure: an Alternative to Horizontal Scaling<\/a><\/li>\n<li><a href=\"https:\/\/www.altoros.com\/blog\/hadoop-distributions-comparison-and-top-5-trends\/\">Hadoop Distributions: Comparison and Top 5 Trends<\/a><\/li>\n<\/ul>\n","protected":false},"excerpt":{"rendered":"<p>Performance evaluation<\/p>\n<p>For some time, Microsoft did not offer a solution for processing big data in cloud environments. SQL Server is good for storage, but its ability to analyze terabytes of data is limited. Hadoop, which was designed for this purpose, is written in Java and was not available to .NET [&#8230;]<\/p>\n","protected":false},"author":34,"featured_media":52394,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"content-type":"","footnotes":"","_links_to":"","_links_to_target":""},"categories":[7],"tags":[894,895],"class_list":["post-52389","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-news-and-opinion","tag-benchmarking","tag-research-and-development"],"acf":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v26.6 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Hadoop on Windows Azure: Hive vs. JavaScript for Processing Big Data | Altoros<\/title>\n<meta name=\"description\" content=\"The blog post shares the performance comparison of Hive querying and JavaScript implementations for processing big data with Hadoop on Windows Azure.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.altoros.com\/blog\/hadoop-on-windows-azure-hive-vs-javascript-for-processing-big-data-2\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Hadoop on Windows Azure: Hive vs. JavaScript for Processing Big Data | Altoros\" \/>\n<meta property=\"og:description\" content=\"Performance evaluation For some time, Microsoft did not offer a solution for processing big data in cloud environments. SQL Server is good for storage, but its ability to analyze terabytes of data is limited. Hadoop, which was designed for this purpose, is written in Java and was not available to .NET [...]\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.altoros.com\/blog\/hadoop-on-windows-azure-hive-vs-javascript-for-processing-big-data-2\/\" \/>\n<meta property=\"og:site_name\" content=\"Altoros\" \/>\n<meta property=\"article:published_time\" content=\"2012-12-06T12:42:07+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2020-03-30T14:02:34+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/www.altoros.com\/blog\/wp-content\/uploads\/2012\/12\/Hadoop-on-Windows-Azure.png\" \/>\n\t<meta property=\"og:image:width\" content=\"640\" \/>\n\t<meta property=\"og:image:height\" content=\"360\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/png\" \/>\n<meta name=\"author\" content=\"Alena Vasilenko\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Alena Vasilenko\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"1 minute\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"WebPage\",\"@id\":\"https:\/\/www.altoros.com\/blog\/hadoop-on-windows-azure-hive-vs-javascript-for-processing-big-data-2\/\",\"url\":\"https:\/\/www.altoros.com\/blog\/hadoop-on-windows-azure-hive-vs-javascript-for-processing-big-data-2\/\",\"name\":\"Hadoop on Windows Azure: Hive vs. JavaScript for Processing Big Data | Altoros\",\"isPartOf\":{\"@id\":\"https:\/\/www.altoros.com\/blog\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/www.altoros.com\/blog\/hadoop-on-windows-azure-hive-vs-javascript-for-processing-big-data-2\/#primaryimage\"},\"image\":{\"@id\":\"https:\/\/www.altoros.com\/blog\/hadoop-on-windows-azure-hive-vs-javascript-for-processing-big-data-2\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/www.altoros.com\/blog\/wp-content\/uploads\/2012\/12\/Hadoop-on-Windows-Azure.png\",\"datePublished\":\"2012-12-06T12:42:07+00:00\",\"dateModified\":\"2020-03-30T14:02:34+00:00\",\"author\":{\"@id\":\"https:\/\/www.altoros.com\/blog\/#\/schema\/person\/019e8147b835bc8f1b4abd8a4fa42c7f\"},\"breadcrumb\":{\"@id\":\"https:\/\/www.altoros.com\/blog\/hadoop-on-windows-azure-hive-vs-javascript-for-processing-big-data-2\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/www.altoros.com\/blog\/hadoop-on-windows-azure-hive-vs-javascript-for-processing-big-data-2\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.altoros.com\/blog\/hadoop-on-windows-azure-hive-vs-javascript-for-processing-big-data-2\/#primaryimage\",\"url\":\"https:\/\/www.altoros.com\/blog\/wp-content\/uploads\/2012\/12\/Hadoop-on-Windows-Azure.png\",\"contentUrl\":\"https:\/\/www.altoros.com\/blog\/wp-content\/uploads\/2012\/12\/Hadoop-on-Windows-Azure.png\",\"width\":640,\"height\":360},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/www.altoros.com\/blog\/hadoop-on-windows-azure-hive-vs-javascript-for-processing-big-data-2\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/www.altoros.com\/blog\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Hadoop on Windows Azure: Hive vs. JavaScript for Processing Big Data\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/www.altoros.com\/blog\/#website\",\"url\":\"https:\/\/www.altoros.com\/blog\/\",\"name\":\"Altoros\",\"description\":\"Insight\",\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/www.altoros.com\/blog\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Person\",\"@id\":\"https:\/\/www.altoros.com\/blog\/#\/schema\/person\/019e8147b835bc8f1b4abd8a4fa42c7f\",\"name\":\"Alena Vasilenko\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.altoros.com\/blog\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/www.altoros.com\/blog\/wp-content\/uploads\/2019\/06\/alena-vasilenko-author-e1561752194994-96x96.jpg\",\"contentUrl\":\"https:\/\/www.altoros.com\/blog\/wp-content\/uploads\/2019\/06\/alena-vasilenko-author-e1561752194994-96x96.jpg\",\"caption\":\"Alena Vasilenko\"},\"description\":\"Alena Vasilenko is Communications Manager at Altoros. She has proven track record of supporting R&amp;D engineers in their research activities on such topics as big data and cloud computing, translating the research results into easy-to-understand stories.\",\"url\":\"https:\/\/www.altoros.com\/blog\/author\/alena-vasilenko\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Hadoop on Windows Azure: Hive vs. JavaScript for Processing Big Data | Altoros","description":"The blog post shares the performance comparison of Hive querying and JavaScript implementations for processing big data with Hadoop on Windows Azure.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.altoros.com\/blog\/hadoop-on-windows-azure-hive-vs-javascript-for-processing-big-data-2\/","og_locale":"en_US","og_type":"article","og_title":"Hadoop on Windows Azure: Hive vs. JavaScript for Processing Big Data | Altoros","og_description":"Performance evaluation For some time, Microsoft did not offer a solution for processing big data in cloud environments. SQL Server is good for storage, but its ability to analyze terabytes of data is limited. Hadoop, which was designed for this purpose, is written in Java and was not available to .NET [...]","og_url":"https:\/\/www.altoros.com\/blog\/hadoop-on-windows-azure-hive-vs-javascript-for-processing-big-data-2\/","og_site_name":"Altoros","article_published_time":"2012-12-06T12:42:07+00:00","article_modified_time":"2020-03-30T14:02:34+00:00","og_image":[{"width":640,"height":360,"url":"https:\/\/www.altoros.com\/blog\/wp-content\/uploads\/2012\/12\/Hadoop-on-Windows-Azure.png","type":"image\/png"}],"author":"Alena Vasilenko","twitter_misc":{"Written by":"Alena Vasilenko","Est. reading time":"1 minute"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"WebPage","@id":"https:\/\/www.altoros.com\/blog\/hadoop-on-windows-azure-hive-vs-javascript-for-processing-big-data-2\/","url":"https:\/\/www.altoros.com\/blog\/hadoop-on-windows-azure-hive-vs-javascript-for-processing-big-data-2\/","name":"Hadoop on Windows Azure: Hive vs. JavaScript for Processing Big Data | Altoros","isPartOf":{"@id":"https:\/\/www.altoros.com\/blog\/#website"},"primaryImageOfPage":{"@id":"https:\/\/www.altoros.com\/blog\/hadoop-on-windows-azure-hive-vs-javascript-for-processing-big-data-2\/#primaryimage"},"image":{"@id":"https:\/\/www.altoros.com\/blog\/hadoop-on-windows-azure-hive-vs-javascript-for-processing-big-data-2\/#primaryimage"},"thumbnailUrl":"https:\/\/www.altoros.com\/blog\/wp-content\/uploads\/2012\/12\/Hadoop-on-Windows-Azure.png","datePublished":"2012-12-06T12:42:07+00:00","dateModified":"2020-03-30T14:02:34+00:00","author":{"@id":"https:\/\/www.altoros.com\/blog\/#\/schema\/person\/019e8147b835bc8f1b4abd8a4fa42c7f"},"breadcrumb":{"@id":"https:\/\/www.altoros.com\/blog\/hadoop-on-windows-azure-hive-vs-javascript-for-processing-big-data-2\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.altoros.com\/blog\/hadoop-on-windows-azure-hive-vs-javascript-for-processing-big-data-2\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.altoros.com\/blog\/hadoop-on-windows-azure-hive-vs-javascript-for-processing-big-data-2\/#primaryimage","url":"https:\/\/www.altoros.com\/blog\/wp-content\/uploads\/2012\/12\/Hadoop-on-Windows-Azure.png","contentUrl":"https:\/\/www.altoros.com\/blog\/wp-content\/uploads\/2012\/12\/Hadoop-on-Windows-Azure.png","width":640,"height":360},{"@type":"BreadcrumbList","@id":"https:\/\/www.altoros.com\/blog\/hadoop-on-windows-azure-hive-vs-javascript-for-processing-big-data-2\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/www.altoros.com\/blog\/"},{"@type":"ListItem","position":2,"name":"Hadoop on Windows Azure: Hive vs. JavaScript for Processing Big Data"}]},{"@type":"WebSite","@id":"https:\/\/www.altoros.com\/blog\/#website","url":"https:\/\/www.altoros.com\/blog\/","name":"Altoros","description":"Insight","potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.altoros.com\/blog\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Person","@id":"https:\/\/www.altoros.com\/blog\/#\/schema\/person\/019e8147b835bc8f1b4abd8a4fa42c7f","name":"Alena Vasilenko","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.altoros.com\/blog\/#\/schema\/person\/image\/","url":"https:\/\/www.altoros.com\/blog\/wp-content\/uploads\/2019\/06\/alena-vasilenko-author-e1561752194994-96x96.jpg","contentUrl":"https:\/\/www.altoros.com\/blog\/wp-content\/uploads\/2019\/06\/alena-vasilenko-author-e1561752194994-96x96.jpg","caption":"Alena Vasilenko"},"description":"Alena Vasilenko is Communications Manager at Altoros. She has proven track record of supporting R&amp;D engineers in their research activities on such topics as big data and cloud computing, translating the research results into easy-to-understand stories.","url":"https:\/\/www.altoros.com\/blog\/author\/alena-vasilenko\/"}]}},"_links":{"self":[{"href":"https:\/\/www.altoros.com\/blog\/wp-json\/wp\/v2\/posts\/52389","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.altoros.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.altoros.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.altoros.com\/blog\/wp-json\/wp\/v2\/users\/34"}],"replies":[{"embeddable":true,"href":"https:\/\/www.altoros.com\/blog\/wp-json\/wp\/v2\/comments?post=52389"}],"version-history":[{"count":9,"href":"https:\/\/www.altoros.com\/blog\/wp-json\/wp\/v2\/posts\/52389\/revisions"}],"predecessor-version":[{"id":52392,"href":"https:\/\/www.altoros.com\/blog\/wp-json\/wp\/v2\/posts\/52389\/revisions\/52392"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.altoros.com\/blog\/wp-json\/wp\/v2\/media\/52394"}],"wp:attachment":[{"href":"https:\/\/www.altoros.com\/blog\/wp-json\/wp\/v2\/media?parent=52389"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.altoros.com\/blog\/wp-json\/wp\/v2\/categories?post=52389"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.altoros.com\/blog\/wp-json\/wp\/v2\/tags?post=52389"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}