{"id":20122,"date":"2016-09-08T21:27:59","date_gmt":"2016-09-09T05:27:59","guid":{"rendered":"https:\/\/www.altoros.com\/blog\/?p=20122"},"modified":"2021-03-18T12:20:43","modified_gmt":"2021-03-18T09:20:43","slug":"processing-data-on-ibm-bluemix-streaming-analytics-apache-spark-and-biginsights","status":"publish","type":"post","link":"https:\/\/www.altoros.com\/blog\/processing-data-on-ibm-bluemix-streaming-analytics-apache-spark-and-biginsights\/","title":{"rendered":"Processing Data on IBM Bluemix: Streaming Analytics, Apache Spark, and BigInsights"},"content":{"rendered":"<p>Essentially, the Internet of Things is about collecting and exchanging data, which then can be used in many different ways. Equipment fault monitoring, predictive maintenance, or real-time diagnostics are only a few of the possible scenarios. Dealing with all this information, however, creates certain challenges for the field of the IoT, and stream processing of huge amounts of data is among them.<\/p>\n<p>In this article, we compare <a href=\"https:\/\/www.altoros.com\/blog\/tag\/ibm-bluemix\/\">IBM Bluemix<\/a> services for real-time processing of data streams\u2014IBM InfoSphere Streams, a managed Apache Spark service, and IBM BigInsights for Apache Hadoop.<\/p>\n<p>&nbsp;<\/p>\n<div id=\"ez-toc-container\" class=\"ez-toc-v2_0_82_2 counter-hierarchy ez-toc-counter ez-toc-transparent ez-toc-container-direction\">\n<div class=\"ez-toc-title-container\">\n<p class=\"ez-toc-title\" style=\"cursor:inherit\">Table of Contents<\/p>\n<span class=\"ez-toc-title-toggle\"><a href=\"#\" class=\"ez-toc-pull-right ez-toc-btn ez-toc-btn-xs ez-toc-btn-default ez-toc-toggle\" aria-label=\"Toggle Table of Content\"><span class=\"ez-toc-js-icon-con\"><span class=\"\"><span class=\"eztoc-hide\" style=\"display:none;\">Toggle<\/span><span class=\"ez-toc-icon-toggle-span\"><svg style=\"fill: #999;color:#999\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\" class=\"list-377408\" width=\"20px\" height=\"20px\" viewBox=\"0 0 24 24\" fill=\"none\"><path d=\"M6 6H4v2h2V6zm14 0H8v2h12V6zM4 11h2v2H4v-2zm16 0H8v2h12v-2zM4 16h2v2H4v-2zm16 0H8v2h12v-2z\" fill=\"currentColor\"><\/path><\/svg><svg style=\"fill: #999;color:#999\" class=\"arrow-unsorted-368013\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"10px\" height=\"10px\" viewBox=\"0 0 24 24\" version=\"1.2\" baseProfile=\"tiny\"><path d=\"M18.2 9.3l-6.2-6.3-6.2 6.3c-.2.2-.3.4-.3.7s.1.5.3.7c.2.2.4.3.7.3h11c.3 0 .5-.1.7-.3.2-.2.3-.5.3-.7s-.1-.5-.3-.7zM5.8 14.7l6.2 6.3 6.2-6.3c.2-.2.3-.5.3-.7s-.1-.5-.3-.7c-.2-.2-.4-.3-.7-.3h-11c-.3 0-.5.1-.7.3-.2.2-.3.5-.3.7s.1.5.3.7z\"\/><\/svg><\/span><\/span><\/span><\/a><\/span><\/div>\n<nav><ul class='ez-toc-list ez-toc-list-level-1 ' ><li class='ez-toc-page-1 ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-1\" href=\"https:\/\/www.altoros.com\/blog\/processing-data-on-ibm-bluemix-streaming-analytics-apache-spark-and-biginsights\/#Streaming_Analytics\" >Streaming Analytics<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-2\" href=\"https:\/\/www.altoros.com\/blog\/processing-data-on-ibm-bluemix-streaming-analytics-apache-spark-and-biginsights\/#Apache_Spark_on_Bluemix\" >Apache Spark on Bluemix<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-3\" href=\"https:\/\/www.altoros.com\/blog\/processing-data-on-ibm-bluemix-streaming-analytics-apache-spark-and-biginsights\/#IBM_BigInsights_for_Apache_Hadoop\" >IBM BigInsights for Apache Hadoop<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-4\" href=\"https:\/\/www.altoros.com\/blog\/processing-data-on-ibm-bluemix-streaming-analytics-apache-spark-and-biginsights\/#Conclusions\" >Conclusions<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-5\" href=\"https:\/\/www.altoros.com\/blog\/processing-data-on-ibm-bluemix-streaming-analytics-apache-spark-and-biginsights\/#Further_reading\" >Further reading<\/a><\/li><\/ul><\/nav><\/div>\n<h3><span class=\"ez-toc-section\" id=\"Streaming_Analytics\"><\/span>Streaming Analytics<span class=\"ez-toc-section-end\"><\/span><\/h3>\n<p><a href=\"https:\/\/cloud.ibm.com\/catalog\/services\/streaming-analytics\" target=\"_blank\" rel=\"noopener noreferrer\">IBM InfoSphere Streams<\/a> is a platform for stream processing. It provides a wide set of connectors to different data stream sources and has good support of standard stream processing algorithms and approaches. For programming with this technology, a special Eclipse-based IDE\u2014IBM Streams Studio\u2014and a set of domain-specific languages are included. The main language used for processing data streams is <a href=\"https:\/\/www.ibm.com\/support\/knowledgecenter\/SSCRJU_3.2.0\/com.ibm.swg.im.infosphere.streams.spl-introductory-tutorial.doc\/doc\/tutorial-container.html\" target=\"_blank\" rel=\"noopener noreferrer\">Streams Processing Language<\/a> (SPL).<\/p>\n<p>Below is an example of event processing using IBM Streams SPL:<\/p>\n<pre class=\"brush: plain; title: ; notranslate\" title=\"\">composite HelloWorld {                                                   \r\n  graph                                                                  \r\n    stream&lt;rstring message&gt; Hi = Beacon() {                              \r\n      param iterations : 1u;                                             \r\n      output Hi : message = &quot;Hello, world!&quot;;                             \r\n    }                                                                    \r\n    () as Sink = Custom(Hi) {                                            \r\n      logic onTuple Hi : printStringLn(message);                         \r\n    }                                                                    \r\n}<\/pre>\n<p><\/p>\n<p><center><a href=\"https:\/\/www.altoros.com\/blog\/wp-content\/uploads\/2017\/01\/ibm-bluemix-ibm-streams-studio.png\"><img decoding=\"async\" src=\"https:\/\/www.altoros.com\/blog\/wp-content\/uploads\/2017\/01\/ibm-bluemix-ibm-streams-studio.png\" alt=\"ibm-bluemix-ibm-streams-studio\" width=\"640\" class=\"aligncenter size-full wp-image-20138\" \/><\/a><small>IBM Streams Studio<\/small><\/center><\/p>\n<p>The advantages of IBM Streams:<\/p>\n<ul>\n<li>Large set of connectors<\/li>\n<li>Good support of standard algorithms for text processing and pattern matching<\/li>\n<li>High scalability and performance<\/li>\n<\/ul>\n<p>The disadvantages:<\/p>\n<ul>\n<li>It requires learning new programing languages to get started.<\/li>\n<li>The installation of IBM Streams Studio is difficult. Also, it does not support all desktop platforms.<\/li>\n<li>It is not possible to develop without proprietary IBM Streams Studio.<\/li>\n<li>You cannot reuse app logic implemented using SPL on other platforms\/apps.<\/li>\n<\/ul>\n<p>&nbsp;<\/p>\n<h3><span class=\"ez-toc-section\" id=\"Apache_Spark_on_Bluemix\"><\/span>Apache Spark on Bluemix<span class=\"ez-toc-section-end\"><\/span><\/h3>\n<p><a href=\"https:\/\/spark.apache.org\" target=\"_blank\" rel=\"noopener noreferrer\">Apache Spark<\/a> is a fast and general engine for large-scale data processing. On Bluemix, it is available as a part of two services: <em>Apache Spark<\/em> and <em>BigInsights for Apache Hadoop<\/em>.<\/p>\n<p>The <a href=\"https:\/\/cloud.ibm.com\/catalog\/services\/apache-spark\" target=\"_blank\" rel=\"noopener noreferrer\">Spark service on Bluemix<\/a> is represented by a separate Spark master and worker nodes, and it also provides an interactive code editor <a href=\"http:\/\/jupyter.org\" target=\"_blank\" rel=\"noopener noreferrer\">Jupyter<\/a>. As the default file storage, the service uses Swift-based IBM Bluemix Object Storage. However, the integration is not good enough at the current moment: additional code is required for configuring and applying parameters in Spark. Now, the Spark service supports the Scala, R, and Python languages. For stream processing, you can work with <a href=\"https:\/\/spark.apache.org\/streaming\/\" target=\"_blank\" rel=\"noopener noreferrer\">Spark Streaming<\/a>.<\/p>\n<p><center><a href=\"https:\/\/www.altoros.com\/blog\/wp-content\/uploads\/2017\/01\/ibm-bluemix-apache-spark-jupyter.png\"><img decoding=\"async\" src=\"https:\/\/www.altoros.com\/blog\/wp-content\/uploads\/2017\/01\/ibm-bluemix-apache-spark-jupyter.png\" alt=\"ibm-bluemix-apache-spark-jupyter\" width=\"640\" class=\"aligncenter size-full wp-image-20136\" \/><\/a><\/p>\n<p><small>Jupyter editor for the Bluemix Spark service<\/small><\/center><\/p>\n<p>The advantages of Apache Spark on Bluemix:<\/p>\n<ul>\n<li>Good documentation<\/li>\n<li>Large community<\/li>\n<li>Support for existing Python, R, and Scala libraries<\/li>\n<li>No need to learn a new language<\/li>\n<li>High performance and scalability<\/li>\n<li>Reliability<\/li>\n<\/ul>\n<p>The disadvantages:<\/p>\n<ul>\n<li style=\"margin-bottom:12px;\">Integration with other Bluemix services, such as Object Storage, IBM BigInsights (Hadoop), and IBM Message Hub, is not perfect.<\/li>\n<li style=\"margin-bottom:12px;\">You need a lot of time to get started. At the moment, it is quite challenging to integrate different data services with Spark: there are a number of difficult actions that are not well documented. For developing a real stream processing flow, it will require even more time to integrate all stream providers and data stores.<\/li>\n<li style=\"margin-bottom:12px;\">It is problematic to program. Internal integrations with other services, including third-party libraries, are quite complicated to develop and debug. You have to use the Jupyter editor for debugging and another IDE for development, which require some kind of constant code synchronization.\n<\/li>\n<\/ul>\n<p>&nbsp;<\/p>\n<h3><span class=\"ez-toc-section\" id=\"IBM_BigInsights_for_Apache_Hadoop\"><\/span>IBM BigInsights for Apache Hadoop<span class=\"ez-toc-section-end\"><\/span><\/h3>\n<p><a href=\"https:\/\/cloud.ibm.com\/catalog\/services\/biginsights-for-apache-hadoop\" target=\"_blank\" rel=\"noopener noreferrer\">IBM BigInsights<\/a> enables you to create a Hadoop cluster with one click. It contains a pre-installed Apache Spark service on all cluster nodes. Unfortunately, BigInsights does not provide any web-based interface for uploading and managing Spark jobs. You can submit jobs to this cluster only via the command line. Debugging of these jobs also is not trivial. BigInsights offers the Ambari web console for monitoring cluster health.<\/p>\n<p><center><a href=\"https:\/\/www.altoros.com\/blog\/wp-content\/uploads\/2017\/01\/ibm-bluemix-biginsights-for-apache-hadoop-ambari.png\"><img decoding=\"async\" src=\"https:\/\/www.altoros.com\/blog\/wp-content\/uploads\/2017\/01\/ibm-bluemix-biginsights-for-apache-hadoop-ambari.png\" alt=\"ibm-bluemix-biginsights-for-apache-hadoop-ambari\" width=\"640\" class=\"aligncenter size-full wp-image-20137\" \/><\/a><small>Ambari cluster manager for the BigInsights service<\/small><\/center><\/p>\n<p>The advantages of IBM BigInsights for Apache Hadoop:<\/p>\n<ul>\n<li>You can quickly create a Hadoop cluster with good integration between all components.<\/li>\n<li>You are able to store streams to a Hadoop-based data lake as well as to process it.<\/li>\n<\/ul>\n<p>The disadvantages include:<\/p>\n<ul>\n<li>Integration with other Bluemix services is weak.<\/li>\n<li>There is no web console for creating, managing, and executing stream processing jobs and interactive programming of stream processing workflows.<\/li>\n<\/ul>\n<p>&nbsp;<\/p>\n<h3><span class=\"ez-toc-section\" id=\"Conclusions\"><\/span>Conclusions<span class=\"ez-toc-section-end\"><\/span><\/h3>\n<p>As already said, Bluemix has three services for programmable processing of data streams: IBM Streams, Apache Spark, and IBM BigInsights for Hadoop. Their comparison is summarized in the table below.<\/p>\n<style type=\"text\/css\">\n.myTable { background-color:white;border-collapse:collapse; } \n.myTable th { background-color:#E0E0E0;color:black;width:25%; } \n.myTable td, .myTable th { text-align:left;vertical-align:top;padding:5px;border:1px solid #989898; }\n<\/style>\n<p><center><small><\/p>\n<table class=\"myTable\" width=\"90%\">\n<thead>\n<tr>\n<th>Criteria<\/th>\n<th>IBM Streams<\/th>\n<th>Apache Spark on Bluemix<\/th>\n<th>IBM BigInsights for Hadoop<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>Languages<\/td>\n<td>SPL, Java<\/td>\n<td>Scala, R, Python<\/td>\n<td>Scala, R, Python<\/td>\n<\/tr>\n<tr>\n<td>Development tools<\/td>\n<td>IBM Streams Studio<\/td>\n<td>Web-based Jupyter or any desktop IDE<\/td>\n<td>Any desktop IDE<\/td>\n<\/tr>\n<tr>\n<td>Integration with file storage<\/td>\n<td>Difficult<\/td>\n<td>Difficult<\/td>\n<td>Simple (It is possible to use HDFS provided by BigInsights.)<\/td>\n<\/tr>\n<tr>\n<td>Integration with other Bluemix services<\/td>\n<td>Bad<\/td>\n<td>Bad<\/td>\n<td>Bad<\/td>\n<\/tr>\n<tr>\n<td>Reuse of third-party libraries<\/td>\n<td>Difficult<\/td>\n<td>Simple<\/td>\n<td>Simple<\/td>\n<\/tr>\n<tr>\n<td>Time to start for beginners<\/td>\n<td>Long<\/td>\n<td>Long<\/td>\n<td>Long<\/td>\n<\/tr>\n<tr>\n<td>Community<\/td>\n<td>Small<\/td>\n<td>Large<\/td>\n<td>Large<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p><\/small><\/center><\/p>\n<p>Among the services, Apache Spark and IBM BigInsights for Hadoop are the most comfortable and easy to work with. IBM Streams is more difficult to use, but it has good support of techniques and algorithms for <a href=\"https:\/\/www.ibm.com\/developerworks\/library\/bd-streamstextanalytics\/\" target=\"_blank\" rel=\"noopener noreferrer\">stream text processing<\/a> and pattern matching. However, all the three share the same issue\u2014their integration with other Bluemix services could be much better. Solving this problem will certainly make the Bluemix data stream processing solutions more appealing to users.<\/p>\n<p>&nbsp;<\/p>\n<h3><span class=\"ez-toc-section\" id=\"Further_reading\"><\/span>Further reading<span class=\"ez-toc-section-end\"><\/span><\/h3>\n<ul>\n<li><a href=\"https:\/\/www.altoros.com\/blog\/using-ibm-analytics-for-apache-spark-in-java-scala-apps-on-bluemix\/\">Bluemix Tutorial: Using IBM Analytics for Apache Spark in Java\/Scala Apps<\/a><\/li>\n<li><a href=\"https:\/\/www.altoros.com\/blog\/using-spark-streaming-apache-kafka-and-object-storage-for-stream-processing-on-bluemix\/\">Using Spark Streaming, Apache Kafka, and Object Storage for Stream Processing on Bluemix<\/a><\/li>\n<\/ul>\n<hr>\n<p><center><small>The post was written by <a href=\"https:\/\/www.altoros.com\/blog\/author\/ilya-drabenia\/\">Ilya Drabenia<\/a> and edited by <a href=\"https:\/\/www.altoros.com\/blog\/author\/viktoryia-fedzkovich\/\">Viktoria Fedzkovich<\/a>.<\/center><\/small><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Essentially, the Internet of Things is about collecting and exchanging data, which then can be used in many different ways. Equipment fault monitoring, predictive maintenance, or real-time diagnostics are only a few of the possible scenarios. Dealing with all this information, however, creates certain challenges for the field of the [&#8230;]<\/p>\n","protected":false},"author":71,"featured_media":29489,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"content-type":"","footnotes":"","_links_to":"","_links_to_target":""},"categories":[214],"tags":[873,187,117],"class_list":["post-20122","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-tutorials","tag-cloud-native","tag-ibm-bluemix","tag-iot"],"acf":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.4 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>Processing Data on IBM Bluemix: Streaming Analytics, Apache Spark, and BigInsights | Altoros<\/title>\n<meta name=\"description\" content=\"This blog post compares the services across ease of use, integration with file storage,  supported programming languages and development tools, etc.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.altoros.com\/blog\/processing-data-on-ibm-bluemix-streaming-analytics-apache-spark-and-biginsights\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Processing Data on IBM Bluemix: Streaming Analytics, Apache Spark, and BigInsights | Altoros\" \/>\n<meta property=\"og:description\" content=\"Essentially, the Internet of Things is about collecting and exchanging data, which then can be used in many different ways. Equipment fault monitoring, predictive maintenance, or real-time diagnostics are only a few of the possible scenarios. Dealing with all this information, however, creates certain challenges for the field of the [...]\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.altoros.com\/blog\/processing-data-on-ibm-bluemix-streaming-analytics-apache-spark-and-biginsights\/\" \/>\n<meta property=\"og:site_name\" content=\"Altoros\" \/>\n<meta property=\"article:published_time\" content=\"2016-09-09T05:27:59+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2021-03-18T09:20:43+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/www.altoros.com\/blog\/wp-content\/uploads\/2016\/09\/ibm-bluemix-data-streaming_version-2.gif\" \/>\n\t<meta property=\"og:image:width\" content=\"640\" \/>\n\t<meta property=\"og:image:height\" content=\"430\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/gif\" \/>\n<meta name=\"author\" content=\"Ilya Drabenia\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Ilya Drabenia\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"5 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\\\/\\\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\\\/\\\/www.altoros.com\\\/blog\\\/processing-data-on-ibm-bluemix-streaming-analytics-apache-spark-and-biginsights\\\/#article\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/www.altoros.com\\\/blog\\\/processing-data-on-ibm-bluemix-streaming-analytics-apache-spark-and-biginsights\\\/\"},\"author\":{\"name\":\"Ilya Drabenia\",\"@id\":\"https:\\\/\\\/www.altoros.com\\\/blog\\\/#\\\/schema\\\/person\\\/48c2eaf6d86abc9fa945bc3860fa2dc2\"},\"headline\":\"Processing Data on IBM Bluemix: Streaming Analytics, Apache Spark, and BigInsights\",\"datePublished\":\"2016-09-09T05:27:59+00:00\",\"dateModified\":\"2021-03-18T09:20:43+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\\\/\\\/www.altoros.com\\\/blog\\\/processing-data-on-ibm-bluemix-streaming-analytics-apache-spark-and-biginsights\\\/\"},\"wordCount\":932,\"commentCount\":0,\"image\":{\"@id\":\"https:\\\/\\\/www.altoros.com\\\/blog\\\/processing-data-on-ibm-bluemix-streaming-analytics-apache-spark-and-biginsights\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/www.altoros.com\\\/blog\\\/wp-content\\\/uploads\\\/2016\\\/09\\\/ibm-bluemix-data-streaming_version-2.gif\",\"keywords\":[\"Cloud-Native\",\"IBM Bluemix\",\"IoT\"],\"articleSection\":[\"Tutorials\"],\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\\\/\\\/www.altoros.com\\\/blog\\\/processing-data-on-ibm-bluemix-streaming-analytics-apache-spark-and-biginsights\\\/#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\\\/\\\/www.altoros.com\\\/blog\\\/processing-data-on-ibm-bluemix-streaming-analytics-apache-spark-and-biginsights\\\/\",\"url\":\"https:\\\/\\\/www.altoros.com\\\/blog\\\/processing-data-on-ibm-bluemix-streaming-analytics-apache-spark-and-biginsights\\\/\",\"name\":\"Processing Data on IBM Bluemix: Streaming Analytics, Apache Spark, and BigInsights | Altoros\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/www.altoros.com\\\/blog\\\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\\\/\\\/www.altoros.com\\\/blog\\\/processing-data-on-ibm-bluemix-streaming-analytics-apache-spark-and-biginsights\\\/#primaryimage\"},\"image\":{\"@id\":\"https:\\\/\\\/www.altoros.com\\\/blog\\\/processing-data-on-ibm-bluemix-streaming-analytics-apache-spark-and-biginsights\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/www.altoros.com\\\/blog\\\/wp-content\\\/uploads\\\/2016\\\/09\\\/ibm-bluemix-data-streaming_version-2.gif\",\"datePublished\":\"2016-09-09T05:27:59+00:00\",\"dateModified\":\"2021-03-18T09:20:43+00:00\",\"author\":{\"@id\":\"https:\\\/\\\/www.altoros.com\\\/blog\\\/#\\\/schema\\\/person\\\/48c2eaf6d86abc9fa945bc3860fa2dc2\"},\"breadcrumb\":{\"@id\":\"https:\\\/\\\/www.altoros.com\\\/blog\\\/processing-data-on-ibm-bluemix-streaming-analytics-apache-spark-and-biginsights\\\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\\\/\\\/www.altoros.com\\\/blog\\\/processing-data-on-ibm-bluemix-streaming-analytics-apache-spark-and-biginsights\\\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/www.altoros.com\\\/blog\\\/processing-data-on-ibm-bluemix-streaming-analytics-apache-spark-and-biginsights\\\/#primaryimage\",\"url\":\"https:\\\/\\\/www.altoros.com\\\/blog\\\/wp-content\\\/uploads\\\/2016\\\/09\\\/ibm-bluemix-data-streaming_version-2.gif\",\"contentUrl\":\"https:\\\/\\\/www.altoros.com\\\/blog\\\/wp-content\\\/uploads\\\/2016\\\/09\\\/ibm-bluemix-data-streaming_version-2.gif\",\"width\":640,\"height\":430},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\\\/\\\/www.altoros.com\\\/blog\\\/processing-data-on-ibm-bluemix-streaming-analytics-apache-spark-and-biginsights\\\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\\\/\\\/www.altoros.com\\\/blog\\\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Processing Data on IBM Bluemix: Streaming Analytics, Apache Spark, and BigInsights\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\\\/\\\/www.altoros.com\\\/blog\\\/#website\",\"url\":\"https:\\\/\\\/www.altoros.com\\\/blog\\\/\",\"name\":\"Altoros\",\"description\":\"Insight\",\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\\\/\\\/www.altoros.com\\\/blog\\\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Person\",\"@id\":\"https:\\\/\\\/www.altoros.com\\\/blog\\\/#\\\/schema\\\/person\\\/48c2eaf6d86abc9fa945bc3860fa2dc2\",\"name\":\"Ilya Drabenia\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/www.altoros.com\\\/blog\\\/wp-content\\\/uploads\\\/2017\\\/01\\\/1108467-150x150.jpg\",\"url\":\"https:\\\/\\\/www.altoros.com\\\/blog\\\/wp-content\\\/uploads\\\/2017\\\/01\\\/1108467-150x150.jpg\",\"contentUrl\":\"https:\\\/\\\/www.altoros.com\\\/blog\\\/wp-content\\\/uploads\\\/2017\\\/01\\\/1108467-150x150.jpg\",\"caption\":\"Ilya Drabenia\"},\"description\":\"Ilya Drabenia is a Technical Lead at Altoros. He has broad experience in building software architectures, including design and development of complex solutions. Ilya is passionate about microservices, domain-driven design, as well as scalable and parallel algorithms. He also holds an MSc degree in Computer Science. See his profile on GitHub.\",\"url\":\"https:\\\/\\\/www.altoros.com\\\/blog\\\/author\\\/ilya-drabenia\\\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Processing Data on IBM Bluemix: Streaming Analytics, Apache Spark, and BigInsights | Altoros","description":"This blog post compares the services across ease of use, integration with file storage,  supported programming languages and development tools, etc.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.altoros.com\/blog\/processing-data-on-ibm-bluemix-streaming-analytics-apache-spark-and-biginsights\/","og_locale":"en_US","og_type":"article","og_title":"Processing Data on IBM Bluemix: Streaming Analytics, Apache Spark, and BigInsights | Altoros","og_description":"Essentially, the Internet of Things is about collecting and exchanging data, which then can be used in many different ways. Equipment fault monitoring, predictive maintenance, or real-time diagnostics are only a few of the possible scenarios. Dealing with all this information, however, creates certain challenges for the field of the [...]","og_url":"https:\/\/www.altoros.com\/blog\/processing-data-on-ibm-bluemix-streaming-analytics-apache-spark-and-biginsights\/","og_site_name":"Altoros","article_published_time":"2016-09-09T05:27:59+00:00","article_modified_time":"2021-03-18T09:20:43+00:00","og_image":[{"width":640,"height":430,"url":"https:\/\/www.altoros.com\/blog\/wp-content\/uploads\/2016\/09\/ibm-bluemix-data-streaming_version-2.gif","type":"image\/gif"}],"author":"Ilya Drabenia","twitter_misc":{"Written by":"Ilya Drabenia","Est. reading time":"5 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/www.altoros.com\/blog\/processing-data-on-ibm-bluemix-streaming-analytics-apache-spark-and-biginsights\/#article","isPartOf":{"@id":"https:\/\/www.altoros.com\/blog\/processing-data-on-ibm-bluemix-streaming-analytics-apache-spark-and-biginsights\/"},"author":{"name":"Ilya Drabenia","@id":"https:\/\/www.altoros.com\/blog\/#\/schema\/person\/48c2eaf6d86abc9fa945bc3860fa2dc2"},"headline":"Processing Data on IBM Bluemix: Streaming Analytics, Apache Spark, and BigInsights","datePublished":"2016-09-09T05:27:59+00:00","dateModified":"2021-03-18T09:20:43+00:00","mainEntityOfPage":{"@id":"https:\/\/www.altoros.com\/blog\/processing-data-on-ibm-bluemix-streaming-analytics-apache-spark-and-biginsights\/"},"wordCount":932,"commentCount":0,"image":{"@id":"https:\/\/www.altoros.com\/blog\/processing-data-on-ibm-bluemix-streaming-analytics-apache-spark-and-biginsights\/#primaryimage"},"thumbnailUrl":"https:\/\/www.altoros.com\/blog\/wp-content\/uploads\/2016\/09\/ibm-bluemix-data-streaming_version-2.gif","keywords":["Cloud-Native","IBM Bluemix","IoT"],"articleSection":["Tutorials"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/www.altoros.com\/blog\/processing-data-on-ibm-bluemix-streaming-analytics-apache-spark-and-biginsights\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/www.altoros.com\/blog\/processing-data-on-ibm-bluemix-streaming-analytics-apache-spark-and-biginsights\/","url":"https:\/\/www.altoros.com\/blog\/processing-data-on-ibm-bluemix-streaming-analytics-apache-spark-and-biginsights\/","name":"Processing Data on IBM Bluemix: Streaming Analytics, Apache Spark, and BigInsights | Altoros","isPartOf":{"@id":"https:\/\/www.altoros.com\/blog\/#website"},"primaryImageOfPage":{"@id":"https:\/\/www.altoros.com\/blog\/processing-data-on-ibm-bluemix-streaming-analytics-apache-spark-and-biginsights\/#primaryimage"},"image":{"@id":"https:\/\/www.altoros.com\/blog\/processing-data-on-ibm-bluemix-streaming-analytics-apache-spark-and-biginsights\/#primaryimage"},"thumbnailUrl":"https:\/\/www.altoros.com\/blog\/wp-content\/uploads\/2016\/09\/ibm-bluemix-data-streaming_version-2.gif","datePublished":"2016-09-09T05:27:59+00:00","dateModified":"2021-03-18T09:20:43+00:00","author":{"@id":"https:\/\/www.altoros.com\/blog\/#\/schema\/person\/48c2eaf6d86abc9fa945bc3860fa2dc2"},"breadcrumb":{"@id":"https:\/\/www.altoros.com\/blog\/processing-data-on-ibm-bluemix-streaming-analytics-apache-spark-and-biginsights\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.altoros.com\/blog\/processing-data-on-ibm-bluemix-streaming-analytics-apache-spark-and-biginsights\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.altoros.com\/blog\/processing-data-on-ibm-bluemix-streaming-analytics-apache-spark-and-biginsights\/#primaryimage","url":"https:\/\/www.altoros.com\/blog\/wp-content\/uploads\/2016\/09\/ibm-bluemix-data-streaming_version-2.gif","contentUrl":"https:\/\/www.altoros.com\/blog\/wp-content\/uploads\/2016\/09\/ibm-bluemix-data-streaming_version-2.gif","width":640,"height":430},{"@type":"BreadcrumbList","@id":"https:\/\/www.altoros.com\/blog\/processing-data-on-ibm-bluemix-streaming-analytics-apache-spark-and-biginsights\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/www.altoros.com\/blog\/"},{"@type":"ListItem","position":2,"name":"Processing Data on IBM Bluemix: Streaming Analytics, Apache Spark, and BigInsights"}]},{"@type":"WebSite","@id":"https:\/\/www.altoros.com\/blog\/#website","url":"https:\/\/www.altoros.com\/blog\/","name":"Altoros","description":"Insight","potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.altoros.com\/blog\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Person","@id":"https:\/\/www.altoros.com\/blog\/#\/schema\/person\/48c2eaf6d86abc9fa945bc3860fa2dc2","name":"Ilya Drabenia","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.altoros.com\/blog\/wp-content\/uploads\/2017\/01\/1108467-150x150.jpg","url":"https:\/\/www.altoros.com\/blog\/wp-content\/uploads\/2017\/01\/1108467-150x150.jpg","contentUrl":"https:\/\/www.altoros.com\/blog\/wp-content\/uploads\/2017\/01\/1108467-150x150.jpg","caption":"Ilya Drabenia"},"description":"Ilya Drabenia is a Technical Lead at Altoros. He has broad experience in building software architectures, including design and development of complex solutions. Ilya is passionate about microservices, domain-driven design, as well as scalable and parallel algorithms. He also holds an MSc degree in Computer Science. See his profile on GitHub.","url":"https:\/\/www.altoros.com\/blog\/author\/ilya-drabenia\/"}]}},"_links":{"self":[{"href":"https:\/\/www.altoros.com\/blog\/wp-json\/wp\/v2\/posts\/20122","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.altoros.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.altoros.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.altoros.com\/blog\/wp-json\/wp\/v2\/users\/71"}],"replies":[{"embeddable":true,"href":"https:\/\/www.altoros.com\/blog\/wp-json\/wp\/v2\/comments?post=20122"}],"version-history":[{"count":22,"href":"https:\/\/www.altoros.com\/blog\/wp-json\/wp\/v2\/posts\/20122\/revisions"}],"predecessor-version":[{"id":60702,"href":"https:\/\/www.altoros.com\/blog\/wp-json\/wp\/v2\/posts\/20122\/revisions\/60702"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.altoros.com\/blog\/wp-json\/wp\/v2\/media\/29489"}],"wp:attachment":[{"href":"https:\/\/www.altoros.com\/blog\/wp-json\/wp\/v2\/media?parent=20122"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.altoros.com\/blog\/wp-json\/wp\/v2\/categories?post=20122"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.altoros.com\/blog\/wp-json\/wp\/v2\/tags?post=20122"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}