{"id":51769,"date":"2014-01-14T18:04:37","date_gmt":"2014-01-14T15:04:37","guid":{"rendered":"https:\/\/www.altoros.com\/blog\/?p=51769"},"modified":"2020-04-21T15:31:47","modified_gmt":"2020-04-21T12:31:47","slug":"madlib-a-solution-for-big-data-analytics-from-pivotal","status":"publish","type":"post","link":"https:\/\/www.altoros.com\/blog\/madlib-a-solution-for-big-data-analytics-from-pivotal\/","title":{"rendered":"MADlib: A Solution for Big Data Analytics from Pivotal"},"content":{"rendered":"<div id=\"ez-toc-container\" class=\"ez-toc-v2_0_79_2 counter-hierarchy ez-toc-counter ez-toc-transparent ez-toc-container-direction\">\n<div class=\"ez-toc-title-container\">\n<p class=\"ez-toc-title\" style=\"cursor:inherit\">Table of Contents<\/p>\n<span class=\"ez-toc-title-toggle\"><a href=\"#\" class=\"ez-toc-pull-right ez-toc-btn ez-toc-btn-xs ez-toc-btn-default ez-toc-toggle\" aria-label=\"Toggle Table of Content\"><span class=\"ez-toc-js-icon-con\"><span class=\"\"><span class=\"eztoc-hide\" style=\"display:none;\">Toggle<\/span><span class=\"ez-toc-icon-toggle-span\"><svg style=\"fill: #999;color:#999\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\" class=\"list-377408\" width=\"20px\" height=\"20px\" viewBox=\"0 0 24 24\" fill=\"none\"><path d=\"M6 6H4v2h2V6zm14 0H8v2h12V6zM4 11h2v2H4v-2zm16 0H8v2h12v-2zM4 16h2v2H4v-2zm16 0H8v2h12v-2z\" fill=\"currentColor\"><\/path><\/svg><svg style=\"fill: #999;color:#999\" class=\"arrow-unsorted-368013\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"10px\" height=\"10px\" viewBox=\"0 0 24 24\" version=\"1.2\" baseProfile=\"tiny\"><path d=\"M18.2 9.3l-6.2-6.3-6.2 6.3c-.2.2-.3.4-.3.7s.1.5.3.7c.2.2.4.3.7.3h11c.3 0 .5-.1.7-.3.2-.2.3-.5.3-.7s-.1-.5-.3-.7zM5.8 14.7l6.2 6.3 6.2-6.3c.2-.2.3-.5.3-.7s-.1-.5-.3-.7c-.2-.2-.4-.3-.7-.3h-11c-.3 0-.5.1-.7.3-.2.2-.3.5-.3.7s.1.5.3.7z\"\/><\/svg><\/span><\/span><\/span><\/a><\/span><\/div>\n<nav><ul class='ez-toc-list ez-toc-list-level-1 ' ><li class='ez-toc-page-1 ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-1\" href=\"https:\/\/www.altoros.com\/blog\/madlib-a-solution-for-big-data-analytics-from-pivotal\/#General_overview\" >General overview<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-2\" href=\"https:\/\/www.altoros.com\/blog\/madlib-a-solution-for-big-data-analytics-from-pivotal\/#Installation\" >Installation<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-3\" href=\"https:\/\/www.altoros.com\/blog\/madlib-a-solution-for-big-data-analytics-from-pivotal\/#Clustering\" >Clustering<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-4\" href=\"https:\/\/www.altoros.com\/blog\/madlib-a-solution-for-big-data-analytics-from-pivotal\/#Linear_regression\" >Linear regression<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-5\" href=\"https:\/\/www.altoros.com\/blog\/madlib-a-solution-for-big-data-analytics-from-pivotal\/#About_the_author\" >About the author<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-6\" href=\"https:\/\/www.altoros.com\/blog\/madlib-a-solution-for-big-data-analytics-from-pivotal\/#Further_reading\" >Further reading<\/a><\/li><\/ul><\/nav><\/div>\n<h3><span class=\"ez-toc-section\" id=\"General_overview\"><\/span>General overview<span class=\"ez-toc-section-end\"><\/span><\/h3>\n<p><img decoding=\"async\" src=\"https:\/\/www.altoros.com\/blog\/wp-content\/uploads\/2014\/01\/Madlib-logo.png\" alt=\"\" width=\"150\"  class=\"alignright size-full wp-image-51776\" \/><\/a>There are a number of data analytics solutions that support the MapReduce principle and are able to work with NoSQL databases. However, most enterprises still rely on mature SQL data stores and, therefore, need traditional analytics solutions to provide in-depth analysis of their business-critical data.<\/p>\n<p><a href=\"https:\/\/pivotal.io\/products\/madlib\" target=\"_blank\" rel=\"noopener noreferrer\">MADlib <\/a>is a scalable in-database analytics library that features sophisticated mathematical algorithms for SQL-based systems. The solution was developed jointly by researchers from UC Berkeley and engineers from Pivotal (formerly EMC\/Greenplum). It can be considered as an enterprise alternative to Hadoop in machine learning, data mining, and statistics tasks. In addition, MADlib supports time series rows, which could not be processed appropriately by Hadoop, greatly extending capabilities for building prediction systems. For more information, watch a <a href=\"https:\/\/www.youtube.com\/watch?v=gI7-wqf6pAs\" target=\"_blank\" rel=\"noopener noreferrer\">video overview<\/a> from Pivotal, read <a href=\"https:\/\/dsf.berkeley.edu\/papers\/vldb09-madskills.pdf\" target=\"_blank\" rel=\"noopener noreferrer\">the introduction<\/a> to MADlib, or visit the <a href=\"https:\/\/madlib.apache.org\/\" target=\"_blank\" rel=\"noopener noreferrer\">product page<\/a>.<\/p>\n<p>With some experience in <a href=\"https:\/\/www.wolfram.com\/mathematica\/\" rel=\"noreferrer noopener\" target=\"_blank\">Wolfram Mathematica<\/a>, we were tempted to compare the two products. The presentation that claimed MADlib\u2019s high performance and great scalability of the built-in machine learning algorithms even boosted our curiosity. Below is one of the slides taken from this document. The solution is supposed to process billions of rows in minutes, impressive math!<\/p>\n<p><center><a href=\"https:\/\/www.altoros.com\/blog\/wp-content\/uploads\/2020\/03\/Figure01.jpg\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.altoros.com\/blog\/wp-content\/uploads\/2020\/03\/Figure01.jpg\" alt=\"\" width=\"550\" height=\"383\" class=\"aligncenter size-full wp-image-51770\" \/><\/a><small>The MADlib data processing method<\/small><\/center><\/p>\n<h3><span class=\"ez-toc-section\" id=\"Installation\"><\/span>Installation<span class=\"ez-toc-section-end\"><\/span><\/h3>\n<p>There is a detailed official <a href=\"https:\/\/github.com\/madlib\/madlib\" target=\"_blank\" rel=\"noopener noreferrer\">installation guide<\/a>, so we faced no difficulties while installing the product. We used the most recent (at that time) MADlib v1.3 and PostgreSQL v9.2 for one CentOS node. When installation was finished, we wanted to check out whether the algorithms worked properly. For that purpose, it is good to have test data samples, but they were not included. So, right after installation, we had to spend some time to find data to test the solution. It would be great if such sample data sets were added to enable users to play with a product and see how it works.<\/p>\n<h3><span class=\"ez-toc-section\" id=\"Clustering\"><\/span>Clustering<span class=\"ez-toc-section-end\"><\/span><\/h3>\n<p>Firstly, we wanted to see how MADlib deals with basic tasks. We decided to start with <a href=\"https:\/\/www.altoros.com\/blog\/evaluating-the-apriori-algorithm-vs-k-means-clustering-for-a-recommendation-engine\/\">k-means clustering<\/a>. A classic <a href=\"http:\/\/archive.ics.uci.edu\/ml\/datasets\/Wine\" target=\"_blank\" rel=\"noopener noreferrer\">wine data set<\/a> downloaded from the UCI archive was taken as a test database. Initially, there were two files with wine characteristics. They were merged into a single database that had 6,497 records and 14 columns.<\/p>\n<p>According to the <a href=\"http:\/\/devdoc.madlib.net\/v0.4\/group__grp__kmeans.html\" target=\"_blank\" rel=\"noopener noreferrer\">notes<\/a> in the developer documentation, data had to be presented in the following way before starting the algorithm.<\/p>\n<div class=\"code\">\n<pre>{TABLE|VIEW} data_points (\r\n    ...\r\n    [ point_id INTEGER, ]\r\n    point_coordinates {SVEC|FLOAT[]|INTEGER[]},\r\n    ...\r\n)<\/pre>\n<\/div>\n<p>Coordinates of a point are to be stored as an array in a single column of the table. Usually, each coordinate is stored in a separate column. Therefore, data should be somehow transformed from this view into a table, in which all coordinates are stored in one column. For a data science specialist with little experience in PostgreSQL, it turned into a challenging task.<\/p>\n<p>After the data had been presented in the required way, the algorithm started easily. The results (only centroids) were compared against the results demonstrated by the same algorithm included in Wolfram Mathematica. The figure below demonstrates the clustering results of Wolfram Mathematica (black centroids) vs. MADlib (red centroids). Although there were some slight deviations, they were within the acceptable limits.<\/p>\n<p><center><a href=\"https:\/\/www.altoros.com\/blog\/wp-content\/uploads\/2020\/03\/Figure02.jpg\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.altoros.com\/blog\/wp-content\/uploads\/2020\/03\/Figure02.jpg\" alt=\"\" width=\"449\" height=\"589\" class=\"aligncenter size-full wp-image-51771\" \/><\/a><small>Comparison of k-means algorithms<\/small><\/center><\/p>\n<h3><span class=\"ez-toc-section\" id=\"Linear_regression\"><\/span>Linear regression<span class=\"ez-toc-section-end\"><\/span><\/h3>\n<p>To evaluate the <a href=\"https:\/\/www.altoros.com\/blog\/using-linear-regression-in-tensorflow\/\">linear regression<\/a> algorithm, the same table with wine characteristics was utilized. Although the type of a matrix was not defined in the <a href=\"https:\/\/madlib.apache.org\/docs\/latest\/group__grp__linreg.html\" target=\"_blank\" rel=\"noopener noreferrer\">documentation<\/a>, there was an example of how to call a function. It could be concluded that no data transformations were required.<\/p>\n<p>The system used the titles of the columns with dependent and independent variables as input data for this algorithm, which is quite natural for this kind of task. The whole data set was divided into training and test samples (6,400 and 97 records respectively). The algorithm successfully handled the task. The predicted results (the blue curve) were compared against the real ones (the purple curve).<\/p>\n<p><center><a href=\"https:\/\/www.altoros.com\/blog\/wp-content\/uploads\/2020\/03\/Figure03.jpg\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.altoros.com\/blog\/wp-content\/uploads\/2020\/03\/Figure03.jpg\" alt=\"\" width=\"585\" height=\"390\" class=\"aligncenter size-full wp-image-51772\" \/><\/a><small>MADlib linear regression<\/small><\/center><\/p>\n<p>The eventual results were quite predictable. Below, you can see two line charts that show models built with MADlib (the purple curve) and Wolfram Mathematica (the blue curve). Since two charts overlapped, there is a single blue-purple line.<\/p>\n<p><center><a href=\"https:\/\/www.altoros.com\/blog\/wp-content\/uploads\/2020\/03\/Figure04.jpg\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.altoros.com\/blog\/wp-content\/uploads\/2020\/03\/Figure04.jpg\" alt=\"\" width=\"585\" height=\"390\" class=\"aligncenter size-full wp-image-51773\" \/><\/a><small>A comparison of linear regression models<\/small><\/center><\/p>\n<p>In addition to clustering and linear regression, we also examined MADlib\u2019s implementation of <a href=\"https:\/\/en.wikipedia.org\/wiki\/Singular_value_decomposition\" rel=\"noreferrer noopener\" target=\"_blank\">singular value decomposition<\/a> (SVD) of a sparse matrix and time series analysis. If you are interested in the results, drop us a line.<\/p>\n<p>Did you have any chance to work with MADlib? What was your experience?<\/p>\n<p>&nbsp;<\/p>\n<h3><span class=\"ez-toc-section\" id=\"About_the_author\"><\/span>About the author<span class=\"ez-toc-section-end\"><\/span><\/h3>\n<div>\n<div style=\"float: right;\"><a href=\"https:\/\/www.altoros.com\/blog\/wp-content\/uploads\/2014\/01\/sofia-parfenovich.png\"><img decoding=\"async\" src=\"https:\/\/www.altoros.com\/blog\/wp-content\/uploads\/2014\/01\/sofia-parfenovich.png\" alt=\"\" width=\"120\" class=\"alignnone size-full wp-image-53382\" \/><\/a><\/div>\n<div style=\"width: 600px;\"><small><strong><a href=\"https:\/\/www.linkedin.com\/in\/sofya-krainova-18a20556\/\" rel=\"noopener noreferrer\" target=\"_blank\">Sophia Parfenovich<\/a><\/strong> is Data Scientist at Altoros. She is interested in creating association rules for mining large volumes of data with Hadoop and other MapReduce tools. Sofia has strong experience in time-series forecasting, building trading strategies, and data analysis.<\/small><\/div>\n<\/div>\n<p>&nbsp;<\/p>\n<h3><span class=\"ez-toc-section\" id=\"Further_reading\"><\/span>Further reading<span class=\"ez-toc-section-end\"><\/span><\/h3>\n<ul>\n<li><a href=\"https:\/\/www.altoros.com\/blog\/building-stock-trading-strategies-20-faster-with-hadoop\/\">Building Stock Trading Strategies: 20% Faster with Hadoop<\/a><\/li>\n<li><a href=\"https:\/\/www.altoros.com\/blog\/using-k-means-clustering-in-tensorflow\/\">Implementing k-means Clustering with TensorFlow<\/a><\/li>\n<li><a href=\"https:\/\/www.altoros.com\/research-papers\/hadoop-based-movie-recommendation-engine-a-comparison-of-the-apriori-algorithm-vs-the-k-means-method\/\">Hadoop-based Movie Recommendation Engine: A Comparison of the Apriori Algorithm vs. the k-means Method<\/a><\/li>\n<\/ul>\n<p>&nbsp;<\/p>\n<hr\/>\n<p><center><small>The blog post was written by Sofia Parfenovich and edited by <a href=\"https:\/\/www.altoros.com\/blog\/author\/alex\/\">Alex Khizhniak<\/a>.<\/small><\/center><\/p>\n","protected":false},"excerpt":{"rendered":"<p>General overview<\/p>\n<p>There are a number of data analytics solutions that support the MapReduce principle and are able to work with NoSQL databases. However, most enterprises still rely on mature SQL data stores and, therefore, need traditional analytics solutions to provide in-depth analysis of their business-critical data.<\/p>\n<p>MADlib is a scalable in-database [&#8230;]<\/p>\n","protected":false},"author":5,"featured_media":53955,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"content-type":"","footnotes":"","_links_to":"","_links_to_target":""},"categories":[7],"tags":[748,895],"class_list":["post-51769","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-news-and-opinion","tag-machine-learning","tag-research-and-development"],"acf":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v26.6 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>MADlib: A Solution for Big Data Analytics from Pivotal | Altoros<\/title>\n<meta name=\"description\" content=\"The blog post explores the MADlib library, featuring its performance clustering and linear regression results in comparison to Wolfram Mathematica.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.altoros.com\/blog\/madlib-a-solution-for-big-data-analytics-from-pivotal\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"MADlib: A Solution for Big Data Analytics from Pivotal | Altoros\" \/>\n<meta property=\"og:description\" content=\"General overview There are a number of data analytics solutions that support the MapReduce principle and are able to work with NoSQL databases. However, most enterprises still rely on mature SQL data stores and, therefore, need traditional analytics solutions to provide in-depth analysis of their business-critical data. MADlib is a scalable in-database [...]\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.altoros.com\/blog\/madlib-a-solution-for-big-data-analytics-from-pivotal\/\" \/>\n<meta property=\"og:site_name\" content=\"Altoros\" \/>\n<meta property=\"article:published_time\" content=\"2014-01-14T15:04:37+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2020-04-21T12:31:47+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/www.altoros.com\/blog\/wp-content\/uploads\/2014\/01\/MADlib-a-Solution-for-Big-Data-Analytics-from-Pivotal6.gif\" \/>\n\t<meta property=\"og:image:width\" content=\"640\" \/>\n\t<meta property=\"og:image:height\" content=\"360\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/gif\" \/>\n<meta name=\"author\" content=\"Alex Khizhniak\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Alex Khizhniak\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"4 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"WebPage\",\"@id\":\"https:\/\/www.altoros.com\/blog\/madlib-a-solution-for-big-data-analytics-from-pivotal\/\",\"url\":\"https:\/\/www.altoros.com\/blog\/madlib-a-solution-for-big-data-analytics-from-pivotal\/\",\"name\":\"MADlib: A Solution for Big Data Analytics from Pivotal | Altoros\",\"isPartOf\":{\"@id\":\"https:\/\/www.altoros.com\/blog\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/www.altoros.com\/blog\/madlib-a-solution-for-big-data-analytics-from-pivotal\/#primaryimage\"},\"image\":{\"@id\":\"https:\/\/www.altoros.com\/blog\/madlib-a-solution-for-big-data-analytics-from-pivotal\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/www.altoros.com\/blog\/wp-content\/uploads\/2014\/01\/MADlib-a-Solution-for-Big-Data-Analytics-from-Pivotal6.gif\",\"datePublished\":\"2014-01-14T15:04:37+00:00\",\"dateModified\":\"2020-04-21T12:31:47+00:00\",\"author\":{\"@id\":\"https:\/\/www.altoros.com\/blog\/#\/schema\/person\/3d914db6ad1b2908c32c0dc5dcabc420\"},\"breadcrumb\":{\"@id\":\"https:\/\/www.altoros.com\/blog\/madlib-a-solution-for-big-data-analytics-from-pivotal\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/www.altoros.com\/blog\/madlib-a-solution-for-big-data-analytics-from-pivotal\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.altoros.com\/blog\/madlib-a-solution-for-big-data-analytics-from-pivotal\/#primaryimage\",\"url\":\"https:\/\/www.altoros.com\/blog\/wp-content\/uploads\/2014\/01\/MADlib-a-Solution-for-Big-Data-Analytics-from-Pivotal6.gif\",\"contentUrl\":\"https:\/\/www.altoros.com\/blog\/wp-content\/uploads\/2014\/01\/MADlib-a-Solution-for-Big-Data-Analytics-from-Pivotal6.gif\",\"width\":640,\"height\":360},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/www.altoros.com\/blog\/madlib-a-solution-for-big-data-analytics-from-pivotal\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/www.altoros.com\/blog\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"MADlib: A Solution for Big Data Analytics from Pivotal\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/www.altoros.com\/blog\/#website\",\"url\":\"https:\/\/www.altoros.com\/blog\/\",\"name\":\"Altoros\",\"description\":\"Insight\",\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/www.altoros.com\/blog\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Person\",\"@id\":\"https:\/\/www.altoros.com\/blog\/#\/schema\/person\/3d914db6ad1b2908c32c0dc5dcabc420\",\"name\":\"Alex Khizhniak\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.altoros.com\/blog\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/www.altoros.com\/blog\/wp-content\/uploads\/2017\/06\/druzya-edit1-150x150.jpg\",\"contentUrl\":\"https:\/\/www.altoros.com\/blog\/wp-content\/uploads\/2017\/06\/druzya-edit1-150x150.jpg\",\"caption\":\"Alex Khizhniak\"},\"description\":\"Alex Khizhniak is Director of Technical Content Strategy at Altoros and a cofounder of a local Java User Group. Managing distributed teams since 2004, he has gained experience as a journalist, an editor-in-chief, a technical writer, a technology evangelist, a project manager, and a product owner. Alex is obsessed with AI\/ML, data science, data integration, ETL\/DWH, data quality, databases (SQL\/NoSQL), big data, IoT, and BI. The articles and industry reports he created or helped to publish reached out to 3,000,000+ tech-savvy readers. Some of the pieces were covered on TechRepublic, ebizQ, NetworkWorld, CIO.com, etc. Find him on Twitter at @alxkh.\",\"sameAs\":[\"https:\/\/x.com\/https:\/\/twitter.com\/alxkh\"],\"url\":\"https:\/\/www.altoros.com\/blog\/author\/alex\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"MADlib: A Solution for Big Data Analytics from Pivotal | Altoros","description":"The blog post explores the MADlib library, featuring its performance clustering and linear regression results in comparison to Wolfram Mathematica.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.altoros.com\/blog\/madlib-a-solution-for-big-data-analytics-from-pivotal\/","og_locale":"en_US","og_type":"article","og_title":"MADlib: A Solution for Big Data Analytics from Pivotal | Altoros","og_description":"General overview There are a number of data analytics solutions that support the MapReduce principle and are able to work with NoSQL databases. However, most enterprises still rely on mature SQL data stores and, therefore, need traditional analytics solutions to provide in-depth analysis of their business-critical data. MADlib is a scalable in-database [...]","og_url":"https:\/\/www.altoros.com\/blog\/madlib-a-solution-for-big-data-analytics-from-pivotal\/","og_site_name":"Altoros","article_published_time":"2014-01-14T15:04:37+00:00","article_modified_time":"2020-04-21T12:31:47+00:00","og_image":[{"width":640,"height":360,"url":"https:\/\/www.altoros.com\/blog\/wp-content\/uploads\/2014\/01\/MADlib-a-Solution-for-Big-Data-Analytics-from-Pivotal6.gif","type":"image\/gif"}],"author":"Alex Khizhniak","twitter_misc":{"Written by":"Alex Khizhniak","Est. reading time":"4 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"WebPage","@id":"https:\/\/www.altoros.com\/blog\/madlib-a-solution-for-big-data-analytics-from-pivotal\/","url":"https:\/\/www.altoros.com\/blog\/madlib-a-solution-for-big-data-analytics-from-pivotal\/","name":"MADlib: A Solution for Big Data Analytics from Pivotal | Altoros","isPartOf":{"@id":"https:\/\/www.altoros.com\/blog\/#website"},"primaryImageOfPage":{"@id":"https:\/\/www.altoros.com\/blog\/madlib-a-solution-for-big-data-analytics-from-pivotal\/#primaryimage"},"image":{"@id":"https:\/\/www.altoros.com\/blog\/madlib-a-solution-for-big-data-analytics-from-pivotal\/#primaryimage"},"thumbnailUrl":"https:\/\/www.altoros.com\/blog\/wp-content\/uploads\/2014\/01\/MADlib-a-Solution-for-Big-Data-Analytics-from-Pivotal6.gif","datePublished":"2014-01-14T15:04:37+00:00","dateModified":"2020-04-21T12:31:47+00:00","author":{"@id":"https:\/\/www.altoros.com\/blog\/#\/schema\/person\/3d914db6ad1b2908c32c0dc5dcabc420"},"breadcrumb":{"@id":"https:\/\/www.altoros.com\/blog\/madlib-a-solution-for-big-data-analytics-from-pivotal\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.altoros.com\/blog\/madlib-a-solution-for-big-data-analytics-from-pivotal\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.altoros.com\/blog\/madlib-a-solution-for-big-data-analytics-from-pivotal\/#primaryimage","url":"https:\/\/www.altoros.com\/blog\/wp-content\/uploads\/2014\/01\/MADlib-a-Solution-for-Big-Data-Analytics-from-Pivotal6.gif","contentUrl":"https:\/\/www.altoros.com\/blog\/wp-content\/uploads\/2014\/01\/MADlib-a-Solution-for-Big-Data-Analytics-from-Pivotal6.gif","width":640,"height":360},{"@type":"BreadcrumbList","@id":"https:\/\/www.altoros.com\/blog\/madlib-a-solution-for-big-data-analytics-from-pivotal\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/www.altoros.com\/blog\/"},{"@type":"ListItem","position":2,"name":"MADlib: A Solution for Big Data Analytics from Pivotal"}]},{"@type":"WebSite","@id":"https:\/\/www.altoros.com\/blog\/#website","url":"https:\/\/www.altoros.com\/blog\/","name":"Altoros","description":"Insight","potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.altoros.com\/blog\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Person","@id":"https:\/\/www.altoros.com\/blog\/#\/schema\/person\/3d914db6ad1b2908c32c0dc5dcabc420","name":"Alex Khizhniak","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.altoros.com\/blog\/#\/schema\/person\/image\/","url":"https:\/\/www.altoros.com\/blog\/wp-content\/uploads\/2017\/06\/druzya-edit1-150x150.jpg","contentUrl":"https:\/\/www.altoros.com\/blog\/wp-content\/uploads\/2017\/06\/druzya-edit1-150x150.jpg","caption":"Alex Khizhniak"},"description":"Alex Khizhniak is Director of Technical Content Strategy at Altoros and a cofounder of a local Java User Group. Managing distributed teams since 2004, he has gained experience as a journalist, an editor-in-chief, a technical writer, a technology evangelist, a project manager, and a product owner. Alex is obsessed with AI\/ML, data science, data integration, ETL\/DWH, data quality, databases (SQL\/NoSQL), big data, IoT, and BI. The articles and industry reports he created or helped to publish reached out to 3,000,000+ tech-savvy readers. Some of the pieces were covered on TechRepublic, ebizQ, NetworkWorld, CIO.com, etc. Find him on Twitter at @alxkh.","sameAs":["https:\/\/x.com\/https:\/\/twitter.com\/alxkh"],"url":"https:\/\/www.altoros.com\/blog\/author\/alex\/"}]}},"_links":{"self":[{"href":"https:\/\/www.altoros.com\/blog\/wp-json\/wp\/v2\/posts\/51769","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.altoros.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.altoros.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.altoros.com\/blog\/wp-json\/wp\/v2\/users\/5"}],"replies":[{"embeddable":true,"href":"https:\/\/www.altoros.com\/blog\/wp-json\/wp\/v2\/comments?post=51769"}],"version-history":[{"count":28,"href":"https:\/\/www.altoros.com\/blog\/wp-json\/wp\/v2\/posts\/51769\/revisions"}],"predecessor-version":[{"id":51814,"href":"https:\/\/www.altoros.com\/blog\/wp-json\/wp\/v2\/posts\/51769\/revisions\/51814"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.altoros.com\/blog\/wp-json\/wp\/v2\/media\/53955"}],"wp:attachment":[{"href":"https:\/\/www.altoros.com\/blog\/wp-json\/wp\/v2\/media?parent=51769"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.altoros.com\/blog\/wp-json\/wp\/v2\/categories?post=51769"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.altoros.com\/blog\/wp-json\/wp\/v2\/tags?post=51769"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}