Machine Learning Expansion: Azure ML Studio and BigML

This blog post compares the two solutions across seven parameters, such as distribution model, supported languages, data formats and sources, pricing, etc.

machine-learning-azure-ml-studio-vs-bigml

Today, machine learning (ML) differs tremendously from what it was a few decades ago. ML algorithms are now used for fraud detection, recommendation systems, web search, spam filtering, speech and image recognition, equipment failure prediction, and other problems.

In this article, we explore two popular solutions that represent the variety of available options: Microsoft Azure ML Studio and BigML.

 

Microsoft Azure Machine Learning

Microsoft Azure ML Studio is a cloud-based machine learning service for developing, testing, and deploying predictive analytics models.

The service’s web interface enables users to create experiments by dragging and dropping data sets and Studio modules on the working canvas. When a predictive experiment is ready, you can publish it as an Azure web service available for consumption from web, mobile, and custom desktop applications, as well as from Excel. Web services can also be published on Azure Marketplace.

microsoft-azure-machine-learning-studio

Azure ML Studio:

  • Does not require installation and is accessed through a web browser
  • Provides a range of built-in algorithms that solve different machine learning tasks, including classification, regression, clustering, and text and image recognition
  • Has modules for data I/O, preparation, and visualization, as well as modules that help to train and evaluate model performance and accuracy
  • Allows for using custom R and Python scripts
  • Supports versioning and collaboration
  • Offers integration with other Azure services, such as Azure Blob Storage, Azure SQL, and Azure HDInsight

Sample experiments for retail forecasting, credit risk detection, speech and face recognition, and other popular problems are available in Cortana Analytics Gallery.

 

BigML

BigML is a cloud-based machine learning service for analyzing data and building predictive solutions. It provides a web interface that guides users through the process of uploading data, creating models, and making predictions based on them.

For managing the service resources programmatically, a RESTful API along with a set of API bindings for different programming languages is also available. In addition, users can interact with BigML through a command line tool—BigMLer.

bigml

BigML:

  • Does not require installation and is accessed through a web browser
  • Solves regression and classification tasks
  • Offers solutions for clustering analysis, anomaly detection, and association discovery
  • Provides high-quality data visualization tools
  • Saves history of performed tasks

BigML has a Gallery with sample models for sales predictions, loan risks, currency rate, etc.

 

Comparing Azure ML Studio and BigML

This table provides an overview of Azure ML Studio and BigML across several parameters.

CharacteristicAzure MLBigML
Distribution modelA cloud-based serviceA cloud-based service
Client languagesR and PythonOpen-source API bindings for Python, Node.js, Java, Clojure, C#, R, Ruby, and PHP
UIAzure ML Studio, a web-based interfaceBigML, a web-based interface
Data formats and sourcesAvailable options for data input:

  • Importing CSV, TSV, ARFF, SVMLight, .RData, and TXT files (Note: The files can be zipped (.zip).)
  • Using data from external sources:
    • Azure Blob Storage (CSV, TSV, ARFF, and Excel)
    • Azure SQL DB
    • Azure Table
    • Hadoop via HiveQL
    • OData feed
    • Web URL via HTTP (CSV, TSV, ARFF, and SVMLight)
  • Creating inline data sources in the CSV, TSV, ARFF, and SVMLight formats
Available options for data input:

  • Uploading CSV and ARFF files (Note: The files can be gzipped (.gz) or compressed (.bz2), while a .zip archive can contain only one file.)
  • Importing data sources from:
    • Azure Marketplace
    • Dropbox
    • Google Cloud Storage
    • Google Drive
  • Creating a data source from a URL
  • Creating inline data sources in the CSV format
SamplesA number of sample data sets and experiments are included by default. Also browse Cortana Analytics Gallery for samples.The service has:

DocumentationAzure ML Studio Documentation BigML documentation:

Pricing as of Q1 2016Standard tier: $9.99 per seat (monthly), $1 per Studio experiment hour, and $2 per production API compute hour with $0.50 per 1,000 production API transactionsSubscription plans: $30–$300 per month, $75–$750 per quarter, and $240–$2,400 per year

 

Conclusion

The emergence of cloud-based services, such as Azure ML and BigML, democratizes machine learning. Although both tools solve machine learning problems, Azure ML and BigML differ in the level of complexity and their use cases.

BigML, for instance, claims its intent to make machine learning accessible for everyone. Indeed, people with little programming knowledge can build classification and regression models using the product UI. The spectrum of tasks that the web service initially covered was represented by decision trees. Later, however, BigML introduced solutions for clustering analysis, anomaly detection, and association discovery.

Azure ML Studio, in its turn, offers a greater number of algorithms, as well as the ability to extend Studio’s functionality with custom R and Python scripts. In general, no programming is required when creating predictive models, but having R and Python skills helps to get the most out of the web service. Azure ML Studio itself is part of Microsoft’s ecosystem for big data and advanced analytics—Cortana Analytics Suite.

Note: Keep in mind that both services are growing rapidly. By the time of reading, new functionality might have been introduced.

 

Further reading

 

About the authors

Siarhei Charnichkou works in the field of web development. For more than 5 years, he has been delivering enterprise-grade solutions that address various business needs, including customer relationship management software, as well as content management and reporting systems. Siarhei is especially practiced in the .NET technology, database design and implementation, and JavaScript-based front-end development.

Victoria Fedzkovich strives for effective technical communication at Altoros. As a professional with 5+ years of experience in technical and scientific writing, she creates content for user guides, manuals, and technical overviews. Victoria is currently focused on the intersection of the Cloud Foundry PaaS and big data.


This blog post was written by Siarhei Charnichkou and Victoria Fedzkovich,
edited by Alex Khizhniak and Sophia Turol.