Custom Data Science Training

Develop a custom program based on your teams' needs, goals, and expertise

Build custom Data Science program specific to your team's needs

  • Design your custom program by choosing appropriate topics to meet the needs of your team
  • This tailored course is intended for Software Engineers, Data Analysts, Data Engineers and all the others who are planning to bootstrap and operate predictive models in research and production.
  • It will help your team understand how everything fits together to run artificial intelligence applications and learn how to start from data understanding, continue with feature engineering, and end up creating a production-grade machine learning model.
s-why-office
On-site
at your office
s-why-puzzle
Tailor-made to fit
your level and needs
s-why-rocket
Hands-on
s-why-presentation
Instructor-led

Why Enroll

  • Understand how to leverage your data to forecast future events and behaviors to discover deeper business insights
  • Get practical skills required to spin up predictive analytics pipeline
  • Learn how to successfully architect, develop and manage Data Science projects

Design Your Training Agenda

Step 1

Select the checkboxes to include the topics you want us to cover. You can save this form at any time and continue later

1. 
Machine Learning
see details

Theory

  • An introduction to machine learning tasks and definitions
  • Core principles of building machine learning algorithms
  • A diversity of machine learning algorithms: from linear regression to random forest
  • Core Python packages for machine learning

Practice

  • Linear and logistic regressions
  • k-nearest neighbors and k-means
  • Decision trees and random forest
  • Handling classification, regression, and clustering tasks

*Packages of choice are Pandas/NumPy/scikit-learn

Theory

  • LASSO/Ridge (regularization)
  • PCA/SVD (dimensionality reduction)
  • Advanced clustering algorithms, such as DBSCAN, expectation-maximization (different similarity approaches to data)
  • Naive Bayes (The Bayes theorem)
  • Complex ensembling schemes, gradient boosting, stacking (iterative refinement)
  • Algorithmic hyperparameter tuning

Practice

  • LASSO
  • PCA
  • DBSCAN, expectation-maximization, agglomerative clustering, mean shift
  • Naive Bayes
  • Gradient boosting machine, stacking
  • Tree-structured Parzen estimator

*Packages of choice are Pandas/NumPy/scikit-learn/HyperOpt/XGBoost

Theory

  • Feature engineering
  • Dealing with missing data and outliers
  • Dealing with imbalanced classification
  • Advanced validation schemes
  • Handling of model versioning
  • CRISP-DM as a major machine learning development methodology

Practice

  • Feature engineering: polynomial and logarithmic features, combinations of features; periodic feature encoding; target encodings
  • Imbalanced сlassification: advanced metrics for classification, threshold tuning, over- and undersampling (SMOTE)
  • DBSCAN, expectation-maximization, agglomerative clustering, mean shift
  • Missing data handling: imputation of missing values using k-nearest neighbors or decision trees
  • Advanced validation: cross-validation for time series

*Packages of choice are Pandas/NumPy/scikit-learn

2. 
Data Science Applications
see details

Algorithmic text processing is a vast area for neural network application. From text classifications to text understanding, there's a successful applications of machine learning. We'll look at basic NLP techniques and for State-of-the-Art applications of NLP:

  • Bag-of-Words approach to text related tasks
  • Sequential approach using RNN architectures
  • Embeddings as richer and dense representations of words
  • State of the Art: contextual embeddings and attention mechanism

Computer Vision is a huge field with most of successes of deep learning, starting from winning of neural networks win in ImageNet competition in 2012. We'll try to dive a little into some useful applications of it that are constantly present here:

  • Image-specific data transformations
  • Object detection using YOLO/SSD model
  • Image segmentation using U-Net/LinkNet/R-CNN algorithms
  • Architectures for real-time image processing

Transaction data is largely prevalent type of datasets, especially in telecom/banking. Purpose of this module is to show an approach for this data to retrieve useful insights.

  • Data preparation of transactional data
  • Time series specific family of algorithms
  • Statistical and Neural Network approaches for this task

Reinforcement Learning generalizes whole concept of machine learning while allowing to solve some intricate problems. In this module we'll make an explanation of the concept of reinforcement learning and guide you from basic algorithms that support this concept to methods that lay foundation to latest State-of-the-Art results. We'll go through this set of algorithms:

  • Markov Decision Process
  • Multi-Armed Bandit Algorithms
  • Q-learning
  • Policy algorithms
3. 
Deep Learning
see details

We’ll look at a surprisingly strong machine learning techniques that have become really popular recently and will cover the following topics:

  • Structure of neural networks, feedforward neural networks
  • A mechanism for learning neural networks
  • Means of neural network learning process control

Practice

  • Neural networks for supervised learning with Keras

*Packages of choice are Pandas/NumPy/scikit-learn/Keras/TensorFlow

Convolution as the core of the neural network layer for spatial data processing. Topics for the day:

  • Image features and representation learning
  • A convolution layer and a deep convolutional network
  • Supporting layers for convolutional neural networks
  • State-of-the-art architectures for image processing
  • Transfer learning and fine tuning

We will:

  • Build a convolutional neural network from scratch to learn image classification
  • Fine-tune existing networks to perform image-related tasks on a different data sets

*Packages of choice are Pandas/NumPy/scikit-learn/Keras/TensorFlow

Neural network architecture for sequential data modelling. Topics for the day:

  • Examples of sequential data and related machine learning tasks
  • The vanilla recurrent neural network architecture and its limitations
  • The advanced recurrent neural network layers architecture

We will implement:

  • Character - and word-level natural language model
  • Fine-tune existing networks to perform image-related tasks on a different data sets

*Packages of choice are Pandas/NumPy/scikit-learn/Keras/TensorFlow

Neural network architectures that developed to solve non-standard tasks such as representation learning and data generation. Topics are:

  • Autoencoder architecture blueprint
  • Properties of autoencoder representations
  • Generative Adversarial Networks
4. 
Big Data
see details

In this module you'll learn:

  • Basic knowledge and principal of Hadoop (Yarn, HDFS)
  • Main concepts of Spark such as RDD, Shared Variables, Persistency, Spark architecture, Spark under the hood
  • Сore principles of Spark SQL
  • Basic principles of Spark MLIB work

This module covers real-time processing data which is based on such processing engine as

  • Spark Stream processing data in real-time
  • Spark Structured Stream processing engine built on the Spark SQL engine
  • Apache Kafka stream processing platform

Additionally, you will get knowledge about

  • Integration Kafka and Spark Streaming
  • Integration Kafka and Spark Structured Streaming

The following topics will be covered in this module

  • Introduction, Foundation and Operation wit Apache Cassandra
  • Concept of DSE Analytics(Spark + Cassandra) and DSE Solo Analytics
  • Main principles of SparkConnector by DSE

Step 2

Fill in the form to complete your submission:

First Name*
Last Name*
Email*
Phone
Your company name*
Your Message (optional)

Our customers

Here is what our customers say about us
arpad-rozsas
Biggest value of the course? Combination of conceptual and practical contents. Also, shared personal experiences and views were particularly valuable.
ramesh-balasubramanian
Great experience. Very knowledgeable and friendly trainers. Biggest value of the course -Practical examples/issues the trainers provided based on their experience.
mark-forest-and-gavin-liu
What did I like most at the training? Both high level and the details of machine learning.
mark-forest-and-gavin-liu-2
I enjoyed the class and learned a lot even though there was so much content jammed into a very small time. The most enjoyable was deep neural nets and seeing some of the largest examples. The most valuable thing professionally will probably be the classification clustering that I have learned K-NN probably.

Contact us

Contact me if you have any questions or want to request a quote

Alexandra Mironova

Alexandra Mironova

Training Coordinator

Headquarters

location icon830 Stewart Dr., Suite 119Sunnyvale, CA 94085
First Name*
Last Name*
Email*
Phone*
Your company name*
Your Message (optional)