Data Science Course Syllabus

Data Science Course Syllabus

Introduction to Data Science

  • Introduction to Data Analytics
  • Introduction to Business Analytics
  • Understanding Business Applications
  • Data types and data Models
  • Type of Business Analytics
  • Evolution of Analytics
  • Data Science Components
  • Data Scientist Skillset
  • Univariate Data Analysis
  • Introduction to Sampling

Basic Operations in R Programming

  • Introduction to R programming
  • Types of Objects in R
  • Naming standards in R
  • Creating Objects in R
  • Data Structure in R
  • Matrix, Data Frame, String, Vectors
  • Understanding Vectors & Data input in R
  • Lists, Data Elements
  • Creating Data Files using R

Data Handling in R Programming

  • Basic Operations in R – Expressions, Constant Values, Arithmetic, Function Calls, Symbols
  • Sub-setting Data
  • Selecting (Keeping) Variables
  • Excluding (Dropping) Variables
  • Selecting Observations and Selection using Subset Function
  • Merging Data
  • Sorting Data
  • Adding Rows
  • Visualization using R
  • Data Type Conversion
  • Built-In Numeric Functions
  • Built-In Character Functions
  • User Built Functions
  • Control Structures
  • Loop Functions

free-demo-class

Introduction to Statistics

  • Basic Statistics
  • Measure of central tendency
  • Types of Distributions
  • Anova
  • F-Test
  • Central Limit Theorem & applications
  • Types of variables
  • Relationships between variables
  • Central Tendency
  • Measures of Central Tendency
  • Kurtosis
  • Skewness
  • Arithmetic Mean / Average
  • Merits & Demerits of Arithmetic Mean
  • Mode, Merits & Demerits of Mode
  • Median, Merits & Demerits of Median
  • Range
  • Concept of Quantiles, Quartiles, percentile
  • Standard Deviation
  • Variance
  • Calculate Variance
  • Covariance
  • Correlation

Introduction to Statistics – 2

  • Hypothesis Testing
  • Multiple Linear Regression
  • Logistic Regression
  • Market Basket Analysis
  • Clustering (Hierarchical Clustering & K-means Clustering)
  • Classification (Decision Trees)
  • Time Series Analysis (Simple Moving Average, Exponential smoothing, ARIMA+)

Introduction to Probability

  • Standard Normal Distribution
  • Normal Distribution
  • Geometric Distribution
  • Poisson Distribution
  • Binomial Distribution
  • Parameters vs. Statistics
  • Probability Mass Function
  • Random Variable
  • Conditional Probability and Independence
  • Unions and Intersections
  • Finding Probability of dataset
  • Probability Terminology
  • Probability Distributions

Data Visualization Techniques

  • Bubble Chart
  • Sparklines
  • Waterfall chart
  • Box Plot
  • Line Charts
  • Frequency Chart
  • Bimodal & Multimodal Histograms
  • Histograms
  • Scatter Plot
  • Pie Chart
  • Bar Graph
  • Line Graph

Introduction to Machine Learning

  • Overview & Terminologies
  • What is Machine Learning?
  • Why Learn?
  • When is Learning required?
  • Data Mining
  • Application Areas and Roles
  • Types of Machine Learning
  • Supervised Learning
  • Unsupervised Learning
  • Reinforcement learning

Machine Learning Concepts & Terminologies

Steps in developing a Machine Learning application

  • Key tasks of Machine Learning
  • Modelling Terminologies
  • Learning a Class from Examples
  • Probability and Inference
  • PAC (Probably Approximately Correct) Learning
  • Noise
  • Noise and Model Complexity
  • Triple Trade-Off
  • Association Rules
  • Association Measures

free-demo-class
Regression Techniques

  • Concept of Regression
  • Best Fitting line
  • Simple Linear Regression
  • Building regression models using excel
  • Coefficient of determination (R- Squared)
  • Multiple Linear Regression
  • Assumptions of Linear Regression
  • Variable transformation
  • Reading coefficients in MLR
  • Multicollinearity
  • VIF
  • Methods of building Linear regression model in R
  • Model validation techniques
  • Cooks Distance
  • Q-Q Plot
  • Durbin- Watson Test
  • Kolmogorov-Smirnof Test
  • Homoskedasticity of error terms
  • Logistic Regression
  • Applications of logistic regression
  • Concept of odds
  • Concept of Odds Ratio
  • Derivation of logistic regression equation
  • Interpretation of logistic regression output
  • Model building for logistic regression
  • Model validations
  • Confusion Matrix
  • Concept of ROC/AOC Curve
  • KS Test

Market Basket Analysis

  • Applications of Market Basket Analysis
  • What is association Rules
  • Overview of Apriori algorithm
  • Key terminologies in MBA
  • Support
  • Confidence
  • Lift
  • Model building for MBA
  • Transforming sales data to suit MBA
  • MBA Rule selection
  • Ensemble modelling applications using MBA

Time Series Analysis (Forecasting)

  • Model building using ARIMA, ARIMAX, SARIMAX
  • Data De-trending & data differencing
  • KPSS Test
  • Dickey Fuller Test
  • Concept of stationarity
  • Model building using exponential smoothing
  • Model building using simple moving average
  • Time series analysis techniques
  • Components of time series
  • Prerequisites for time series analysis
  • Concept of Time series data
  • Applications of Forecasting

Decision Trees using R

  • Understanding the Concept
  • Internal decision nodes
  • Terminal leaves.
  • Tree induction: Construction of the tree
  • Classification Trees
  • Entropy
  • Selecting Attribute
  • Information Gain
  • Partially learned tree
  • Overfitting
  • Causes for over fitting
  • Overfitting Prevention (Pruning) Methods
  • Reduced Error Pruning
  • Decision trees – Advantages & Drawbacks
  • Ensemble Models

K Means Clustering

  • Parametric Methods Recap
  • Clustering
  • Direct Clustering Method
  • Mixture densities
  • Classes v/s Clusters
  • Hierarchical Clustering
  • Dendogram interpretation
  • Non-Hierarchical Clustering
  • K-Means
  • Distance Metrics
  • K-Means Algorithm
  • K-Means Objective
  • Color Quantization
  • Vector Quantization

Tableau Analytics

  • Tableau Introduction
  • Data connection to Tableau
  • Calculated fields, hierarchy, parameters, sets, groups in Tableau
  • Various visualizations Techniques in Tableau
  • Map based visualization using Tableau
  • Reference Lines
  • Adding Totals, sub totals, Captions
  • Advanced Formatting Options
  • Using Combined Field
  • Show Filter & Use various filter options
  • Data Sorting
  • Create Combined Field
  • Table Calculations
  • Creating Tableau Dashboard
  • Action Filters
  • Creating Story using Tableau

Analytics using Tableau

  • Clustering using Tableau
  • Time series analysis using Tableau
  • Simple Linear Regression using Tableau

R integration in Tableau

  • Integrating R code with Tableau
  • Creating statistical model with dynamic inputs
  • Visualizing R output in Tableau
  • Case Study 1- Real time project with Twitter Data Analytics
  • Case Study 2- Real time project with Google Finance
  • Case Study 3- Real time project with IMDB Website

If you want to Learn Data Science Training in Chennai, Please reach us at +91 86818 84318