Data Science Course Syllabus
Introduction to Data Science
- Introduction to Data Analytics
- Introduction to Business Analytics
- Understanding Business Applications
- Data types and data Models
- Type of Business Analytics
- Evolution of Analytics
- Data Science Components
- Data Scientist Skillset
- Univariate Data Analysis
- Introduction to Sampling
Basic Operations in R Programming
- Introduction to R programming
- Types of Objects in R
- Naming standards in R
- Creating Objects in R
- Data Structure in R
- Matrix, Data Frame, String, Vectors
- Understanding Vectors & Data input in R
- Lists, Data Elements
- Creating Data Files using R
Data Handling in R Programming
- Basic Operations in R – Expressions, Constant Values, Arithmetic, Function Calls, Symbols
- Sub-setting Data
- Selecting (Keeping) Variables
- Excluding (Dropping) Variables
- Selecting Observations and Selection using Subset Function
- Merging Data
- Sorting Data
- Adding Rows
- Visualization using R
- Data Type Conversion
- Built-In Numeric Functions
- Built-In Character Functions
- User Built Functions
- Control Structures
- Loop Functions
Introduction to Statistics
- Basic Statistics
- Measure of central tendency
- Types of Distributions
- Anova
- F-Test
- Central Limit Theorem & applications
- Types of variables
- Relationships between variables
- Central Tendency
- Measures of Central Tendency
- Kurtosis
- Skewness
- Arithmetic Mean / Average
- Merits & Demerits of Arithmetic Mean
- Mode, Merits & Demerits of Mode
- Median, Merits & Demerits of Median
- Range
- Concept of Quantiles, Quartiles, percentile
- Standard Deviation
- Variance
- Calculate Variance
- Covariance
- Correlation
Introduction to Statistics – 2
- Hypothesis Testing
- Multiple Linear Regression
- Logistic Regression
- Market Basket Analysis
- Clustering (Hierarchical Clustering & K-means Clustering)
- Classification (Decision Trees)
- Time Series Analysis (Simple Moving Average, Exponential smoothing, ARIMA+)
Introduction to Probability
- Standard Normal Distribution
- Normal Distribution
- Geometric Distribution
- Poisson Distribution
- Binomial Distribution
- Parameters vs. Statistics
- Probability Mass Function
- Random Variable
- Conditional Probability and Independence
- Unions and Intersections
- Finding Probability of dataset
- Probability Terminology
- Probability Distributions
Data Visualization Techniques
- Bubble Chart
- Sparklines
- Waterfall chart
- Box Plot
- Line Charts
- Frequency Chart
- Bimodal & Multimodal Histograms
- Histograms
- Scatter Plot
- Pie Chart
- Bar Graph
- Line Graph
Introduction to Machine Learning
- Overview & Terminologies
- What is Machine Learning?
- Why Learn?
- When is Learning required?
- Data Mining
- Application Areas and Roles
- Types of Machine Learning
- Supervised Learning
- Unsupervised Learning
- Reinforcement learning
Machine Learning Concepts & Terminologies
Steps in developing a Machine Learning application
- Key tasks of Machine Learning
- Modelling Terminologies
- Learning a Class from Examples
- Probability and Inference
- PAC (Probably Approximately Correct) Learning
- Noise
- Noise and Model Complexity
- Triple Trade-Off
- Association Rules
- Association Measures
Regression Techniques
- Concept of Regression
- Best Fitting line
- Simple Linear Regression
- Building regression models using excel
- Coefficient of determination (R- Squared)
- Multiple Linear Regression
- Assumptions of Linear Regression
- Variable transformation
- Reading coefficients in MLR
- Multicollinearity
- VIF
- Methods of building Linear regression model in R
- Model validation techniques
- Cooks Distance
- Q-Q Plot
- Durbin- Watson Test
- Kolmogorov-Smirnof Test
- Homoskedasticity of error terms
- Logistic Regression
- Applications of logistic regression
- Concept of odds
- Concept of Odds Ratio
- Derivation of logistic regression equation
- Interpretation of logistic regression output
- Model building for logistic regression
- Model validations
- Confusion Matrix
- Concept of ROC/AOC Curve
- KS Test
Market Basket Analysis
- Applications of Market Basket Analysis
- What is association Rules
- Overview of Apriori algorithm
- Key terminologies in MBA
- Support
- Confidence
- Lift
- Model building for MBA
- Transforming sales data to suit MBA
- MBA Rule selection
- Ensemble modelling applications using MBA
Time Series Analysis (Forecasting)
- Model building using ARIMA, ARIMAX, SARIMAX
- Data De-trending & data differencing
- KPSS Test
- Dickey Fuller Test
- Concept of stationarity
- Model building using exponential smoothing
- Model building using simple moving average
- Time series analysis techniques
- Components of time series
- Prerequisites for time series analysis
- Concept of Time series data
- Applications of Forecasting
Decision Trees using R
- Understanding the Concept
- Internal decision nodes
- Terminal leaves.
- Tree induction: Construction of the tree
- Classification Trees
- Entropy
- Selecting Attribute
- Information Gain
- Partially learned tree
- Overfitting
- Causes for over fitting
- Overfitting Prevention (Pruning) Methods
- Reduced Error Pruning
- Decision trees – Advantages & Drawbacks
- Ensemble Models
K Means Clustering
- Parametric Methods Recap
- Clustering
- Direct Clustering Method
- Mixture densities
- Classes v/s Clusters
- Hierarchical Clustering
- Dendogram interpretation
- Non-Hierarchical Clustering
- K-Means
- Distance Metrics
- K-Means Algorithm
- K-Means Objective
- Color Quantization
- Vector Quantization
Tableau Analytics
- Tableau Introduction
- Data connection to Tableau
- Calculated fields, hierarchy, parameters, sets, groups in Tableau
- Various visualizations Techniques in Tableau
- Map based visualization using Tableau
- Reference Lines
- Adding Totals, sub totals, Captions
- Advanced Formatting Options
- Using Combined Field
- Show Filter & Use various filter options
- Data Sorting
- Create Combined Field
- Table Calculations
- Creating Tableau Dashboard
- Action Filters
- Creating Story using Tableau
Analytics using Tableau
- Clustering using Tableau
- Time series analysis using Tableau
- Simple Linear Regression using Tableau
R integration in Tableau
- Integrating R code with Tableau
- Creating statistical model with dynamic inputs
- Visualizing R output in Tableau
- Case Study 1- Real time project with Twitter Data Analytics
- Case Study 2- Real time project with Google Finance
- Case Study 3- Real time project with IMDB Website
If you want to Learn Data Science Training in Chennai, Please reach us at +91 86818 84318