Call :+91 8882400500

Menu

5918 Ratings

21059 Learners

**Data Science and Data Analytics – Python / R / SAS Training in Bangalore**

**Learn Data Science, Deep Learning, & Machine Learning using Python / R /SAS With Live Machine Learning & Deep Learning Projects **

Duration : 3 Months – Weekends 3 Hours on Saturday and Sundays

Real Time Projects , Assignments , scenarios are part of this course

Data Sets , Installations , Interview Preparations , Repeat the session until 6 months are all attractions of this particular course

Menu

INTRO VIDEO

- Highly practical oriented training
- 25000+ Man-hours of Real-time projects & scenarios
- 10 to 20+ year Experienced corporate trainers With Real Time Experience.

- Building up professionals by highly experienced professionals
- 100 % quality assurance in training .
- 10000+ Placement Records and 180+ MNC’s and Consultancies Tie up

There are no formal prerequisites for this course; however, previous operating system administration experience will be very beneficial.

**Introduction to Data Science with Python**

- What is analytics & Data Science?
- Common Terms in Analytics
- Analytics vs. Data warehousing, OLAP, MIS Reporting
- Relevance in industry and need of the hour
- Types of problems and business objectives in various industries
- How leading companies are harnessing the power of analytics?
- Critical success drivers
- Overview of analytics tools & their popularity
- Analytics Methodology & problem solving framework
- List of steps in Analytics projects
- Identify the most appropriate solution design for the given problem statement
- Project plan for Analytics project & key milestones based on effort estimates
- Build Resource plan for analytics project

- Why Python for data science?
- Overview of Python- Starting with Python
- Introduction to installation of Python
- Introduction to Python Editors & IDE’s(Canopy, pycharm, Jupyter, Rodeo, Ipython etc…)
- Understand Jupyter notebook & Customize Settings
- Concept of Packages/Libraries – Important packages(NumPy, SciPy, scikit-learn, Pandas, Matplotlib, etc)
- Installing & loading Packages & Name Spaces
- Data Types & Data objects/structures (strings, Tuples, Lists, Dictionaries)
- List and Dictionary Comprehensions
- Variable & Value Labels – Date & Time Values
- Basic Operations – Mathematical – string – date
- Reading and writing data
- Simple plotting
- Control flow & conditional statements
- Debugging & Code profiling
- How to create class and modules and how to call them?

Scientific Distributions Used In Python For Data Science

NumPy, pandas, scikit-learn, stat models, nltk

- Importing Data from various sources (Csv, txt, excel, access etc)
- Database Input (Connecting to database)
- Viewing Data objects – subsetting Data, methods
- Exporting Data to various formats
- Important python modules: Pandas, beautiful soup

- Cleansing Data with Python
- Data Manipulation steps(Sorting, filtering, duplicates, merging, appending, subsetting, derived variables, sampling, Data type conversions, renaming, formatting etc)
- Data manipulation tools(Operators, Functions, Packages, control structures, Loops, arrays etc)
- Python Built-in Functions (Text, numeric, date, utility functions)
- Python User Defined Functions
- Stripping out extraneous information
- Normalizing data
- Formatting data
- Important Python modules for data manipulation (Pandas, Numpy, re, math, string, datetime etc)

- Introduction exploratory data analysis
- Descriptive statistics, Frequency Tables and summarization
- Univariate Analysis (Distribution of data & Graphical Analysis)
- Bivariate Analysis(Cross Tabs, Distributions & Relationships, Graphical Analysis)
- Creating Graphs- Bar/pie/line chart/histogram/ boxplot/ scatter/ density etc)
- Important Packages for Exploratory Analysis(NumPy Arrays, Matplotlib, seaborn, Pandas and SciPy. Stats etc)

- Basic Statistics – Measures of Central Tendencies and Variance
- Building blocks – Probability Distributions – Normal distribution – Central Limit Theorem
- Inferential Statistics -Sampling – Concept of Hypothesis Testing Statistical Methods – Z/t-tests( One sample, independent, paired), Analysis of variance, Correlations and Chi-square
- Important modules for statistical methods: NumPy, SciPy, Pandas

- Concept of model in analytics and how it is used?
- Common terminology used in analytics & Modelling process
- Popular modelling algorithms
- Types of Business problems – Mapping of Techniques
- Different Phases of Predictive Modelling

- Need for structured exploratory data
- EDA framework for exploring the data and identifying any problems with the data (Data Audit Report)
- Identify missing data
- Identify outliers data
- Visualize the data trends and patterns

- Need of Data preparation
- Consolidation/Aggregation – Outlier treatment – Flat Liners – Missing values- Dummy creation – Variable Reduction
- Variable Reduction Techniques – Factor & PCA Analysis

- Introduction to Segmentation
- Types of Segmentation (Subjective Vs Objective, Heuristic Vs. Statistical)
- Heuristic Segmentation Techniques (Value Based, RFM Segmentation and Life Stage Segmentation)
- Behavioural Segmentation Techniques (K-Means Cluster Analysis)
- Cluster evaluation and profiling – Identify cluster characteristics
- Interpretation of results – Implementation on new data

- Introduction – Applications
- Assumptions of Linear Regression
- Building Linear Regression Model
- Understanding standard metrics (Variable significance, R-square/Adjusted R-square, Global hypothesis ,etc)
- Assess the overall effectiveness of the model
- Validation of Models (Re running Vs. Scoring)
- Standard Business Outputs (Decile Analysis, Error distribution (histogram), Model equation, drivers etc.)
- Interpretation of Results – Business Validation – Implementation on new data

- Introduction – Applications
- Linear Regression Vs. Logistic Regression Vs. Generalized Linear Models
- Building Logistic Regression Model (Binary Logistic Model)
- Understanding standard model metrics (Concordance, Variable significance, Hosmer Lemeshov Test, Gini, KS, Misclassification, ROC Curve etc)
- Validation of Logistic Regression Models (Re running Vs. Scoring)
- Standard Business Outputs (Decile Analysis, ROC Curve, Probability Cut-offs, Lift charts, Model equation, Drivers or variable importance, etc)
- Interpretation of Results – Business Validation – Implementation on new data

- Introduction – Applications
- Time Series Components( Trend, Seasonality, Cyclicity and Level) and Decomposition
- Classification of Techniques(Pattern based – Pattern less)
- Basic Techniques – Averages, Smoothening, etc
- Advanced Techniques – AR Models, ARIMA, etc
- Understanding Forecasting Accuracy – MAPE, MAD, MSE, etc

- Introduction to Machine Learning & Predictive Modelling
- Types of Business problems – Mapping of Techniques – Regression vs. classification vs. segmentation vs. Forecasting
- Major Classes of Learning Algorithms -Supervised vs Unsupervised Learning
- Different Phases of Predictive Modelling (Data Pre-processing, Sampling, Model Building, Validation)
- Overfitting (Bias-Variance Trade off) & Performance Metrics
- Feature engineering & dimension reduction
- Concept of optimization & cost function
- Overview of gradient descent algorithm
- Overview of Cross validation(Bootstrapping, K-Fold validation etc)
- Model performance metrics (R-square, Adjusted R-square, RMSE, MAPE, AUC, ROC curve, recall, precision, sensitivity, specificity, confusion metrics )

- What is segmentation & Role of ML in Segmentation?
- Concept of Distance and related math background
- K-Means Clustering
- Expectation Maximization
- Hierarchical Clustering
- Spectral Clustering (DBSCAN)
- Principle component Analysis (PCA)

- Decision Trees – Introduction – Applications
- Types of Decision Tree Algorithms
- Construction of Decision Trees through Simplified Examples; Choosing the “Best” attribute at each Non-Leaf node; Entropy; Information Gain, Gini Index, Chi Square, Regression Trees
- Generalizing Decision Trees; Information Content and Gain Ratio; Dealing with Numerical Variables; other Measures of Randomness
- Pruning a Decision Tree; Cost as a consideration; Unwrapping Trees as Rules
- Decision Trees – Validation
- Overfitting – Best Practices to avoid

- Concept of Ensembling
- Manual Ensembling Vs. Automated Ensembling
- Methods of Ensembling (Stacking, Mixture of Experts)
- Bagging (Logic, Practical Applications)
- Random forest (Logic, Practical Applications)
- Boosting (Logic, Practical Applications)
- Ada Boost
- Gradient Boosting Machines (GBM)
- XGBoost

- Motivation for Neural Networks and Its Applications
- Perceptron and Single Layer Neural Network, and Hand Calculations
- Learning In a Multi Layered Neural Net: Back Propagation and Conjugant Gradient Techniques
- Neural Networks for Regression
- Neural Networks for Classification
- Interpretation of Outputs and Fine tune the models with hyper parameters
- Validating ANN models

- What is analytics & Data Science?
- Common Terms in Analytics
- Analytics vs. Data warehousing, OLAP, MIS Reporting
- Relevance in industry and need of the hour
- Types of problems and business objectives in various industries
- How leading companies are harnessing the power of analytics?
- Critical success drivers
- Overview of analytics tools & their popularity
- Analytics Methodology & problem solving framework
- List of steps in Analytics projects
- Identify the most appropriate solution design for the given problem statement
- Project plan for Analytics project & key milestones based on effort estimates
- Build Resource plan for analytics project
- Why R for data science?

- Introduction R/R-Studio – GUI
- Concept of Packages – Useful Packages (Base & Other packages)
- Data Structure & Data Types (Vectors, Matrices, factors, Data frames, and Lists)
- Importing Data from various sources (txt, dlm, excel, sas7bdata, db, etc.)
- Database Input (Connecting to database)
- Exporting Data to various formats)
- Viewing Data (Viewing partial data and full data)
- Variable & Value Labels – Date Values

- Data Manipulation steps
- Creating New Variables (calculations & Binning)
- Dummy variable creation
- Applying transformations
- Handling duplicates
- Handling missings
- Sorting and Filtering
- Subsetting (Rows/Columns)
- Appending (Row appending/column appending)
- Merging/Joining (Left, right, inner, full, outer etc)
- Data type conversions
- Renaming
- Formatting
- Reshaping data
- Sampling
- Data manipulation tools
- Operators
- Functions
- Packages
- Control Structures (if, if else)
- Loops (Conditional, iterative loops, apply functions)
- Arrays
- R Built-in Functions (Text, Numeric, Date, utility)
- Numerical Functions
- Text Functions
- Date Functions
- Utilities Functions
- R User Defined Functions
- R Packages for data manipulation (base, dplyr, plyr, data.table, reshape, car, sqldf, etc)

** **

- introduction exploratory data analysis
- Descriptive statistics, Frequency Tables and summarization
- Univariate Analysis (Distribution of data & Graphical Analysis)
- Bivariate Analysis(Cross Tabs, Distributions & Relationships, Graphical Analysis)
- Creating Graphs- Bar/pie/line chart/histogram/boxplot/scatter/density etc)
- R Packages for Exploratory Data Analysis(dplyr, plyr, gmodes, car, vcd, Hmisc, psych, doby etc)
- R Packages for Graphical Analysis (base, ggplot, lattice,etc)

- Basic Statistics – Measures of Central Tendencies and Variance
- Building blocks – Probability Distributions – Normal distribution – Central Limit Theorem
- Inferential Statistics -Sampling – Concept of Hypothesis Testing
- Statistical Methods – Z/t-tests( One sample, independent, paired), Anova, Correlations and Chi-square

- Concept of model in analytics and how it is used?
- Common terminology used in analytics & modelling process
- Popular modelling algorithms
- Types of Business problems – Mapping of Techniques
- Different Phases of Predictive Modelling

** **

** Data Preparation**

- Need of Data preparation
- Consolidation/Aggregation – Outlier treatment – Flat Liners – Missing values- Dummy creation – Variable Reduction
- Variable Reduction Techniques – Factor & PCA Analysis

** **

- Introduction to Segmentation
- Types of Segmentation (Subjective Vs Objective, Heuristic Vs. Statistical)
- Heuristic Segmentation Techniques (Value Based, RFM Segmentation and Life Stage Segmentation)
- Behavioral Segmentation Techniques (K-Means Cluster Analysis)
- Cluster evaluation and profiling – Identify cluster characteristics
- Interpretation of results – Implementation on new data

- Introduction – Applications
- Assumptions of Linear Regression
- Building Linear Regression Model
- Understanding standard metrics (Variable significance, R-square/Adjusted R-square, Global hypothesis ,etc)
- Assess the overall effectiveness of the model
- Validation of Models (Re running Vs. Scoring)
- Standard Business Outputs (Decile Analysis, Error distribution (histogram), Model equation, drivers etc.)
- Interpretation of Results – Business Validation – Implementation on new data

Introduction – Applications

- Linear Regression Vs. Logistic Regression Vs. Generalized Linear Models
- Building Logistic Regression Model (Binary Logistic Model)
- Understanding standard model metrics (Concordance, Variable significance, Hosmer Lemeshov Test, Gini, KS, Misclassification, ROC Curve etc)
- Validation of Logistic Regression Models (Re running Vs. Scoring)
- Standard Business Outputs (Decile Analysis, ROC Curve, Probability Cut-offs, Lift charts, Model equation, Drivers or variable importance, etc)
- Interpretation of Results – Business Validation – Implementation on new data

- Introduction – Applications
- Time Series Components( Trend, Seasonality, Cyclicity and Level) and Decomposition
- Classification of Techniques(Pattern based – Pattern less)
- Basic Techniques – Averages, Smoothening, etc
- Advanced Techniques – AR Models, ARIMA, etc
- Understanding Forecasting Accuracy – MAPE, MAD, MSE, etc

- Introduction to Machine Learning & Predictive Modeling
- Types of Business problems – Mapping of Techniques – Regression vs. classification vs. segmentation vs. Forecasting
- Major Classes of Learning Algorithms -Supervised vs Unsupervised Learning
- Different Phases of Predictive Modeling (Data Pre-processing, Sampling, Model Building, Validation)
- Overfitting (Bias-Variance Trade off) & Performance Metrics
- Feature engineering & dimension reduction
- Concept of optimization & cost function
- Overview of gradient descent algorithm
- Overview of Cross validation(Bootstrapping, K-Fold validation etc)
- Model performance metrics (R-square, Adjusted R-squre, RMSE, MAPE, AUC, ROC curve, recall, precision, sensitivity, specificity, confusion metrics )

- What is segmentation & Role of ML in Segmentation?
- Concept of Distance and related math background
- K-Means Clustering
- Expectation Maximization
- Hierarchical Clustering
- Spectral Clustering (DBSCAN)
- Principle component Analysis (PCA)

- Decision Trees – Introduction – Applications
- Types of Decision Tree Algorithms
- Construction of Decision Trees through Simplified Examples; Choosing the “Best” attribute at each Non-Leaf node; Entropy; Information Gain, Gini Index, Chi Square, Regression Trees
- Generalizing Decision Trees; Information Content and Gain Ratio; Dealing with Numerical Variables; other Measures of Randomness
- Pruning a Decision Tree; Cost as a consideration; Unwrapping Trees as Rules
- Decision Trees – Validation
- Overfitting – Best Practices to avoid

- Concept of Ensembling
- Manual Ensembling Vs. Automated Ensembling
- Methods of Ensembling (Stacking, Mixture of Experts)
- Bagging (Logic, Practical Applications)
- Random forest (Logic, Practical Applications)
- Boosting (Logic, Practical Applications)
- Ada Boost
- Gradient Boosting Machines (GBM)
- XGBoost

- Motivation for Neural Networks and Its Applications
- Perceptron and Single Layer Neural Network, and Hand Calculations
- Learning In a Multi Layered Neural Net: Back Propagation and Conjugant Gradient Techniques
- Neural Networks for Regression
- Neural Networks for Classification
- Interpretation of Outputs and Fine tune the models with hyper parameters
- Validating ANN models

- Motivation for Support Vector Machine & Applications
- Support Vector Regression
- Support vector classifier (Linear & Non-Linear)
- Mathematical Intuition (Kernel Methods Revisited, Quadratic Optimization and Soft Constraints)
- Interpretation of Outputs and Fine tune the models with hyper parameters
- Validating SVM models

- What is KNN & Applications?
- KNN for missing treatment
- KNN For solving regression problems
- KNN for solving classification problems
- Validating KNN model
- Model fine tuning with hyper parameters

- Concept of Conditional Probability
- Bayes Theorem and Its Applications
- Naïve Bayes for classification
- Applications of Naïve Bayes in Classifications

- Taming big text, Unstructured vs. Semi-structured Data; Fundamentals of information retrieval, Properties of words; Creating Term-Document (TxD);Matrices; Similarity measures, Low-level processes (Sentence Splitting; Tokenization; Part-of-Speech Tagging; Stemming; Chunking)
- Finding patterns in text: text mining, text as a graph
- Natural Language processing (NLP)
- Text Analytics – Sentiment Analysis using R
- Text Analytics – Word cloud analysis using R
- Text Analytics – Segmentation using K-Means/Hierarchical Clustering
- Text Analytics – Classification (Spam/Not spam)
- Applications of Social Media Analytics
- Metrics(Measures Actions) in social media analytics
- Examples & Actionable Insights using Social Media Analytics
- Important R packages for Machine Learning (caret, H2O, Randomforest, nnet, tm etc)
- Fine tuning the models using Hyper parameters, grid search, piping etc.

**Introduction To Analytics**

- Analytics World
- Introduction to Analytics
- Concept of ETL
- S-A-S in advanced analytics

- Global Certification: Induction and walk through
- Getting Started
- Software installation
- Introduction to GUI
- Different components of the language
- All programming windows
- Concept of Libraries and Creating Libraries
- Variable Attributes – (Name, Type, Length, Format, In format, Label)
- Importing Data and Entering data manually

- Understanding Datasets
- Descriptor Portion of a Dataset (Proc Contents)
- Data Portion of a Dataset
- Variable Names and Values
- Data Libraries

** **Understanding Data Step Processing

- Data Step and Proc Step
- Data step execution
- Compilation and execution phase
- Input buffer and concept of PDV

- Importing Raw Data Files
- Column Input and List Input and Formatted methods
- Delimiters, Reading missing and non standard values
- Reading one to many and many to one records
- Reading Hierarchical files
- Creating raw data files and put statement
- Formats / Informat

- Importing and Exporting Data (Fixed Format / Delimited)
- Proc Import / Delimited text files
- Proc Export / Exporting Data
- Datalines / Cards;
- Atypical importing cases (mixing different style of inputs)
- Reading Multiple Records per Observation
- Reading “Mixed Record Types”
- Sub-setting from a Raw Data File
- Multiple Observations per Record
- Reading Hierarchical Files

- Concept of SAS library and SAS Catalog
- Variable Types in SAS
- Reading Data stored external to SAS
- Importing Data by using Proc Import
- Data Step SAS statements
- SAS Functions
- Appending and Merging using SAS
- SAS Procedures like proc means, proc Univariate, proc append, proc freq and proc export.
- SAS SQL
- SAS Macros

- One Sample t-test of comparing means
- Two Sample t-test of comparing means
- One Way ANOVA
- Assumptions of ANOVA Modeling
- n-way ANOVA
- ANOVA Post Hoc Studies

- Apply the principles of honest assessment to model performance measurement
- Assess classifier performance using the confusion matrix
- Model selection and validation using training and validation data
- Create and interpret graphs (ROC, lift, and gains charts) for model comparison and selection
- Establish effective decision cut-off values for scoring

- Understanding and Exploration Data
- Introduction to basic Procedures – Proc Contents, Proc Print

- Understanding and Exploration Data
- Operators and Operands
- Conditional Statements (Where, If, If then Else, If then Do and select when)
- Difference between WHERE and IF statements and limitation of WHERE statements
- Labels, Commenting
- System Options (OBS, FSTOBS, NOOBS etc…)

- Data Manipulation
- Proc Sort – with options / De-Duping
- Accumulator variable and By-Group processing
- Explicit Output Statements
- Nesting Do loops
- Do While and Do Until Statement
- Array elements and Range

- Combining Datasets (Appending and Merging)
- Concatenation
- Interleaving
- Proc Append
- One To One Merging
- Match Merging
- IN = Controlling merge and Indicator

** **

- Engineering/Management Graduate or Post-graduate Fresher Students who want to make their career in Data Science Industry or want to be future Data Scientist.
- Engineers who want to use a distributed computing engine for batch or stream processing or both
- Analysts who want to leverage Spark for analyzing interesting datasets
- Data Scientists who want a single engine for analyzing and modelling data as well as productionizing it.
- MBA Graduates or business professionals who are looking to move to a heavily quantitative role.
- Engineering Graduate/Professionals who want to understand basic statistics and lay a foundation for a career in Data Science
- Working Professional or Fresh Graduate who have mostly worked in Descriptive analytics or not work anywhere and want to make the shift to being data scientists
- Professionals who’ve worked mostly with tools like Excel and want to learn how to use R for statistical analysis

ENQUIRE NOW
*keyboard_arrow_left*Previous
FormCraft - WordPress form builder

[]

1
Step 1

Next*keyboard_arrow_right*