Data Science and Data Analytics TRAINING IN THRISSUR

2546 Reviews
4.8/5
1880 Learners

Overview

Data Science and Data Analytics  – Python / R / SAS Training in Pune

Learn Data Science, Deep Learning, & Machine Learning  using Python / R  /SAS With Live Machine Learning & Deep Learning Projects 

Duration : 3 Months – Weekends 3 Hours on Saturday and Sundays

Real Time Projects , Assignments , scenarios are part of this course

Data Sets , Installations , Interview Preparations , Repeat the session until 6 months are all attractions of this particular course

Watch
INTRO VIDEO

Why Radical Technologies

  • Highly practical oriented training
  • 25000+ Man-hours of Real-time projects & scenarios
  • 10 to 20+ year Experienced corporate trainers With Real Time Experience.
  • Building up professionals by highly experienced professionals
  • 100 % quality assurance in training .
  • 10000+ Placement Records and 180+ MNC’s and Consultancies Tie up

Check Batch Schedulings

AUDIENCE

  • Engineering/Management Graduate or Post-graduate Fresher Students who want to make their career in Data Science Industry or want to be future Data Scientist.
  • Engineers who want to use a distributed computing engine for batch or stream processing or both
  • Analysts who want to leverage Spark for analyzing interesting datasets
  • Data Scientists who want a single engine for analyzing and modelling data as well as productionizing it.
  • MBA Graduates or business professionals who are looking to move to a heavily quantitative role.
  • Engineering Graduate/Professionals who want to understand basic statistics and lay a foundation for a career in Data Science
  • Working Professional or Fresh Graduate who have mostly worked in Descriptive analytics or not work anywhere and want to make the shift to being  data scientists
  • Professionals who’ve worked mostly with tools like Excel and want to learn how to use R for statistical analysis. 

Course Curriculum

Course description

One time class room registraion to click here Fee 1000/-

Clasroom training batch schedules:

LocationDay/DurationDateTimeType 
AundhWeekday28/01/202112:00 PMDemo BatchEnquiry

Online training batch schedules:

ModeDay/DurationStart DateEnd Date₹ PriceBook Seat
Online1 Day30/01/202130/01/2021₹ 0.00Enroll Now

Data Science Training & Certification in Pune 

Learn Data Science, Deep Learning, & Machine Learning with Python / R  /SAS With Live Machine Learning & Deep Learning Projects 

Duration of Data Science Training : 80 hrs

Batch type : weekdays /weekends

Mode of Data Science Training: Classroom / Online / Corporate Training

Data Science Training , Real Time Projects , Assignments , scenarios are part of this course

Preparing you to become a Certified Data Scientist & Complete Placement Support for getting the job.

Data Sets , Installations , Interview Preparations , Repeat the session until 6 months are all attractions of this particular course

Trainer :- Experienced DataScience Consultant

Want to be Future Data Scientist

Data Science Training Introduction:  This course does not require a prior quantitative or mathematics background. It starts by introducing basic concepts such as the mean, median mode etc. and eventually covers all aspects of an analytics (or) data science career from analyzing and preparing raw data to visualizing your findings. If you’re a programmer or a fresh graduate looking to switch into an exciting new career track, or a data analyst looking to make the transition into the tech industry – this course will teach you the basic to Advance techniques used by real-world industry data scientists.

Data Science, Statistics with Python / R / SAS : This course is an introduction to Data Science and Statistics using the R programming language OR Python OR SAS. It covers both the theoretical aspects of Statistical concepts and the practical implementation using R / Python/ SaS. If you’re new to Python, don’t worry – the course starts with a crash course. If you’ve done some programming before or you are new in Programming, you should pick it up quickly. This course shows you how to get set up on Microsoft Windows-based PC’s; the sample code will also run on MacOS or Linux desktop systems.

Data Science Analytics: Using Spark and Scala you can analyze and explore your data in an interactive environment with fast feedback. The course will show how to leverage the power of RDDs and Data frames to manipulate data with ease.

Machine Learning and Data Science : Spark’s core functionality and built-in libraries make it easy to implement complex algorithms like Recommendations with very few lines of code. We’ll cover a variety of datasets and algorithms including PageRank, MapReduce and Graph datasets.

Data Science Real life examples: Every concept is explained with the help of examples, case studies and source code in R wherever necessary. The examples cover a wide array of topics and range from A/B testing in an Internet company context to the Capital Asset Pricing Model in a quant finance context. 

Course Content

Introduction to Data Science with Python

  • What is analytics & Data Science?
  • Common Terms in Analytics
  • Analytics vs. Data warehousing, OLAP, MIS Reporting
  • Relevance in industry and need of the hour
  • Types of problems and business objectives in various industries
  • How leading companies are harnessing the power of analytics?
  • Critical success drivers
  • Overview of analytics tools & their popularity
  • Analytics Methodology & problem solving framework
  • List of steps in Analytics projects
  • Identify the most appropriate solution design for the given problem statement
  • Project plan for Analytics project & key milestones based on effort estimates
  • Build Resource plan for analytics project

Python Essentials

  • Why Python for data science?
  • Overview of Python- Starting with Python
  • Introduction to installation of Python
  • Introduction to Python Editors & IDE’s(Canopy, pycharm, Jupyter, Rodeo, Ipython etc…)
  • Understand Jupyter notebook & Customize Settings
  • Concept of Packages/Libraries – Important packages(NumPy, SciPy, scikit-learn, Pandas, Matplotlib, etc)
  • Installing & loading Packages & Name Spaces
  • Data Types & Data objects/structures (strings, Tuples, Lists, Dictionaries)
  • List and Dictionary Comprehensions
  • Variable & Value Labels –  Date & Time Values
  • Basic Operations – Mathematical – string – date
  • Reading and writing data
  • Simple plotting
  • Control flow & conditional statements
  • Debugging & Code profiling
  • How to create class and modules and how to call them?

Scientific Distributions Used In Python For Data Science

NumPy, pandas, scikit-learn, stat models, nltk

Accessing/Importing And Exporting Data Using Python Modules  

  • Importing Data from various sources (Csv, txt, excel, access etc)
  • Database Input (Connecting to database)
  • Viewing Data objects – subsetting Data, methods
  • Exporting Data to various formats
  • Important python modules: Pandas, beautiful soup

Data Manipulation – Cleansing – Munging using python modules

  • Cleansing Data with Python
  • Data Manipulation steps(Sorting, filtering, duplicates, merging, appending, subsetting, derived variables, sampling, Data type conversions, renaming, formatting etc)
  • Data manipulation tools(Operators, Functions, Packages, control structures, Loops, arrays etc)
  • Python Built-in Functions (Text, numeric, date, utility functions)
  • Python User Defined Functions
  • Stripping out extraneous information
  • Normalizing data
  • Formatting data
  • Important Python modules for data manipulation (Pandas, Numpy, re, math, string, datetime etc)

Data Analysis – Visualization Using Python

  • Introduction exploratory data analysis
  • Descriptive statistics, Frequency Tables and summarization
  • Univariate Analysis (Distribution of data & Graphical Analysis)
  • Bivariate Analysis(Cross Tabs, Distributions & Relationships, Graphical Analysis)
  • Creating Graphs- Bar/pie/line chart/histogram/ boxplot/ scatter/ density etc)
  • Important Packages for Exploratory Analysis(NumPy Arrays, Matplotlib, seaborn, Pandas and SciPy. Stats etc)

Introduction to Statistics

  • Basic Statistics – Measures of Central Tendencies and Variance
  • Building blocks – Probability Distributions – Normal distribution – Central Limit Theorem
  • Inferential Statistics -Sampling – Concept of Hypothesis Testing Statistical Methods – Z/t-tests( One sample, independent, paired), Analysis of variance, Correlations and Chi-square
  • Important modules for statistical methods: NumPy, SciPy, Pandas

Introduction to Predictive Modelling

  • Concept of model in analytics and how it is used?
  • Common terminology used in analytics & Modelling process
  • Popular modelling algorithms
  • Types of Business problems – Mapping of Techniques
  • Different Phases of Predictive Modelling

Data Exploration For Modelling

  • Need for structured exploratory data
  • EDA framework for exploring the data and identifying any problems with the data (Data Audit Report)
  • Identify missing data
  • Identify outliers data
  • Visualize the data trends and patterns

Data Preparation

  • Need of Data preparation
  • Consolidation/Aggregation – Outlier treatment – Flat Liners – Missing values- Dummy creation – Variable Reduction
  • Variable Reduction Techniques – Factor & PCA Analysis

Segmentation: Solving Segmentation Problems

  • Introduction to Segmentation
  • Types of Segmentation (Subjective Vs Objective, Heuristic Vs. Statistical)
  • Heuristic Segmentation Techniques (Value Based, RFM Segmentation and Life Stage Segmentation)
  • Behavioural Segmentation Techniques (K-Means Cluster Analysis)
  • Cluster evaluation and profiling – Identify cluster characteristics
  • Interpretation of results – Implementation on new data

Linear Regression: Solving Regression Problems

  • Introduction – Applications
  • Assumptions of Linear Regression
  • Building Linear Regression Model
  • Understanding standard metrics (Variable significance, R-square/Adjusted R-square, Global hypothesis ,etc)
  • Assess the overall effectiveness of the model
  • Validation of Models (Re running Vs. Scoring)
  • Standard Business Outputs (Decile Analysis, Error distribution (histogram), Model equation, drivers etc.)
  • Interpretation of Results – Business Validation – Implementation on new data

Logistic Regression : Solving Classification Problems

  • Introduction – Applications
  • Linear Regression Vs. Logistic Regression Vs. Generalized Linear Models
  • Building Logistic Regression Model (Binary Logistic Model)
  • Understanding standard model metrics (Concordance, Variable significance, Hosmer Lemeshov Test, Gini, KS, Misclassification, ROC Curve etc)
  • Validation of Logistic Regression Models (Re running Vs. Scoring)
  • Standard Business Outputs (Decile Analysis, ROC Curve, Probability Cut-offs, Lift charts, Model equation, Drivers or variable importance, etc)
  • Interpretation of Results – Business Validation – Implementation on new data

Time Series Forecasting : Solving Forecasting Problems

  • Introduction – Applications
  • Time Series Components( Trend, Seasonality, Cyclicity and Level) and Decomposition
  • Classification of Techniques(Pattern based – Pattern less)
  • Basic Techniques – Averages, Smoothening, etc
  • Advanced Techniques – AR Models, ARIMA, etc
  • Understanding Forecasting Accuracy – MAPE, MAD, MSE, etc

Machine Learning : Predictive Modelling

  • Introduction to Machine Learning & Predictive Modelling
  • Types of Business problems – Mapping of Techniques – Regression vs. classification vs. segmentation vs. Forecasting
  • Major Classes of Learning Algorithms -Supervised vs Unsupervised Learning
  • Different Phases of Predictive Modelling (Data Pre-processing, Sampling, Model Building, Validation)
  • Overfitting (Bias-Variance Trade off) & Performance Metrics
  • Feature engineering & dimension reduction
  • Concept of optimization & cost function
  • Overview of gradient descent algorithm
  • Overview of Cross validation(Bootstrapping, K-Fold validation etc)
  • Model performance metrics (R-square, Adjusted R-square, RMSE, MAPE, AUC, ROC curve, recall, precision, sensitivity, specificity, confusion metrics )

Data Science Unsupervised Learning : Segmentation

  • What is segmentation & Role of ML in Segmentation?
  • Concept of Distance and related math background
  • K-Means Clustering
  • Expectation Maximization
  • Hierarchical Clustering
  • Spectral Clustering (DBSCAN)
  • Principle component Analysis (PCA)

Data Science Supervised Learning :- Decision Trees

  • Decision Trees – Introduction – Applications
  • Types of Decision Tree Algorithms
  • Construction of Decision Trees through Simplified Examples; Choosing the “Best” attribute at each Non-Leaf node; Entropy; Information Gain, Gini Index, Chi Square, Regression Trees
  • Generalizing Decision Trees; Information Content and Gain Ratio; Dealing with Numerical Variables; other Measures of Randomness
  • Pruning a Decision Tree; Cost as a consideration; Unwrapping Trees as Rules
  • Decision Trees – Validation
  • Overfitting – Best Practices to avoid

Supervised Learning :- Ensemble Learning

  • Concept of Ensembling
  • Manual Ensembling Vs. Automated Ensembling
  • Methods of Ensembling (Stacking, Mixture of Experts)
  • Bagging (Logic, Practical Applications)
  • Random forest (Logic, Practical Applications)
  • Boosting (Logic, Practical Applications)
  • Ada Boost
  • Gradient Boosting Machines (GBM)
  • XGBoost

Supervised Learning :- Artificial Neural Network – ANN

  • Motivation for Neural Networks and Its Applications
  • Perceptron and Single Layer Neural Network, and Hand Calculations
  • Learning In a Multi Layered Neural Net: Back Propagation and Conjugant Gradient Techniques
  • Neural Networks for Regression
  • Neural Networks for Classification
  • Interpretation of Outputs and Fine tune the models with hyper parameters
  • Validating ANN models

Supervised Learning :- Support Vector Machines

  • Motivation for Support Vector Machine & Applications
  • Support Vector Regression
  • Support vector classifier (Linear & Non-Linear)
  • Mathematical Intuition (Kernel Methods Revisited, Quadratic Optimization and Soft Constraints)
  • Interpretation of Outputs and Fine tune the models with hyper parameters
  • Validating SVM models

Supervised Learning :-KNN

  • What is KNN & Applications?
  • KNN for missing treatment
  • KNN For solving regression problems
  • KNN for solving classification problems
  • Validating KNN model
  • Model fine tuning with hyper parameters

Supervised Learning :- Naive Bayes

  • Concept of Conditional Probability
  • Bayes Theorem and Its Applications
  • Naïve Bayes for classification
  • Applications of Naïve Bayes in Classifications

Text Mining And Analytics

  • Taming big text, Unstructured vs. Semi-structured Data; Fundamentals of information retrieval, Properties of words; Creating Term-Document (TxD);Matrices; Similarity measures, Low-level processes (Sentence Splitting; Tokenization; Part-of-Speech Tagging; Stemming; Chunking)
  • Finding patterns in text: text mining, text as a graph
  • Natural Language processing (NLP)
  • Text Analytics – Sentiment Analysis using Python
  • Text Analytics – Word cloud analysis using Python
  • Text Analytics – Segmentation using K-Means/Hierarchical Clustering
  • Text Analytics – Classification (Spam/Not spam)
  • Applications of Social Media Analytics
  • Metrics(Measures Actions) in social media analytics
  • Examples & Actionable Insights using Social Media Analytics
  • Important python modules for Machine Learning (SciKit Learn, stats models, scipy, nltk etc)
  • Fine tuning the models using Hyper parameters, grid search, piping etc.

OR

DATASCIENCE WITH R COURSE CONTENT

  • What is analytics & Data Science?
  • Common Terms in Analytics
  • Analytics vs. Data warehousing, OLAP, MIS Reporting
  • Relevance in industry and need of the hour
  • Types of problems and business objectives in various industries
  • How leading companies are harnessing the power of analytics?
  • Critical success drivers
  • Overview of analytics tools & their popularity
  • Analytics Methodology & problem solving framework
  • List of steps in Analytics projects
  • Identify the most appropriate solution design for the given problem statement
  • Project plan for Analytics project & key milestones based on effort estimates
  • Build Resource plan for analytics project
  • Why R for data science?

Data Importing / Exporting

  • Introduction R/R-Studio – GUI
  • Concept of Packages – Useful Packages (Base & Other packages)
  • Data Structure & Data Types (Vectors, Matrices, factors, Data frames,  and Lists)
  • Importing Data from various sources (txt, dlm, excel, sas7bdata, db, etc.)
  • Database Input (Connecting to database)
  • Exporting Data to various formats)
  • Viewing Data (Viewing partial data and full data)
  • Variable & Value Labels –  Date Values

Data Manipulation

  • Data Manipulation steps
  • Creating New Variables (calculations & Binning)
  • Dummy variable creation
  • Applying transformations
  • Handling duplicates
  • Handling missings
  • Sorting and Filtering
  • Subsetting (Rows/Columns)
  • Appending (Row appending/column appending)
  • Merging/Joining (Left, right, inner, full, outer etc)
  • Data type conversions
  • Renaming
  • Formatting
  • Reshaping data
  • Sampling
  • Data manipulation tools
  • Operators
  • Functions
  • Packages
  • Control Structures (if, if else)
  • Loops (Conditional, iterative loops, apply functions)
  • Arrays
  • R Built-in Functions (Text, Numeric, Date, utility)
  • Numerical Functions
  • Text Functions
  • Date Functions
  • Utilities Functions
  • R User Defined Functions
  • R Packages for data manipulation (base, dplyr, plyr, data.table, reshape, car, sqldf, etc)

  Data Analysis – Visualization

  • ntroduction exploratory data analysis
  • Descriptive statistics, Frequency Tables and summarization
  • Univariate Analysis (Distribution of data & Graphical Analysis)
  • Bivariate Analysis(Cross Tabs, Distributions & Relationships, Graphical Analysis)
  • Creating Graphs- Bar/pie/line chart/histogram/boxplot/scatter/density etc)
  • R Packages for Exploratory Data Analysis(dplyr, plyr, gmodes, car, vcd, Hmisc, psych, doby etc)
  • R Packages for Graphical Analysis (base, ggplot, lattice,etc)

   Introduction To Statistics

  • Basic Statistics – Measures of Central Tendencies and Variance
  • Building blocks – Probability Distributions – Normal distribution – Central Limit Theorem
  • Inferential Statistics -Sampling – Concept of Hypothesis Testing
  • Statistical Methods – Z/t-tests( One sample, independent, paired), Anova, Correlations and Chi-square

Predictive Modelling

  • Concept of model in analytics and how it is used?
  • Common terminology used in analytics & modelling process
  • Popular modelling algorithms
  • Types of Business problems – Mapping of Techniques
  • Different Phases of Predictive Modelling

   Data Exploration For Modeling

   Data Preparation

  •  Need of Data preparation
  • Consolidation/Aggregation – Outlier treatment – Flat Liners – Missing values- Dummy creation – Variable Reduction
  • Variable Reduction Techniques – Factor & PCA Analysis

   Segmentation: Solving Segmentation Problems

  • Introduction to Segmentation
  • Types of Segmentation (Subjective Vs Objective, Heuristic Vs. Statistical)
  • Heuristic Segmentation Techniques (Value Based, RFM Segmentation and Life Stage Segmentation)
  • Behavioral Segmentation Techniques (K-Means Cluster Analysis)
  • Cluster evaluation and profiling – Identify cluster characteristics
  • Interpretation of results – Implementation on new data

Linear Regression: Solving Regression Problems

  • Introduction – Applications
  • Assumptions of Linear Regression
  • Building Linear Regression Model
  • Understanding standard metrics (Variable significance, R-square/Adjusted R-square, Global hypothesis ,etc)
  • Assess the overall effectiveness of the model
  • Validation of Models (Re running Vs. Scoring)
  • Standard Business Outputs (Decile Analysis, Error distribution (histogram), Model equation, drivers etc.)
  • Interpretation of Results – Business Validation – Implementation on new data

 Logistic Regression: Solving Classification Problems

  • Introduction – Applications
  • Linear Regression Vs. Logistic Regression Vs. Generalized Linear Models
  • Building Logistic Regression Model (Binary Logistic Model)
  • Understanding standard model metrics (Concordance, Variable significance, Hosmer Lemeshov Test, Gini, KS, Misclassification, ROC Curve etc)
  • Validation of Logistic Regression Models (Re running Vs. Scoring)
  • Standard Business Outputs (Decile Analysis, ROC Curve, Probability Cut-offs, Lift charts, Model equation, Drivers or variable importance, etc)
  • Interpretation of Results – Business Validation – Implementation on new data

Time Series Forecasting: Solving Forecasting Problems

  • Introduction – Applications
  • Time Series Components( Trend, Seasonality, Cyclicity and Level) and Decomposition
  • Classification of Techniques(Pattern based – Pattern less)
  • Basic Techniques – Averages, Smoothening, etc
  • Advanced Techniques – AR Models, ARIMA, etc
  • Understanding Forecasting Accuracy – MAPE, MAD, MSE, etc

Machine Learning -Predictive Modeling – Basics

  • Introduction to Machine Learning & Predictive Modeling
  • Types of Business problems – Mapping of Techniques – Regression vs. classification vs. segmentation vs. Forecasting
  • Major Classes of Learning Algorithms -Supervised vs Unsupervised Learning
  • Different Phases of Predictive Modeling (Data Pre-processing, Sampling, Model Building, Validation)
  • Overfitting (Bias-Variance Trade off) & Performance Metrics
  • Feature engineering & dimension reduction
  • Concept of optimization & cost function
  • Overview of gradient descent algorithm
  • Overview of Cross validation(Bootstrapping, K-Fold validation etc)
  • Model performance metrics (R-square, Adjusted R-squre, RMSE, MAPE, AUC, ROC curve, recall, precision, sensitivity, specificity, confusion metrics )

Unsupervised Learning: Segmentation

  • What is segmentation & Role of ML in Segmentation?
  • Concept of Distance and related math background
  • K-Means Clustering
  • Expectation Maximization
  • Hierarchical Clustering
  • Spectral Clustering (DBSCAN)
  • Principle component Analysis (PCA)

Supervised Learning: Decision Trees

  • Decision Trees – Introduction – Applications
  • Types of Decision Tree Algorithms
  • Construction of Decision Trees through Simplified Examples; Choosing the “Best” attribute at each Non-Leaf node; Entropy; Information Gain, Gini Index, Chi Square, Regression Trees
  • Generalizing Decision Trees; Information Content and Gain Ratio; Dealing with Numerical Variables; other Measures of Randomness
  • Pruning a Decision Tree; Cost as a consideration; Unwrapping Trees as Rules
  • Decision Trees – Validation
  • Overfitting – Best Practices to avoid

   Supervised Learning: Ensemble Learning

  • Concept of Ensembling
  • Manual Ensembling Vs. Automated Ensembling
  • Methods of Ensembling (Stacking, Mixture of Experts)
  • Bagging (Logic, Practical Applications)
  • Random forest (Logic, Practical Applications)
  • Boosting (Logic, Practical Applications)
  • Ada Boost
  • Gradient Boosting Machines (GBM)
  • XGBoost

Supervised Learning: Artificial Neural Networks (ANN)

  • Motivation for Neural Networks and Its Applications
  • Perceptron and Single Layer Neural Network, and Hand Calculations
  • Learning In a Multi Layered Neural Net: Back Propagation and Conjugant Gradient Techniques
  • Neural Networks for Regression
  • Neural Networks for Classification
  • Interpretation of Outputs and Fine tune the models with hyper parameters
  • Validating ANN models

Supervised Learning: Support Vector Machines

  • Motivation for Support Vector Machine & Applications
  • Support Vector Regression
  • Support vector classifier (Linear & Non-Linear)
  • Mathematical Intuition (Kernel Methods Revisited, Quadratic Optimization and Soft Constraints)
  • Interpretation of Outputs and Fine tune the models with hyper parameters
  • Validating SVM models

Supervised Learning: KNN

  • What is KNN & Applications?
  • KNN for missing treatment
  • KNN For solving regression problems
  • KNN for solving classification problems
  • Validating KNN model
  • Model fine tuning with hyper parameters

Supervised Learning: Naïve Bayes

  • Concept of Conditional Probability
  • Bayes Theorem and Its Applications
  • Naïve Bayes for classification
  • Applications of Naïve Bayes in Classifications

Text Mining & Analytics

  • Taming big text, Unstructured vs. Semi-structured Data; Fundamentals of information retrieval, Properties of words; Creating Term-Document (TxD);Matrices; Similarity measures, Low-level processes (Sentence Splitting; Tokenization; Part-of-Speech Tagging; Stemming; Chunking)
  • Finding patterns in text: text mining, text as a graph
  • Natural Language processing (NLP)
  • Text Analytics – Sentiment Analysis using R
  • Text Analytics – Word cloud analysis using R
  • Text Analytics – Segmentation using K-Means/Hierarchical Clustering
  • Text Analytics – Classification (Spam/Not spam)
  • Applications of Social Media Analytics
  • Metrics(Measures Actions) in social media analytics
  • Examples & Actionable Insights using Social Media Analytics
  • Important R packages for Machine Learning (caret, H2O, Randomforest, nnet, tm etc)
  • Fine tuning the models using Hyper parameters, grid search, piping etc.

Project

Case Studies

OR

DATASCIENCE TRAINING WITH S-A-S COURSE CONTENT

Introduction To Analytics

  • Analytics World
    • Introduction to Analytics
    • Concept of ETL
    • S-A-S in advanced analytics
  • Global Certification: Induction and walk through
    • Getting Started
    • Software installation
    • Introduction to GUI
    • Different components of the language
    • All programming windows
    • Concept of Libraries and Creating Libraries
    • Variable Attributes – (Name, Type, Length, Format, In format, Label)
    • Importing Data and Entering data manually
  • Understanding Datasets
    • Descriptor Portion of a Dataset (Proc Contents)
    • Data Portion of a Dataset
    • Variable Names and Values
    • Data Libraries

 Base S-A-S – Accessing The Data

  • Understanding Data Step Processing
    • Data Step and Proc Step
    • Data step execution
    • Compilation and execution phase
    • Input buffer and concept of PDV
  • Importing Raw Data Files
    • Column Input and List Input and Formatted methods
    • Delimiters, Reading missing and non standard values
    • Reading one to many and many to one records
    • Reading Hierarchical files
    • Creating raw data files and put statement
    • Formats / Informat
  • Importing and Exporting Data (Fixed Format / Delimited)
  • Proc Import / Delimited text files
  • Proc Export / Exporting Data
  • Datalines / Cards;
  • Atypical importing cases (mixing different style of inputs)
    • Reading Multiple Records per Observation
    • Reading “Mixed Record Types”
    • Sub-setting from a Raw Data File
    • Multiple Observations per Record
    • Reading Hierarchical Files
    •  
  • Concept of SAS library and SAS Catalog
  • Variable Types in SAS
  • Reading Data stored external to SAS
  • Importing Data by using Proc Import
  • Data Step SAS statements
  • SAS Functions
  • Appending and Merging using SAS
  • SAS Procedures like proc means, proc Univariate, proc append, proc freq and proc export.
  • SAS SQL
  • SAS Macros

Hypothesis Testing and ANOVA

  • One Sample t-test of comparing means
  • Two Sample t-test of comparing means
  • One Way ANOVA
  • Assumptions of ANOVA Modeling
  • n-way ANOVA
  • ANOVA Post Hoc Studies

Measure Model Performance

  • Apply the principles of honest assessment to model performance measurement
  • Assess classifier performance using the confusion matrix
  • Model selection and validation using training and validation data
  • Create and interpret graphs (ROC, lift, and gains charts) for model comparison and selection
  • Establish effective decision cut-off values for scoring

Data Understanding, Managing And Manipulation

  • Understanding and Exploration Data
    • Introduction to basic Procedures – Proc Contents, Proc Print
  • Understanding and Exploration Data
    • Operators and Operands
    • Conditional Statements (Where, If, If then Else, If then Do and select when)
    • Difference between WHERE and IF statements and limitation of WHERE statements
    • Labels, Commenting
    • System Options (OBS, FSTOBS, NOOBS etc…)
  • Data Manipulation
    • Proc Sort – with options / De-Duping
    • Accumulator variable and By-Group processing
    • Explicit Output Statements
    • Nesting Do loops
    • Do While and Do Until Statement
    • Array elements and Range
  • Combining Datasets (Appending and Merging)
    • Concatenation
    • Interleaving
    • Proc Append
    • One To One Merging
    • Match Merging
    • IN = Controlling merge and Indicator

 Data Mining With Proc SQL

  • Introduction to Databases
  • Introduction to Proc SQL
  • Basics of General SQL language
  • Creating table and Inserting Values
  • Retrieve & Summarize data
  • Group, Sort & Filter
  • Using Joins (Full, Inner, Left, Right and Outer)
  • Reporting and summary analysis
  • Concept of Indexes and creating Indexes (simple and composite)
  • Connecting S-A-S to external Databases
  • Implicit and Explicit pass through methods

Macros For Automation

  • Macro Parameters and Variables
  • Different types of Macro Creation
  • Defining and calling a macro
  • Using call Symput and Symget
  • Macros options (mprint symbolgen mlogic merror serror)

 Fundamental Of Statistics

  • Basic Statistics – Measures of Central Tendencies and Variance
  • Building blocks – Probability Distributions – Normal distribution – Central Limit Theorem
  • Inferential Statistics -Sampling – Concept of Hypothesis Testing
  • Statistical Methods – Z/t-tests( One sample, independent, paired), Anova, Correlations and Chi-square    
  • Levels of Measurement and Variable types
  • Descriptive Statistics and Picturing Distributions
  • Confidence Interval for the Mean

Introduction To Predictive Modelling

  • Introduction to Predictive Modeling
  • Types of Business problems – Mapping of Techniques
  • Different Phases of Predictive Modeling

 Data Preparation

  • Need of Data preparation
  • Data Audit Report and Its importance
  • Consolidation/Aggregation – Outlier treatment – Flat Liners – Missing values- Dummy creation – Variable Reduction
  • Variable Reduction Techniques – Factor & PCA Analysis

 Segmentation

  • Introduction to Segmentation
  • Types of Segmentation (Subjective Vs Objective, Heuristic Vs. Statistical)
  • Heuristic Segmentation Techniques (Value Based, RFM Segmentation and Life Stage Segmentation)
  • Behavioural Segmentation Techniques (K-Means Cluster Analysis)
  • Cluster evaluation and profiling
  • Interpretation of results – Implementation on new data

 Linear Regression

  • Introduction – Applications
  • Assumptions of Linear Regression
  • Building Linear Regression Model
  • Understanding standard metrics (Variable significance, R-square/Adjusted R-square, Global hypothesis ,etc)
  • Validation of Models (Re running Vs. Scoring)
  • Standard Business Outputs (Decile Analysis, Error distribution (histogram), Model equation, drivers etc.)
  • Interpretation of Results – Business Validation – Implementation on new data

 Logistic Regression

  • Introduction – Applications
  • Linear Regression Vs. Logistic Regression Vs. Generalized Linear Models
  • Building Logistic Regression Model
  • Understanding standard model metrics (Concordance, Variable significance, Hosmer Lemeshov Test, Gini, KS, Misclassification, etc)
  • Validation of Logistic Regression Models (Re running Vs. Scoring)
  • Standard Business Outputs (Decile Analysis, ROC Curve,
    Probability Cut-offs, Lift charts, Model equation, Drivers, etc)
  • Interpretation of Results – Business Validation -Implementation on new data

 Time Series Forecasting

  • Introduction – Applications
  • Time Series Components( Trend, Seasonality, Cyclicity and Level) and Decomposition
  • Classification of Techniques(Pattern based – Pattern less)
  • Basic Techniques – Averages, Smoothening, etc
  • Advanced Techniques – AR Models, ARIMA, etc
  • Understanding Forecasting Accuracy – MAPE, MAD, MSE, etc

 Introduction To Machine Learning

  • Statistical learning vs. Machine learning
  • Major Classes of Learning Algorithms -Supervised vs Unsupervised Learning
  • Concept of Overfitting and Under fitting (Bias-Variance Trade off) & Performance Metrics
  • Types of Cross validation(Train & Test, Bootstrapping, K-Fold validation etc)

 Regression & Classification Model Building

  • Recursive Partitioning(Decision Trees)
  • Ensemble Models(Random Forest, Bagging & Boosting)
  • K-Nearest neighbours

OR

ADVANCED BIG DATASCIENCE COURSE CONTENT

Introduction To Data Science

  • What is Data Science?
  • Why Python for data science?
  • Relevance in industry and need of the hour
  • How leading companies are harnessing the power of Data Science with Python?
  • Different phases of a typical Analytics/Data Science projects and role of python
  • Anaconda vs. Python

 Python Essentials (Core)

  • Overview of Python- Starting with Python
  • Introduction to installation of Python
  • Introduction to Python Editors & IDE’s(Canopy, pycharm, Jupyter, Rodeo, Ipython etc…)
  • Understand Jupyter notebook & Customize Settings
  • Concept of Packages/Libraries – Important packages(NumPy, SciPy, scikit-learn, Pandas, Matplotlib, etc)
  • Installing & loading Packages & Name Spaces
  • Data Types & Data objects/structures (strings, Tuples, Lists, Dictionaries)
  • List and Dictionary Comprehensions
  • Variable & Value Labels –  Date & Time Values
  • Basic Operations – Mathematical – string – date
  • Reading and writing data
  • Simple plotting
  • Control flow & conditional statements
  • Debugging & Code profiling
  • How to create class and modules and how to call them?
  • Scientific distributions used in python for Data Science – Numpy, scify, pandas, scikitlearn, statmodels, nltk etc

 Accessing/Importing And Exporting Data Using Python Modules

  • Importing Data from various sources (Csv, txt, excel, access etc)
  • Database Input (Connecting to database)
  • Viewing Data objects – subsetting, methods
  • Exporting Data to various formats
  • Important python modules: Pandas, beautifulsoup

 Data Manipulation – Cleansing – Munging Using Python Modules

  • Cleansing Data with Python
  • Data Manipulation steps(Sorting, filtering, duplicates, merging, appending, subsetting, derived variables, sampling, Data type conversions, renaming, formatting etc)
  • Data manipulation tools(Operators, Functions, Packages, control structures, Loops, arrays etc)
  • Python Built-in Functions (Text, numeric, date, utility functions)
  • Python User Defined Functions
  • Stripping out extraneous information
  • Normalizing data
  • Formatting data
  • Important Python modules for data manipulation (Pandas, Numpy, re, math, string, datetime etc)

 Data Analysis – Visualization Using Python

  • Introduction exploratory data analysis
  • Descriptive statistics, Frequency Tables and summarization
  • Univariate Analysis (Distribution of data & Graphical Analysis)
  • Bivariate Analysis(Cross Tabs, Distributions & Relationships, Graphical Analysis)
  • Creating Graphs- Bar/pie/line chart/histogram/ boxplot/ scatter/ density etc)
  • Important Packages for Exploratory Analysis(NumPy Arrays, Matplotlib, seaborn, Pandas and scipy.stats etc)

 Basic Statistics & Implementation Of Stats Methods In Python

  • Basic Statistics – Measures of Central Tendencies and Variance
  • Building blocks – Probability Distributions – Normal distribution – Central Limit Theorem
  • Inferential Statistics -Sampling – Concept of Hypothesis Testing
  • Statistical Methods – Z/t-tests (One sample, independent, paired), Anova, Correlation and Chi-square
  • Important modules for statistical methods: Numpy, Scipy, Pandas

 Python: Machine Learning -Predictive Modeling – Basics

  • Introduction to Machine Learning & Predictive Modeling
  • Types of Business problems – Mapping of Techniques – Regression vs. classification vs. segmentation vs. Forecasting
  • Major Classes of Learning Algorithms -Supervised vs Unsupervised Learning
  • Different Phases of Predictive Modeling (Data Pre-processing, Sampling, Model Building, Validation)
  • Overfitting (Bias-Variance Trade off) & Performance Metrics
  • Feature engineering & dimension reduction
  • Concept of optimization & cost function
  • Concept of gradient descent algorithm
  • Concept of Cross validation(Bootstrapping, K-Fold validation etc)
  • Model performance metrics (R-square, RMSE, MAPE, AUC, ROC curve, recall, precision, sensitivity, specificity, confusion metrics)

 Machine Learning Algorithms & Applications – Implementation In Python

  • Linear & Logistic Regression
  • Segmentation – Cluster Analysis (K-Means)
  • Decision Trees (CART/CD 5.0)
  • Ensemble Learning (Random Forest, Bagging & boosting)
  • Artificial Neural Networks(ANN)
  • Support Vector Machines(SVM)
  • Other Techniques (KNN, Naïve Bayes, PCA)
  • Introduction to Text Mining using NLTK
  • Introduction to Time Series Forecasting (Decomposition & ARIMA)
  • Important python modules for Machine Learning (SciKit Learn, stats models, scipy, nltk etc)
  • Fine tuning the models using Hyper parameters, grid search, piping etc.

Project – Consolidate Learnings

  • Applying different algorithms to solve the business problems and bench mark the results

Introduction To Big Data

  • Introduction and Relevance
  • Uses of Big Data analytics in various industries like Telecom, E- commerce, Finance and Insurance etc.
  • Problems with Traditional Large-Scale Systems

 Hadoop(Big Data) Eco-System

  • Motivation for Hadoop
  • Different types of projects by Apache
  • Role of projects in the Hadoop Ecosystem
  • Key technology foundations required for Big Data
  • Limitations and Solutions of existing Data Analytics Architecture
  • Comparison of traditional data management systems with Big Data management systems
  • Evaluate key framework requirements for Big Data analytics
  • Hadoop Ecosystem & Hadoop 2.x core components
  • Explain the relevance of real-time data
  • Explain how to use Big Data and real-time data as a Business planning tool

 Hadoop Cluster-Architecture-Configuration Files

  • Hadoop Master-Slave Architecture
  • The Hadoop Distributed File System – Concept of data storage
  • Explain different types of cluster setups(Fully distributed/Pseudo etc)
  • Hadoop cluster set up – Installation
  • Hadoop 2.x Cluster Architecture
  • A Typical enterprise cluster – Hadoop Cluster Modes
  • Understanding cluster management tools like Cloudera manager/Apache ambari

 Hadoop-HDFS & MapReduce (YARN)

  • HDFS Overview & Data storage in HDFS
  • Get the data into Hadoop from local machine(Data Loading Techniques) – vice versa
  • Map Reduce Overview (Traditional way Vs. MapReduce way)
  • Concept of Mapper & Reducer
  • Understanding MapReduce program Framework
  • Develop MapReduce Program using Java (Basic)
  • Develop MapReduce program with streaming API) (Basic)

 Data Integration Using Sqoop & Flume

  • Integrating Hadoop into an Existing Enterprise
  • Loading Data from an RDBMS into HDFS by Using Sqoop
  • Managing Real-Time Data Using Flume
  • Accessing HDFS from Legacy Systems

 Data Analysis Using Pig

  • Introduction to Data Analysis Tools
  • Apache PIG – MapReduce Vs Pig, Pig Use Cases
  • PIG’s Data Model
  • PIG Streaming
  • Pig Latin Program & Execution
  • Pig Latin : Relational Operators, File Loaders, Group Operator, COGROUP Operator, Joins and COGROUP, Union, Diagnostic Operators, Pig UDF
  • Writing JAVA UDF’s
  • Embedded PIG in JAVA
  • PIG Macros
  • Parameter Substitution
  • Use Pig to automate the design and implementation of MapReduce applications
  • Use Pig to apply structure to unstructured Big Data

 Data Analysis Using Hive

  • Apache Hive – Hive Vs. PIG – Hive Use Cases
  • Discuss the Hive data storage principle
  • Explain the File formats and Records formats supported by the Hive environment
  • Perform operations with data in Hive
  • Hive QL: Joining Tables, Dynamic Partitioning, Custom Map/Reduce Scripts
  • Hive Script, Hive UDF
  • Hive Persistence formats
  • Loading data in Hive – Methods
  • Serialization & Deserialization
  • Handling Text data using Hive
  • Integrating external BI tools with Hadoop Hive

 Data Analysis Using Impala

  • Impala & Architecture
  • How Impala executes Queries and its importance
  • Hive vs. PIG vs. Impala
  • Extending Impala with User Defined functions

 Introduction To Other Ecosystem Tools

  • NoSQL database – Hbase
  • Introduction Oozie

Spark: Introduction

  • Introduction to Apache Spark
  • Streaming Data Vs. In Memory Data
  • Map Reduce Vs. Spark
  • Modes of Spark
  • Spark Installation Demo
  • Overview of Spark on a cluster
  • Spark Standalone Cluster

 Spark: Spark In Practice

  • Invoking Spark Shell
  • Creating the Spark Context
  • Loading a File in Shell
  • Performing Some Basic Operations on Files in Spark Shell
  • Caching Overview
  • Distributed Persistence
  • Spark Streaming Overview(Example: Streaming Word Count)

 Spark: Spark Meets Hive

  • Analyze Hive and Spark SQL Architecture
  • Analyze Spark SQL
  • Context in Spark SQL
  • Implement a sample example for Spark SQL
  • Integrating hive and Spark SQL
  • Support for JSON and Parquet File Formats Implement Data Visualization in Spark
  • Loading of Data
  • Hive Queries through Spark
  • Performance Tuning Tips in Spark
  • Shared Variables: Broadcast Variables & Accumulators

 Spark Streaming

  • Extract and analyze the data from twitter using Spark streaming
  • Comparison of Spark and Storm – Overview

 Spark GraphX

  • Overview of GraphX module in spark
  • Creating graphs with GraphX

 Introduction To Machine Learning Using Spark

  • Understand Machine learning framework
  • Implement some of the ML algorithms using Spark MLLib

 Project

  • Consolidate all the learnings
  • Working on Big Data Project by integrating various key components

Projects :-

Python Projects

Random password generatorMini
CLI based scientific calculatorMini
Instagram botMini
Expense TrackerMini
Site connectivity checkerMini
Lawn Tennis Match Highlight (Can be extended to any sport)Major
NLP libraryMajor

 

Deep Learning Projects

Churn Modelling using ANNMini
Image ClassificationMini
Image classification using Transfer learningMajor
Sentence Classification using RNN,LSTM,GRUMini
Sentence Classification using word embeddingsMajor
Object Detection using yoloMajor

 

Machine Learning Projects

EDA on movies databaseMini
House price prediction using RegressionMini
Predict survival on the Titanic using ClassificationMini
Image ClusteringMini
Document ClusteringMini
Twitter US Airline SentimentMajor
Restaurant revenue predictionMajor
Disease PredictionMajor

 

Note: Depends upon Trainers above projects may vary

Training Options

Live Online Training

  • Highly practical oriented training
  • Installation of Software On your System
  • 24/7 Email and Phone Support
  • 100% Placement Assistance until you get placed
  • Global Certification Preparation
  • Trainer Student Interactive Portal
  • Assignments and Projects Guided by Mentors
  • And Many More Features

Course completion certificate and Global Certifications are part of our all Master Program

Live Classroom Training

  • Weekend / Weekdays / Morning / Evening Batches
  • 80:20 Practical and Theory Ratio
  • Real-life Case Studies
  • Easy Coverup if you missed any sessions
  • PSI | Kryterion | Redhat Test Centers
  • Life Time Video Classroom Access ( coming soon )
  • Resume Preparations and Mock Interviews
  • And Many More Features

Course completion certificate and Global Certifications are part of our all Master Program

Exam & Certification

Course Reviews

I had a wonderful experience in Radical technologies where i did training in Hadoop development under the guidance of Shanit Sir. He started from the very basic and covered and shared everything he knew in this field. He was brilliant and had a lot of experience in this field. We did hands on for every topic we covered, and that’s the most important thing because honestly theoretical knowledge cannot land you a job.
Rohit Agrawal Hadoop
I have recently completed Linux course under Anand Sir and can assuredly say that it is definitely the best Linux course in Pune. Since most of the Linux courses from other sources are strictly focused on clearing the certification, they will not provide an insight into real-world server administration, but that is not the case with Anand Sir’s course. Anand Sir being an experienced IT infrastructure professional has an excellent understanding of how a data center works and all these information is seamlessly integrated into his classes.
Manu Sunil Linux
I had undergone oracle DBA course under Chetan sir’s Guidance an it was a very good learning experience overall since they not only provide us with theoretical knowledge but also conduct lot of practical sessions which are really fruitful and also the way of teaching is very fine clear and crisp which is easier to understand , overall I had a great time for around 2 months , they really train you well.also make it a point to clear all your doubts and provide you with clear and in-depth concepts hence hope to join sometime again
Reema banerjee Oracle DBA
I have completed Oracle DBA 11g from Radical technology pune. Excellent trainer (chetna gupta ). The trainer kept the energy level up and kept us interested throughout. Very practical, hands on experience. Gave us real-time examples, excellent tips and hints. It was a great experience with Radical technologies.
Mrudul Bhokare Oracle DBA
Linux learning with Anand sir is truly different experience… I don’t have any idea about Linux and system but Anand sir taught with scratch…He has a great knowledge and the best trainer…he can solve all your queries related to Linux in very simple way and giving nice examples… 100 🌟 to Anand Sir.
Harsh Singh Parihar Linux
Prev
Next

Why we are the best Radical Technologies

Radical Technologies is truly progressing and offer best possible services. And recognition towards Radical Technologies is increasing steeply as the demand is growing rapidly.

Creative

0%

Innovative

0%

Student Friendly

0%

Practical Oriented

0%

Valued Certification

0%

Training FAQs

Similar Courses

ENQUIRE NOW
[]
1 Step 1
keyboard_arrow_leftPrevious
Nextkeyboard_arrow_right
FormCraft - WordPress form builder