## About Data Science

A Data Scientist is someone who is better at Statistics than any software Engineer and better at Software Engineering than any Statistician. They are responsible for discovering insights from massive amounts of structured and unstructured data to help shape or meet specific business needs and goals. The data scientist role in data analysis is becoming increasingly important as businesses rely more heavily on big data and data analytics to drive decision-making

## Get ₹2000 off this Summer. Register Now.

## Who Can Join Data Science course ?

IT Professionals,

Data Analysts, Business Analysts,

Functional Managers,

Graduates, Post Graduates

Also, anyone having interest in Data Science is free to attend our workshop.

## Who Can be a Data Scientist?

A Data scientist is sort of ‘jack-of-all-trades’ for data crunching. Basically, 3 main skills a data scientist needs to possess are mathematics/statistics, computer programming literacy and knowledge of particular business.

## What is the Payscale of Data Scientist?

A Data Scientist earn an average salary of Rs 6,00,000 per year.

Experience influences the income of this job. The salary for a Data Scientist abroad can range anywhere from $100,000 to $120,000.

## What are the requisite skill set to be a Data Scientist?

## Data Science Salaries (PayScale Report)

Expertise in mathematics, technical and programming skills, business and strategy awareness combine to form Data Science.

# Data Science Curriculum (Long term)

**PART 1 : INTRODUCTION TO DATA SCIENCE:**

- What is Data Science? – Introduction.
- Importance of Data Science.
- Demand for Data Science Professional.
- Brief Introduction to Big data and Data Analytics.
- Lifecycle of data science.
- Tools and Technologies used in data Science.
- Comprehensive R Archive Network
- Demo of Installing R On windows from CRAN Website
- Installing R Studios on Windows OS Setting Up R Workspace.
- Getting Help for R-How to use help system
- Installing Packages – Loading And Unloading Packages

**PART 2 – STATISTICS**

**Fundamentals of Math and Probability Basic**- understanding of linear algebra, Matrics, vectors
- Addition and Multimplication of matrics Fundamentals of Probability
- Probability distributed function and cumulative distributed function.
- Class Hand-on
- Problem solving using R for vector manupulation
- Problem solving for probability assignments

**2 Descriptive Statistics**

- Describe or sumarise a set of data Measure of central tendency and measure of dispersion.
- The mean,median,mode, curtosis and skewness
- Computing Standard deviation and Variance.
- Types of distribution.

**Class Handson:**

- 5 Point summary BoxPlot
- Histogram and Bar Chart
- Exploratory analytics R Methods

**Inferential Statistics**- What is inferential statistics
- Different types of Sampling techniques
- Central Limit Theorem
- Point estimate and Interval estimate
- Creating confidence interval for population parameter
- Characteristics of Z-distribution and T-Distribution
- Basics of Hypothesis Testing
- Type of test and rejection region
- Type of errors in Hypothesis resting, Type-l error and Type-ll errors
- P-Value and Z-Score Method
- T-Test, Analysis of variance(ANOVA) and Analysis of Co variance(ANCOVA) Regression analysis in ANOVA

**Class Hands-on:**

- Problem solving for C.L.T
- Problem solving Hypothesis Testing
- Problem solving for T-test, Z-score test
- Case study and model run for ANOVA, ANCOVA

**Hypothesis Testing**- Hypothesis Testing
- Basics of Hypothesis Testing
- Type of test and Rejection Region
- Type o errors-Type 1 Errors,Type 2 Errors
- P value method,Z score Method

**PART 3 – UNDERSTANDING AND IMPLEMENTING MACHINE LEARNING**

**Introduction To Machine Learning**- What is Machine Learning?
- What is the Challenge?
- Introduction to Supervised Learning,Unsupervised Learning
- What is Reinforcement Learning?

**Linear Regression**- Introduction to Linear Regression
- Linear Regression with Multiple Variables
- Disadvantage of Linear Models
- Interpretation of Model Outputs
- Understanding Covariance and Colinearity
- Understanding Heteroscedasticity

**Case Study **

- Application of Linear
- Regression for Housing Price Prediction

**Logistic Regression**- Introduction to Logistic Regression.– Why Logistic Regression .
- Introduce the notion of classification Cost function for logistic regression
- Application of logistic regression to multi-class classification.
- Confusion Matrix, Odd’s Ratio And ROC Curve
- Advantages And Disadvantages of Logistic Regression.

**Case Study:**

- To classify an email as spam or not spam using logistic Regression.

**Decision Trees And Supervised Learning**- Decision Tree – data set
- How to build decision tree?
- Understanding Kart Model
- Classification Rules- Overfitting Problem
- Stopping Criteria And Pruning
- How to Find final size of Trees?
- Model A decision Tree.
- Naive Bayes
- Random Forests and Support Vector Machines
- Interpretation of Model Outputs

**Case Study:**

- Business Case Study for Kart Model
- Business Case Study for Random Forest
- Business Case Study for SVM

**Unsupervised Learning**- Hierarchical Clustering
- k-Means algorithm for clustering – groupings
- of unlabeled data points.
- Principal Component Analysis(PCA)- Data
- Independent components analysis(ICA)
- Anomaly Detection
- Recommender System-collaborative filtering algorithm

**Case Study**– Recommendation Engine for

e-commerce/retail chain

**Introduction to Deep Learning**- INeural Network
- Understaing Neural Network Model
- Understanding Tuning of Neural Network

**Case Study**: Case study using Neural Network

**Natural language Processing**- Introduction to natural Language
- Processing(NLP).
- Word Frequency Algorithms for NLP Sentiment Analysis

**Case Study :** Twitter data analysis using NLP

**Apache Spark Analytics**- What is Spark
- Introduction to Spark RDD
- Introduction to Spark SQL and Dataframes
- Using R-Spark for machine learning
- Hands-on:
- installation and configuration of Spark
- Hands on Spark RDD programming
- Hands on of Spark SQL and Dataframe programming
- Using R-Spark for machine learning programming

**PART 4 – R PROGRAMMING BASICS**

**R Basics, background**- Comprehensive R Archive Network
- Demo of Installing R On windows from CRAN Website
- Installing R Studios on Windows OS
- Setting Up R Workspace.
- Getting Help for R-How to use help system
- Installing Packages – Loading And Unloading Packages

**Getting familiar with basics**- Operators in R – Arithmetic,Relational,Logical and Assignment Operators
- Variables,Types Of Variables,Using variables Conditional statements,ifelse(),switch
- Loops: For Loops,While Loops,Using Break statement,Switch

**The R Programming Language- Data Types**- creating data objects from the keyword.
- How to make different type of data objects.
- Types of data structures in R
- Arrays And Lists- Create Access the elements
- Vectors – Create Vectors,Vectorized Operations,Power of Vectorized Operations Matrices- Building the first matrices,Matrix Operations,Subsetting,visualising subset
- Data Frames- create and filter data frames,Building And Merging data frames.

**Functions And Importing data into R**- Function Overview – Naming Guidelines
- Arguments Matching,Function with Multiple Arguments
- Additional Arguments using Ellipsis,Lazy Evaluation Multiple Return Values Function as Objects,Anonymous Functions
- Importing and exporting Data into R- importing from files like excel,csv and minitab.
- Import from URL and excel Files
- Import from database.

**Data Descriptive**- Statistics,Tabulation,Distribution
- Summary Statistics for Matrix Objects. apply() Command. Converting an Object into a Table
- Histograms, Stem and Leaf Plot, Density
- Normal Distribution

**Graphics in R – Types of graphics**- Bar Chart,Pie Chart,Histograms- Create and edit.
- Box Plots- Basics of Boxplots- Create and Edit Visualisation in R using ggplot2.
- More About Graphs: Adding Legends to Graphs,Adding Text to Graphs, Orienting the Axis Label.

**PART 5 – PYTHON FOR DATA SCIENCE**

**Python Programming Basics**- Installing Jupyter Notebooks
- Python Overview
- Python 2.7 vs Python 3
- Python Identifiers
- Various Operators and Operators Precedence
- Getting input from User,Comments,Multi line Comments.

**Making Decisions And Loop Control**- Simple if Statement,if-else Statement if-elif Statement.
- Introduction To while Loops.
- Introduction To for Loops,Using continue and break.

**Python Data Types: List,Tuples,Dictionaries**- Python Lists,Tuples,Dictionaries
- Accessing Values
- Basic Operations
- Indexing, Slicing, and Matrixes
- Built-in Functions & Methods
- Exercises on List,Tuples And Dictionary

**Functions And Modules**- Introduction To Functions
- Why Defining Functions
- Calling Functions
- Functions With Multiple Arguments.
- Anonymous Functions – Lambda Using Built-In Modules,User-Defined Modules,Module Namespaces,Iterators And Generators.

**File I/O And Exceptional Handling**- Opening and Closing Files
- open Function,file Object Attributes
- close() Method ,Read,write,seek.Exception Handling,the try-finally Clause
- Raising an Exceptions,User-Defined Exceptions
- Regular Expression- Search and Replace
- Regular Expression Modifiers
- Regular Expression Patterns,re module.

**Numpy**- Introduction to Numpy. Array Creation,Printing Arrays
- Basic Operations- Indexing, Slicing and Iterating Shape Manipulation – Changing shape,stacking and spliting of array Vector stacking

**Pandas And Matplotlib**- Introduction to Pandas
- Importing data into Python
- Pandas Data Frames,Indexing Data Frames ,Basic Operations With Data frame,Renaming Columns,Subletting and filtering a data frame.
- Matplotlib – Introduction,plot(),Controlling
- Line Properties,Working with Multiple Figures,Histograms

**Introduction to Tableau/Spotfire**- Connecting to data source
- Creating dashboard pages
- How to create calculated columns
- Different charts
- Hands-on:
- Hands on on connecting data source and data clensing
- Hands on various charts
- Hands on deployment of Predictive model in visualisation

### Course Highlights

1. A Dedicated Portal For Practicing.

2. Real Time Project Data Models to Work

3. 1-1 Mentorship

4. Internship Offers for Freshers.

5. Weekly Assignments.

6. Weekly Doubt Sessions

7. Advanced Curriculum

8. Certificates On successful Completion of Project .

9. Resume Preparation Tips

10. Interview Guidance And Support.

11. Dedicated HR Team for Job Support And Placement Assistance.

12. Experienced Trainers.