Basics of Data Science and Machine learning in R

Course code: DAR001

Duration:       44 Hours

About this course

This course is designed and delivered to a large extent with practice sessions. Data science goes beyond traditional analytics and statistical models and facilitates building predictive models. Participants will learn the concepts and motivation to apply data science and interdisciplinary programs. This also provides you the foundation skills to produce various types of predictive models for machine learning, the algorithms involved to build such models and how predictive modelling and machine learning are interrelated. The key aspect of this course is that, participants are exposed to sufficient amount of real time use cases, for each of the machine learning algorithms learnt, in order to analyse, conceptualize and build various types of predictive models, on their own.


Students and Working Professionals who want to learn Data Science, Machine Learning & R. This is a foundation course in data science imparting necessary skills to begin working on machine learning projects.


Understanding of basic mathematics and analytical thinking are required.

Course Objectives

This course provides the foundation knowledge needed in analytics and data science:

    - Introduction to analytics and data science and their applications today
    - Using R software and its facilities
    - Producing reports using R
    - Basic mathematics
    - Various Machine learning algorithms
    - Predictive models
    - Project works for applying each of the machine learning algorithms learnt.

Course Curriculum

1. Welcome to the Course - 4 hours

Introduction to Data Sciences, Machine Learning, Applications, R, Data Collection, Data Types & R Studio.

2. Basic Statistics and other fundamentals - 4 hours

Basic Statistics, Measure of Spread, Measure of Dispersion, Outlier Treatment, Data visualization.

Probability Theory, Probability Distribution – Binomial distribution, Normal distribution.

3. Starting using R - 4 hours

R Installation & Introduction, Process of Cleaning datasets, Treating Missing value, Transforming variables.

4. Linear Regression- 4 hours

Regression – Linear regression, coefficient of determination (R2), Multi linear regression.

5. Making a predictive model using Linear Regression - 4 hours

House Prices Prediction using Linear regression in R (CASE STUDY)

6. Logistic regression - 4 hours

Logistic regression. Logistic regression in R and theory.

7. Logistic regression case study - 4 hours

Titanic Survival prediction in R using Logistic Regression

8. Cluster Analysis

K-means in R and theory.

9. Grouping of Clusters - 4 hours

Grouping of Clusters for Digital marketing (Using K means)

10. Decision Sciences – 4 Hours

Decision Sciences using Decision tree with practical Problem Statement

11. Final Evaluation of Projects based on real world use cases – 4 Hours

Projects are based on real data using applied machine learning