Advanced Statistics for Data Science Specialization
Basic Info
Faculty Profile
Course Contents
Course Outcomes
Assignments
Exams
Further Readings

Course Title:

Advanced Statistics for Data Science Specialization



Course Description:

Fundamental concepts in probability, statistics and linear models are primary building blocks for data science work. Learners aspiring to become biostatisticians and data scientists will benefit from the foundational knowledge being offered in this specialization. It will enable the learner to understand the behind-the-scenes mechanism of key modeling tools in data science, like least squares and linear regression. This specialization starts with Mathematical Statistics bootcamps, specifically concepts and methods used in biostatistics applications. These range from probability, distribution, and likelihood concepts to hypothesis testing and case-control sampling. This specialization also linear models for data science, starting from understanding least squares from a linear algebraic and mathematical perspective, to statistical linear models, including multivariate regression using the R programming language. These courses will give learners a firm foundation in the linear algebraic treatment of regression modeling, which will greatly augment applied data scientists' general understanding of regression models.



Course instructional level:


Advance

Course Duration:


3 Month/6 Month
Hours: 45

Course coordinator:


Prof. Nilanjan Nandy

Course coordinator's profile(s):


Course Contents:



Module/Topic name Sub-topic Duration
1. Mathematical Biostatistics Boot Camp 1 1. Introduction, Probability, expectation, and Random vectors.
2. Conditional Probability, Bayes’Rule, Likelihood, Distributions, and Asymptotics
3. Confidence Intervals, Bootstrapping, and Plotting
4. Binomial Proportions and Logs
2. Mathematical Biostatistics Boot Camp 2 1. Hypothesis testing
2. Two Binomials
3. Discrete Data Setting
4. Techniques
3. Advanced Linear Models for Data Science 1: Least Squares 1. Background
2. One and two parameter regression
3. Linear regression
4. General least squares
5. General least squares
6. Least squares example
4. Advanced Linear Models for Data Science 2: Statistical Linear Models 1. Introduction and expected values
2. The multivariate normal distribution
3. Distributional results
4. Residuals


Course Outcomes:


  1. Learn about probability, expectations, conditional probabilities, distributions, confidence intervals, bootstrapping, binomial proportions, and more.
  2. Understand the matrix algebra of linear regression models.
  3. Learn about canonical examples of linear models to relate them to techniques that you may already be using.