Advanced Statistics for Data Science Specialization
Basic Info
Faculty Profile
Course Contents
Course Outcomes
Assignments
Exams
Further Readings
Register

Course Title:

Advanced Statistics for Data Science Specialization



Course Description:

Fundamental concepts in probability, statistics and linear models are primary building blocks for data science work. Learners aspiring to become biostatisticians and data scientists will benefit from the foundational knowledge being offered in this specialization. It will enable the learner to understand the behind-the-scenes mechanism of key modeling tools in data science, like least squares and linear regression. This specialization starts with Mathematical Statistics bootcamps, specifically concepts and methods used in biostatistics applications. These range from probability, distribution, and likelihood concepts to hypothesis testing and case-control sampling. This specialization also linear models for data science, starting from understanding least squares from a linear algebraic and mathematical perspective, to statistical linear models, including multivariate regression using the R programming language. These courses will give learners a firm foundation in the linear algebraic treatment of regression modeling, which will greatly augment applied data scientists' general understanding of regression models.



Course instructional level:

Advance

Course Duration:

3 Month/6 Month
Hours: 45

Course Images



Course coordinator:

Brian Caffo, PhD

Prerequisites, if any:

NA

Course coordinator's profile(s):

Brian Caffo, PhD is a professor in the Department of Biostatistics at the Johns Hopkins University Bloomberg School of Public Health. He graduated from the Department of Statistics at the University of Florida in 2001. He works in the fields of computational statistics and neuroinformatics and co-created the SMART ( www.smart-stats.org) working group. He has been the recipient of the Presidential Early Career Award for Scientist ( PECASE) and Engineers and Bloomberg School of Public Health Golden Apple and AMTRA teaching awards.

Course Contents:



Module/Topic name Sub-topic Duration Module/Topic- wiseCourse name (Coursera/ other online platform) University/organization name: Course Instructor name Course (Coursera/ other online platform) web-page link. Please paste the web page link of each topic/sub-topic.
1. Mathematical Biostatistics Boot Camp 1 1. Introduction, Probability, expectation, and Random vectors. Course name: University/organization name: Johns Hopkis University Course Instructor name: Brian Caffo, PhD web page link: https://www.coursera.org/learn/biostatistics?specialization=advanced-statistics-data-science
2. Conditional Probability, Bayes’Rule, Likelihood, Distributions, and Asymptotics web page link: https://www.coursera.org/learn/biostatistics?specialization=advanced-statistics-data-science
3. Confidence Intervals, Bootstrapping, and Plotting web page link: https://www.coursera.org/learn/biostatistics?specialization=advanced-statistics-data-science
4. Binomial Proportions and Logs web page link: https://www.coursera.org/learn/biostatistics?specialization=advanced-statistics-data-science
2. Mathematical Biostatistics Boot Camp 2 1. Hypothesis testing Course name: University/organization name: Johns Hopkis University Course Instructor name: Brian Caffo, PhD web page link: https://www.coursera.org/learn/biostatistics-2?specialization=advanced-statistics-data-science
2. Two Binomials web page link: https://www.coursera.org/learn/biostatistics-2?specialization=advanced-statistics-data-science
3. Discrete Data Setting web page link: https://www.coursera.org/learn/biostatistics-2?specialization=advanced-statistics-data-science
4. Techniques web page link: https://www.coursera.org/learn/biostatistics-2?specialization=advanced-statistics-data-science
3. Advanced Linear Models for Data Science 1: Least Squares 1. Background Course name: University/organization name: Johns Hopkis University Course Instructor name: Brian Caffo, PhD web page link: https://www.coursera.org/learn/linear-models?specialization=advanced-statistics-data-science#modules
2. One and two parameter regression web page link: https://www.coursera.org/learn/linear-models?specialization=advanced-statistics-data-science#modules
3. Linear regression web page link: https://www.coursera.org/learn/linear-models?specialization=advanced-statistics-data-science#modules
4. General least squares web page link: https://www.coursera.org/learn/linear-models?specialization=advanced-statistics-data-science#modules
5. General least squares https://www.coursera.org/learn/linear-models?specialization=advanced-statistics-data-science#modules
6. Least squares example https://www.coursera.org/learn/linear-models?specialization=advanced-statistics-data-science#modules
4. Advanced Linear Models for Data Science 2: Statistical Linear Models 1. Introduction and expected values Course name: University/organization name: Johns Hopkis University Course Instructor name: Brian Caffo, PhD https://www.coursera.org/learn/linear-models-2?specialization=advanced-statistics-data-science
2. The multivariate normal distribution https://www.coursera.org/learn/linear-models-2?specialization=advanced-statistics-data-science
3. Distributional results https://www.coursera.org/learn/linear-models-2?specialization=advanced-statistics-data-science
4. Residuals https://www.coursera.org/learn/linear-models-2?specialization=advanced-statistics-data-science


Course Outcomes:

  1. Learn about probability, expectations, conditional probabilities, distributions, confidence intervals, bootstrapping, binomial proportions, and more.
  2. Understand the matrix algebra of linear regression models.
  3. Learn about canonical examples of linear models to relate them to techniques that you may already be using.