Have you ever wondered why some communities face higher disease burdens than others, how outbreaks are detected before they spiral, or what it truly takes to know whether a treatment works? Behind every major public health decision, from vaccine rollouts to pollution control policies, is a world of numbers, patterns, and hidden signals waiting to be uncovered. This is the power of data in protecting populations.
Data Analytics for Public Health is a live, interactive course that takes you inside this world: where real health questions meet rigorous data-driven investigations. From tracing epidemics to evaluating healthcare interventions, this course shows how modern data science can reveal insights that save lives, guide policies, and shape healthier societies.
Through guided projects using open-access health datasets, students will learn how to transform raw data into meaningful stories. Step by step, you will build the basics of computational analysis and statistical thinking while gaining hands-on experience with Python. Along the way, you will explore predictive models, work with real-world health variables, and use data-visualisation tools such as histograms, boxplots, and more to uncover patterns that shape public health decisions. You wonβt just learn how to analyse data, youβll learn how to think like a public health data scientist navigating real-world complexity.
The course culminates in a collaborative capstone project, where you will work in small research teams and choose one of three real-world public health challenges. You may predict diabetes risk using lifestyle and health indicators, model the spread of COVID-19 over time using epidemiological frameworks, or investigate how air quality is associated with respiratory ailments. Using the tools you learn in class, including Python, statistical methods, and data-visualisation techniques, you will develop a concise, data-driven analysis that mirrors how modern public health decisions are made.
Whatβs more? Top students may also receive an observership opportunity, where they will connect with PhD researchers developing and applying data-science tools to address pressing public health challenges: gaining first-hand exposure to research in action.
This course is for high schoolers who are curious about healthcare, public health, data, mathematics, or computer science. Students do not need prior coding experience, only the willingness to learn and engage with problem-solving. Familiarity with basic algebra is assumed, and familiarity with vectors or matrices is helpful but not required.
Prerequisite: High proficiency in written & spoken English.
By the end of this programme, you will:
| Week | Lecture Module | Project Module |
|---|---|---|
| Week 1 | What is Health Data Science?
|
Python Setup + Project Setup |
| Week 2 | Prediction Modeling
|
Playing with Models
|
| Week 3 | Visualisation & Interpretation
|
Understanding & Communicating ResultsΒ
|
| Week 4 | Big Data in Public Health
|
Capstone Presentation
|
| Counselling:Β
Get a chance to ask questions to the faculty and the mentor, and get their answers and perspective.Β You are encouraged to ask questions to the faculty around the following aspects:
|
Mentoring:
You are encouraged to ask questions to the mentor around the following aspects:
|
Working in small research teams, students will choose one of three high-impact public health challenges and investigate it using the tools and techniques theyβve learned in class.
Students can choose to:
Harsh Parikh is an Assistant Professor in the Department of Biostatistics and a member of the Data Science and Data Equity Initiative at the Yale School of Public Health. His work develops accurate and trustworthy methods for causal inference, particularly in complex settings where standard assumptions may not hold.
He collaborates broadly across epidemiology, health policy, and critical care medicine, with additional applications in supply chain optimisation and social network analysis.
Before joining Yale, Parikh completed a postdoctoral fellowship at the Johns Hopkins Bloomberg School of Public Health. He holds a PhD in Computer Science from Duke University, where his dissertation received an Outstanding Dissertation Award and an Amazon Graduate Research Fellowship
Professor Bhramar Mukherjee is the Anna M.R. Lauder Professor of Biostatistics and Professor of Chronic Disease Epidemiology at the Yale School of Public Health, where she serves as the inaugural Senior Associate Dean of Public Health Data Science and Data Equity. She also holds a secondary appointment in the Department of Statistics and Data Science at Yale.
Before joining Yale in 2024, Professor Mukherjee spent over a decade at the University of Michigan, where she was appointed the John D. Kalbfleisch Distinguished University Professor of Biostatistics and became the first woman to serve as Chair of the Department of Biostatistics (2018β2024). She is widely recognised for her contributions to statistical methods for integrating genetic, environmental, and disease data from large-scale healthcare databases.
Professor Mukherjee is the recipient of numerous honours, including the 2023 Karl E. Peace Award for Outstanding Statistical Contributions for the Betterment of Society from the ASA and the 2024 Marvin Zelen Leadership in Statistical Science Award from Harvard Biostatistics. She is a Fellow of the ASA and AAAS, and an elected member of the U.S. National Academy of Medicine.
Grading, Assessments, and Certification
All Ashoka Horizons courses offer a certificate on satisfactory completion of the course.Β
Class participation will be assessed based on your active engagement in live sessions, contributions to discussion forums, and involvement in Teaching Fellow-led activities. Letters of Academic Achievement will be issued for select students based on exceptional performance in the course.
Achieve Moreβ¦with Horizons
*For select students, subject to the discretion of the faculty
This programmeΒ is administered through an online platform. Students are expected to have a foundational understanding of computer usage, including but not limited to sending emails and conducting Internet searches. Consistent access to the Internet and a computer that aligns with the recommended minimum specifications are also requisite for participation in the programme.
Have a question about Ashoka Horizons Achievers Programme? Write to us on horizons@ashoka.edu.in
It was a completely different experience from my school studies. Being able to collaborate with my teacher and be able to express my thoughts to them gave me a lot of confidence in my own thinking!
I also developed a deeper appreciation for statistical thinking, critical analysis, and working with real datasets under time pressure.
I was surprised by how approachable and structured the learning was. The focus on foundational ideas like data preprocessing, model types, bias, and interpretability made the complex concepts much clearer.