Skip to main content Link Search Menu Expand Document (external link)

Principles of Data Science

DSC 10, Winter 2025 at UC San Diego

Janine Tiefenbruck
she/her

jlobue@ucsd.edu

Lecture(s): (A) MWF 10-10:50AM, (B) MWF 11-11:50AM in Center 212

Welcome to DSC 10! Make sure to read this website thoroughly and complete the items in the Getting Started checklist. These are due Wednesday, January 8th at 11:59PM.

Jump to the current week

Week 1 – Python Basics

Mon Jan 6

LEC 1 Introduction   

CIT 1.0, BPD 1-3

Keywords: course logistics, syllabus, Little Women demo, Jupyter notebooks, expressions
Wed Jan 8

LEC 2 Variables and Data Types   

BPD 3-5

Keywords: variables, assignment, functions, import, methods, int, float, string

DISC 1 Getting Started with Jupyter Notebooks

SUR Welcome Survey

SYL Syllabus Check

PRE Pretest

Fri Jan 10

LEC 3 Lists and Arrays   

BPD 7-8, CIT 14.1

Keywords: mean, median, lists, arrays, array arithmetic, array methods, np.arange
Sat Jan 11

LAB 0 Expressions and Data Types

Week 2 – DataFrames and Visualization

Mon Jan 13

LEC 4 DataFrames

BPD 9

Keywords: read_csv, .get, .assign, .sort_values, .iloc, .loc, .set_index, US states

DISC 2 Practice Problems

Wed Jan 15

LEC 5 Querying and Grouping

BPD 10-11

Keywords: Booleans, querying, .shape, &, |, .take, .groupby, aggregation, .drop

DISC 3 Practice Problems

Fri Jan 17

LEC 6 Data Visualization

CIT 7.0-7.1

Keywords: numerical vs. categorical, scatter plot, line plot, bar chart, exoplanets
Sat Jan 18

LAB 1 Arrays and DataFrames

Week 3 – Histograms and Functions

Mon Jan 20

No Lecture (Martin Luther King Jr. Day)

Tue Jan 21

HW 1 Basic Python, Arrays, and DataFrames

Wed Jan 22

LEC 7 Distributions and Histograms

CIT 7.2-7.3

Keywords: distributions, density histograms, binning, total area, overlaid plots

QUIZ 1 Quiz 1 covers Lectures 1-6

Fri Jan 24

LEC 8 Functions and Applying

BPD 6, 12

Keywords: functions, arguments, print vs. return, .apply, .reset_index
Sat Jan 25

LAB 2 Data Visualizations and Functions

Week 4 – DataFrames, Control Flow, and Probability

Mon Jan 27

LEC 9 Grouping on Multiple Columns, Merging

BPD 11, 13

Keywords: .groupby([col_1, col_2, …]), subgroups, MultiIndex, .merge, number of rows

DISC 4 Practice Problems

Tue Jan 28

HW 2 DataFrames, Data Visualization, and Functions

Wed Jan 29

LEC 10 Conditional Statements and Iteration

CIT 9.0-9.2

Keywords: in, not, and, or, if, else, elif, for-loops, np.append, accumulator pattern

DISC 5 Practice Problems

Fri Jan 31

LEC 11 Probability

CIT 9.5

Keywords: event, conditional prob., multiplication and addition rules, independence
Sat Feb 1

LAB 3 DataFrames, Control Flow, and Probability

Week 5 – Simulations and Sampling

Mon Feb 3

LEC 12 Simulation

CIT 9.3-9.4

Keywords: np.random.choice, replacement, np.count_nonzero, coin flipping, Monty Hall

DISC 6 Practice Problems

Tue Feb 4

HW 3 DataFrames, Control Flow, and Probability

SUR Mid-Quarter Survey

Wed Feb 5

LEC 13 Distributions and Sampling

CIT 10.0-10.4

Keywords: probability vs. empirical distribution, SRS, .sample, parameter, statistic

QUIZ 2 Quiz 2 covers Lectures 7-11

Fri Feb 7

LEC 14 Bootstrapping and Confidence Intervals

CIT 13.0-13.2

Keywords: inference, bootstrapping, resample, np.percentile, confidence interval

Week 6 – Confidence Intervals and the Normal Distribution

Mon Feb 10

EXAM Midterm Exam covers Lectures 1-12

Tue Feb 11

PROJ Midterm Project

Wed Feb 12

LEC 15 Confidence Intervals, Center, and Spread

CIT 13.3-13.4

Keywords: interpreting CIs, robust vs. sensitive, center, standard deviation

DISC 7 Practice Problems

Fri Feb 14

LEC 16 Standardization and the Normal Distribution

CIT 14.2-14.3

Keywords: Chebyshev, standard units, normal distribution, CDF, inflection points
Sat Feb 15

LAB 4 Simulation, Sampling, & Bootstrapping

Week 7 – Central Limit Theorem

Mon Feb 17

No Lecture (Presidents Day)

Tue Feb 18

HW 4 Simulation, Sampling, & Bootstrapping

Wed Feb 19

LEC 17 The Central Limit Theorem

CIT 14.4-14.5

Keywords: distribution of the sample mean, square root law, CLT-based CIs

DISC 8 Practice Problems

Fri Feb 21

LEC 18 Choosing Sample Sizes, Statistical Models

CIT 14.6, 11.1

Keywords: standard deviation of 0s and 1s, np.random.multinomial, Robert Swain jury
Sat Feb 22

LAB 5 Variability and the Normal Distribution

Week 8 – Hypothesis and Permutation Testing

Mon Feb 24

LEC 19 Hypothesis Testing

CIT 11.3

Keywords: null and alternative hypotheses, test statistic, fair or unfair coin

DISC 9 Practice Problems

Tue Feb 25

HW 5 The Normal Distribution and the Central Limit Theorem

Wed Feb 26

LEC 20 Hypothesis Testing and Total Variation Distance

CIT 11.2, 11.4

Keywords: fair or unfair coin, p-value, midterm exam scores, Alameda County jury, TVD

QUIZ 2 Quiz 3 covers Lectures 13-17

Fri Feb 28

LEC 21 TVD, Hypothesis Testing, and Permutation Testing

CIT 12.0-12.1

Keywords: confidence intervals for hypothesis testing, body temperature, smoking/babies
Sat Mar 1

LAB 6 Hypothesis Testing

Week 9 – Prediction

Mon Mar 3

LEC 22 Permutation Testing

CIT 12.3

Keywords: smoking/babies, np.random.permutation, shuffling, Deflategate

DISC 10 Practice Problems

Wed Mar 5

LEC 23 Correlation

CIT 15.0-15.2

Keywords: association, correlation coefficient (r), predicting heights, regression line (su)

QUIZ 4 Quiz 4 covers Lectures 18-21

Fri Mar 7

LEC 24 Regression and Least Squares

CIT 15.2-15.4

Keywords: regression line in original units, outliers, errors, RMSE, best fit, least squares
Sat Mar 8

HW 6 Hypothesis Testing and Permutation Testing

Week 10 – Review

Mon Mar 10

LEC 25 Residuals and Inference

CIT 15.5-16.3

Keywords: residuals, residual plots, patterns, datasaurus dozen, prediction intervals

DISC 11 Practice Problems

Tue Mar 11

LAB 7 Regression

Wed Mar 12

LEC 26 Review

QUIZ 5 Quiz 5 covers Lectures 22-25

Thu Mar 13

PROJ Final Project

Fri Mar 14

LEC 27 Review, Conclusion

Sat Mar 15

EXAM Final Exam (7-10PM)

SUR SETs and End-of-Quarter Survey (due 8AM)