Skip to main content Link Search Menu Expand Document (external link)

Principles of Data Science

DSC 10, Winter 2025 at UC San Diego

Janine Tiefenbruck
she/her

jlobue@ucsd.edu

Lecture(s): (A) MWF 10-10:50AM, (B) MWF 11-11:50AM in Center 212

Welcome to DSC 10! Make sure to read this website thoroughly and complete the items in the Getting Started checklist.

Jump to the current week

Week 1 – Python Basics

Mon Jan 6

LEC 1 Introduction   

CIT 1.0, BPD 1-3

Keywords: course logistics, syllabus, Little Women demo, Jupyter notebooks, expressions
Wed Jan 8

LEC 2 Variables and Data Types   

BPD 3-5

Keywords: variables, assignment, functions, import, methods, int, float, string

DISC 1 Getting Started with Jupyter Notebooks

SUR Welcome Survey

SYL Syllabus Check

PRE Pretest

Fri Jan 10

LEC 3 Lists and Arrays   

BPD 7-8, CIT 14.1

Keywords: mean, median, lists, arrays, array arithmetic, array methods, np.arange
Sat Jan 11

LAB 0 Expressions and Data Types

Week 2 – DataFrames and Visualization

Mon Jan 13

LEC 4 DataFrames   

BPD 9

Keywords: read_csv, .get, .assign, .sort_values, .iloc, .loc, .set_index, US states

DISC 2 Basic Python and Arrays

Wed Jan 15

LEC 5 Querying and Grouping   

BPD 10-11

Keywords: Booleans, querying, .shape, &, |, .take, .groupby, aggregation, .drop

DISC 3 DataFrames, Querying, and Grouping

Fri Jan 17

LEC 6 Data Visualization   

CIT 7.0-7.1

Keywords: numerical vs. categorical, scatter plot, line plot, bar chart, exoplanets
Sat Jan 18

LAB 1 Arrays and DataFrames

Week 3 – Histograms and Functions

Mon Jan 20

No Lecture (Martin Luther King Jr. Day)

Tue Jan 21

HW 1 Basic Python, Arrays, and DataFrames

Wed Jan 22

LEC 7 Distributions and Histograms

CIT 7.2-7.3

Keywords: distributions, density histograms, binning, total area, overlaid plots

QUIZ 1 Quiz 1 covers Lectures 1-6

Fri Jan 24

LEC 8 Functions and Applying

BPD 6, 12

Keywords: functions, arguments, print vs. return, .apply, .reset_index
Sat Jan 25

LAB 2 Data Visualizations and Functions

Week 4 – DataFrames, Control Flow, and Probability

Mon Jan 27

LEC 9 Grouping on Multiple Columns, Merging

BPD 11, 13

Keywords: .groupby([col_1, col_2, …]), subgroups, MultiIndex, .merge, number of rows

DISC 4 Practice Problems

Tue Jan 28

HW 2 DataFrames, Data Visualization, and Functions

Wed Jan 29

LEC 10 Conditional Statements and Iteration

CIT 9.0-9.2

Keywords: in, not, and, or, if, else, elif, for-loops, np.append, accumulator pattern

DISC 5 Practice Problems

Fri Jan 31

LEC 11 Probability

CIT 9.5

Keywords: event, conditional prob., multiplication and addition rules, independence
Sat Feb 1

LAB 3 DataFrames, Control Flow, and Probability

Week 5 – Simulations and Sampling

Mon Feb 3

LEC 12 Simulation

CIT 9.3-9.4

Keywords: np.random.choice, replacement, np.count_nonzero, coin flipping, Monty Hall

DISC 6 Practice Problems

Tue Feb 4

HW 3 DataFrames, Control Flow, and Probability

SUR Mid-Quarter Survey

Wed Feb 5

LEC 13 Distributions and Sampling

CIT 10.0-10.4

Keywords: probability vs. empirical distribution, SRS, .sample, parameter, statistic

QUIZ 2 Quiz 2 covers Lectures 7-11

Fri Feb 7

LEC 14 Bootstrapping and Confidence Intervals

CIT 13.0-13.2

Keywords: inference, bootstrapping, resample, np.percentile, confidence interval

Week 6 – Confidence Intervals and the Normal Distribution

Mon Feb 10

EXAM Midterm Exam covers Lectures 1-12

Tue Feb 11

PROJ Midterm Project

Wed Feb 12

LEC 15 Confidence Intervals, Center, and Spread

CIT 13.3-13.4

Keywords: interpreting CIs, robust vs. sensitive, center, standard deviation

DISC 7 Practice Problems

Fri Feb 14

LEC 16 Standardization and the Normal Distribution

CIT 14.2-14.3

Keywords: Chebyshev, standard units, normal distribution, CDF, inflection points
Sat Feb 15

LAB 4 Simulation, Sampling, & Bootstrapping

Week 7 – Central Limit Theorem

Mon Feb 17

No Lecture (Presidents Day)

Tue Feb 18

HW 4 Simulation, Sampling, & Bootstrapping

Wed Feb 19

LEC 17 The Central Limit Theorem

CIT 14.4-14.5

Keywords: distribution of the sample mean, square root law, CLT-based CIs

DISC 8 Practice Problems

Fri Feb 21

LEC 18 Choosing Sample Sizes, Statistical Models

CIT 14.6, 11.1

Keywords: standard deviation of 0s and 1s, np.random.multinomial, Robert Swain jury
Sat Feb 22

LAB 5 Variability and the Normal Distribution

Week 8 – Hypothesis and Permutation Testing

Mon Feb 24

LEC 19 Hypothesis Testing

CIT 11.3

Keywords: null and alternative hypotheses, test statistic, fair or unfair coin

DISC 9 Practice Problems

Tue Feb 25

HW 5 The Normal Distribution and the Central Limit Theorem

Wed Feb 26

LEC 20 Hypothesis Testing and Total Variation Distance

CIT 11.2, 11.4

Keywords: fair or unfair coin, p-value, midterm exam scores, Alameda County jury, TVD

QUIZ 2 Quiz 3 covers Lectures 13-17

Fri Feb 28

LEC 21 TVD, Hypothesis Testing, and Permutation Testing

CIT 12.0-12.1

Keywords: confidence intervals for hypothesis testing, body temperature, smoking/babies
Sat Mar 1

LAB 6 Hypothesis Testing

Week 9 – Prediction

Mon Mar 3

LEC 22 Permutation Testing

CIT 12.3

Keywords: smoking/babies, np.random.permutation, shuffling, Deflategate

DISC 10 Practice Problems

Wed Mar 5

LEC 23 Correlation

CIT 15.0-15.2

Keywords: association, correlation coefficient (r), predicting heights, regression line (su)

QUIZ 4 Quiz 4 covers Lectures 18-21

Fri Mar 7

LEC 24 Regression and Least Squares

CIT 15.2-15.4

Keywords: regression line in original units, outliers, errors, RMSE, best fit, least squares
Sat Mar 8

HW 6 Hypothesis Testing and Permutation Testing

Week 10 – Review

Mon Mar 10

LEC 25 Residuals and Inference

CIT 15.5-16.3

Keywords: residuals, residual plots, patterns, datasaurus dozen, prediction intervals

DISC 11 Practice Problems

Tue Mar 11

LAB 7 Regression

Wed Mar 12

LEC 26 Review

QUIZ 5 Quiz 5 covers Lectures 22-25

Thu Mar 13

PROJ Final Project

Fri Mar 14

LEC 27 Review, Conclusion

Sat Mar 15

EXAM Final Exam (7-10PM)

SUR SETs and End-of-Quarter Survey (due 8AM)