Principles of Data Science
DSC 10, Winter 2025 at UC San Diego
Janine Tiefenbruckshe/her
Lecture(s): (A) MWF 10-10:50AM, (B) MWF 11-11:50AM in Center 212
Welcome to DSC 10! Make sure to read this website thoroughly and complete the items in the Getting Started checklist.
Week 1 – Python Basics
- Mon Jan 6
- Keywords: course logistics, syllabus, Little Women demo, Jupyter notebooks, expressions
- Wed Jan 8
LEC 2 Variables and Data Types
Keywords: variables, assignment, functions, import, methods, int, float, stringDISC 1 Getting Started with Jupyter Notebooks
SUR Welcome Survey
SYL Syllabus Check
PRE Pretest
- Fri Jan 10
- Keywords: mean, median, lists, arrays, array arithmetic, array methods, np.arange
- Sat Jan 11
Week 2 – DataFrames and Visualization
- Mon Jan 13
- Keywords: read_csv, .get, .assign, .sort_values, .iloc, .loc, .set_index, US states
DISC 2 Basic Python and Arrays
- Wed Jan 15
- Keywords: Booleans, querying, .shape, &, |, .take, .groupby, aggregation, .drop
- Fri Jan 17
- Keywords: numerical vs. categorical, scatter plot, line plot, bar chart, exoplanets
- Sat Jan 18
LAB 1 Arrays and DataFrames
Week 3 – Histograms and Functions
- Mon Jan 20
No Lecture (Martin Luther King Jr. Day)
- Tue Jan 21
- Wed Jan 22
LEC 7 Distributions and Histograms
Keywords: distributions, density histograms, binning, total area, overlaid plotsQUIZ 1 Quiz 1 covers Lectures 1-6
- Fri Jan 24
LEC 8 Functions and Applying
Keywords: functions, arguments, print vs. return, .apply, .reset_index- Sat Jan 25
LAB 2 Data Visualizations and Functions
Week 4 – DataFrames, Control Flow, and Probability
- Mon Jan 27
LEC 9 Grouping on Multiple Columns, Merging
Keywords: .groupby([col_1, col_2, …]), subgroups, MultiIndex, .merge, number of rowsDISC 4 Practice Problems
- Tue Jan 28
HW 2 DataFrames, Data Visualization, and Functions
- Wed Jan 29
LEC 10 Conditional Statements and Iteration
Keywords: in, not, and, or, if, else, elif, for-loops, np.append, accumulator patternDISC 5 Practice Problems
- Fri Jan 31
LEC 11 Probability
Keywords: event, conditional prob., multiplication and addition rules, independence- Sat Feb 1
LAB 3 DataFrames, Control Flow, and Probability
Week 5 – Simulations and Sampling
- Mon Feb 3
LEC 12 Simulation
Keywords: np.random.choice, replacement, np.count_nonzero, coin flipping, Monty HallDISC 6 Practice Problems
- Tue Feb 4
HW 3 DataFrames, Control Flow, and Probability
SUR Mid-Quarter Survey
- Wed Feb 5
LEC 13 Distributions and Sampling
Keywords: probability vs. empirical distribution, SRS, .sample, parameter, statisticQUIZ 2 Quiz 2 covers Lectures 7-11
- Fri Feb 7
LEC 14 Bootstrapping and Confidence Intervals
Keywords: inference, bootstrapping, resample, np.percentile, confidence interval
Week 6 – Confidence Intervals and the Normal Distribution
- Mon Feb 10
EXAM Midterm Exam covers Lectures 1-12
- Tue Feb 11
PROJ Midterm Project
- Wed Feb 12
LEC 15 Confidence Intervals, Center, and Spread
Keywords: interpreting CIs, robust vs. sensitive, center, standard deviationDISC 7 Practice Problems
- Fri Feb 14
LEC 16 Standardization and the Normal Distribution
Keywords: Chebyshev, standard units, normal distribution, CDF, inflection points- Sat Feb 15
LAB 4 Simulation, Sampling, & Bootstrapping
Week 7 – Central Limit Theorem
- Mon Feb 17
No Lecture (Presidents Day)
- Tue Feb 18
HW 4 Simulation, Sampling, & Bootstrapping
- Wed Feb 19
LEC 17 The Central Limit Theorem
Keywords: distribution of the sample mean, square root law, CLT-based CIsDISC 8 Practice Problems
- Fri Feb 21
LEC 18 Choosing Sample Sizes, Statistical Models
Keywords: standard deviation of 0s and 1s, np.random.multinomial, Robert Swain jury- Sat Feb 22
LAB 5 Variability and the Normal Distribution
Week 8 – Hypothesis and Permutation Testing
- Mon Feb 24
LEC 19 Hypothesis Testing
Keywords: null and alternative hypotheses, test statistic, fair or unfair coinDISC 9 Practice Problems
- Tue Feb 25
HW 5 The Normal Distribution and the Central Limit Theorem
- Wed Feb 26
LEC 20 Hypothesis Testing and Total Variation Distance
Keywords: fair or unfair coin, p-value, midterm exam scores, Alameda County jury, TVDQUIZ 2 Quiz 3 covers Lectures 13-17
- Fri Feb 28
LEC 21 TVD, Hypothesis Testing, and Permutation Testing
Keywords: confidence intervals for hypothesis testing, body temperature, smoking/babies- Sat Mar 1
LAB 6 Hypothesis Testing
Week 9 – Prediction
- Mon Mar 3
LEC 22 Permutation Testing
Keywords: smoking/babies, np.random.permutation, shuffling, DeflategateDISC 10 Practice Problems
- Wed Mar 5
LEC 23 Correlation
Keywords: association, correlation coefficient (r), predicting heights, regression line (su)QUIZ 4 Quiz 4 covers Lectures 18-21
- Fri Mar 7
LEC 24 Regression and Least Squares
Keywords: regression line in original units, outliers, errors, RMSE, best fit, least squares- Sat Mar 8
HW 6 Hypothesis Testing and Permutation Testing
Week 10 – Review
- Mon Mar 10
LEC 25 Residuals and Inference
Keywords: residuals, residual plots, patterns, datasaurus dozen, prediction intervalsDISC 11 Practice Problems
- Tue Mar 11
LAB 7 Regression
- Wed Mar 12
LEC 26 Review
QUIZ 5 Quiz 5 covers Lectures 22-25
- Thu Mar 13
PROJ Final Project
- Fri Mar 14
LEC 27 Review, Conclusion
- Sat Mar 15
EXAM Final Exam (7-10PM)
SUR SETs and End-of-Quarter Survey (due 8AM)