Principles of Data Science
DSC 10, Winter 2025 at UC San Diego
Janine Tiefenbruckshe/her
Lecture(s): (A) MWF 10-10:50AM, (B) MWF 11-11:50AM in Center 212
Welcome to DSC 10! Make sure to read this website thoroughly and complete the items in the Getting Started checklist. These are due Wednesday, January 8th at 11:59PM.
Week 1 – Python Basics
- Mon Jan 6
- Keywords: course logistics, syllabus, Little Women demo, Jupyter notebooks, expressions
- Wed Jan 8
LEC 2 Variables and Data Types
Keywords: variables, assignment, functions, import, methods, int, float, stringDISC 1 Getting Started with Jupyter Notebooks
SUR Welcome Survey
SYL Syllabus Check
PRE Pretest
- Fri Jan 10
- Keywords: mean, median, lists, arrays, array arithmetic, array methods, np.arange
- Sat Jan 11
Week 2 – DataFrames and Visualization
- Mon Jan 13
LEC 4 DataFrames
Keywords: read_csv, .get, .assign, .sort_values, .iloc, .loc, .set_index, US statesDISC 2 Practice Problems
- Wed Jan 15
LEC 5 Querying and Grouping
Keywords: Booleans, querying, .shape, &, |, .take, .groupby, aggregation, .dropDISC 3 Practice Problems
- Fri Jan 17
LEC 6 Data Visualization
Keywords: numerical vs. categorical, scatter plot, line plot, bar chart, exoplanets- Sat Jan 18
LAB 1 Arrays and DataFrames
Week 3 – Histograms and Functions
- Mon Jan 20
No Lecture (Martin Luther King Jr. Day)
- Tue Jan 21
HW 1 Basic Python, Arrays, and DataFrames
- Wed Jan 22
LEC 7 Distributions and Histograms
Keywords: distributions, density histograms, binning, total area, overlaid plotsQUIZ 1 Quiz 1 covers Lectures 1-6
- Fri Jan 24
LEC 8 Functions and Applying
Keywords: functions, arguments, print vs. return, .apply, .reset_index- Sat Jan 25
LAB 2 Data Visualizations and Functions
Week 4 – DataFrames, Control Flow, and Probability
- Mon Jan 27
LEC 9 Grouping on Multiple Columns, Merging
Keywords: .groupby([col_1, col_2, …]), subgroups, MultiIndex, .merge, number of rowsDISC 4 Practice Problems
- Tue Jan 28
HW 2 DataFrames, Data Visualization, and Functions
- Wed Jan 29
LEC 10 Conditional Statements and Iteration
Keywords: in, not, and, or, if, else, elif, for-loops, np.append, accumulator patternDISC 5 Practice Problems
- Fri Jan 31
LEC 11 Probability
Keywords: event, conditional prob., multiplication and addition rules, independence- Sat Feb 1
LAB 3 DataFrames, Control Flow, and Probability
Week 5 – Simulations and Sampling
- Mon Feb 3
LEC 12 Simulation
Keywords: np.random.choice, replacement, np.count_nonzero, coin flipping, Monty HallDISC 6 Practice Problems
- Tue Feb 4
HW 3 DataFrames, Control Flow, and Probability
SUR Mid-Quarter Survey
- Wed Feb 5
LEC 13 Distributions and Sampling
Keywords: probability vs. empirical distribution, SRS, .sample, parameter, statisticQUIZ 2 Quiz 2 covers Lectures 7-11
- Fri Feb 7
LEC 14 Bootstrapping and Confidence Intervals
Keywords: inference, bootstrapping, resample, np.percentile, confidence interval
Week 6 – Confidence Intervals and the Normal Distribution
- Mon Feb 10
EXAM Midterm Exam covers Lectures 1-12
- Tue Feb 11
PROJ Midterm Project
- Wed Feb 12
LEC 15 Confidence Intervals, Center, and Spread
Keywords: interpreting CIs, robust vs. sensitive, center, standard deviationDISC 7 Practice Problems
- Fri Feb 14
LEC 16 Standardization and the Normal Distribution
Keywords: Chebyshev, standard units, normal distribution, CDF, inflection points- Sat Feb 15
LAB 4 Simulation, Sampling, & Bootstrapping
Week 7 – Central Limit Theorem
- Mon Feb 17
No Lecture (Presidents Day)
- Tue Feb 18
HW 4 Simulation, Sampling, & Bootstrapping
- Wed Feb 19
LEC 17 The Central Limit Theorem
Keywords: distribution of the sample mean, square root law, CLT-based CIsDISC 8 Practice Problems
- Fri Feb 21
LEC 18 Choosing Sample Sizes, Statistical Models
Keywords: standard deviation of 0s and 1s, np.random.multinomial, Robert Swain jury- Sat Feb 22
LAB 5 Variability and the Normal Distribution
Week 8 – Hypothesis and Permutation Testing
- Mon Feb 24
LEC 19 Hypothesis Testing
Keywords: null and alternative hypotheses, test statistic, fair or unfair coinDISC 9 Practice Problems
- Tue Feb 25
HW 5 The Normal Distribution and the Central Limit Theorem
- Wed Feb 26
LEC 20 Hypothesis Testing and Total Variation Distance
Keywords: fair or unfair coin, p-value, midterm exam scores, Alameda County jury, TVDQUIZ 2 Quiz 3 covers Lectures 13-17
- Fri Feb 28
LEC 21 TVD, Hypothesis Testing, and Permutation Testing
Keywords: confidence intervals for hypothesis testing, body temperature, smoking/babies- Sat Mar 1
LAB 6 Hypothesis Testing
Week 9 – Prediction
- Mon Mar 3
LEC 22 Permutation Testing
Keywords: smoking/babies, np.random.permutation, shuffling, DeflategateDISC 10 Practice Problems
- Wed Mar 5
LEC 23 Correlation
Keywords: association, correlation coefficient (r), predicting heights, regression line (su)QUIZ 4 Quiz 4 covers Lectures 18-21
- Fri Mar 7
LEC 24 Regression and Least Squares
Keywords: regression line in original units, outliers, errors, RMSE, best fit, least squares- Sat Mar 8
HW 6 Hypothesis Testing and Permutation Testing
Week 10 – Review
- Mon Mar 10
LEC 25 Residuals and Inference
Keywords: residuals, residual plots, patterns, datasaurus dozen, prediction intervalsDISC 11 Practice Problems
- Tue Mar 11
LAB 7 Regression
- Wed Mar 12
LEC 26 Review
QUIZ 5 Quiz 5 covers Lectures 22-25
- Thu Mar 13
PROJ Final Project
- Fri Mar 14
LEC 27 Review, Conclusion
- Sat Mar 15
EXAM Final Exam (7-10PM)
SUR SETs and End-of-Quarter Survey (due 8AM)