# Imports
import babypandas as bpd
import numpy as np
import plotly.express as px
import matplotlib.pyplot as plt
plt.style.use('ggplot')
Welcome to DSC 10! 👋¶
- A guided tour of data science.
- Developed by UC Berkeley in 2015.
- Adapted by UC San Diego in 2017.
- Learn just enough programming and statistics to do data science.
- Statistics without too much math, mostly simulation.
- Lays the foundation for all other courses in the DSC major.
Agenda¶
- Course staff.
- What is data science?
- How will this course run?
- Jumping into Python basics:
- Doing basic calculations in Python
- Variables and assignment
- Data types: how do we represent numbers?
Course staff¶
Instructor: Samantha Chen (call me Sam)¶
- Originally from Midland, Michigan (〽️)
- BS (’20) in Mathematics and Computer Science from Carleton College (Northfield, Minnesota) and now in the 5th year of my PhD here at UCSD
- Teaching
- I was a teaching assistant in Fall term 2023 for CSE 20: Discrete Mathematics for Computer Science
- This is the first course I'm teaching!
- Outside the classroom: watching movies, reading books, hanging out at cafes, swimming, climbing, (short) hikes etc.

TA: Ashley Ho¶
- Originally from Irvine, CA.
- BS in Probability & Statistics, minor in Data Science from UCSD in 2024.
- Currently a 1st year Data Science Master's student at UCSD.
- Previously tutored DSC 10 four times. Tutored DSC 30 and DSC 80 as well.
- Fourth time TA-ing DSC 10!
- Outside interests: piano, cooking, movies & TV shows, my dog!

What is "data science"? 🤔¶

What is "data science"?¶
Data science is about drawing useful conclusions from data using computation. Throughout the quarter, we'll touch on several aspects of data science:
- First 2 weeks: use Python to explore data.
- Lots of visualization 📈📊 and "data manipulation", using industry-standard tools.
- Next 2 weeks: use data to infer about a population, given just a sample.
- Rely heavily on simulation, rather than formulas.
- Last 1 week: use data from the past to predict what may happen in the future.
- A taste of machine learning 🤖.
Data science is more relevant than ever 🤧¶
We spent all of 2020 looking at graphs that look like this:

As of March 2023, both the New York Times and Johns Hopkins have stopped updating their COVID dashboards.
It can be fun, too!¶
The site The Pudding is home to several interactive data-rich articles.


Course logistics¶
Course website¶
The course website is your one-stop-shop for all things related to the course.
Right now: dsc-courses.github.io/dsc10-2025-su
This is where lectures, homeworks, labs, practice problems, and all other content will be posted. Check it often, and read the syllabus!
Getting set up¶
- Campuswire: Q&A forum. All announcements will be made here. You should have gotten email invitation; if not, there's a link on syllabus.
- Gradescope: Where you will submit all assignments for autograding, and where all of your grades will live. You should have been automatically added; contact us if not.
- DataHub: Where you will access and run all code in this class. Access at datahub.ucsd.edu.
- We will not be using Canvas for anything!
In addition, you must also fill out the Welcome Survey.
Lecture¶
- Lectures will be hybrid: there is an in-person and zoom option. All lectures will also be recorded for viewing afterwards.
- In person: 9:30am-10:50am, Zoom: https://ucsd.zoom.us/j/97178497084?pwd=pv6DgQLlRPey6fHcUnYa9SCOobfiCQ.1 (password to zoom room should have been sent to you via email).
- Recordings can be found at podcast.ucsd.edu and on the course website.
- Participation (either in-person or on zoom) counts for 2% of your overall grade. You will get full points if you attend 14/19 lectures and if you filled out the pre-course survey, we'll count that towards your total lecture count.
- Slides/code from lecture will be linked on the course website, both in a "runnable" code format and as an HTML file (✏️), which you can save as a PDF and annotate on your tablet.
- We will try to make lectures engaging. Bring your laptop or tablet, if you have one.
Labs¶
- Labs refer to lab assignments, which are a required part of the course and help you develop fluency in Python and working with data.
- While working on labs, you'll be able to run autograder tests which tell you if your answers are correct.
- For labs, if you pass all autograder tests, you will get 100%!
- You must submit labs individually, but you can discuss ideas with others (no sharing code).
- The first lab is due this Thursday 07/03/2025 @11:59PM, PST.
Homeworks and projects¶
- Weekly homework assignments build off of skills you develop in labs.
- A key difference between homeworks and labs is that passing autograder tests does not guarantee a perfect score!
- In homeworks, we have "hidden tests" that are only run after you submit the assignment.
- The tests that are available to you within the assignment itself only verify that your answer is reasonable/on the right track.
- Again, you must work on homeworks yourself, but you can discuss ideas with other students (no sharing code).
- The homework scehdule can be found on the course website!
- In the Final Project, you will do a deep dive into a dataset! Projects are longer than homeworks, so we give you more time to work on them.
- This quarter's projects: UCSD Admissions 💯 and Meteorite Landings ☄️.
- You can work on projects with partners, following these project partner guidelines. Both of you should actively contribute to all parts of the project.
Quizzes¶
Instead of using the Monday 1pm-5pm block in the schedule for lab/discussion, we will be having 4 oral quizzes in this class. They make up 8% of your grade and are pass/fail.
- You will meet either me or Ashley for 5 minutes each Monday (starting next Monday). To get through all students, we have blocked out 1pm-3:30pm for this purpose. From 3:30pm-5:00pm, I plan on having office hours online.
- We will ask one question about topics covered in the previous week.
- You can make mistakes while answering and still pass! We just want to hear your thought process for problem solving.
Exams¶
We will have two exams this quarter.
- Midterm Exam: Friday, July 18th, during lecture.
- Final Exam: Saturday, August 2nd, 8-11AM.
- Both exams will be conducted remotely. However, you must turn on your camera during the exams and turn it so your hands and exam sheet are visible.
Readings and resources¶
- We will draw readings from two sources. Readings for each lecture will be posted on the course homepage.
- Computational and Inferential Thinking (CIT), the textbook created for Berkeley's version of this course.
babypandas
notes, written specifically for the first part of DSC 10.
- The Resources tab of the course website contains links to helpful resources that you'll want to use throughout the course (e.g. DSC 10 Reference Sheet, programming tutorials, supplemental videos).
- The Debugging tab of the course website has answers to many common technical issues.
First assignment¶
- Lab 0 is due this Thursday 07/03/2025 @11:59PM, PST
- Should be released on website. Let me know ASAP if you have not yet gotten access to DataHub.
- 🚨 Important: Start early and submit often.
Getting help¶
This is a tough, fast-paced course, but we're here to help you – here's how:
- Office Hours (OH).
- Held both remotely and in-person!
- Come with questions, or just to work!
- See the schedule and instructions on the Calendar 📆.
- Campuswire
- Post here with any logistical or conceptual questions (please don't email).
- No code or solutions in public posts. Such posts should be private to course staff.
- Otherwise, post publicly (anonymously, if you'd like).
- 🚨 Important: Use these to your advantage!
Advice from previous students¶
At the end of each quarter, we ask DSC 10 students to give advice to future students in the course. Here are some responses from Winter 2023:
Start the assignments early, every time that I started an assignment the day or even night of, I always struggled and the added pressure of not getting it in on time didn't help me one bit. The times that I started a day or two in advance, even if it was just completing a couple problems in advance, I felt way more relaxed and in turn I learned and retained a lot more.
Pay attention in lectures and to begin both labs and homework early because they will pile up. The lectures are very helpful references to use if you’re stuck during labs and homework’s and office hours are incredibly useful so go!!!
Use TA's and office hours as much as possible, also the reference sheet was crucial.
Collaboration¶
Asking questions is highly encouraged!¶
- Discuss all questions with each other (except exams).
- Submit lab assignments individually, but you can work with others (no sharing code).
- Submit homeworks individually, but you can discuss problem-solving strategies with others (no sharing code).
- Submit projects individually or in pairs.
The limits of collaboration:¶
- Don't share solutions with each other or look at someone’s code.
- Project partners should both contribute to all parts of the project. Don't split up the project.
- Don't use ChatGPT or GitHub Copilot – all work you submit should be written by you.
- Academic integrity violations usually result in failing the course.
We're here for you!¶
Regardless of your background, you can succeed in this course. No prior programming or statistics experience will be assumed!
Watch on YouTube: We’re All Data Scientists | Rebecca Nugent | TEDxCMU.
Campus resources¶
Counseling and Psychological Services (CAPS) is a campus unit that offers “short term counseling for academic, career, and personal issues and also offers psychiatry services for circumstances when medication can help with counseling.” If you or anyone you know is ever in need of mental health care, you should contact CAPS.
caps.ucsd.edu
Demo¶
Little Women (1868)¶
- Little Women, by Louisa May Alcott, is a novel that follows the life of four sisters – Meg, Jo, Beth, and Amy.
- A movie based on the novel was released in 2019, starring Emma Watson (Meg) and Timothée Chalamet (Laurie).
- Using tools from this class, we'll learn (a bit) about the plot of the book, without reading it.
- Do not worry about any of this code – we'll cover the necessary pieces in the weeks to come. Sit back and relax!
# Read in 'lw.txt' to a variable called little_women_text.
little_women_text = open('data/lw.txt').read()
# See the first three thousand characters.
little_women_text[:3000]
'The Project Gutenberg EBook of Little Women, by Louisa May Alcott\n\nThis eBook is for the use of anyone anywhere at no cost and with\nalmost no restrictions whatsoever. You may copy it, give it away or\nre-use it under the terms of the Project Gutenberg License included\nwith this eBook or online at www.gutenberg.net\n\n\nTitle: Little Women\n\nAuthor: Louisa May Alcott\n\nPosting Date: September 13, 2008 [EBook #514]\nRelease Date: May, 1996\n[This file last updated on August 19, 2010]\n\nLanguage: English\n\n\n*** START OF THIS PROJECT GUTENBERG EBOOK LITTLE WOMEN ***\n\n\n\n\nLITTLE WOMEN\n\n\nby\n\nLouisa May Alcott\n\n\n\n\nCONTENTS\n\n\nPART 1\n\n ONE PLAYING PILGRIMS\n TWO A MERRY CHRISTMAS\n THREE THE LAURENCE BOY\n FOUR BURDENS\n FIVE BEING NEIGHBORLY\n SIX BETH FINDS THE PALACE BEAUTIFUL\n SEVEN AMY\'S VALLEY OF HUMILIATION\n EIGHT JO MEETS APOLLYON\n NINE MEG GOES TO VANITY FAIR\n TEN THE P.C. AND P.O.\n ELEVEN EXPERIMENTS\n TWELVE CAMP LAURENCE\n THIRTEEN CASTLES IN THE AIR\n FOURTEEN SECRETS\n FIFTEEN A TELEGRAM\n SIXTEEN LETTERS\n SEVENTEEN LITTLE FAITHFUL\n EIGHTEEN DARK DAYS\n NINETEEN AMY\'S WILL\n TWENTY CONFIDENTIAL\n TWENTY-ONE LAURIE MAKES MISCHIEF, AND JO MAKES PEACE\n TWENTY-TWO PLEASANT MEADOWS\n TWENTY-THREE AUNT MARCH SETTLES THE QUESTION\n\n\nPART 2\n\n TWENTY-FOUR GOSSIP\n TWENTY-FIVE THE FIRST WEDDING\n TWENTY-SIX ARTISTIC ATTEMPTS\n TWENTY-SEVEN LITERARY LESSONS\n TWENTY-EIGHT DOMESTIC EXPERIENCES\n TWENTY-NINE CALLS\n THIRTY CONSEQUENCES\n THIRTY-ONE OUR FOREIGN CORRESPONDENT\n THIRTY-TWO TENDER TROUBLES\n THIRTY-THREE JO\'S JOURNAL\n THIRTY-FOUR FRIEND\n THIRTY-FIVE HEARTACHE\n THIRTY-SIX BETH\'S SECRET\n THIRTY-SEVEN NEW IMPRESSIONS\n THIRTY-EIGHT ON THE SHELF\n THIRTY-NINE LAZY LAURENCE\n FORTY THE VALLEY OF THE SHADOW\n FORTY-ONE LEARNING TO FORGET\n FORTY-TWO ALL ALONE\n FORTY-THREE SURPRISES\n FORTY-FOUR MY LORD AND LADY\n FORTY-FIVE DAISY AND DEMI\n FORTY-SIX UNDER THE UMBRELLA\n FORTY-SEVEN HARVEST TIME\n\n\n\nCHAPTER ONE\n\nPLAYING PILGRIMS\n\n"Christmas won\'t be Christmas without any presents," grumbled Jo, lying\non the rug.\n\n"It\'s so dreadful to be poor!" sighed Meg, looking down at her old\ndress.\n\n"I don\'t think it\'s fair for some girls to have plenty of pretty\nthings, and other girls nothing at all," added little Amy, with an\ninjured sniff.\n\n"We\'ve got Father and Mother, and each other," said Beth contentedly\nfrom her corner.\n\nThe four young faces on which the firelight shone brightened at the\ncheerful words, but darkened again as Jo said sadly, "We haven\'t got\nFather, and shall not have him for a long time." She didn\'t say\n"perhaps never," but each silently added it, thinking of Father far\naway, where the fighting was.\n\nNobody spoke for a minute; then Meg said in an altered tone, "You know\nthe reason Mother proposed not having any presents this Christmas was\nbecause it is going to b'
# Print the first three thousand characters.
print(little_women_text[:3000])
The Project Gutenberg EBook of Little Women, by Louisa May Alcott This eBook is for the use of anyone anywhere at no cost and with almost no restrictions whatsoever. You may copy it, give it away or re-use it under the terms of the Project Gutenberg License included with this eBook or online at www.gutenberg.net Title: Little Women Author: Louisa May Alcott Posting Date: September 13, 2008 [EBook #514] Release Date: May, 1996 [This file last updated on August 19, 2010] Language: English *** START OF THIS PROJECT GUTENBERG EBOOK LITTLE WOMEN *** LITTLE WOMEN by Louisa May Alcott CONTENTS PART 1 ONE PLAYING PILGRIMS TWO A MERRY CHRISTMAS THREE THE LAURENCE BOY FOUR BURDENS FIVE BEING NEIGHBORLY SIX BETH FINDS THE PALACE BEAUTIFUL SEVEN AMY'S VALLEY OF HUMILIATION EIGHT JO MEETS APOLLYON NINE MEG GOES TO VANITY FAIR TEN THE P.C. AND P.O. ELEVEN EXPERIMENTS TWELVE CAMP LAURENCE THIRTEEN CASTLES IN THE AIR FOURTEEN SECRETS FIFTEEN A TELEGRAM SIXTEEN LETTERS SEVENTEEN LITTLE FAITHFUL EIGHTEEN DARK DAYS NINETEEN AMY'S WILL TWENTY CONFIDENTIAL TWENTY-ONE LAURIE MAKES MISCHIEF, AND JO MAKES PEACE TWENTY-TWO PLEASANT MEADOWS TWENTY-THREE AUNT MARCH SETTLES THE QUESTION PART 2 TWENTY-FOUR GOSSIP TWENTY-FIVE THE FIRST WEDDING TWENTY-SIX ARTISTIC ATTEMPTS TWENTY-SEVEN LITERARY LESSONS TWENTY-EIGHT DOMESTIC EXPERIENCES TWENTY-NINE CALLS THIRTY CONSEQUENCES THIRTY-ONE OUR FOREIGN CORRESPONDENT THIRTY-TWO TENDER TROUBLES THIRTY-THREE JO'S JOURNAL THIRTY-FOUR FRIEND THIRTY-FIVE HEARTACHE THIRTY-SIX BETH'S SECRET THIRTY-SEVEN NEW IMPRESSIONS THIRTY-EIGHT ON THE SHELF THIRTY-NINE LAZY LAURENCE FORTY THE VALLEY OF THE SHADOW FORTY-ONE LEARNING TO FORGET FORTY-TWO ALL ALONE FORTY-THREE SURPRISES FORTY-FOUR MY LORD AND LADY FORTY-FIVE DAISY AND DEMI FORTY-SIX UNDER THE UMBRELLA FORTY-SEVEN HARVEST TIME CHAPTER ONE PLAYING PILGRIMS "Christmas won't be Christmas without any presents," grumbled Jo, lying on the rug. "It's so dreadful to be poor!" sighed Meg, looking down at her old dress. "I don't think it's fair for some girls to have plenty of pretty things, and other girls nothing at all," added little Amy, with an injured sniff. "We've got Father and Mother, and each other," said Beth contentedly from her corner. The four young faces on which the firelight shone brightened at the cheerful words, but darkened again as Jo said sadly, "We haven't got Father, and shall not have him for a long time." She didn't say "perhaps never," but each silently added it, thinking of Father far away, where the fighting was. Nobody spoke for a minute; then Meg said in an altered tone, "You know the reason Mother proposed not having any presents this Christmas was because it is going to b
# Create a variable "chapters" by splitting the text on 'CHAPTER '.
chapters = little_women_text.split('CHAPTER ')
# Create a DataFrame with one column - the text of each chapters.
bpd.DataFrame().assign(chapters=chapters)
chapters | |
---|---|
0 | The Project Gutenberg EBook of Little Women, b... |
1 | ONE\n\nPLAYING PILGRIMS\n\n"Christmas won't be... |
2 | TWO\n\nA MERRY CHRISTMAS\n\nJo was the first t... |
3 | THREE\n\nTHE LAURENCE BOY\n\n"Jo! Jo! Where ... |
4 | FOUR\n\nBURDENS\n\n"Oh, dear, how hard it does... |
... | ... |
43 | FORTY-THREE\n\nSURPRISES\n\nJo was alone in th... |
44 | FORTY-FOUR\n\nMY LORD AND LADY\n\n"Please, Mad... |
45 | FORTY-FIVE\n\nDAISY AND DEMI\n\nI cannot feel ... |
46 | FORTY-SIX\n\nUNDER THE UMBRELLA\n\nWhile Lauri... |
47 | FORTY-SEVEN\n\nHARVEST TIME\n\nFor a year Jo a... |
48 rows × 1 columns
# Number of occurrences of each name in each chapter.
counts = bpd.DataFrame().assign(
Amy=np.char.count(chapters, 'Amy'),
Beth=np.char.count(chapters, 'Beth'),
Jo=np.char.count(chapters, 'Jo'),
Meg=np.char.count(chapters, 'Meg'),
Laurie=np.char.count(chapters, 'Laurie'),
)
counts
Amy | Beth | Jo | Meg | Laurie | |
---|---|---|---|---|---|
0 | 0 | 0 | 0 | 0 | 0 |
1 | 23 | 26 | 44 | 26 | 0 |
2 | 13 | 12 | 21 | 20 | 0 |
3 | 2 | 2 | 62 | 36 | 16 |
4 | 14 | 18 | 34 | 17 | 0 |
... | ... | ... | ... | ... | ... |
43 | 31 | 8 | 61 | 3 | 29 |
44 | 13 | 0 | 9 | 0 | 10 |
45 | 1 | 2 | 6 | 2 | 0 |
46 | 2 | 1 | 56 | 4 | 2 |
47 | 10 | 3 | 37 | 6 | 13 |
48 rows × 5 columns
# Cumulative number of times each name appears.
cumulative_counts = bpd.DataFrame().assign(
Amy=np.cumsum(counts.get('Amy')),
Beth=np.cumsum(counts.get('Beth')),
Jo=np.cumsum(counts.get('Jo')),
Meg=np.cumsum(counts.get('Meg')),
Laurie=np.cumsum(counts.get('Laurie')),
Chapter=np.arange(1, 49, 1)
)
cumulative_counts
Amy | Beth | Jo | Meg | Laurie | Chapter | |
---|---|---|---|---|---|---|
0 | 0 | 0 | 0 | 0 | 0 | 1 |
1 | 23 | 26 | 44 | 26 | 0 | 2 |
2 | 36 | 38 | 65 | 46 | 0 | 3 |
3 | 38 | 40 | 127 | 82 | 16 | 4 |
4 | 52 | 58 | 161 | 99 | 16 | 5 |
... | ... | ... | ... | ... | ... | ... |
43 | 619 | 459 | 1435 | 673 | 571 | 44 |
44 | 632 | 459 | 1444 | 673 | 581 | 45 |
45 | 633 | 461 | 1450 | 675 | 581 | 46 |
46 | 635 | 462 | 1506 | 679 | 583 | 47 |
47 | 645 | 465 | 1543 | 685 | 596 | 48 |
48 rows × 6 columns
df = cumulative_counts.drop(columns=['Chapter']).to_df().melt().rename(columns={'variable': 'name', 'value': 'Count'})
df.assign(Chapter=list(range(1, 49)) * 5)
name | Count | Chapter | |
---|---|---|---|
0 | Amy | 0 | 1 |
1 | Amy | 23 | 2 |
2 | Amy | 36 | 3 |
3 | Amy | 38 | 4 |
4 | Amy | 52 | 5 |
... | ... | ... | ... |
235 | Laurie | 571 | 44 |
236 | Laurie | 581 | 45 |
237 | Laurie | 581 | 46 |
238 | Laurie | 583 | 47 |
239 | Laurie | 596 | 48 |
240 rows × 3 columns
# Putting it all together, we get a helpful visualization.
cumulative_counts_df = cumulative_counts.drop(columns=['Chapter']).to_df().melt().rename(columns={'variable': 'name', 'value': 'Count'})
cumulative_counts_df = cumulative_counts_df.assign(Chapter=list(range(1, 49)) * 5)
px.line(cumulative_counts_df, x='Chapter', y='Count', color='name', width=900, height=600, title='Cumulative Number of Times Each Name Appears', template='ggplot2').show()
- In Chapter 32, Jo moves to New York alone. Her relationship with which sister suffers the most from this faraway move?
- Laurie is a man who marries one of the sisters at the end. Which one?
What is code? What are Jupyter Notebooks? 💻¶
What is code?¶
- Instructions for computers are written in programming languages, and are referred to as code.
- “Computer programs” are nothing more than recipes: we write programs that tell the computer exactly what to do, and it does exactly that – nothing more, and nothing less.
Why Python?¶
- It's popular!

- It has a variety of use cases. Some examples:
- Web development.
- Data science and machine learning.
- Scripting and automation.
- It's (relatively) easy to dive right in! 🏊
Jupyter Notebooks 📓¶
- Often, but not in this class, code is written in a text editor and then run in a command-line interface (or both steps are done in an IDE).

- Jupyter Notebooks allow us to write and run code within a single document. They also allow us to embed text and code. We will be using Jupyter Notebooks throughout the quarter.
- DataHub is a server that allows you to run Jupyter Notebooks from your web browser without having to install any software locally.
Expressions¶
Python as a calculator¶
- An expression is a combination of values, operators, and functions that evaluates to some value.
- For now, let's think of Python like a calculator – it takes expressions and evaluates them.
- We will enter our expressions in code cells. To run a code cell, either:
- Hit
shift
+enter
(orshift
+return
) on your keyboard (strongly preferred), or - Press the "▶ Run" button in the toolbar.
- Hit
23
23
-15 + 2.718
-12.282
4 ** 3
64
(2 + 3 + 4) / 3
3.0
# Only one value is displayed. Why?
9 + 10
13 / 4
21
21
Arithmetic operations¶
Operation | Operator | Example | Value |
---|---|---|---|
Addition | + |
2 + 3 |
5 |
Subtraction | - |
2 - 3 |
-1 |
Multiplication | * |
2 * 3 |
6 |
Division | / |
7 / 3 |
2.66667 |
Remainder | % |
7 % 3 |
1 |
Exponentiation | ** |
2 ** 0.5 |
1.41421 |
Python uses the typical order of operations – PEMDAS (BEDMAS? 🛏️)¶
5 * 2 ** 3
40
(5 * 2) ** 3
1000
Activity¶
In the cell below, write an expression that's equivalent to
$$(19 + 6 \cdot 3) - 15 \cdot \left(\sqrt{100} \cdot \frac{1}{30}\right) \cdot \frac{3}{5} + \frac{4^2}{2^3} + \left( 6 - \frac{2}{3} \right) \cdot 12 $$
Variables¶
Motivation¶
Below, we compute the number of seconds in a year.
60 * 60 * 24 * 365
31536000
If we want to use the above value later in our notebook to find, say, the number of seconds in 12 years, we'd have to copy-and-paste the expression. This is inconvenient, and prone to introducing errors.
60 * 60 * 24 * 365 * 12
378432000
It would be great if we could store the initial value and refer to it later on!
Variables and assignment statements¶
- A variable is a place to store a value so that it can be referred to later in our code. To define a variable, we use an assignment statement.
$$ \overbrace{\texttt{zebra}}^{\text{name}} = \overbrace{\texttt{23 - 14}}^{\text{any expression}} $$
- An assignment statement changes the meaning of the name to the left of the
=
symbol.
- The expression on the right-hand side of the
=
symbol is evaluated before being assigned to the name on the left-hand side.- e.g.
zebra
is bound to9
(value) not23 - 14
(expression).
- e.g.
Think of variable names as nametags!¶
# Note: This is an assignment statement, not an expression.
# Assignment statements don't output anything!
a = 1
a = 2
b = 2
Example¶
Note that before we use it in an assignment statement, triton
has no meaning.
triton
--------------------------------------------------------------------------- NameError Traceback (most recent call last) /var/folders/2k/9mnd960x2j1d9b35wyjwwx200000gp/T/ipykernel_79595/521444765.py in <cell line: 0>() ----> 1 triton NameError: name 'triton' is not defined
After using it in an assignment statement, we can ask Python for its value.
triton = 15 - 5
triton
10
Any time we use triton
in an expression, 10
is substituted for it.
triton * -4
-40
Note that the above expression did not change the value of triton
, because we did not re-assign triton
!
triton
10
Naming variables¶
- Give your variables helpful names so that you know what they refer to.
- Variable names can contain uppercase and lowercase characters, the digits 0-9, and underscores.
- They cannot start with a number.
- They are case sensitive!
The following assignment statements are valid, but use poor variable names 😕.
six = 15
i_45love_chocolate_9999 = 60 * 60 * 24 * 365
The following assignment statements are valid, and use good variable names ✅.
seconds_per_hour = 60 * 60
hours_per_year = 24 * 365
seconds_per_year = seconds_per_hour * hours_per_year
The following "assignment statements" are invalid ❌.
7_days = 24 * 7
File "/var/folders/2k/9mnd960x2j1d9b35wyjwwx200000gp/T/ipykernel_79595/3229775372.py", line 1 7_days = 24 * 7 ^ SyntaxError: invalid decimal literal
3 = 2 + 1
File "/var/folders/2k/9mnd960x2j1d9b35wyjwwx200000gp/T/ipykernel_79595/2449763097.py", line 1 3 = 2 + 1 ^ SyntaxError: cannot assign to literal here. Maybe you meant '==' instead of '='?
Python functions¶
- Functions in Python work the same way functions in math do.
- The inputs to functions are called arguments.
- Python comes with a number of built-in functions that we are free to use.
- Calling a function, or using a function, means asking the function to "run its recipe" on the given input.
abs(-23)
23
Some functions can take a variable number of arguments¶
max(4, -8)
4
max(2, -3, -6, 10, -4)
10
max(9)
--------------------------------------------------------------------------- TypeError Traceback (most recent call last) /var/folders/2k/9mnd960x2j1d9b35wyjwwx200000gp/T/ipykernel_79595/60825961.py in <cell line: 0>() ----> 1 max(9) TypeError: 'int' object is not iterable
max(9 + 10, 9 - 10)
19
Put ?
after a function's name to see its documentation 📄¶
Or use the help
function, e.g. help(round)
.
round(1.45678)
1
round?
Signature: round(number, ndigits=None) Docstring: Round a number to a given precision in decimal digits. The return value is an integer if ndigits is omitted or None. Otherwise the return value has the same type as the number. ndigits may be negative. Type: builtin_function_or_method
round(1.45678, 3)
1.457
Nested evaluation¶
We can nest many function calls to evaluate sophisticated expressions.
min(abs(max(-1, -2, -3, min(4, -2))), max(5, 100))
1
...how did that work?
from lec00_imports import *
show_nested_eval()
Import statements¶
- Python doesn't have everything we need built in.
- In order to gain additional functionality, we import modules through import statements.
- Modules are collections of Python functions and values.
- Call these functions using the syntax
module.function()
, called "dot notation".
Example: import math
¶
Some of the many functions built into the math
module are sqrt
, pow
, and log
.
import math
math.sqrt(16)
4.0
math.pow(2, 5)
32.0
math
also has constants built in!
math.pi
3.141592653589793
Concept Check ✅ – Answer at cc.dsc10.com¶
Assume you have run the following statements:
x = 3
y = -2
Which of these examples results in an error? For the ones that don't error, try to determine what they evaluate to!
A. abs(x, y)
B. math.pow(x, abs(y))
C. round(x, max(abs(y ** 2)))
D. math.pow(x, math.pow(y, x))
E. More than one of the above
Data types¶
What's the difference? 🧐¶
4 / 2
2.0
5 - 3
2
To us, 2.0
and 2
are the same number, $2$. But to Python, these appear to be different!
Data types¶
- Every value in Python has a type.
- Use the
type
function to check a value's type.
- Use the
- It's important to understand how different types work with different operations, as the results may not always be what we expect.
Two numeric data types: int
and float
¶
int
: An integer of any size.float
: A number with a decimal point.
int
¶
- If you add (
+
), subtract (-
), multiply (*
), or exponentiate (**
)int
s, the result will be anotherint
. int
s have arbitrary precision in Python, meaning that your calculations will always be exact.
7 - 15
-8
type(7 - 15)
int
2 ** 300
2037035976334486086268445688409378161051468393665936250636140449354381299763336706183397376
2 ** 3000
1230231922161117176931558813276752514640713895736833715766118029160058800614672948775360067838593459582429649254051804908512884180898236823585082482065348331234959350355845017413023320111360666922624728239756880416434478315693675013413090757208690376793296658810662941824493488451726505303712916005346747908623702673480919353936813105736620402352744776903840477883651100322409301983488363802930540482487909763484098253940728685132044408863734754271212592471778643949486688511721051561970432780747454823776808464180697103083861812184348565522740195796682622205511845512080552010310050255801589349645928001133745474220715013683413907542779063759833876101354235184245096670042160720629411581502371248008430447184842098610320580417992206662247328722122088513643683907670360209162653670641130936997002170500675501374723998766005827579300723253474890612250135171889174899079911291512399773872178519018229989376
float
¶
- A
float
is specified using a decimal point. - A
float
might be printed using scientific notation.
3.2 + 2.5
5.7
type(3.2 + 2.5)
float
# The result is in scientific notation: e+90 means "times 10^90".
2.0 ** 300
2.037035976334486e+90
The pitfalls of float
¶
floats
have limited precision; after arithmetic, the final few decimal places can be wrong in unexpected ways.float
s have limited size, though the limit is huge.
1 + 0.2
1.2
1 + 0.1 + 0.1
1.2000000000000002
2.0 ** 3000
--------------------------------------------------------------------------- OverflowError Traceback (most recent call last) /var/folders/2k/9mnd960x2j1d9b35wyjwwx200000gp/T/ipykernel_79595/1310821553.py in <cell line: 0>() ----> 1 2.0 ** 3000 OverflowError: (34, 'Result too large')
Converting between int
and float
¶
- If you mix
int
s andfloat
s in an expression, the result will always be afloat
.- Note that when you divide two
int
s, you get afloat
back.
- Note that when you divide two
- A value can be explicity coerced (i.e. converted) using the
int
andfloat
functions.
2.0 + 3
5.0
12 / 2
6.0
int(12 / 2)
6
int(-2.9)
-2
Summary¶
- Expressions evaluate to values. Python will display the value of the last expression in a cell by default.
- Python knows about all of the standard mathematical operators and follows PEMDAS.
- Assignment statements allow us to bind values to variables.
- We can call functions in Python similar to how we call functions in math.
- Python knows some functions by default, and import statements allow us to bring additional functionality from modules.
- All values in Python have a data type.
int
s andfloat
s are numbers.int
s are integers, whilefloat
s contain decimal points.