Lecture 28 – Review, Conclusion

DSC 10, Spring 2023

Announcements

Agenda

More review

From the Winter 2023 Final:

From the Winter 2023 Final:

From the Winter 2023 Final:

From the Winter 2023 Final:

From the Fall 2022 Final:

From the Fall 2022 Final:

From the Fall 2022 Final:

From the Fall 2022 Final:

From the Fall 2022 Final:

Personal projects

Using Jupyter Notebooks after DSC 10

Finding data

These sites allow you to search for datasets (in CSV format) from a variety of different domains. Some may require you to sign up for an account; these are generally reputable sources.

Note that all of these links are also available at rampure.org/find-datasets.

Domain-specific sources of data

Tip: if a site only allows you to download a file as an Excel file, not a CSV file, you can download it, open it in a spreadsheet viewer (Excel, Numbers, Google Sheets), and export it to a CSV.

Join a DS3 Project Group 🤝

The Data Science Student Society organizes project groups, which are a great way to get experience and build your resume. Keep your eye out for applications!

Demo: Gapminder 🌎

plotly

Gapminder dataset

Gapminder Foundation is a non-profit venture registered in Stockholm, Sweden, that promotes sustainable global development and achievement of the United Nations Millennium Development Goals by increased use and understanding of statistics and other information about social, economic and environmental development at local, national and global levels. - Gapminder Wikipedia

The dataset contains information for each country for several different years.

Let's start by just looking at 2007 data (the most recent year in the dataset).

Scatter plot

We can plot life expectancy vs. GDP per capita. If you hover over a point, you will see the name of the country.