project01

pdf

School

University of Oregon *

*We aren’t endorsed by this school

Course

101

Subject

Statistics

Date

Apr 3, 2024

Type

pdf

Pages

Uploaded by MasterGrouse3886

project01 March 20, 2024 0.1 Project 1: World Progress In this project, you’ll explore data from Gapminder.org , a website dedicated to providing a fact- based view of the world and how it has changed. That site includes several data visualizations and presentations, but also publishes the raw data that we will use in this project to recreate and extend some of their most famous visualizations. The Gapminder website collects data from many sources and compiles them into tables that describe many countries around the world. All of the data they aggregate are published in the Systema Globalis . Their goal is “to compile all public statistics; Social, Economic and Environmental; into a comparable total dataset.” All data sets in this project are copied directly from the Systema Globalis without any changes. This project is dedicated to Hans Rosling (1948-2017), who championed the use of data to under- stand and prioritize global development challenges. 0.1.1 Logistics Deadline. This project is due at 11:59pm on the due date in Canvas. Late work will not be accepted as per the course policies. It’s much better to be early than late, so start working now. Rules. Don’t share your code with anybody. You are welcome to discuss questions with other students, but don’t share the answers. The experience of solving the problems in this project will prepare you for exams (and life). If someone asks you for the answer, resist! Instead, you can demonstrate how you would solve a similar problem. Support. You are not alone! Come to offce hours, post on Slack/Canvas Chat, and talk to your classmates. If you want to ask about the details of your solution to a problem, make a private Slack/Chat post and the staff will respond. Take advantage of the plentiful help hours provided by the Learning Assistants. Tests. The tests that are given are not comprehensive and passing the tests for a question does not mean that you answered the question correctly. Tests usually only check that your table has the correct column labels. However, more tests will be applied to verify the correctness of your submission in order to assign your final score, so be careful and check your work! You might want to create your own checks along the way to see if your answers make sense. Additionally, before you submit, make sure that none of your cells take a very long time to run (several minutes). Free Response Questions: Make sure that you put the answers to the written questions in the indicated cell we provide. Advice. Develop your answers incrementally. To perform a complicated table manipulation, break 1

it up into steps, perform each step on a different line, give a new name to each result, and check that each intermediate result is what you expect. You can add any additional names or functions you want to the provided cells. Make sure that you are using distinct and meaningful variable names throughout the notebook. Along that line, DO NOT reuse the variable names that we use when we grade your answers. For example, in Question 1 of the Global Poverty section, we ask you to assign an answer to latest . Do not reassign the variable name latest to anything else in your notebook, otherwise there is the chance that our tests grade against what latest was reassigned to. You never have to use just one line in this project or any others. Use intermediate variables and multiple lines as much as you would like! To get started, load datascience , numpy , plots , and otter . [2]: from datascience import * import numpy as np % matplotlib inline import matplotlib.pyplot as plots plots . style . use( 'fivethirtyeight' ) import otter grader = otter . Notebook() 'imports complete' [2]: 'imports complete' 0.2 1. Global Population Growth The global population of humans reached 1 billion around 1800, 3 billion around 1960, and 7 billion around 2011. The potential impact of exponential population growth has concerned scientists, economists, and politicians alike. The UN Population Division estimates that the world population will likely continue to grow throughout the 21st century, but at a slower rate, perhaps reaching 11 billion by 2100. However, the UN does not rule out scenarios of more extreme growth. In this section, we will examine some of the factors that influence population growth and how they are changing around the world. The first table we will consider is the total population of each country over time. Run the cell below. [3]: population = Table . read_table( 'population.csv' ) population . show( 3 ) <IPython.core.display.HTML object> Note: The population csv file can also be found here . The data for this project was downloaded in February 2017. 2

0.2.1 Bangladesh In the population table, the geo column contains three-letter codes established by the International Organization for Standardization (ISO) in the Alpha-3 standard. We will begin by taking a close look at Bangladesh. Inspect the standard to find the 3-letter code for Bangladesh. Question 1. Create a table called b_pop that has two columns labeled time and population_total . The first column should contain the years from 1970 through 2015 (including both 1970 and 2015) and the second should contain the population of Bangladesh in each of those years. [4]: b_pop = population . where( 'geo' , 'bgd' ) . select( 'time' , 'population_total' ) . ↪ where( 'time' , are . between_or_equal_to( 1970 , 2015 )) b_pop [4]: time | population_total 1970 | 65048701 1971 | 66417450 1972 | 67578486 1973 | 68658472 1974 | 69837960 1975 | 71247153 1976 | 72930206 1977 | 74848466 1978 | 76948378 1979 | 79141947 … (36 rows omitted) [5]: grader . check( "q1_1" ) [5]: q1_1 results: All test cases passed! Run the following cell to create a table called b_five that has the population of Bangladesh every five years. At a glance, it appears that the population of Bangladesh has been growing quickly indeed! [6]: b_pop . set_format( 'population_total' , NumberFormatter) fives = np . arange( 1970 , 2016 , 5 ) # 1970, 1975, 1980, ... b_five = b_pop . sort( 'time' ) . where( 'time' , are . contained_in(fives)) b_five [6]: time | population_total 1970 | 65,048,701 1975 | 71,247,153 1980 | 81,364,176 1985 | 93,015,182 1990 | 105,983,136 1995 | 118,427,768 3

Your preview ends here