Bartleby Sitemap - Textbook Solutions

All Textbook Solutions for Introductory Statistics

The Regression Equation Use the following information to answer the next five exercises. A random sample of ten professional athletes produced the following data where x is the number of endorsements the player has and y is the amount of money made (in millions of dollars). x y x y 0 2 5 12 3 8 4 9 2 7 3 9 1 3 0 3 5 13 4 10 Table 12.13 What is the slope of the line of best fit? What does it represent?The Regression Equation Use the following information to answer the next five exercises. A random sample of ten professional athletes produced the following data where x is the number of endorsements the player has and y is the amount of money made (in millions of dollars). x y x y 0 2 5 12 3 8 4 9 2 7 3 9 1 3 0 3 5 13 4 10 Table 12.13 What is the y-intercept of the line of best fit? What does it represent?What does an r value of zero mean?When n = 2 and r = 1, are the data significant? Explain.When n = 100 and r = -0.89, is there a significant correlation? Explain.When testing the significance of the correlation coefficient, what is the null hypothesis?When testing the significance of the correlation coefficient, what is the alternative hypothesis?If the level of significance is 0.05 and the p-value is 0.04, what conclusion can you draw?Use the following information to answer the next two exercises. An electronics retailer used regression to find a simple model to predict sales growth in the first quarter of the new year (January through March). The model is good for 90 days, where x is the day. The model can be written as follows: y=101.32+2.48xwhere y is in thousands of dollars. What would you predict the sales to be on day 60?Use the following information to answer the next two exercises. An electronics retailer used regression to find a simple model to predict sales growth in the first quarter of the new year (January through March). The model is good for 90 days, where x is the day. The model can be written as follows: y=101.32+2.48xwhere y is in thousands of dollars. What would you predict the sales to be on day 90?Use the following information to answer the next three exercises. A landscaping company is hired to mow the grass for several large properties. The total area of the properties combined is 1,345 acres. The rate at which one person can mow is as follows: y=13501.2xwhere x is the number of hours and y represents the number of acres left to mow. How many acres will be left to mow after 20 hours of work?Use the following information to answer the next three exercises. A landscaping company is hired to mow the grass for several large properties. The total area of the properties combined is 1,345 acres. The rate at which one person can mow is as follows: y=13501.2xwhere x is the number of hours and y represents the number of acres left to mow. How many acres will be left to mow after 100 hours of work?Use the following information to answer the next three exercises. A landscaping company is hired to mow the grass for several large properties. The total area of the properties combined is 1,345 acres. The rate at which one person can mow is as follows: y=13501.2xwhere x is the number of hours and y represents the number of acres left to mow. 35. How many hours will it take to mow all of the lawns? (When is y = 0?) Table 12.14 contains real data for the first two decades of flu cases reporting.Graph “year” versus “# flu cases diagnosed” (plot the scatter plot). Do not include pre-1981 data.Perform linear regression. What is the linear equation? Round to the nearest whole number.Find the correlation coefficient. a. r = ________Solve. a. When x = 1985, y = _____ b. When x = 1990, y =_____ c. When x = 1970, y =______ Why doesn’t this answer make sense?Does the line seem to fit the data? Why or why not?What does the correlation imply about the relationship between time (years) and the number of diagnosed flu cases reported in the U.S.?Plot the two given points on the following graph. Then, connect the two points to form the regression line. Figure 12.29 Obtain the graph on your calculator or computer.Write the equation: y= ____________Hand draw a smooth curve on the graph that shows the flow of the data.Does the line seem to fit the data? Why or why not?Do you think a linear fit is best? Why or why not?What does the correlation imply about the relationship between time (years) and the number of diagnosed flu cases reported in the U.S.?Graph “year” vs. “# flu cases diagnosed.” Do not include pre-1981. Label both axes with words. Scale both axes.Enter your data into your calculator or computer. The pre-1981 data should not be included. Why is that so? Write the linear equation, rounding to four decimal places:Find the correlation coefficient. a. correlation = _____Outliers Use the following information to answer the next four exercises. The scatter plot shows the relationship between hours spent studying and exam scores. The line shown is the calculated line of best fit. The correlation coefficient is 0.69. Figure 12.30 Do there appear to be any outliers?Outliers Use the following information to answer the next four exercises. The scatter plot shows the relationship between hours spent studying and exam scores. The line shown is the calculated line of best fit. The correlation coefficient is 0.69. Figure 12.30 A point is removed, and the line of best fit is recalculated. The new correlation coefficient is 0.98. Does the point appear to have been an outlier? Why?Outliers Use the following information to answer the next four exercises. The scatter plot shows the relationship between hours spent studying and exam scores. The line shown is the calculated line of best fit. The correlation coefficient is 0.69. Figure 12.30 What effect did the potential outlier have on the line of best fit?Outliers Use the following information to answer the next four exercises. The scatter plot shows the relationship between hours spent studying and exam scores. The line shown is the calculated line of best fit. The correlation coefficient is 0.69. Figure 12.30 Are you more or less confident in the predictive ability of the new line of best fit?The Sum of Squared Errors for a data set of 18 numbers is 49. What is the standard deviation?The Standard Deviation for the Sum of Squared Errors for a data set is 9.8. What is the cutoff for the vertical distance that a point can be from the line of best fit to be considered an outlier?For each of the following situations, state the independent variable and the dependent variable. a. A study is done to determine if elderly drivers are involved in more motor vehicle fatalities than other drivers. The number of fatalities per 100,000 drivers is compared to the age of drivers. b. A study is done to determine if the weekly grocery bill changes based on the number of family members. c. Insurance companies base life insurance premiums partially on the age of the applicant. d. Utility bills vary according to power consumption. e. A study is done to determine if a higher education reduces the crime rate in a population.Piece-rate systems are widely debated incentive payment plans. In a recent study of loan officer effectiveness, the following piece-rate system was examined: % ofgoalreached < 80 80 100 120 Incentive n/ a $4,000 with an additional $125 added per percentage point from 8199% $6,500 with an additional $125 added per percentage point from 101119% $9,500 with an additional $125 added per percentage point starting at 121% Table 12.15 If a loan officer makes 95% of his or her goal, write the linear function that applies based on the incentive plan table. In context, explain the y-intercept and slope.The Gross Domestic Product Purchasing Power Parity is an indication of a country’s currency value compared to another country. Table 12.16 shows the GDP PPP of Cuba as compared to US dollars. Construct a scatter plot of the data. Year Cuba’s PPP Year Cuba’s PPP 1,999 1,700 2,006 4,000 2,000 1,700 2,007 11,000 2,002 2,300 2,008 9,500 2,003 2,900 2,009 9,700 2,004 3,000 2,010 9,900 2,005 3,500 Table 12.16The following table shows the poverty rates and cell phone usage in the United States. Construct a scatter plot of the Data Year Poverty Rate Cellular Usage per Capita 2003 12.7 54.67 2005 12.6 74.19 2007 12 84.86 2009 12 90.82 Table 12.17Does the higher cost of tuition translate into higher-paying jobs? The table lists the top ten colleges based on mid-career salary and the associated yearly tuition costs. Construct a scatter plot of the data. School Mid-Career Salary (in thousands) Yearly Tuition Princeton 137 28,540 Harvey Mudd 135 40,133 CalTech 127 39,900 US Naval Academy 122 0 West Point 120 0 MIT 118 42,050 Lehigh University 118 43,220 NYU-Poly 117 39,565 Babson College 117 40,400 Stanford 114 54,506 Table 12.18If the level of significance is 0.05 and the p-value is 0.06, what conclusion can you draw?If there are 15 data points in a set of data, what is the number of degree of freedom?What is the process through which we can calculate a line that goes through a scatter plot with a linear pattern?Explain what it means when a correlation has an r2 of 0.72.Can a coefficient of determination be negative? Why or why not?Recently, the annual number of driver deaths per 100,000 for the selected age groups was as follows: Age Number of Driver Deaths per 100,000 16-19 38 20-24 36 25-34 24 35-54 20 55-74 18 75+ 28 Table 12.19 a. For each age group, pick the midpoint of the interval for the x value. (For the 75+ group, use 80.) b. Using “ages” as the independent variable and “Number of driver deaths per 100,000” as the dependent variable, make a scatter plot of the data. c. Calculate the least squares (bestfit) line. Put the equation in the form of: y=a+bx d. Find the correlation coefficient. Is it significant? e. Predict the number of deaths for ages 40 and 60. f. Based on the given data, is there a linear relationship between age of a driver and driver fatality rate? g. What is the slope of the least squares (best-fit) line? Interpret the slope.Table 12.20 shows the life expectancy for an individual born in the United States in certain years. Year of Birth Life Expectancy 1930 59.7 1940 62.9 1950 70.2 965 69.7 1973 71.4 1982 74.5 1987 75 1992 75.7 2010 78.7 Table 12.20a. Decide which variable should be the independent variable and which should be the dependent variable. b. Draw a scatter plot of the ordered pairs. c. Calculate the least squares line. Put the equation in the form of: y=a+bx d. Find the correlation coefficient. Is it significant? e. Find the estimated life expectancy for an individual born in 1950 and for one born in 1982. f. Why aren’t the answers to part e the same as the values in Table 12.20 that correspond to those years? g. Use the two points in part e to plot the least squares line on your graph from part b. h. Based on the data, is there a linear relationship between the year of birth and life expectancy? i. Are there any outliers in the data? j. Using the least squares line, find the estimated life expectancy for an individual born in 1850. Does the least squares line give an accurate estimate for that year? Explain why or why not. k. What is the slope of the least-squares (best-fit) line? Interpret the slope.The maximum discount value of the Entertainment® card for the “Fine Dining” section, Edition ten, for various pages is given in Table 12.21 Page number Maximum value ($) 4 16 14 19 25 15 32 17 43 19 57 15 72 16 85 15 90 17 Table 12.21 a. Decide which variable should be the independent variable and which should be the dependent variable. b. Draw a scatter plot of the ordered pairs. c. Calculate the least-squares line. Put the equation in the form of: y = a + bx d. Find the correlation coefficient. Is it significant? e. Find the estimated maximum values for the restaurants on page ten and on page 70. f. Does it appear that the restaurants giving the maximum value are placed in the beginning of the “Fine Dining” section? How did you arrive at your answer? g. Suppose that there were 200 pages of restaurants. What do you estimate to be the maximum value for a restaurant listed on page 200? h. Is the least squares line valid for page 200? Why or why not? i. What is the slope of the least-squares (best-fit) line? Interpret the slope.Table 12.22 gives the gold medal times for every other Summer Olympics for the women’s 100-meter freestyle (swimming). Year Time (seconds) 1912 82.2 1924 72.4 1932 66.8 1952 66.8 1960 61.2 1968 60.0 1976 55.65 1984 55.92 1992 54.64 2000 53.8 2008 53.1 Table 12.22a. Decide which variable should be the independent variable and which should be the dependent variable. b. Draw a scatter plot of the data. c. Does it appear from inspection that there is a relationship between the variables? Why or why not? d. Calculate the least squares line. Put the equation in the form of: y=a+bx. e. Find the correlation coefficient. Is the decrease in times significant? f. Find the estimated gold medal time for 1932. Find the estimated time for 1984. g. Why are the answers from part f different from the chart values? h. Does it appear that a line is the best way to fit the data? Why or why not? i. Use the least-squares line to estimate the gold medal time for the next Summer Olympics. Do you think that your answer is reasonable? Why or why not?State # letters in name Year entered the Union Rank for entering the Union Area (squaremiles) Alabama 7 1819 22 52,423 Colorado 8 1876 38 104,100 Hawaii 6 1959 50 10,932 Iowa 4 1846 29 56,276 Maryland 8 1788 7 12,407 Missouri 8 1821 24 69,709 New Jersey 9 1787 3 8,722 Ohio 4 1803 17 44,828 South Carolina 13 1788 8 32,008 Utah 4 1896 45 84,904 Wisconsin 9 1848 30 65,499 Table 12.23 We are interested in whether or not the number of letters in a state name depends upon the year the state entered the Union. a. Decide which variable should be the independent variable and which should be the dependent variable. b. Draw a scatter plot of the data. c. Does it appear from inspection that there is a relationship between the variables? Why or why not? d. Calculate the least-squares line. Put the equation in the form of: y=a+bx. e. Find the correlation coefficient. What does it imply about the significance of the relationship? f. Find the estimated number of letters (to the nearest integer) a state would have if it entered the Union in 1900. Find the estimated number of letters a state would have if it entered the Union in 1940. g. Does it appear that a line is the best way to fit the data? Why or why not? h. Use the least-squares line to estimate the number of letters a new state that enters the Union this year would have. Can the least squares line be used to predict it? Why or why not?The height (sidewalk to roof) of notable tall buildings in America is compared to the number of stories of the building (beginning at street level). Height (in feet) Stories 1,050 57 428 28 362 26 529 40 790 60 401 22 380 38 1,454 110 1,127 100 700 46 Table 12.24 a. Using “stories” as the independent variable and “height” as the dependent variable, make a scatter plot of the data. b. Does it appear from inspection that there is a relationship between the variables? c. Calculate the least squares line. Put the equation in the form of: y = a + bx d. Find the correlation coefficient. Is it significant? e. Find the estimated heights for 32 stories and for 94 stories. f. Based on the data in Table 12.24, is there a linear relationship between the number of stories in tall buildings and the height of the buildings? g. Are there any outliers in the data? If so, which point(s)? h. What is the estimated height of a building with six stories? Does the least squares line give an accurate estimate of height? Explain why or why not. i. Based on the least squares line, adding an extra story is predicted to add about how many feet to a building? j. What is the slope of the least squares (best-fit) line? Interpret the slope.Ornithologists, scientists who study birds, tag sparrow hawks in 13 different colonies to study their population. They gather data for the percent of new sparrow hawks in each colony and the percent of those that have returned from migration. Percent return:74; 66; 81; 52; 73; 62; 52; 45; 62; 46; 60; 46; 38 Percent new:5; 6; 8; 11; 12; 15; 16; 17; 18; 18; 19; 20; 20 a. Enter the data into your calculator and make a scatter plot. b. Use your calculator’s regression function to find the equation of the least-squares regression line. Add this to your scatter plot from part a. c. Explain in words what the slope and y-intercept of the regression line tell us. d. How well does the regression line fit the data? Explain your response. e. Which point has the largest residual? Explain what the residual means in context. Is this point an outlier? An influential point? Explain. f. An ecologist wants to predict how many birds will join another colony of sparrow hawks to which 70% of the adults from the previous year have returned. What is the prediction?The following table shows data on average per capita coffee consumption and heart disease rate in a random sample of 10 countries. Yearly coffee consumption in liters 2.5 3.9 2.9 2.4 2.9 0.8 9.1 2.7 0.8 0.7 Death from heart diseases 221 167 131 191 220 297 71 172 211 300 Table 12.25 a. Enter the data into your calculator and make a scatter plot. b. Use your calculator’s regression function to find the equation of the least-squares regression line. Add this to your scatter plot from part a. c. Explain in words what the slope and y-intercept of the regression line tell us. d. How well does the regression line fit the data? Explain your response. e. Which point has the largest residual? Explain what the residual means in context. Is this point an outlier? An influential point? Explain. f. Do the data provide convincing evidence that there is a linear relationship between the amount of coffee consumed and the heart disease death rate? Carry out an appropriate test at a significance level of 0.05 to help answer this question.The following table consists of one student athlete’s time (in minutes) to swim 2000 yards and the student’s heart rate (beats per minute) after swimming on a random sample of 10 days: Swim Time Heart Rate 34.12 144 35.72 152 34.72 124 34.05 140 34.13 152 35.73 146 36.17 128 35.57 136 35.37 144 35.57 148 Table 12.26 a. Enter the data into your calculator and make a scatter plot. b. Use your calculator’s regression function to find the equation of the least-squares regression line. Add this to your scatter plot from part a. c. Explain in words what the slope and y-intercept of the regression line tell us. d. How well does the regression line fit the data? Explain your response. e. Which point has the largest residual? Explain what the residual means in context. Is this point an outlier? An influential point? Explain.A researcher is investigating whether population impacts homicide rate. He uses demographic data from Detroit, MI to compare homicide rates and the number of the population that are white males. Population Size Homicide rate per 100,000 people 558,724 8.6 538,584 8.9 519,171 8.5 500,457 8.9 482,418 13.07 465,029 14.57 448,267 21.36 432,109 28.03 416,533 31.49 401,518 37.39 387,046 46.26 373,095 47.24 359,647 52.33 Table 12.27 a. Use your calculator to construct a scatter plot of the data. What should the independent variable be? Why? b. Use your calculator’s regression function to find the equation of the least-squares regression line. Add this to your scatter plot. c. Discuss what the following mean in context. i. The slope of the regression equation ii. The y-intercept of the regression equation iii. The correlation r iv. The coefficient of determination r2. d. Do the data provide convincing evidence that there is a linear relationship between population size and homicide rate? Carry out an appropriate test at a significance level of 0.05 to help answer this question.School Mid-Career Salary (in thousands) Yearly Tuition Princeton 137 28,540 Harvey Mudd 135 40,133 CalTech 127 39,900 US Naval Academy 122 0 West Point 120 0 MIT 118 42,050 Lehigh University 118 43,220 NYU-Poly 117 39,565 Babson College 117 40,400 Stanford 114 54,506 Table 12.28 Using the data to determine the linear-regression line equation with the outliers removed. Is there a linear correlation for the data set with outliers removed? Justify your answer.The average number of people in a family that attended college for various years is given in Table 12.29. Yer Number of Family Members Attending College 1969 4.0 1973 3.6 1975 3.2 1979 3.0 1983 3.0 1988 3.0 1991 2.9 Table 12.29 a. Using “year” as the independent variable and “Number of Family Members Attending College” as the dependent variable, draw a scatter plot of the data. b. Calculate the least-squares line. Put the equation in the form of: y=a+bx c. Find the correlation coefficient. Is it significant? d. Pick two years between 1969 and 1991 and find the estimated number of family members attending college. e. Based on the data in Table 12.29, is there a linear relationship between the year and the average number of family members attending college? f. Using the least-squares line, estimate the number of family members attending college for 1960 and 1995. Does the least-squares line give an accurate estimate for those years? Explain why or why not. g. Are there any outliers in the data? h. What is the estimated average number of family members attending college for 1986? Does the least squares line give an accurate estimate for that year? Explain why or why not. i. What is the slope of the least squares (best-fit) line? Interpret the slope.The percent of female wage and salary workers who are paid hourly rates is given in Table 12.30 for the years 1979 to 1992. Year Percent of workers paid hourly rates 1979 61.2 1980 60.7 1981 61.3 1982 61.3 1983 61.8 1984 61.7 1985 61.8 1986 62 1987 62.7 1990 62.8 1992 62.9 Table 12.30 a. Using “year” as the independent variable and “percent” as the dependent variable, draw a scatter plot of the data. b. Does it appear from inspection that there is a relationship between the variables? Why or why not? c. Calculate the least-squares line. Put the equation in the form of: y = a + bx d. Find the correlation coefficient. Is it significant? e. Find the estimated percents for 1991 and 1988. f. Based on the data, is there a linear relationship between the year and the percent of female wage and salary earners who are paid hourly rates? g. Are there any outliers in the data? h. What is the estimated percent for the year 2050? Does the least-squares line give an accurate estimate for that year? Explain why or why not. i. What is the slope of the least-squares (best-fit) line? Interpret the slope.Use the following information to answer the next two exercises. The cost of a leading liquid laundry detergent in different sizes is given in Table 12.31. Size (ounces) Cost ($) Cost per ounce 16 3.99 32 4.99 64 5.99 200 10.99 Table 12.31 80. a. Using “size” as the independent variable and “cost” as the dependent variable, draw a scatter plot. b. Does it appear from inspection that there is a relationship between the variables? Why or why not? c. Calculate the least-squares line. Put the equation in the form of: y = a + bx d. Find the correlation coefficient. Is it significant? e. If the laundry detergent were sold in a 40-ounce size, find the estimated cost. f. If the laundry detergent were sold in a 90-ounce size, find the estimated cost. g. Does it appear that a line is the best way to fit the data? Why or why not? h. Are there any outliers in the given data? i. Is the least-squares line valid for predicting what a 300-ounce size of the laundry detergent would you cost? Why or why not? j. What is the slope of the least-squares (best-fit) line? Interpret the slope.Use the following information to answer the next two exercises. The cost of a leading liquid laundry detergent in different sizes is given in Table 12.31. Size (ounces) Cost ($) Cost per ounce 16 3.99 32 4.99 64 5.99 200 10.99 Table 12.31 81. a. Complete Table 12.31 for the cost per ounce of the different sizes. b. Using “size” as the independent variable and “cost per ounce” as the dependent variable, draw a scatter plot of the data. c. Does it appear from inspection that there is a relationship between the variables? Why or why not? d. Calculate the least-squares line. Put the equation in the form of: y=a+bx e. Find the correlation coefficient. Is it significant? f. If the laundry detergent were sold in a 40-ounce size, find the estimated cost per ounce. g. If the laundry detergent were sold in a 90-ounce size, find the estimated cost per ounce. h. Does it appear that a line is the best way to fit the data? Why or why not? i. Are there any outliers in the the data? j. Is the least-squares line valid for predicting what a 300-ounce size of the laundry detergent would cost per ounce? Why or why not? k. What is the slope of the least-squares (best-fit) line? Interpret the slope.According to a flyer by a Prudential Insurance Company representative, the costs of approximate probate fees and taxes for selected net taxable estates are as follows: Net Taxable Estate ($) Approximate Probate Fees and Taxes '($) 600,000 30,000 750,000 92,500 1,000,000 203,000 1,500,000 438,000 2,000,000 688,000 2,500,000 1,037,000 3,000,000 1,350,000 Table 12.32 a. Decide which variable should be the independent variable and which should be the dependent variable. b. Draw a scatter plot of the data. c. Does it appear from inspection that there is a relationship between the variables? Why or why not? d. Calculate the least-squares line. Put the equation in the form of: y = a + bx. e. Find the correlation coefficient. Is it significant? f. Find the estimated total cost for a next taxable estate of $1,000,000. Find the cost for $2,500,000. g. Does it appear that a line is the best way to fit the data? Why or why not? h. Are there any outliers in the data? i. Based on these results, what would be the probate fees and taxes for an estate that does not have any assets? j. What is the slope of the least-squares (best-fit) line? Interpret the slope.The following are advertised sale prices of color televisions at Anderson’s. Size (inches) Sale Price ($) 9 147 20 197 27 297 31 447 35 1177 40 2177 60 2497 Table 12.33a. Decide which variable should be the independent variable and which should be the dependent variable. b. Draw a scatter plot of the data. c. Does it appear from inspection that there is a relationship between the variables? Why or why not? d. Calculate the least-squares line. Put the equation in the form of: y=a+bx e. Find the correlation coefficient. Is it significant? f. Find the estimated sale price for a 32 inch television. Find the cost for a 50 inch television. g. Does it appear that a line is the best way to fit the data? Why or why not? h. Are there any outliers in the data? i. What is the slope of the least-squares (best-fit) line? Interpret the slope.Table 12.34 shows the average heights for American boy s in 1990. Age (years) Height (cm) birth 50.8 2 83.8 3 91.4 5 106.6 7 119.3 10 137.1 14 157.5 Table 12.34 a. Decide which variable should be the independent variable and which should be the dependent variable. b. Draw a scatter plot of the data. c. Does it appear from inspection that there is a relationship between the variables? Why or why not? d. Calculate the least-squares line. Put the equation in the form of: y = a + bx e. Find the correlation coefficient. Is it significant? f. Find the estimated average height for a one-year-old. Find the estimated average height for an eleven-year-old. g. Does it appear that a line is the best way to fit the data? Why or why not? h. Are there any outliers in the data? i. Use the least squares line to estimate the average height for a sixty-two-year-old man. Do you think that your answer is reasonable? Why or why not? j. What is the slope of the least-squares (best-fit) line? Interpret the slope.State # letters in name Year entered the Union Ranks for entering the Union Area (squaremiles) Alabama 7 1,819 22 52,423 Colorado 8 1,876 38 104,100 Hawaii 6 1,959 50 10,932 Iowa 4 1,846 29 56,276 Maryland 8 1,788 7 12,407 Missouri 8 1,821 24 69,709 New Jersey 9 1,787 3 8,722 Ohio 4 1,803 17 44,828 South Carolina 13 1788 8 32,008 Utah 4 1896 45 84,904 Wisconsin 9 1848 30 65,499 We are interested in whether there is a relationship between the ranking of a state and the area of the state. a. What are the independent and dependent variables? b. What do you think the scatter plot will look like? Make a scatter plot of the data. c. Does it appear from inspection that there is a relationship between the variables? Why or why not? d. Calculate the least-squares line. Put the equation in the form of: y=a+bx e. Find the correlation coefficient. What does it imply about the significance of the relationship? f. Find the estimated areas for Alabama and for Colorado. Are they close to the actual areas? g. Use the two points in part f to plot the least-squares line on your graph from part b. h. Does it appear that a line is the best way to fit the data? Why or why not? i. Are there any outliers? j. Use the least squares line to estimate the area of a new state that enters the Union. Can the least-squares line be used to predict it? Why or why not? k. Delete “Hawaii” and substitute “Alaska” for it. Alaska is the forty-ninth, state with an area of 656,424 square miles. l. Calculate the new least-squares line. m. Find the estimated area for Alabama. Is it closer to the actual area with this new least-squares line or with the previous one that included Hawaii? Why do you think that’s the case? n. Do you think that, in general, newer states are larger than the original states?As part of an experiment to see how different types of soil cover would affect slicing tomato production, Marist College students grew tomato plants under different soil cover conditions. Groups of three plants each had one of the following treatments • bare soil • a commercial ground cover • black plastic • straw • compost All plants grew under the same conditions and were the same variety. Students recorded the weight (in grams) of tomatoes produced by each of the n = 15 plants: Bare: n1= 3 Ground Cover: n2= 3 Plastic: n3= 3 Straw: n4= 3 Compost: n5= 3 2,625 5,348 65,837 7,285 6,277 2,997 5,682 8,560 6,897 7,818 4,915 5,482 3,830 9,230 8,677 Table 13.4 Create the one-way ANOVA table.MRSA, or Staphylococcus aureus, can cause a serious bacterial infections in hospital patients. Table 13.6 shows various colony counts from different patients who may or may not have MRSA. The data from the table is plotted in figure 13.5. Conc = 0.6 Conc = 0.8 Conc = 1.0 Conc = 1.2 Conc = 1.4 9 16 22 30 27 66 93 147 199 168 98 82 120 148 132 Table 13.6 Plot of the data for the different concentrations: Figure 13.5 Test whether the mean number of colonies are the same or are different. Construct the ANOVA table (by hand or by using a TI-83, 83+, or 84+ calculator), find the p-value, and state your conclusion. Use a 5% significance level.Four sports teams took a random sample of players regarding their GPAs for the last year. The results are shown in Table 13.8. Basketball Baseball Hockey Lacrosse 3.6 2.1 4.0 2.0 2.9 2.6 2.0 3.6 2.5 3.9 2.6 3.9 3.3 3.1 3.2 2.7 Table 13.8 GPAs FOR FOUR SPORTS TEAMS Use a significance level of 5%, and determine if there is a difference in GPA among the teams.Another fourth grader also grew bean plants, but this time in a jelly-like mass. The heights were (in inches) 24, 28, 25, 30, and 32. Do a one-way ANOVA test on the four groups. Are the heights of the bean plants different? Use the same method as shown in Example 13.4.The New York Choral Society divides male singers up into four categories from highest voices to lowest: Tenor1, Tenor2, Bass1, Bass2. In the table are heights of the men in the Tenor1 and Bass2 groups. One suspects that taller men will have lower voices, and that the variance of height may go up with the lower voices as well. Do we have good evidence that the variance of the heights of singers in each of these two groups (Tenor1 and Bass2) are different? Tenor1 Bass2 Tenor 1 Bass 2 Tenor 1 Bass 2 69 72 67 72 68 67 72 75 70 74 67 70 71 67 65 70 64 70 66 75 72 66 69 76 74 70 68 72 74 72 68 75 71 71 72 64 68 74 66 74 73 70 75 68 72 66 72 Table 13.11 13.5 |Use the following information to answer the next five exercises. There are five basic assumptions that must be fulfilled in order to perform a one-way ANOVA test. What are they? Write one assumption.Use the following information to answer the next five exercises. There are five basic assumptions that must be fulfilled in order to perform a one-way ANOVA test. What are they? Write another assumptionOne-Way ANOVA Use the following information to answer the next five exercises. There are five basic assumptions that must be fulfilled in Write a third assumption.One-Way ANOVA Use the following information to answer the next five exercises. There are five basic assumptions that must be fulfilled in Write a fourth assumption.One-Way ANOVA Use the following information to answer the next five exercises. There are five basic assumptions that must be fulfilled in Write the final assumptionState the null hypothesis for a one-way ANOVA test if there are four groups.State the alternative hypothesis for a one-way ANOVA test if there are three groupsWhen do you use an ANOVA test?The F Distribution and the F-Ratio Use the following information to answer the next eight exercises. Groups of men from three different areas of the country are to be tested for mean weight. The entries in Table 13.13 are the weights for the different groups. Group 1 Group 2 Group 3 216 202 170 198 213 165 240 284 182 187 228 197 176 210 201 Table 13.13 9. What is the Sum of Squares Factor?2 The F Distribution and the F-Ratio Use the following information to answer the next eight exercises. Groups of men from three different areas of the country are to be tested for mean weight. The entries in Table 13.13 are the weights for the different groups. Group 1 Group 2 Group 3 216 202 170 198 213 165 240 284 182 187 228 197 176 210 201 Table 13.13 What is the Sum of Squares Error?The F Distribution and the F-Ratio Use the following information to answer the next eight exercises. Groups of men from three different areas of the country are to be tested for mean weight. The entries in Table 13.13 are the weights for the different groups. Group 1 Group 2 Group 3 216 202 170 198 213 165 240 284 182 187 228 197 176 210 201 Table 13.13 What is the df for the numerator?The F Distribution and the F-Ratio Use the following information to answer the next eight exercises. Groups of men from three different areas of the country are to be tested for mean weight. The entries in Table 13.13 are the weights for the different groups. Group 1 Group 2 Group 3 216 202 170 198 213 165 240 284 182 187 228 197 176 210 201 Table 13.13 What is the df for the denominator?The F Distribution and the F-Ratio Use the following information to answer the next eight exercises. Groups of men from three different areas of the country are to be tested for mean weight. The entries in Table 13.13 are the weights for the different groups. Group 1 Group 2 Group 3 216 202 170 198 213 165 240 284 182 187 228 197 176 210 201 Table 13.13 What is the Mean Square Factor?The F Distribution and the F-Ratio Use the following information to answer the next eight exercises. Groups of men from three different areas of the country are to be tested for mean weight. The entries in Table 13.13 are the weights for the different groups. Group 1 Group 2 Group 3 216 202 170 198 213 165 240 284 182 187 228 197 176 210 201 Table 13.13 What is the Mean Square Error?The F Distribution and the F-Ratio Use the following information to answer the next eight exercises. Groups of men from three different areas of the country are to be tested for mean weight. The entries in Table 13.13 are the weights for the different groups. Group 1 Group 2 Group 3 216 202 170 198 213 165 240 284 182 187 228 197 176 210 201 Table 13.13 What is the F statistic?Use the following information to answer the next eight exercises. Girls from four different soccer teams are to be tested for mean goals scored per game. The entries in the table are the goals per game for the different teams. The one-way ANOVA results are shown in Table 13.14. Team 1 Team 2 Team 3 Team 4 1 2 0 3 2 3 1 4 0 2 1 4 3 4 0 3 2 4 0 2 Table 13.14 What is SSbetween?Use the following information to answer the next eight exercises. Girls from four different soccer teams are to be tested for mean goals scored per game. The entries in the table are the goals per game for the different teams. The one-way ANOVA results are shown in Table 13.14. Team 1 Team 2 Team 3 Team 4 1 2 0 3 2 3 1 4 0 2 1 4 3 4 0 3 2 4 0 2 Table 13.14 What is the df for the numerator?Use the following information to answer the next eight exercises. Girls from four different soccer teams are to be tested for mean goals scored per game. The entries in the table are the goals per game for the different teams. The one-way ANOVA results are shown in Table 13.14. Team 1 Team 2 Team 3 Team 4 1 2 0 3 2 3 1 4 0 2 1 4 3 4 0 3 2 4 0 2 Table 13.14 What is MSbetween?Use the following information to answer the next eight exercises. Girls from four different soccer teams are to be tested for mean goals scored per game. The entries in the table are the goals per game for the different teams. The one-way ANOVA results are shown in Table 13.14. Team 1 Team 2 Team 3 Team 4 1 2 0 3 2 3 1 4 0 2 1 4 3 4 0 3 2 4 0 2 Table 13.14 What is SSwithin?Use the following information to answer the next eight exercises. Girls from four different soccer teams are to be tested for mean goals scored per game. The entries in the table are the goals per game for the different teams. The one-way ANOVA results are shown in Table 13.14. Team 1 Team 2 Team 3 Team 4 1 2 0 3 2 3 1 4 0 2 1 4 3 4 0 3 2 4 0 2 Table 13.14 What is the df for the denominator?Use the following information to answer the next eight exercises. Girls from four different soccer teams are to be tested for mean goals scored per game. The entries in the table are the goals per game for the different teams. The one-way ANOVA results are shown in Table 13.14. Team 1 Team 2 Team 3 Team 4 1 2 0 3 2 3 1 4 0 2 1 4 3 4 0 3 2 4 0 2 Table 13.14 What is MSwithin?Use the following information to answer the next eight exercises. Girls from four different soccer teams are to be tested for mean goals scored per game. The entries in the table are the goals per game for the different teams. The one-way ANOVA results are shown in Table 13.14. Team 1 Team 2 Team 3 Team 4 1 2 0 3 2 3 1 4 0 2 1 4 3 4 0 3 2 4 0 2 Table 13.14 What is the F statistic?Use the following information to answer the next eight exercises. Girls from four different soccer teams are to be tested for mean goals scored per game. The entries in the table are the goals per game for the different teams. The one-way ANOVA results are shown in Table 13.14. Team 1 Team 2 Team 3 Team 4 1 2 0 3 2 3 1 4 0 2 1 4 3 4 0 3 2 4 0 2 Table 13.14 Judging by the F statistic, do you think it is likely or unlikely that you will reject the null hypothesis?Use the following information to answer the next eight exercises. Girls from four different soccer teams are to be tested for mean goals scored per game. The entries in the table are the goals per game for the different teams. The one-way ANOVA results are shown in Table 13.14. Team 1 Team 2 Team 3 Team 4 1 2 0 3 2 3 1 4 0 2 1 4 3 4 0 3 2 4 0 2 Table 13.14 An F statistic can have what values?What happens to the curves as the degrees of freedom for the numerator and the denominator get larger? Use the following information to answer the next seven exercise. Four basketball teams took a random sample of players regarding how high each player can jump (in inches). The results are shown in Table 13.15. Team 1 Team 2 Team 3 Team 4 Team 5 36 32 48 38 41 42 35 50 44 39 51 38 39 46 40 Table 13.15What is the df(num)?What is the df(denom)?What are the Sum of Squares and Mean Squares Factors?What are the Sum of Squares and Mean Squares Errors?What is the F statistic?What is the p-value?At the 5% significance level, is there a difference in the mean jump heights among the teams?Use the following information to answer the next seven exercises. A video game developer is testing a new game on three different groups. Each group represents a different target market for the game. The developer collects scores from a random sample from each group. The results are shown in Table 13.16 Group A Group B Group C 101 151 101 108 149 109 98 160 198 107 112 186 111 126 160 Table 13.16 What is the df(num)?Use the following information to answer the next seven exercises. A video game developer is testing a new game on three different groups. Each group represents a different target market for the game. The developer collects scores from a random sample from each group. The results are shown in Table 13.16 Group A Group B Group C 101 151 101 108 149 109 98 160 198 107 112 186 111 126 160 Table 13.16 What is the df(denom)?Use the following information to answer the next seven exercises. A video game developer is testing a new game on three different groups. Each group represents a different target market for the game. The developer collects scores from a random sample from each group. The results are shown in Table 13.16 Group A Group B Group C 101 151 101 108 149 109 98 160 198 107 112 186 111 126 160 Table 13.16 What are the SSbetween and MSbetween?Use the following information to answer the next seven exercises. A video game developer is testing a new game on three different groups. Each group represents a different target market for the game. The developer collects scores from a random sample from each group. The results are shown in Table 13.16 Group A Group B Group C 101 151 101 108 149 109 98 160 198 107 112 186 111 126 160 Table 13.16 What are the SSwithin and MSwithin?Use the following information to answer the next seven exercises. A video game developer is testing a new game on three different groups. Each group represents a different target market for the game. The developer collects scores from a random sample from each group. The results are shown in Table 13.16 Group A Group B Group C 101 151 101 108 149 109 98 160 198 107 112 186 111 126 160 Table 13.16 What is the F Statistic?Use the following information to answer the next seven exercises. A video game developer is testing a new game on three different groups. Each group represents a different target market for the game. The developer collects scores from a random sample from each group. The results are shown in Table 13.16 Group A Group B Group C 101 151 101 108 149 109 98 160 198 107 112 186 111 126 160 Table 13.16 What is the p-value?Use the following information to answer the next seven exercises. A video game developer is testing a new game on three different groups. Each group represents a different target market for the game. The developer collects scores from a random sample from each group. The results are shown in Table 13.16 Group A Group B Group C 101 151 101 108 149 109 98 160 198 107 112 186 111 126 160 Table 13.16 At the 10% significance level, are the scores among the different groups different?Use the following information to answer the next three exercises. Suppose a group is interested in determining whether teenagers obtain their drivers licenses at approximately the same average age across the country. Suppose that the following data are randomly collected from five teenagers in each region of the country. The numbers represent the age at which teenagers obtained their drivers licenses. Northeast South West Central East 16.3 16.9 16.4 16.2 17.1 16.1 16.5 16.5 16.6 17.2 16.4 16.4 16.6 16.5 16.6 16.5 16.2 16.1 16.4 16.8 x ¯= ----- ----- ----- ----- S2 = ----- ----- ----- ----- Table 13.17 Enter the data into your calculator or computer. p-value = ______ State the decisions and conclusions (in complete sentences) for the following preconceived levels of a.Use the following information to answer the next three exercises. Suppose a group is interested in determining whether teenagers obtain their drivers licenses at approximately the same average age across the country. Suppose that the following data are randomly collected from five teenagers in each region of the country. The numbers represent the age at which teenagers obtained their drivers licenses. Northeast South West Central East 16.3 16.9 16.4 16.2 17.1 16.1 16.5 16.5 16.6 17.2 16.4 16.4 16.6 16.5 16.6 16.5 16.2 16.1 16.4 16.8 x ¯= ----- ----- ----- ----- S2 = ----- ----- ----- ----- Table 13.17 Enter the data into your calculator or computer. a = 0.05 a. Decision: ____________________________ b. Conclusion: ____________________________Use the following information to answer the next three exercises. Suppose a group is interested in determining whether teenagers obtain their drivers licenses at approximately the same average age across the country. Suppose that the following data are randomly collected from five teenagers in each region of the country. The numbers represent the age at which teenagers obtained their drivers licenses. Northeast South West Central East 16.3 16.9 16.4 16.2 17.1 16.1 16.5 16.5 16.6 17.2 16.4 16.4 16.6 16.5 16.6 16.5 16.2 16.1 16.4 16.8 x ¯= ----- ----- ----- ----- S2 = ----- ----- ----- ----- Table 13.17 Enter the data into your calculator or computer. a = 0.01 a. Decision: ____________________________ b. Conclusion: ____________________________Test of Two Variances Use the following information to answer the next two exercises. There are two assumptions that must be true in order to perform an F test of two variances. Name one assumption that must be true.Test of Two Variances Use the following information to answer the next two exercises. There are two assumptions that must be true in order to perform an F test of two variances. What is the other assumption that must be true?Use the following information to answer the next five exercises. Two coworkers commute from the same building. They are interested in whether or not there is any variation in the time it takes them to drive to work. They each record their times for 20 commutes. The first worker’s times have a variance of 12.1. The second worker’s times have a variance of 16.9. The first worker thinks that he is more consistent with his commute times. Test the claim at the 10% level. Assume that commute times are normally distributed. State the null and alternative hypotheses.Use the following information to answer the next five exercises. Two coworkers commute from the same building. They are interested in whether or not there is any variation in the time it takes them to drive to work. They each record their times for 20 commutes. The first worker’s times have a variance of 12.1. The second worker’s times have a variance of 16.9. The first worker thinks that he is more consistent with his commute times. Test the claim at the 10% level. Assume that commute times are normally distributed. What is s1 in this problem?Use the following information to answer the next five exercises. Two coworkers commute from the same building. They are interested in whether or not there is any variation in the time it takes them to drive to work. They each record their times for 20 commutes. The first worker’s times have a variance of 12.1. The second worker’s times have a variance of 16.9. The first worker thinks that he is more consistent with his commute times. Test the claim at the 10% level. Assume that commute times are normally distributed. What is s2 in this problem?Use the following information to answer the next five exercises. Two coworkers commute from the same building. They are interested in whether or not there is any variation in the time it takes them to drive to work. They each record their times for 20 commutes. The first worker’s times have a variance of 12.1. The second worker’s times have a variance of 16.9. The first worker thinks that he is more consistent with his commute times. Test the claim at the 10% level. Assume that commute times are normally distributed. What is n?Use the following information to answer the next five exercises. Two coworkers commute from the same building. They are interested in whether or not there is any variation in the time it takes them to drive to work. They each record their times for 20 commutes. The first worker’s times have a variance of 12.1. The second worker’s times have a variance of 16.9. The first worker thinks that he is more consistent with his commute times. Test the claim at the 10% level. Assume that commute times are normally distributed. What is the F statistic?What is the p-value?Is the claim accurate?Use the following information to answer the next four exercises. Two students are interested in whether or not there is variation in their test scores for math class. There are 15 total math tests they have taken so far. The first student’s grades have a standard deviation of 38.1. The second student’s grades have a standard deviation of 22.5. The second student thinks his scores are more consistent. State the null and alternative hypotheses.Use the following information to answer the next four exercises. Two students are interested in whether or not there is variation in their test scores for math class. There are 15 total math tests they have taken so far. The first student’s grades have a standard deviation of 38.1. The second student’s grades have a standard deviation of 22.5. The second student thinks his scores are more consistent. What is the F Statistic?Use the following information to answer the next four exercises. Two students are interested in whether or not there is variation in their test scores for math class. There are 15 total math tests they have taken so far. The first student’s grades have a standard deviation of 38.1. The second student’s grades have a standard deviation of 22.5. The second student thinks his scores are more consistent. What is the p-value?Use the following information to answer the next four exercises. Two students are interested in whether or not there is variation in their test scores for math class. There are 15 total math tests they have taken so far. The first student’s grades have a standard deviation of 38.1. The second student’s grades have a standard deviation of 22.5. The second student thinks his scores are more consistent. At the 5% significance level, do we reject the null hypothesis?Use the following information to answer the next three exercises. Two cyclists are comparing the variances of their overall paces going uphill. Each cyclist records his or her speeds going up 35 hills. The first cyclist has a variance of 23.8 and the second cyclist has a variance of 32.1. The cyclists want to see if their variances are the same or different. Assume that commute times are normally distributed. State the null and alternative hypotheses.Use the following information to answer the next three exercises. Two cyclists are comparing the variances of their overall paces going uphill. Each cyclist records his or her speeds going up 35 hills. The first cyclist has a variance of 23.8 and the second cyclist has a variance of 32.1. The cyclists want to see if their variances are the same or different. Assume that commute times are normally distributed. What is the F Statistic?Use the following information to answer the next three exercises. Two cyclists are comparing the variances of their overall paces going uphill. Each cyclist records his or her speeds going up 35 hills. The first cyclist has a variance of 23.8 and the second cyclist has a variance of 32.1. The cyclists want to see if their variances are the same or different. Assume that commute times are normally distributed. At the 5% significance level, what can we say about the cyclists’ variances?Three different traffic routes are tested for mean driving time. The entries in the Table 13.18 are the driving times in minutes on the three different routes. Route 1 Route 2 Route 3 30 27 16 32 29 41 27 28 22 35 36 31 Table 13.18 State SSbetween, SSwithin, and the F statistic.Suppose a group is interested in determining whether teenagers obtain their drivers licenses at approximately the same average age across the country. Suppose that the following data are randomly collected from five teenagers in each region of the country. The numbers represent the age at which teenagers obtained their drivers licenses. Northeast South West Central East 16.3 16.9 16.4 16.2 17.1 16.1 16.5 16.5 16.6 17.2 16.4 16.4 16.6 16.5 16.6 16.5 16.2 16.1 16.4 16.8 x ¯= ----- ----- ----- ----- S2 = ----- ----- ----- ----- Table 13.19 State the hypotheses. H0: ____________ Ha: ____________The F Distribution and the F-Ratio Use the following information to answer the next three exercises. Suppose a group is interested in determining whether teenagers obtain their drivers licenses at approximately the same average age across the country. Suppose that the following data are randomly collected from five teenagers in each region of the country. The numbers represent the age at which teenagers obtained their drivers licenses. Northeast South West Central East 16.3 16.9 16.4 16.2 17.1 16.1 16.5 16.5 16.6 17.2 16.4 16.4 16.6 16.5 16.6 16.5 16.2 16.1 16.4 16.8 x ¯= ----- ----- ----- ----- S2 = ----- ----- ----- ----- Table 13.20 H0: µ1 = µ2 = µ3 = µ4 = µ5 Ha: At least any two of the group means µ1, µ2, …, µ5 are not equal. degrees of freedom numerator: df(num) = _________The F Distribution and the F-Ratio Use the following information to answer the next three exercises. Suppose a group is interested in determining whether teenagers obtain their drivers licenses at approximately the same average age across the country. Suppose that the following data are randomly collected from five teenagers in each region of the country. The numbers represent the age at which teenagers obtained their drivers licenses. Northeast South West Central East 16.3 16.9 16.4 16.2 17.1 16.1 16.5 16.5 16.6 17.2 16.4 16.4 16.6 16.5 16.6 16.5 16.2 16.1 16.4 16.8 x ¯= ----- ----- ----- ----- S2 = ----- ----- ----- ----- Table 13.20 H0: µ1 = µ2 = µ3 = µ4 = µ5 Ha: At least any two of the group means µ1, µ2, …, µ5 are not equal. degrees of freedom denominator: df(denom) = ________The F Distribution and the F-Ratio Use the following information to answer the next three exercises. Suppose a group is interested in determining whether teenagers obtain their drivers licenses at approximately the same average age across the country. Suppose that the following data are randomly collected from five teenagers in each region of the country. The numbers represent the age at which teenagers obtained their drivers licenses. Northeast South West Central East 16.3 16.9 16.4 16.2 17.1 16.1 16.5 16.5 16.6 17.2 16.4 16.4 16.6 16.5 16.6 16.5 16.2 16.1 16.4 16.8 x ¯= ----- ----- ----- ----- S2 = ----- ----- ----- ----- Table 13.20 H0: µ1 = µ2 = µ3 = µ4 = µ5 Ha: At least any two of the group means µ1, µ2, …, µ5 are not equal. F statistic = ________Three students, Linda, Tuan, and Javier, are given five laboratory rats each for a nutritional experiment. Each rat's weight is recorded in grams. Linda feeds her rats Formula A, Tuan feeds his rats Formula B, and Javier feeds his rats Formula C. At the end of a specified time period, each rat is weighed again, and the net gain in grams is recorded. Using a significance level of 10%, test the hypothesis that the three formulas produce the same mean weight gain. Linda's rats Tuan's rats Javier's rats 43.5 47.0 51.2 39.4 40.5 40.9 41.3 38.9 37.9 46.0 46.3 45.0 38.2 44.2 48.6 Table 13.21 Weights of Student Lab RatsA grassroots group opposed to a proposed increase in the gas tax claimed that the increase would hurt working-class people the most, since they commute the farthest to work. Suppose that the group randomly surveyed 24 individuals and asked them their daily one-way commuting mileage. The results are in Table 13.22. Using a 5% significance level, test the hypothesis that the three mean commuting mileages are the same. working-class professional (middle incomes) professional (wealthy) 17.8 16.5 8.5 26.7 17.4 6.3 49.4 22.0 4.6 9.4 7.4 12.6 65.4 9.4 11.0 47.1 2.1 28.6 19.5 6.4 15.4 51.2 13.9 9.3 Table 13.22Use the following information to answer the next two exercises. Table 13.23 lists the number of pages in four different types of magazines. home decorating news health computer 172 87 82 104 286 94 153 136 163 123 87 98 205 106 103 207 197 101 96 146 Table 13.23 Using a significance level of 5%, test the hypothesis that the four magazine types have the same mean length.Use the following information to answer the next two exercises. Table 13.23 lists the number of pages in four different types of magazines. home decorating news health computer 172 87 82 104 286 94 153 136 163 123 87 98 205 106 103 207 197 101 96 146 Table 13.23 Eliminate one magazine type that you now feel has a mean length different from the others. Redo the hypothesis test, testing that the remaining three means are statistically the same. Use a new solution sheet. Based on this test, are the mean lengths for the remaining three magazines statistically the same?A researcher wants to know if the mean times (in minutes) that people watch their favorite news station are the same. Suppose that Table 13.24 shows the results of a study. CNN FOX Local 45 15 72 12 43 37 18 68 56 38 50 60 23 31 51 Table 13.24 Assume that all distributions are normal, the four population standard deviations are approximately the same, and the data were collected independently and randomly. Use a level of significance of 0.05.Are the means for the final exams the same for all statistics class delivery types? Table 13.25 shows the scores on final exams from several randomly selected classes that used the different delivery types. Online Hybrid Face-to-Face 72 83 80 84 73 78 77 84 84 80 81 81 81 86 79 82 Table 13.25 Assume that all distributions are normal, the four population standard deviations are approximately the same, and the data were collected independently and randomly. Use a level of significance of 0.05.Are the mean number of times a month a person eats out the same for whites, blacks, Hispanics and Asians? Suppose that Table 13.26 shows the results of a study. White Black Hispanic Asian 6 4 7 8 8 1 3 3 2 5 5 5 4 2 4 1 6 6 7 Table 13.26 Assume that all distributions are normal, the four population standard deviations are approximately the same, and the data were collected independently and randomly. Use a level of significance of 0.05.Are the mean numbers of daily visitors to a ski resort the same for the three types of snow conditions? Suppose that Table 13.27 shows the results of a study. Powder Machine Made Hard Packed 1,210 2,107 2,846 1,080 1,149 1,638 1,537 862 2,019 941 1,870 1,178 1,528 2,233 1,382 Table 13.27 Assume that all distributions are normal, the four population standard deviations are approximately the same, and the data were collected independently and randomly. Use a level of significance of 0.05.Sanjay made identical paper airplanes out of three different weights of paper, light, medium and heavy. He made four airplanes from each of the weights, and launched them himself across the room. Here are the distances (in meters) that his planes flew. Paper Type/Trial Trial 1 Trial 2 Trial 3 Trial 4 Heavy 5.1 meters 3.1 meters 4.7 meters 5.3 meters Medium 4 meters 3.5 meters 4.5 meters 6.1 meters Light 3.1 meters 3.3 meters 2.1 meters 1.9 meters Table 13.28 Figure 13.8 a. Take a look at the data in the graph. Look at the spread of data for each group (light, medium, heavy). Does it seem reasonable to assume a normal distribution with the same variance for each group? Yes or No. b. Why is this a balanced design? c. Calculate the sample mean and sample standard deviation for each group. d. Does the weight of the paper have an effect on how far the plane will travel? Use a 1% level of significance. Complete the test using the method shown in the bean plant example in Example 13.4. variance of the group means __________ MSbetween= ___________ mean of the three sample variances ___________ MSwithin = _____________ F statistic = ____________ df(num) = __________, df(denom) = ___________ number of groups _______ number of observations _______ p-value = __________ (P(F > _______) = __________) Graph the p-value. decision: _______________________ conclusion: _______________________________________________________________DDT is a pesticide that has been banned from use in the United States and most other areas of the world. It is quite effective, but persisted in the environment and over time became seen as harmful to higher-level organisms. Famously, egg shells of eagles and other raptors were believed to be thinner and prone to breakage in the nest because of ingestion of DDT in the food chain of the birds. An experiment was conducted on the number of eggs (fecundity) laid by female fruit flies. There are three groups of flies. One group was bred to be resistant to DDT (the RS group). Another was bred to be especially susceptible to DDT (SS). Finally there was a control line of non-selected or typical fruitflies (NS). Here are the data: RS SS NS RS SS NS 12.8 38.4 35.4 22.4 23.1 22.6 21.6 32.9 27.4 27.5 29.4 40.4 14.8 48.5 19.3 20.3 16 34.4 23.1 20.9 41.8 38.7 20.1 30.4 34.6 11.6 20.3 26.4 23.3 14.9 19.7 22.3 37.6 23.7 22.9 51.8 22.6 30.2 36.9 26.1 22.5 33.8 29.6 33.4 37.3 29.5 15.1 37.9 16.4 26.7 28.2 38.6 31 29.5 20.3 39 23.4 44.4 160.9 42.4 29.3 12.8 33.7 23.2 16.1 36.6 14.9 14.6 29.2 23.8 10.8 47.4 27.3 12.2 41.7 Table 13.29 The values are the average number of eggs laid daily for each of 75 flies (25 in each group) over the first 14 days of their lives. Using a 1% level of significance, are the mean rates of egg selection for the three strains of fruitfly different? If so, in what way? Specifically, the researchers were interested in whether or not the selectively bred strains were different from the nonselected line, and whether the two selected lines were different from each other. Here is a chart of the three groups: FigureThe data shown is the recorded body temperatures of 130 subjects as estimated from available histograms. Traditionally we are taught that the normal human body temperature is 98.6 F. This is not quite correct for everyone. Are the mean temperatures among the four groups different? Calculate 95% confidence intervals for the mean body temperature in each group and comment about the confidence intervals. FL FH ML MH FL FH ML MH 96.4 96.8 96.3 96.9 98.4 98.6 98.1 98.6 96.7 97.7 96.7 97 98.7 98.6 98.1 98.6 97.2 97.8 97.1 97.1 98.7 98.6 98.2 98.7 97.2 97.9 97.2 97.1 98.7 98.7 98.2 98.8 97.4 98 97.3 97.4 98.7 98.7 98.2 98.8 97.6 98 97.4 97.5 98.8 98.8 98.2 98.8 97.7 98 97.4 97.6 98.8 98.8 98.3 98.9 97.8 98 97.4 97.7 98.8 98.8 98.4 99 97.8 98.1 97.5 97.8 98.8 98.9 98.4 99 97.9 98.3 97.6 97.9 99.2 99 98.5 99 97.9 98.3 97.6 98 993 99 98.5 99.2 98 98.3 97.8 98 99.1 98.6 99.5 98.2 98.4 97.8 98 99.1 98.6 98.2 98.4 97.8 98.3 99.2 98.7 98.2 98.4 97.9 98.4 99.4 99.1 98.2 98.4 98 98.4 99.9 99.3 98.2 98.5 98 98.6 100 99.4 Table 13.30Three students, Linda, Tuan, and Javier, are given five laboratory rats each for a nutritional experiment. Each rat’s weight is recorded in grams. Linda feeds her rats Formula A, Tuan feeds his rats Formula B, and Javier feeds his rats Formula C. At the end of a specified time period, each rat is weighed again and the net gain in grams is recorded. Linda's rats Tuan's rats Javier's rats 43.5 47.0 51.2 39.4 40.5 40.9 41.3 38.9 37.9 46.0 46.3 45.0 38.2 44.2 48.6 Table 13.31 Determine whether or not the variance in weight gain is statistically the same among Javier’s and Linda’s rats. Test at a significance level of 10%.A grassroots group opposed to a proposed increase in the gas tax claimed that the increase would hurt working-class people the most, since they commute the farthest to work. Suppose that the group randomly surveyed 24 individuals and asked them their daily one-way commuting mileage. The results are as follows. working-class professional (middle incomes) professional (wealthy) 17.8 16.5 8.5 26.7 17.4 6.3 49.4 22.0 4.6 9.4 7.4 12.6 65.4 9.4 11.0 47.1 2.1 28.6 19.5 6.4 15.4 51.2 13.9 9.3 Table 13.32 Determine whether or not the variance in mileage driven is statistically the same among the working class and professional (middle income) groups. Use a 5% significance level.Which two magazine types do you think have the same variance in length?Which two magazine types do you think have different variances in length?Is the variance for the amount of money, in dollars, that shoppers spend on Saturdays at the mall the same as the variance for the amount of money that shoppers spend on Sundays at the mall? Suppose that the Table 13.33 shows the results of a study. Saturday Saturday Saturday Saturday 75 44 62 137 18 58 0 82 150 61 124 39 94 19 50 127 62 99 31 141 73 60 118 73 89 Table 13.33Are the variances for incomes on the East Coast and the West Coast the same? Suppose that Table 13.34 shows the results of a study. Income is shown in thousands of dollars. Assume that both distributions are normal. Use a level of significance of 0.05. East West 38 71 47 126 30 42 82 51 75 44 52 90 115 88 67 Table 13.34Thirty men in college were taught a method of finger tapping. They were randomly assigned to three groups of ten, with each receiving one of three doses of caffeine: 0 mg, 100 mg, 200 mg. This is approximately the amount in no, one, or two cups of coffee. Two hours after ingesting the caffeine, the men had the rate of finger tapping per minute recorded. The experiment was double blind, so neither the recorders nor the students knew which group they were in. Does caffeine affect the rate of tapping, and if so how? Here are the data: 0 mg 100 mg 200 mg 0 mg 100 mg 200 mg 242 248 246 245 246 248 244 245 250 248 247 252 247 248 248 248 250 250 242 247 246 244 246 248 246 243 245 242 244 250 Table 13.35King Manuel I, Komnenus ruled the Byzantine Empire from Constantinople (Istanbul) during the years 1145 to 1180 A.D. The empire was very powerful during his reign, but declined significantly afterwards. Coins minted during his era were found in Cyprus, an island in the eastern Mediterranean Sea. Nine coins were from his first coinage, seven from the second, four from the third, and seven from a fourth. These spanned most of his reign. We have data on the silver content of the coins: First Coinage Second Coinage Third Coinage Fourth Coinage 5.59 6.9 4.9 5.3 6.8 9 5.5 5.6 6.4 6.6 4.6 5.5 7 8.1 4.5 5.1 6.6 9.3 6.2 7.7 9.2 5.8 6.9 8.6 5.8 6.2 Table 13.36 Did the silver content of the coins change over the course of Manuel’s reign? Here are the means and variances of each coinage. The data are unbalanced. First Second Third Fourth Mean 6.7444 8.2429 4.875 5.6143 Variance 0.2953 1.2095 0.2025 0.1314 Table 13.37The American League and the National League of Major League Baseball are each divided into three divisions: East, Central, and West. Many years, fans talk about some divisions being stronger (having better teams) than other divisions. This may have consequences for the post season. For instance, in 2012 Tampa Bay won 90 games and did not play in the post season, while Detroit won only 88 and did play in the post season. This may have been an oddity, but is there good evidence that in the 2012 season, the American League divisions were significantly different in overall records? Use the following data to test whether the mean number of wins per team in the three American League divisions were the same or not. Note that the data are not balanced, as two divisions had five teams, while one had only four. Division Team Wins East Yankees 95 East Baltimore 93 East Tampa Bay 90 East Toronto 73 East Boston 69 Table 13.38 Division Team Wins Central Detroit 88 Central Chicago Sox 85 Central Kansas City 72 Central Cleveland 68 Central Minnesota 66 Table 13.39 Division Team Wins West Oakland 94 West LA Angels 93 West Texas 89 West Seattle 75 Table 13.40