Casey Deesel is a sports agent negotiating a contract for Titus Johnston, an athlete in the National Football League (NFL). An important aspect of any NFL contract is the amount of guaranteed money over the life of the contract. Casey has gathered data on 506 NFL athletes who have recently signed new contracts. Each observation (NFL athlete) includes values for percentage of his team's plays that the athlete is on the field (SnapPercent), the number of awards an athlete has received recognizing on-field performance (Awards), the number of games the athlete has missed due to injury (GamesMissed), and millions of dollars of guaranteed money in the athlete's most recent contract (Money, dependent variable).
Casey has trained a full regression tree on 304 observations and then used the validation set to prune the tree to obtain a best-pruned tree. The best-pruned tree (as applied to the 202 observations in the validation set) is:
(a) | Titus Johnston's variable values are: SnapPercent = 84, Awards = 7, and GamesMissed = 2. How much guaranteed money does the regression tree predict that a player with Titus Johnson's profile should earn in his contract? |
If required, round your answers to two decimal places. | |
The predicted result is $ ????? million of guaranteed money. | |
(b) | Casey feels that Titus was denied an additional award in the past season due to some questionable voting by some sports media. If Titus had won this additional award, how much additional guaranteed money would the regression tree predict for Titus versus the prediction in part (a)? |
An additional award would increase the amount of guaranteed money by $8.91 million. An additional award would increase the amount of guaranteed money by $11.81 million. An additional award would increase the amount of guaranteed money by $13.99 million. An additional award would increase the amount of guaranteed money by $32.00 million. An additional award would not change the amount of guaranteed money. - Select your answer -I.II.III.IV.V. |
|
(c) | As Casey reviews the best-pruned tree, he is confused by the leaf node corresponding to the sequence of decision rules of "SnapPercent ≥ 90.28, SnapPercent ≥ 95.37, Awards < 7.25, GamesMissed < 2.5." This sequence of decision rules results in an estimate of $47.83 million of guaranteed money, but the tree states that zero observations occur in the corresponding partition. If zero observations occur in this partition, how can the regression tree provide an estimate of $50 million? Explain this part of the regression tree to Casey by referring to how the best-pruned tree is obtained. |
The predicted guaranteed money of $47.83 million for observations satisfying "SnapPercent ≥ 90.28, SnapPercent ≥ 95.37, Awards < 7.25, GamesMissed < 2.5" is based on the average guaranteed money of the observations in the - Select your answer -training or validation set that satisfy this sequence of decision rules. The best-pruned tree is obtained by - Select your answer -removing leaf nodes from or adding leaf nodes to the initial regression tree to obtain the tree with the - Select your answer -fewest or greatest leaf nodes while achieving the minimum classification error rate on the - Select your answer -training or validation set. In this case, the - Select your answer -training or validation set has zero observations that satisfy "SnapPercent ≥ 90.28, SnapPercent ≥ 95.37, Awards < 7.25, GamesMissed < 2.5" which just means that this leaf node - Select your answer -does not contributes or contributes to the classification error rate of this tree. |
Trending nowThis is a popular solution!
Step by stepSolved in 3 steps with 3 images
- A statistical program is recommended. The Condé Nast Traveler Gold List provides ratings for the top 20 small cruise ships. The data shown below are the scores each ship received based upon the results from Condé Nast Traveler's Annual Readers' Choice Survey. Each score represents the percentage of respondents who rated a ship as excellent or very good on several criteria, including Shore Excursions and Food/Dining. An overall score was also reported and used to rank the ships. The highest ranked ship, the Seabourn Odyssey, has an overall score of 94.4, the highest component of which is 97.8 for Food/Dining. Shore Ship Overall Food/Dining Excursions Seabourn Odyssey 94.4 90.9 97.8 Seabourn Pride 93.0 84.2 96.7 National Geographic Endeavor 92.9 100.0 88.5 Seabourn Sojourn 91.3 94.8 97.1 Paul Gauguin 90.5 87.9 91.2 Seabourn Legend 90.3 82.1 98.8 Seabourn Spirit 90.2 86.3 92.0 Silver Explorer 89.9 92.6 88.9 Silver Spirit 89.4 85.9 90.8 Seven Seas Navigator 89.2 83.3 90.5 Silver Whisperer…arrow_forwardDetermine whether the following statement is true. The median of a data set will increase by a factor of 10 if the largest value increased by a factor of 100.arrow_forwardThe price drivers pay for gasoline often varies a great deal across regions throughout the United States. The following data show the price per gallon for regular gasoline for a random sample of gasoline service stations for three major brands of gasoline (Shell, BP, and Marathon) located in eleven metropolitan areas across the upper Midwest region (OhioGasPrices.com website, March 18, 2012). Click on the datafile logo to reference the data. DATA file Shell BP Metropolitan Area Marathon Akron, Ohio Cincinnati, Ohio Cleveland, Ohio Columbus, Ohio Ft. Wayne, Indiana Indianapolis, Indiana Lansing, Michigan Lexington, Kentucky Louisville, Kentucky Muncie, Indiana Toledo, Ohio 3.77 3.72 3.87 3.76 3.78 3.87 3.89 3.79 3.83 3.83 3.85 3.77 3.83 3.85 3.93 3.84 3.84 4.04 3.87 3.87 3.99 3.79 3.78 3.81 3.69 3.78 3.84 3.84 3.83 3.79 3.79 3.86 3.86 Use a = .05 to test for any significant difference in the mean price of gasoline for the three brands. Round SS to 6 decimals, MS to 6 decimals, F to 2…arrow_forward
- Jocelyn believes that the amount of sleep she tends to get on weekends differs from the amount of sleep she tends to get during the school week. To investigate this claim, she randomly selects 10 weekend days and 10 school days. She consults her smart watch to determine the number of hours she slept for each of the selected days. Here are the data. School week: 7, 7.5, 8, 6.5, 8, 7.5, 7, 6.5, 7, 8Weekend: 9.5, 9.5, 8.25, 8.5, 7.5, 10.25, 8, 7, 9.5, 10 Jocelyn would like to determine if these data provide convincing evidence that the true mean amount of sleep she gets on the weekend differs from the true mean amount of sleep she gets during the school week. She tests H0: μS – μW = 0, Ha: μS – μW ≠ 0, where μW = the true mean amount of sleep Jocelyn gets on the weekend and μS = the true mean amount of sleep she gets during the school week. The conditions for inference are met. What are the values of the test statistic and P-value for a t-test about a difference in means? Find the…arrow_forwardPlease assist with all questions and workings to learnarrow_forwardA statistics class has 136 students. The professor records how much money (in dollars) each student carries in his or her pocket during the first class. The histogram shows the data that were collected. Frequency 60 50 40 30 20 10 0 0 10 20 30 40 50 60 70 80 90 100 110 The percentage of students having less than $10 in their pockets is closest to 30% 35% 70% 60% 50%arrow_forward
- help pleasearrow_forwardA real estate agent has been given 10 houses to list. The house prices in dollars are: 110,000 95,000 87,000 92,000 99,000 250,000 265,000 210,000 275,000 240,000 A potential buyer calls and asks about house prices in the area. What measure of central tendency would best describe this data set? How should the real estate agent report it? Justify your response.arrow_forwardTexting while driving has become a national focus now that cell phones are ubiquitous and texting while driving results in ever-increasing injuries and deaths. According to 2013 study, 44.5% of US Teenage drivers text while driving. You want to see how Gainesville teenage drivers compare to national teens. You randomly select 300 local high school students (ages 16-19) and ask if they have texted while driving within the past 30 days. The data is recorded in the table. Yes No 142 158 Find a 95% confidence interval for the percentage of teenage drivers who text while driving. Then explain what the confidence interval means in this context.arrow_forward
- MATLAB: An Introduction with ApplicationsStatisticsISBN:9781119256830Author:Amos GilatPublisher:John Wiley & Sons IncProbability and Statistics for Engineering and th...StatisticsISBN:9781305251809Author:Jay L. DevorePublisher:Cengage LearningStatistics for The Behavioral Sciences (MindTap C...StatisticsISBN:9781305504912Author:Frederick J Gravetter, Larry B. WallnauPublisher:Cengage Learning
- Elementary Statistics: Picturing the World (7th E...StatisticsISBN:9780134683416Author:Ron Larson, Betsy FarberPublisher:PEARSONThe Basic Practice of StatisticsStatisticsISBN:9781319042578Author:David S. Moore, William I. Notz, Michael A. FlignerPublisher:W. H. FreemanIntroduction to the Practice of StatisticsStatisticsISBN:9781319013387Author:David S. Moore, George P. McCabe, Bruce A. CraigPublisher:W. H. Freeman