Final Exam Prep Sheet_yl

docx

School

University of Southern California *

*We aren’t endorsed by this school

Course

310

Subject

Statistics

Date

Jan 9, 2024

Type

docx

Pages

5

Report

Uploaded by MinisterRamMaster791

Final Exam Prep Sheet To be prepared for the Final exam, you should know and understand the following topics and concepts. The first 27 items are topics on regression. Items 28 through 71 are other topics we covered during the course: 1. Know the definition of correlation (r) of two variables. Understand what values of r between -1 and +1 mean. Be clear on the difference between correlation and causation. 2. Know what simple linear regression is, including the definition of the explanatory (independent) variable and the response (dependent) variable. 3. Understand the definition of residual. 4. Know how the coefficients b 0 and b 1 of a least-squares regression line are calculated from the x,y pairs of points in a given data set, i.e., from x , y ,r ,s x ,s y . Understand how to interpret the intercept, b 0 , and the slope, b 1 . Know that the regression line goes through the point x , y . 5. Understand the simple linear regression model. 6. Understand how the population values for the intercept and the slope ( β 0 and β 1 ¿ relate to the intercept and slope of the least-squares line (b 0 and b 1 ). 7. Understand what residual plots reveal and what they should look like in a proper regression model. Know the meaning of fan–shaped and curved patterns and what assumptions they violate. 8. Understand the output, coefficient table and analysis of variance table (ANOVA) in the regression output. 9. Know how to assess whether or not the coefficient of a variable is statistically significant. 10. Understand the interpretation of the hypothesis test for the β i . 11. Know how the sum of squared errors ( SSE = e i 2 ¿ , mean square error (MSE = s e 2 = SSE n 2 ¿ and standard error ( s e = MSE = SSE n 2 ¿¿ are calculated in simple regression and can be found in the regression output. Know how these quantities and results can be generalized when there are k independent variables. 12. Understand the three measures of variation of y (SST, SSR and SSE) and how they relate. 13. Know the definition of R 2 in terms of SSR, SST and SSE. 14. Know how to use a regression equation to create a prediction value and an approximate prediction interval for new values of the independent variable(s).
15. Khow a unit change in the explanatory variable leads to a multiplicative change in the response variable. 16. Know that R 2 always increases when a new explanatory variable is added to the model and understand why this is true. 17. Understand what adjusted R 2 is and how it is used to compare models of different sizes. 18. Understand what the F significance test for a multiple regression model reveals. 19. Know how categorical data in a regression model is handled with dummy variables. Know how to construct and interpret dummy variables. 20. Understand why a categorical variable that can take on m values requires m – 1 dummy variables. Understand the meaning of baseline. 21. Understand how dummy variables affect the intercept of the regression. 22. Know how to build and interpret interaction models, where a dummy variable interacts with another variable in the model. Understand how interaction terms affect the intercept of the regression and the slope of the numerical independent variable. 23. Understand what multicollinearity is, the problems it creates and how to mitigate its effects. 24. Know what a Variance Inflation Factor is, how it relates to multicollinearity and what to do to mitigate its effect. 25. Have an idea of how the automated procedures of backward elimination, forward stepwise regression and best subsets regression are implemented. 26. Know how to perform and interpret a chi square test and how to use the Excel functions CHISQ.DIST and CHISQ.INV, CHISQ.TEST. Know how a chi square test can be used to see if two categorical variables are independent, or to test goodness of fit. 27. Understand the meanings of and differences between categorical, ordinal and numerical variables. 28. Understand what a distribution of a variable is. 29. Understand the following concepts that describe distributions: unimodal, bimodal, skewed to the left, skewed to the right, symmetric, mode, mean, median, minimum, maximum, range, variance, standard deviation, quartile, percentile, robust, the five- number summary, box plot, histogram. 30. Understand how correlation is used to show the relationship between two quantitative variables.
31. Know how to interpret a scatterplot (form, direction, strength, positive and negative association). 32. Understand the possible values for correlation, r, and what these values mean. 33. Understand the fundamental concepts of probability theory: random phenomena, sample space, events, probability of an event, equally likely events, the possible values for probability, the rules of probability, the intersection of two events (A and B), the union of two events (A or B), the complement of an event, disjoint or mutually exclusive events, the general and special laws of addition and multiplication, conditional probability, independent events. 34. Know how to develop and interpret contingency tables and how to use them to find a joint distribution, marginal distributions and conditional distributions. 35. Understand random variables: the difference between discrete and continuous random variables, probability distribution functions and cumulative distribution functions for discrete and continuous random variables, and the expected value, variance and standard deviation of a random variable. 36. Know how to calculate the expected value and standard deviation of a discrete random variable. 37. Know the characteristics of the Uniform distribution. 38. Recognize when it is appropriate to apply the binomial distribution and how to apply it. 39. For a random variable with a continuous probability distribution, know what its probability density function (PDF) and cumulative distribution function (CDF) are and how the latter is used to find the probability the random variable is less than a given number, greater than a given number, or between two given numbers. Know how to use the CDF to find a given percentile. Be familiar with the uniform and normal density curves. 40. Know how to use the appropriate Excel functions for the above distributions. 41. Know the key definitions in statistical inference: population, sample, survey, bias. 42. Know the mean and standard deviation of the sample mean, X . 43. Understand the nature of the sampling distribution of X if the population distribution is normal. 44. Understand the impact of sample size on the distribution of X regardless of the shape of the population distribution (the Central Limit Theorem). 45. Know the definition of population proportion π and the sample proportion p used to estimate it.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
46. Know how to construct a confidence interval for the population mean when you know and you when you don’t know the population standard deviation . 47. Understand the notions of confidence level and margin of error. 48. Understand how the width of a confidence interval is related to the confidence level, the standard error of the mean and the sample size. 49. Understand the t distribution and the role of degrees of freedom and how it is used to construct a confidence interval for the population mean assuming you do not know the population standard deviation. 50. Know how to use the appropriate Excel functions for the t distribution. 51. Know how to construct a confidence interval for the population proportion and how big the sample size has to be. 52. Understand the general form of the confidence interval for a population mean and a population proportion, that is, what makes up the margin of error to be added and subtracted from the sample mean or sample proportion: a t or a z multiplier times the standard error of the sample mean or a z multiplier times the standard error of the sample proportion. 53. Know what a hypothesis test is and how to conduct one. Know the definitions of and how to frame the null hypothesis and the alternative hypothesis. Know when to use one-tailed and two-tailed alternatives. 54. Understand the concept of a test statistic, how to calculate it given the sample data, and how to use it to decide whether or not to reject the null hypothesis, for all kinds of hypothesis tests. 55. Know the definition of the p-value and how to use it to decide whether or not to reject the null hypothesis, , for all kinds of hypothesis tests. 56. Be able to do a t-test for a hypothesis test involving the sample mean. 57. Know what it means for the evidence against the null hypothesis provided by the data to be statistically significant at level α. Know what it means for the p-value to be greater than or less than α. 58. Understand the equivalence among the notions of rejecting the null hypothesis at level α, finding a p-value less than α, and determining a t-statistic less than or greater than the value of t with α or α/2 in the appropriate tail or tails. 59. Know how to do a z-test for a null hypothesis involving the population proportion.
60. Know how to use the following Excel functions in connection with the above: AVERAGE, RAND(), STDEV.S, BINOM.DIST, NORM.DIST, NORM.S.DIST, NORM.INV, NORM.S.INV, T.DIST, T.INV, T.INV.2T, IF, AND, OR, CHISQ.DIST, CHISQ.INV, CHISQ.TEST.