MATH 533(Applied Managerial Statistics) Project AJ Davis Department Stores; Part C: Regression and Correlation Analysis Using MINITAB perform the regression and correlation analysis for the data on CREDIT BALANCE (Y) and SIZE (X) by answering the following. 1. Generate a scatterplot for CREDIT BALANCE vs. SIZE, including the graph of the "best fit" line. Interpret. Scatterplot of Credit Balance($) vs Size 6000 5000 Credit Balance($) 4000 3000 2000 1 2 3 4 Size 5 6 7 The scatter plot of Credit balance ($) versus Size show that the slope of the „best fit‟ line is upward (positive);this indicates that Credit balance varies directly with Size. As Size increases, Credit Balance also increases vice versa. Correct …show more content…
Correct MINITAB OUTPUT: Predicted Values for New Observations New ObsFit 1 4607.5 SE Fit 95% CI 119.0 (4368.2, 4846.9) 95% PI (3337.9, 5877.2) Values of Predictors for New Observations New Obs 1 Size 5.00 9. Using an interval, predict the credit balance for a customer that has a household size of 5. Interpret this interval. The credit balance for a customer that has household size of 5 is expected to lie within the interval of (3337.9, 5877.2). This is the 95% prediction interval estimate for the credit balance for a customer that has household size of 5. Correct MINITAB OUTPUT: Predicted Values for New Observations New ObsFit 1 4607.5 SE Fit 95% CI 119.0 (4368.2, 4846.9) 95% PI (3337.9, 5877.2) Values of Predictors for New Observations New Obs 1 Size 5.00 10. What can we say about the credit balance for a customer that has a household size of 10? Explain your answer. We cannot say anything about the credit balance for a customer that has a household size of 10 because since the maximum value of the predictor variable (size) used to formulate the given regression model is only 7, which is much less than 10; therefore, we cannot use the given regression model to accurately estimate the credit balance for a customer that has a household size of 10. Correct In an attempt to improve the model, we attempt to do a multiple regression model predicting CREDIT BALANCE based on INCOME, SIZE and YEARS. 11. Using MINITAB run the
To compute the 90% prediction interval for all trading days during the study period, the formula ( , ) can be used. Referring to the question equals 0.1 and equals 0.05.
AJ DAVIS is a department store chain, which has many credit customers. A sample of 50 credit customers is selected with data collected on location, income, credit balance, number of people and years lived in the house
You provided no discussion about TAXYPR, TGEG, and DMC in Table 4. How can you insert TGEG and DMC into the regression equation as explanatory variables?
Iterations of analysis eliminated data points that were listed as “unusual observations,” or any data point with a large standardized residual. After 5 iterations, the analysis showed improved residual plots. Randomness in the versus fits and versus order plots means that the linear regression model is appropriate for the data; a straight line in the normal probability plot illustrates the linearity of the data, and a bell shaped curve in the histogram illustrates the normality of the data.
6. Why is the black line so much more variable than the red line? What 's the difference between the data they show?
5) Graph the equation you wrote in step four superimposed over the original data. Comment on how well or how poorly the equation fits the data.
Lending evaluations by Santander are based on credit background of the person (or company) who wants borrow. Through the development of a credit-approval system, Santander Consumer Finance increased an understanding into these clients on an online database which allowed for the use of real time analysis in determining interest rates for the business interactions.
The trendline, known as the line of best fit or the least squares regression line, shows the linear equation which best explains the sums up the data’s trend. The formula on the right is the formula of the line of best fit.
What is seen in the stem and leaf plot for the money variable (include the shape)? Explain your answer.
Attached is a sample of loan-level information (Exhibit 2). How would you expect an applicant’s debt-to-income ratio to relate to other loan characteristics, such as credit score? (Narrative)
What is the lifetime value of a typical customer in each of the four segments, in current dollar values? Compare these figures to the “Gross margin” figures in the original spreadsheet. What can you learn from this comparison?
Problem 2.6: In fitting a model to classify prospects as purchasers or non-purchasers, a certain company drew the training data from internal data that include demographic and purchase information. Future data to be classified will be lists purchased from other sources, with demographic (but not purchase) data included. It was found that “refund issued” was a useful
The results of the two test statistics differed at times, i.e., listing two different curves as providing the “best” fit. In the fore-mentioned situation, a final decision pertaining to the “best” fit was made based on a visual assessment of the figures.
It is a process of analyzing the relationship among the data from various perspectives and summarizing it into valuable information. It also assists the banks to look for hidden patterns in a group and discovers unknown relationships in the data. These data mining techniques facilitate useful data interpretations for the banking sector to avoid customer attrition. An accurate prediction on the credit approval is important to prospective homeowners, developers, investors, appraisers, tax assessors and other real estate market participants without fraudulence. People who are looking to buy a new place or thing, tend to be more conservative with their budget and acquiring loans from financial institutions. The credit functionality is prime for any banking system over the tentative market conditions. The lack of general credit review system & precise methods in banks are the important reasons, why an expert support system is necessary.
Meanwhile, Albrecht and Ziderman (1992) claim that failure to repay the loans associated with an insufficient amount of salary received. Other than insufficient the amount of salary received, borrowers family size also has to do with the defaults on the loan. In the study by Canner and Luckett (1990) found that the number of household size is indirectly related to the increased probability of debt delinquency. This is because the increasing of the number of people in the family could contribute to increase of the family expenditures.