MAT 240 Project One

.docx

School

Southern New Hampshire University *

*We aren’t endorsed by this school

Course

240

Subject

Economics

Date

Jan 9, 2024

Type

docx

Pages

8

Uploaded by ProfDeer3898

Report
Median Housing Price Prediction Model for D. M. Pan National Real Estate Company 1 Report: Housing Price Prediction Model for D. M. Pan National Real Estate Company Natalie Ann Merle Donate Southern New Hampshire University
Median Housing Price Model for D. M. Pan National Real Estate Company 2 Introduction This report has the purpose of determining if using the square footage of a house is appropriate to predict the selling price of houses in the US. We will use a sample data of the population to do our analysis. Using a model of linear regression can help us understand the how strong the relationship is between two variables, if there is one, and whether this relationship is positive or negative, which is appropriate for the purpose of our report. From our sample data, we will create a scatterplot which we expect to show a trend in listing price as it relates to the square footage. Square footage will be our predictor variable, meaning that the square feet of houses will help us estimate listing price. On the other hand, our response variable is listing price, which means that the value of houses will be dependent on square feet. Data Collection A random sample is necessary in order to prevent any bias, so we selected a random sample of 50 data values using the excel function “ =rand() in our worksheet. Our predictor variable will be square feet, as it will help us predict listing price, and our response variable is listing price, as this will be responding to a change in square feet. We expect these two to be related, which we will analyze more in depth later on.
Median Housing Price Model for D. M. Pan National Real Estate Company 3 Region State County square feet listing price New England vt rutland 4,162 523,600 West South Central la vernon 2,256 255,900 Mid Atlantic md frederick 1,246 263,600 West South Central tx brazos 1,835 242,500 East North Central in marion 3,408 323,300 Northeast ny st. lawrence 1,734 256,900 New England nh hillsborough 1,744 331,300 New England vt rutland 2,280 344,200 West South Central tx grayson 1,837 248,700 Northeast pa adams 1,566 247,000 West North Central mn crow wing 1,602 263,200 South Atlantic nc cleveland 1,681 250,700 West South Central la calcasieu 1,871 214,300 Northeast pa indiana 1,713 284,600 New England vt franklin 3,747 489,900 Mountain ut davis 2,351 486,700 New England nh grafton 2,491 358,200 West North Central mn blue earth 1,518 315,900 Pacific ca butte 1,875 406,200 South Atlantic nc davidson 2,571 504,500 Pacific or lane 1,845 404,600 Pacific ca marin 1,856 420,800 Northeast ny chautauqua 1,905 336,400 Northeast pa franklin 1,363 277,700 Northeast ny schenectady 2,031 331,300 East North Central mi ionia 1,416 193,600 West South Central la st. tammany 1,869 280,300 Northeast ny monroe 1,545 318,400 East South Central ms lee 2,066 273,100 East North Central oh muskingum 1,999 188,300 New England me york 1,991 370,000 South Atlantic fl miami-dade 4,810 792,900 East South Central tn greene 1,950 226,000 East North Central in allen 3,525 398,000 Mountain id twin falls 1,892 319,000 New England vt franklin 1,892 341,300 Pacific ca san diego 2,081 421,400 West South Central la bossier 2,177 305,500 South Atlantic sc berkeley 1,913 351,900 South Atlantic fl martin 3,919 592,000 West North Central ia cerro gordo 1,869 361,000 Mid Atlantic nj hudson 3,168 444,400 Mid Atlantic va roanoke 2,054 348,200 Pacific wa grays harbor 1,393 395,300 Mid Atlantic md wicomico 1,335 353,900 South Atlantic ga laurens 1,694 275,900 Northeast ny niagara 1,517 261,400 Pacific hi hawaii 4,927 856,600 West North Central mo cape girardeau 1,798 292,100
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Median Housing Price Model for D. M. Pan National Real Estate Company 4
Median Housing Price Model for D. M. Pan National Real Estate Company 5 Square Feet Listing Price Mean 2,190 353,929 $ Median 1,892 331,300 $ Std Dev 871.69 131,971.56 $ Min 1,246 188,300 $ Max 4,927 856,600 $ Spread 3,681 668,300 $ The histograms above provide a lot of valuable information represented in a visual way for an easier interpretation. We can see that the data is not in normal distribution, so we do not have a center. The center would normally be identified if we have a normal distribution, in which the bars in the histogram would be bell-shaped. Instead, we have a right-skewed graph which means that most houses are within the lower price range (left side). Given our selected random sample, square feet ranges from a low of 1,246 to a high of 4,927 – within this range, or “ spread ” is where we can most accurately predict house prices. We can see that the average listing price is $353,929. Standard deviation tells us how far values are to that mean, or average. When compared to the national population, we can see that our numbers are fairly close which means that our sample is representative of the national housing market.
Median Housing Price Model for D. M. Pan National Real Estate Company 6 Develop Regression Model Data Analysis As we can see in the above scatterplot, this is an appropriate model as most data points follow the trend line fairly close. We can clearly identify an increase in listing price as the square feet increases, which gives us a positive (upward) relationship between the square feet and listing price of houses. Some outliers were identified and removed from our data in order to have a stronger model, since they do not properly represent the national average. Keeping these outliers in our model could prevent us from having a more accurate prediction given that these are the exception to the rule when it comes to our data, and could weaken the correlation between square feet and pricing. Outliers can be easily identified visually; they are the dots that are furthest away from the trendline, and also in our histogram they are the bars next to the gap. By removing the one identified outlier, our correlation coefficient went from 0.78 to our current 0.817 – any correlation above 0.80 is considered strong. We can also see that this correlation is
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Median Housing Price Model for D. M. Pan National Real Estate Company 7 supported by our scatterplot given that it’s evident that as square feet increase, listing price does so as well in a very similar way, generally speaking. Using this scatterplot as a confirmation is a great way to validate our data. This is a strong model and an appropriate one for our analysis. Determine the Line of Best Fit Our regression equation is “ y_hat = 123.73x + 83024 which we obtained from our worksheet in excel using the square feet as the variable “ x ” and listing price as variable “ y .” This equation is what we will be using to predict listing price. In this regression model, the number next to “ x ” is considered the “ slope ” and it tells us how much the listing price will increase with each unit increase of square feet. The number on the right, considered the “ intercept” tells us the square feet when house listing price is “ $0 ”, which is not really of value in this context as house listing price is never zero. R -squared is 0.6679 provided our model. This means that 66.79% of the listing price of a house can be explained (or predicted) by the square feet. We can say that this is a great indicator for us to use when determining the listing price of a house based on square feet. Prediction using our regression model “ y_hat = 123.73x + 83024 : Square Feet Listing Price 1,500 $268,619.00
Median Housing Price Model for D. M. Pan National Real Estate Company 8 Conclusions Based on our analysis, we can determine that the model selected is an appropriate one to use when predicting the listing price of houses. Our model helped us identify that square feet is strongly correlated to selling price and hence it’s a great tool to use for our predictions. The equation developed can confidently help us estimate the selling house of a price. The sample we used was representative of the national house market, which otherwise could have given us different results. Before we analyzed our data, we did expect the square feet and house pricing to have a positive relationship, which were our findings in this report along with few exceptions or outliers. Figuratively, if we were to flip the numbers, perhaps a city developer or contractor could use listing price around a certain area to determine the size houses that they should build. This could be another problem that this type of analysis could help solve, if there was a need. As a follow up, I would suggest considering what factors would contribute to a higher house price, if we wanted to list it outside of our findings?