Case Study #4 will assess your ability to apply the concepts of chapter 10 to conduct simple and multiple regression analyses to create a prediction model for home prices based on up to four independent variables. You will calculate various descriptive statistics, create summary tables, create various charts and develop five regression prediction models. Finally, you will create a written report summarizing your findings. You will need to use the Data Analysis ToolPak Add-in as you did for the previous two case studies.
The data file contains data for a random sample of 1,000 houses located in the greater Baltimore, MD area. The data fields included are as follows:
- Home Price
- Living area (square feet)
- Number of bedrooms
- Number of bathrooms
- Age (years)
In developing both your model and the report, address the items below.
- There are numerous variables that are believed to be predictors of housing prices, including the ones in the data set for this project. Using the web, find the key variables that determine home price including any not include in this data set.
- Using Data>Data Analysis>Descriptive Statistics in Excel, calculate the mean, median, range and standard deviation of each variable and summarize the results in table.
- Using Excel, create histograms for price of the home, living area (square feet) and age of the home. Be sure to give each chart a title and label the axes clearly.
- Using Excel, create scatterplots of each variable with each other variable. Be sure to give each chart a title and label the axes clearly.
- Using Data>Data Analysis>Correlation in Excel, calculate the correlation coefficient each variable with each other variable.
- Using Data>Data Analysis>Regression in Excel, run 4 separate simple regression models to predict the dependent variable (price of the home) with each of the independent variables. Use an alpha level of 0.05 to determine significance.
- Using Data>Data Analysis>Regression in Excel, run a multiple regression model to predict the dependent variable with all 4 independent variables. Use an alpha level of 0.05 to determine significance.
- In Word, write a summary report of the findings that includes the tables, charts and regression analyses from steps 1-7 and includes the following:
- An introductory paragraph summarizes the purpose of the analysis. Also include information that found in your web search about the key variables that determine home price.
- A section (1 or more paragraphs) describing what the tabular data from step 2 indicate about the central tendency, variability and distribution of each variable. For example, do the variables appear to be distributed in a symmetric or skewed pattern?
c. A section (1 or more paragraphs) describing how the frequency histograms from step 3 support and clarify the findings of the tabular data. Include in this section any evidence suggesting outliers in the data.
- A section (1 or more paragraphs) describing what the scatterplots from step 4 and correlations from step 5 indicate about the relationship between the various pairs of variables (e.g., are the variables related?, does the relationship appear to be linear or nonlinear?, is the direction of the relationship positive or negative?).
- A section (1 or more paragraphs) summarizing the findings of the 4 simple regression models from step 6. Which models (if any) show that the independent variable in the model is a significant predictor of price of the home? Which models (if any) show that the independent variable in the model is not a significant predictor of price of the home? Which model is the best fitting? Which model is the poorest fitting?
- A section (1 or more paragraphs) summarizing the findings of the multiple regression model from step 7. Which variables in the model (if any) show that are a significant predictor of price of the home? Which variables in the model (if any) show that are not a significant predictor of price of the home? Does the multiple regression model provide a better fit than the best fitting simple regression model?
- A concluding paragraph summarizing the key findings of the analysis and making about which model is the best fitting. Based on your web research, indicate any other variables that are not included in the current best fitting model that might improve the fit if they were included.
Submit a single Excel workbook showing all work for Steps 2-7 and a Word document of your summary report that addresses all parts of Step 8 and that also includes/interweaves all supporting tables and charts from Steps 2-7 (to tell a story with the data and through visualization means).