Data Visualization and Descriptive Statistics Case Study
This assignment will assess your ability to apply the concepts of chapters 1-3 to conduct an analysis of data on 4 variables collected from the 50 US states (Home Values).
You will calculate various descriptive statistics, create summary tables and create various charts to summarize the data for a company. Finally, you will create a written report for company leadership summarizing your findings. You will need to enable the Data Analysis ToolPak Add-in on your Data Ribbon in Excel to perform the necessary analyses. Refer to the video on how to enable the Data Analysis Add-in. Note: If you use a Mac, the Data Analysis ToolPak is only available in Excel 2016 or higher.
Please refer to the videos on how to create box plots, create histograms, create scatterplots, run descriptive statistics and run a correlation matrix in Excel.
Details of the assignment are contained in the document MBA 6350 Case Study 1.
I have also prepared a MS Word Template that you can use that can help with drafting a quality case study report in MS Word in APA format (it can also be found in the Helpful Resources module).
The case study is due on Sunday by 11:59 pm of Week 2. Please refer to the rubric for grading criteria.
Case Study – Data Visualization and Descriptive Statistics Resources:
Videos:
- How to Install the Data Analysis ToolPak in Excel (PC) (Links to an external site.)
- How to Install the Data Analysis ToolPak in Excel (Mac) (Links to an external site.)
- How to Create Box Plots in Excel (Links to an external site.)
- How to Create a Histogram in Excel (Links to an external site.)
- How to Create a Scatter Plot in Excel (Links to an external site.)
- How to Run Descriptive Statistics in Excel (Links to an external site.)
- How to Run a Correlation Matrix in Excel (Links to an external site.)
Case Study 1 – Data Visualization and Descriptive Statistics
The data file Home_Values.xlsx contains median home values ( Home Value ), median household income ( HH Inc ), median per capita ( Per Cap Inc ) and percent of homes that are owner occupied ( Pct Owner Occ ) for each of the 50 US states. Prior to a more detailed analysis of the data, a company wants to get a good understanding of the 4 variables (e.g. central tendency, variability, shape of the distribution, pattern of relationship between the variables). A company representative contracts with you to help with this process.
To help the company get a better understanding of the data, you are asked to perform the following analysis steps:
- Using Data>Data Analysis>Descriptive Statistics in Excel, calculate the mean, median, range and standard deviation of each variable and summarize the results in table.
- Using Excel, create a frequency histogram for each variable to determine the shape of the distributions. Be sure to give each chart a title and label the axes clearly.
- Using Excel, create boxplots for each variable. Be sure to give each chart a title and label the axes clearly.
- Using Excel, create scatterplots of each variable with each other variable (hint: you should have 6 scatterplots). Be sure to give each chart a title and label the axes clearly.
- Using Data>Data Analysis>Correlation in Excel, calculate the correlation coefficient each variable with each other variable.
- In Word, write a summary report of the findings that includes the tables and charts from steps 1-5 and includes the following:
- An introductory paragraph summarizes the purpose of the analysis.
- A section (1 or more paragraphs) describing what the tabular data from step 1 indicate about the central tendency, variability and distribution of each variable. For example, do the variables appear to be distributed in a symmetric or skewed pattern.
- A section (1 or more paragraphs) describing how the frequency histograms from step 2 and the boxplots from step 3 support and clarify the findings of the tabular data. Include in this section any evidence suggesting outliers in the data.
- A section (1 or more paragraphs) describing what the scatterplots from step 4 and correlations from step 5 indicate about the relationship between the various pairs of variables (e.g., are the variables related?, does the relationship appear to be linear or nonlinear?, is the direction of the relationship positive or negative?).
- A concluding paragraph summarizing the key findings of the analysis and making recommendations for the variable among HH Inc, Per Cap Incand Pct Owner Occ that is most strongly correlated with Home Value .
Note: Submit a single Excel workbook showing all work for steps 1-5 and a Word document of your summary report that addresses all parts step 6. Make sure to clearly label your tables and charts.
Home_Values
State | Home Value | HH Inc | Per Cap Inc | Pct Owner Occ |
Alabama | $117,600 | $42,081 | $22,984 | 71.1 |
Alaska | $229,100 | $66,521 | $30,726 | 64.7 |
Arizona | $215,000 | $50,448 | $25,680 | 67.4 |
Arkansas | $102,300 | $39,267 | $21,274 | 67.7 |
California | $458,500 | $60,883 | $29,188 | 57.4 |
Colorado | $236,600 | $56,456 | $30,151 | 67.6 |
Connecticut | $296,500 | $67,740 | $36,775 | 69.2 |
Delaware | $242,300 | $57,599 | $29,007 | 73.6 |
Florida | $205,600 | $47,661 | $26,551 | 69.7 |
Georgia | $161,400 | $49,347 | $25,134 | 67.2 |
Hawaii | $537,400 | $66,420 | $28,882 | 59.3 |
Idaho | $172,700 | $46,423 | $22,518 | 71.0 |
Illinois | $202,500 | $55,735 | $28,782 | 69.2 |
Indiana | $123,000 | $47,697 | $24,058 | 71.5 |
Iowa | $119,200 | $48,872 | $25,335 | 73.2 |
Kansas | $122,600 | $49,424 | $25,907 | 69.4 |
Kentucky | $116,800 | $41,576 | $22,515 | 69.9 |
Louisiana | $130,000 | $43,445 | $23,094 | 68.2 |
Maine | $176,200 | $46,933 | $25,385 | 73.1 |
Maryland | $329,400 | $70,647 | $34,849 | 69.0 |
Massachusetts | $352,300 | $64,509 | $33,966 | 64.0 |
Michigan | $144,200 | $48,432 | $25,135 | 74.2 |
Minnesota | $206,200 | $57,243 | $29,582 | 74.2 |
Mississippi | $96,500 | $37,881 | $19,977 | 70.8 |
Missouri | $137,700 | $46,262 | $24,724 | 70.0 |
Montana | $173,300 | $43,872 | $23,836 | 69.0 |
Nebraska | $123,900 | $49,342 | $25,229 | 68.6 |
Nevada | $254,200 | $55,726 | $27,589 | 60.1 |
New Hampshire | $253,200 | $63,277 | $31,422 | 72.6 |
New Jersey | $357,000 | $69,811 | $34,858 | 66.9 |
New Mexico | $158,400 | $43,820 | $22,966 | 69.6 |
New York | $303,900 | $55,603 | $30,948 | 55.2 |
North Carolina | $149,100 | $45,570 | $24,745 | 68.1 |
North Dakota | $111,300 | $46,781 | $25,803 | 66.6 |
Ohio | $136,400 | $47,358 | $25,113 | 69.2 |
Oklahoma | $104,300 | $42,979 | $23,094 | 68.2 |
Oregon | $252,600 | $49,260 | $26,171 | 63.8 |
Pennsylvania | $159,300 | $50,398 | $27,049 | 71.0 |
Rhode Island | $279,300 | $54,902 | $28,707 | 62.5 |
South Carolina | $134,100 | $43,939 | $23,443 | 69.9 |
South Dakota | $122,200 | $46,369 | $24,110 | 68.9 |
Tennessee | $134,100 | $43,314 | $23,722 | 69.6 |
Texas | $123,500 | $49,646 | $24,870 | 64.8 |
Utah | $218,100 | $56,330 | $23,139 | 71.2 |
Vermont | $208,400 | $51,841 | $27,478 | 71.4 |
Virginia | $255,100 | $61,406 | $32,145 | 68.9 |
Washington | $285,400 | $57,244 | $29,733 | 64.8 |
West Virginia | $94,500 | $38,380 | $21,232 | 74.6 |
Wisconsin | $169,000 | $51,598 | $26,624 | 69.5 |
Wyoming | $174,000 | $53,802 | $27,860 | 70.2 |
Case Study – Data Visualization and Descriptive Statistics
Introduction Comment by McCloud, Shawn K. (College of Business-Graduate): This document contains headings (like this) and is constructed in APA format. I embedded the instructions and you can use this as a template for the assignment.
This section should clearly summarize the purpose of the analysis you are undertaking (per instruction 6a). One or two paragraphs should suffice.
When completing this assignment please remember the primary objective of scientific reporting is clear communication (American Psychological Association [APA], 2010, p. 65). In this context, clear communication means presenting ideas in an orderly manner and by expressing yourself smoothly and precisely (APA, 2010, p. 65). To this end, your writing should be concise and communicate your ideas clearly and effectively (APA, 2010, p. 65). Comment by McCloud, Shawn K. (College of Business-Graduate): About APA writing – please follow these ‘guidelines’ when writing in this course.
Writing in APA style is concerned with economy of expression which means “say only what needs to be said” (APA, 2010, p.67). Moreover, “short words and short sentences are easier to comprehend than are long ones” (APA, 2010, p. 67). Per Strunk (2014), “a sentence should contain no unnecessary words, a paragraph no unnecessary sentences, for the same reason that a drawing should have no unnecessary lines and a machine no unnecessary parts” (p. 27). Additionally, per Strunk (2014), “prefer the specific to the general, the definite to the vague, the concrete to the abstract” (p. 24).
Overall, your goal should be to examine the statistical data, charts, graphs, etc. and then tell a story about it, per these guidelines (and adhering to the assignment instructions), so that you demonstrate your understanding of the data and can convey critical findings to a reader that may have much less statistical knowledge than you. Think of yourself as a data analyst that needs to put together a meaningful Executive Summary report for your non-statistical boss at work to easily read and understand.
Descriptive Statistics
This section should describe what the tabular data (from step 1 of the assignment) indicates about the central tendency, variability, and distribution of each variable (per instruction 6b). For example, do the variables appear to be distributed in a symmetric or skewed pattern?
Please remember to utilize statistical terms in describing the data and include applicable tables and/or charts (from steps 1-5 of the instructions and from your Excel data). One or two paragraphs should suffice and ensure any graphs or tables have titles and, if applicable, the axes are labeled clearly.
Frequency Histograms and Boxplots
This section should describe how the frequency histograms and the boxplots (from steps 2 and 3 of the assignment) support and clarify the findings of the tabular data (per instruction 6c). Be sure to include any evidence suggesting outliers in the data.
Please remember to utilize statistical terms in describing the data and include applicable tables and/or charts (from steps 1-5 of the instructions and from your Excel data). One or two paragraphs should suffice and ensure any graphs or tables have titles and, if applicable, the axes are labeled clearly.
Scatterplots and Correlations
This section should describe what the scatterplots and correlations (from steps 4 and 5) indicate about the relationship between the various pairs of variables (per instruction 6d). For example, all of the following points for all relationships should be addresses: 1) are the variables related; 2) does the relationship appear to be linear or nonlinear; and, 3) is the direction of the relationship positive or negative?
Please remember to utilize statistical terms in describing the data and include applicable tables and/or charts (from steps 1-5 of the instructions and from your Excel data). One or two paragraphs should suffice and ensure any graphs or tables have titles and, if applicable, the axes are labeled clearly.
Conclusion
This final section contains a concluding paragraph (per instruction 6e) summarizing the key findings of the analysis and making recommendations for the variable among HH Inc, Per Cap Inc and Pct Owner Occ that is most strongly correlated with Home Value.
References
If you use any references (although not required for this assignment) you would list them here… in alphabetical order. If you do not use any references please remove this Reference page before you submit your assignment. Following is how a reference should appear in APA immediately following the “References” heading on the page:
American Psychological Association. (2010). Publication manual of the American Psychological Association (Sixth). American Psychological Association.
Jaggia, S. & Kelly, A. (2019). Business Statistics: Communicating with Numbers 3rd Edition (3rd ed.). New York, NY: McGraw-Hill Education.
Strunk, W. (2014). The elements of style.
Data Visualization and Descriptive Statistics
Data-Driven Decision Making Project