MATU 203 CLASS PROJECT/SIGNATURE ASSIGNMENT INSTRUCTIONS
Parts 1 and 2 DUE: Sunday at 11:59pm in Week 4
Entire Paper, Parts 1 – 4 Due: Saturday 11:59pm in Week 8
An understanding of statistics extends beyond the ability to crunch numbers or use a software program to run statistical analyses. Mastering the concepts in
Introductory Statistics assists in building critical thinking skills, developing businesses and organizations, and solving problems that require data. This Final Class
Project/Signature Assignment is a synthesis of the knowledge obtained throughout the class. You will submit a paper that pulls together the statistics you have learned
and apply these concepts to a final project.
This Signature Assignment/Class Project due as separate research papers in Weeks 4 and 8, synthesizes the knowledge you obtained throughout the class. You will submit
the first half of the paper in Week 4 and the final paper in Week 8.
PROJECT:
Upon successful completion of this course, you decide that you can earn some extra money as a statistical consultant. You are offered five projects and you must choose
one of the topics/data sets below. You are told that the project will give you full credit for the assignment and a promotion if you have the best statistical research
analysis.
Your objective is to provide a written report that explains the context of the case, an analysis of the sampling methods, graphs that represent the information
provided, an explanation of any outliers, calculations, hypothesis test, and a description of the inferences you make from the results. As the consultant, you will
provide a cover page, research questions, report, hypothesis, substantiation, conclusion, references, appendix with data or supporting information, and recommendations
to finish out the report.
Your goal, and the purpose of this assignment, is to compile all pertinent information into one report. To help you compile the necessary material and to ensure you do
not miss a step, the assignment is broken down into four parts: Primary Situation/Data Analysis with Research, Examination of Descriptive Statistics, Examination of
Inferential Statistics, and Conclusion/Recommendations.
While, in actuality, this may be a final project for this class, these projects can pay thousands and even hundreds of thousands of dollars in an actual work
situation. Why? In order to improve medicine, business, crime rates, or society, we use data. You have the chance to try this in this class. You will explain your
calculations and symbolic operations in order to interpret social and economic trends for this project.
Parts 1 and 2 are due one Sunday at 11:59pm of Week 4.
A cover page, reference page, and appendix are required. The appendix comes at the end of the paper and should include, at a minimum, data from a newspaper/journal
article, research study, university/corporate report, or national data set. Put this in Appendix 1. Your cover page must have your name, title of the paper, type of
assignment project, course name/number, university name, term of enrollment, and optional picture that represents the study you are conducting. Your reference page
must have a minimum of two references outside of the textbook or materials in this class. Other appendices might include graphs or charts from outside sources,
although some of these should be in the body of the paper in order to present your findings so that the reader can see what you are conveying.
Part 1: Primary Data Analysis:
1. Before you can examine the data, you must understand the problem and determine the research questions. What statistics will explain your situation? What graphs,
charts, or tables will help readers
understand your points?
Your first two paragraphs should discuss the importance of this issue or situation, clearly defined population, explicitly defined sample, and research questions. In
these first two paragraphs, you also need a minimum of two references that present the history, background, or underlying ideas in order to frame the discussion. A
minimum of two references to outside sources such as data from a newspaper/journal article, research study, university/corporate report, or national data set. You must
cite your sources and put a minimum of two references in the reference page, not including the textbook.
Your next two paragraphs should introduce the company/organization and why it is important to them. Please include:
a. A description of the historical and practical context, variables, units, terminology, and example/application to the real world.
b. Clearly defined dependent and independent variables
c. Levels of measurement of the variables (nominal, ordinal, interval, ratio)
d. Sampling methodology
e. A minimum of one of each, preferably more: confounding, lurking, and missing variables. State the problems you might encounter with missing variables or
information.
f. Experimental or observational study; qualitative or quantitative research
g. One or more graphs, charts, images to give the reader a visual understanding of the background of the topic.
h. A description of the information you collect and how it relates to the big picture of social and economic trends.
Part 2: Examination of Descriptive Statistics
2. Your next step is to collect organize, and examine the data. This section is designed to present the calculations, graphs, and explanation of what you have found.
Since all inferential tests are based on several assumptions, before you conduct the inferential statistics, you want to make sure that you are not violating any
assumptions. In Part 2, you need to answer the following questions:
a. Are the scores normally distributed?
Construct a histogram, scatterplot, frequency polygon, or other graph to show the nature, shape, or distribution of the data to include in your paper. This graph will
appear in the body of the paper.
Hint #1: A visual inspection of this data can provide information on normality, outliers, spread, and shape. First, ask yourself: Is the graph bell-shaped or skewed?
After your inspection, you must perform calculations or use statistics to confirm normality, outliers, and information about the distribution.
Hint #2: If the distribution is symmetric, can it be considered normal? Read about skewed distributions to determine if the distribution is skewed.
b. Find the mean, median, and mode. These Measures of Central Tendency can help you better understand and describe your data.
Hint: Consider any outliers to see if they are drawing off the data. Describe these measures.
c. Variation: Find the range, variance, and standard deviation. These Measures of the Variation show the spread of the data. Describe the dispersion or the amount that
the sample values vary among themselves. You might also want to use the five number summary and provide a boxplot to consider your data or use percentiles and other
measures of spread.
d. Outliers: Identify any sample values that lie very far away from the vast majority of the other sample values.
Hint: You could do this using a visual inspection of the graph, but just looking at the graph is not enough for this course. Since you have learned statistics, you
must use a mathematical method for determining outliers for this report. After all, you are armed with statistics to prove whether or not
there is an outlier. You may either use the standard deviation method to identify extreme scores or 1.5 x IQR. Use whatever method you want to determine if there are
any outliers and explain what you did and the calculations you used to determine if any data lies outside of the boundaries.
e. Corrections: Based on your inspection of the outliers are there any errors that should be corrected? How would you correct them? Discuss the implications of this
result.
Part 3: Examination of Inferential Statistics
3. Assuming that all assumptions have been met, it is now time for you to conduct some inferential statistics. While you will need to do a hypothesis test, you will
also either (1) compute a confidence interval, (2) find out if there is a correlation, or (3) use a regression line. You need to describe your hypotheses, assumptions,
and tests used. For the hypothesis test:
a. Present your assumptions
b. State null and alternative hypotheses
c. Select the significance level and determine if it is a one or two-tailed test
d. Select your test statistic and compute the value (traditional method or p-value method)
e. Presentation of the confidence interval, correlation coefficient, or regression line
f. Make a decision and explain the inferential statistics and how they connect to the research questions and study’s purpose. Determine how the inferences you made
relate to the research you collected in Part 1 and the descriptive statistics you presented in Part 2.
Hint: A final conclusion that said “reject the null hypothesis” by itself without explanation is basically worthless to those who hired you. Similarly, stating that
the conclusion is false or rejected is not sufficient.
Part 4: Conclusion and Recommendations
Using the results from your hypothesis test, correlation, confidence interval, and/or other measures, explain what the results mean.
a. What can you determine from the descriptive and inferential statistics?
b. What information might lead you to a different conclusion?
c. What variables were needed to present a more thorough analysis?
d. What additional information would be valuable to help draw a more certain conclusion?
e. What qualitative or quantitative data would you want to collect if you were hired to do a follow up study?
f. Using the research you gathered and information you presented, what can you now state about the issue or topic that is further clarified or needs to be further
analyzed.
Final Comments Before you Begin
Computers and calculators are quite good at yielding results, but such results typically require the careful interpretation that breathes life into an otherwise
meaningless set of data and outcomes. We should recognize that a result is not automatically valid simply because it was computer-generated. Computers do not think,
even though they crunch out results.
The key in this study is to have your data make no sense when considered in the context of the real world. We should always apply the most important and indispensable
tool in all of statistics – common sense! So be sure to criticize, analyze, and connect the data to the question to be answered in a clear way.
Plagiarism Note:
Your paper will be submitted through Turnitin.com. You will get a percent that represents the amount your paper is similar to or copied from another source.
(1) You may not copy your paper or use a reference without quotes or giving credit to the original author.
(2) You may not use a paper you wrote from a prior course, even if it is the same class you are taking now.
(3) If your percent is on or below 30%, you are fine. If your Turnitin similarity percent is 30 – 60%, a warning may be sent either to your advisor or to a school
administrator and you will have to resubmit a new version with a late penalty determined by your instructor. If your percent is over 60%, you will receive an F on the
paper.
FORMAT: Word document, with in the text body graphs, reference section, and appended original data, outside information, and Excel documents.
Length:
Title page: You should have a cover page. The cover page should have the title of your study, your name, type of assignment, Brandman University, MATU 103, term of
enrollment, and year of attendance. You might find it helpful to include an image of something representative of the study on the cover. This is not necessary, but may
provide a visual context for your report.
APA or MLA Format with name and page number at the top of each page
Write Up/Body: The body should be 5-10 pages with the graphs, images, screen shots, data output, and text. Additional cover page, reference page, and appendix.
APA or MLA Format with name and page number on each page
References: Please include all articles, books, websites, publications, or other information that helped you reach your conclusion. The references come before the
appendix. At least two references are required.
Appendices: You must have an appendix(this comes after the reference section); this might be an additional 1-2 pages. In the appendix, include original data and
supplemental information. You may also want to include computations, but this is not necessary. It may be helpful when you refer to this later or if the reader wants
to review your computations.
Note: The first half of the paper with Part 1 and Part 2 of this assignment is due on Sunday night at 11:59pm of Week 4. The final paper with Parts 1-4 is due on
Saturday night at 11:59pm of Week 8.
You may choose any one of the five studies below for your final class project.
Study #1: Business/Sociology
Mickey and Minnie Home Sales, an Orange County real estate company, wants to know if there is a relationship between the home prices and weeks on the market prior to
the sale. They conjecture that the higher the price of the home, the longer the house will be on the market. They select a simple random sample of 20 recently sold
homes in Anaheim for analysis. In a separate analysis, they also want to know if the weeks on the market for their company is significant higher or lower than the
average of 12 weeks to sale. You are asked to analyze this data and present your results to the company with a hypothesis test on the average weeks to sale,
correlation between the two variables, and a regression analysis. Asking Price (thousands), X Weeks to Sale, Y Asking Price (thousands), X Weeks to Sale, Y Asking
Price (thousands), X Weeks to Sale, Y 150 6 180 6 400 10
250
17
600
22
300
18 200 8 300 9 200 9
320
12
210
8
240
14 80 5 300 16 65 7
120
8
460
14
900
13 300 12 670 15
Study #2: Nursing and Health Sciences
Premiere Hospital of Newport Beach (PHNB) wants to determine whether or not to serve caffeinated coffee in the waiting rooms for their patients. At this point, they
are not concerned about the visitors, only people who will be treated at the hospital. Since similar facilities do serve coffee, tea, water, and other beverages, they
want to determine if there is sufficient evidence to show that caffeine increases heart rate. The hospital tested a random sample of 21 patients of varying ages who
drank caffeinated beverages in the hospital’s emergency area waiting room on a single day at three different times of the day. You are asked to analyze this data and
present your results to Premiere with a hypothesis test of whether caffeine significantly increases heart rate, correlation between the two variables, and a regression
analysis. PHNB would also like you to present national research data/results to support the results of their study. Since they also have locations in Huntington Beach
and Irvine, they want to use this data to decide what policy to implement at all of their facilities. Please present a report to the Premiere Hospital administrators
that includes your data and supporting evidence from national studies.
Heart Rate Before Coffee Heart Rate After Coffee Heart Rate Before Coffee Heart Rate After Coffee Heart Rate Before Coffee Heart Rate After Coffee 80 95 75 85 72 88
70
100
88
98
75
79 80 84 70 95 60 72
65
65
78
80
72
90 77 84 70 80 60 65 60 74 62 78 68 78 90 100 50 60 65 75
Study #3: Psychology
Some people believe that reading text printed on colored paper is easier than reading text on white paper. For this reason, disability services locations often provide
colored filters that are clear to put over top of books and imprinted pages. Irvine University, a public research institution with 30,000 undergraduate students is
considering the possibility of providing colored filters or materials on colored paper for those people in disability services who would be helped by this tool. Irvine
University randomly tested 21 students in their learning disability program during the first week of classes in Fall 2015. The students, of varying ages, abilities,
majors, and socioeconomic status were timed as they read a passage printed in black ink printed on white paper and timed reading the same passage printed with black
ink on light blue-colored paper. Students were tested on one day with a passage on white paper and then asked to willfully return the next day to read a similar, but
different passage of the same length and difficulty level on light blue-colored paper. Is there sufficient evidence to show that reading text on light blue-color paper
is more effective? You are asked to analyze this data and present your
results to the company with a hypothesis test on whether or not there is significant improvement, a correlation between the two variables, and a regression analysis.
Also include in your report research from journals, articles, and national studies that show whether or not there is a relationship between colored paper and
improvement in reading.
Light Blue-Colored Paper White Paper Light Blue-Colored Paper White Paper Light Blue-Colored Paper White Paper 70 65 79 68 49 44
113
100
67
60
75
69 50 40 73 71 60 58
138
120
67
74
82
75 77 74 70 68 88 78
120
100
110
90
80
74 100 90 95 85 100 85
Study #4: Sociology – Homeless Veterans
You were hired to better understand the plight of homelessness in California. You learn that there are more veterans in California than any other state and you need a
better understanding of whether or not there is a correlation between homelessness and education, homelessness and crime, and homelessness and mental illness. You
survey 100 homeless people, interviewing 20 from each of the following five cities: Los Angeles, San Diego, San Francisco, Sacramento, and Bakersfield. The project
turns out to be a huge endeavor, so you decided to take a small chunk of the project to present for your statistics class.
You decide to study 20 veterans of 100 homeless people (about a third were veterans) in order to better understand homelessness in California. You choose to determine
if education level and length of time on the streets is related. Members of the sample have ages ranging from 25 – 70. You ask each participant how long many months
they have lived on the streets and the grade they finished in school (one year of college would equal 13 years of school, 4 years of college would equal 16 years of
school). Some got a GED and had less than 12 years of formal education. You are asked by the State of California to analyze this data and present your results with a
hypothesis test on whether or not there is significant relationship between education and homelessness in veterans, a correlation between the two variables, and
present a regression analysis. You will want to back up your data with U.S. Census data, information from the Department of Housing and Urban Development, and other
sources. From this information do you think there is a problem? What does the data suggest about why people are homeless and how to support the homeless?
Education in years of school Homelessness in months on the street Education in years of school Homelessness in months on the street Education in years of school
Homelessness in months on the street 16 5 16 18 12 14 13 10 12 20 13 9 12 4 13 17 10 18 12 12 14 4 12 35
18 7 9 8 16 28 12 10 10 9 15 14 16 20 15 15
Study #5: Criminal Justice
The FBI collects national data on a variety of areas of criminal justice. The FBI’s Uniform Crime Reporting (UCR) collects and provides data on arrests, citations, and
summonses, in geographic regions. Many public bodies seek to prevent violent, property, and drug crimes and understand the issues surrounding criminal behavior. You
are asked to determine if there is a relationship between poverty and crime and, if so, better understand how this cycle can be broken in order to prevent burglary,
theft, forgery, counterfeiting, fraud, smuggling, drug distribution, vandalism, rape, assault, and murder. In the light of the correlation coefficient you present and
the hypothesis test you perform, do you think that poverty causes crime or is it something else?
Since arrests do not equal convictions and number of convictions is not a good variable with the Three Strike Law in California, you decide to ask 20 random people in
and around the Los Angeles courthouse over the month of July to voluntarily give you their family income and self-reported incidents of criminal activity for your
study. Half of them were awaiting trials of various types and half were bystanders.
Family Income (in thousands) # of self-reported incidents of criminal activity in their lifetime Family Income # of Arrests Family Income # of Arrests 160 1 25 10 15
14 5 8 40 5 28 10 12 0 32 13 30 8 100 1 14 20 120 0 18 7 90 1 30 10 200 0 30 9 40 4 16 2 40 6
Prepare this analysis for your local law enforcement agency examining the statistics using information provided and supporting evidence from reference/research
sources. Analyze this data and present your results with a hypothesis test on whether or not there is significant relationship between family income and number of
incidents of criminal activity. Is there a correlation between the two variables? Present a regression line with a graph to look at where the data lie and explain the
data.