Statistic Problems
STAT1070 Assignment 1
Please justify your answer to each question. This justification can involve hand calculation or providing relevant interpretation of output from SPSS and/or statstar.io. If a question requires hand calculation, please show your working. If a question requires output from statstar.io, please provide and refer to this output accordingly. Do not simply restate SPSS or statstar.io output, but provide concise interpretation of this output where appropriate.
Question 1. [16 marks]
A data set contains information about 120 different female dogs who have bred, 40 of which are labradors, 40 are golden retrievers, and 40 are German shepherds.
The variables are:
Breed breed of the dog
LitterSize the number of pups in the dog’s most recent litter
Weight the weight of the dog, in kilograms
Height the height of the dog, in centimetres
Temperament behaviour score, with 5 possible scores: very poor, poor, standard, good, and very good
⦁ [5 marks] What types of variables are Breed, LitterSize, Weight, Height, and Temperament? Provide a justification for each answer. Your mark for each variable will be based on this justification.
⦁ [2 marks] Using the output of Figures 1 and 2, describe the distribution of the weight of the dogs.
⦁ [3 marks] Using the output of Figures 3 and 4, describe the relationship (if any) between breed and temperament.
⦁ [3 marks] Using the output of Figures 5 and 6, describe the relationship (if any) between weight and height of the dogs.
⦁ [3 marks] Using the output of Figures 7 and 8, describe the relationship (if any) between litter size and breed.
Figure 1: A histogram and boxplot of the weights of the 120 dogs.
Figure 2: Descriptive statistics of the weights of the 120 dogs.
Figure 3: A stacked bar chart of the temperament by breed for the 120 dogs.
Figure 4: Counts and proportions of temperament by breed for the 120 dogs.
Figure 5: A scatterplot of weight vs height for the 120 dogs.
Figure 6: Correlation output of weight vs height for the 120 dogs.
Figure 7: Side-by-side boxplots of litter size by breed for the 120 dogs.
Figure 8: Descriptive statistics of litter size by breed for the 120 dogs.
Question 2. [15 marks]
To get full marks for the following questions you need to convert the question from words to a mathematical expression (i.e. use mathematical notation), defining your events where necessary, and using correct probability statements.
Suppose the University of Newcastle (UON) service area consists of the three Main Statistical Areas from which most students from the University of Newcastle live: the Central Coast (CC), Hunter excluding Newcastle (HEN), and Lake Macquarie and Newcastle (LMN) areas. According to the Australian Bureau of Statistics, 14.6% of residents in the CC area were born overseas, 8.4% of residents in the HEN area were born overseas, and 11.7% of residents in the LMN area were born overseas. Across the UON service area, 34.6% live in the CC area, 27.0% live in the HEN area, and 38.4% live in the LMN area.
Let B be the event that a resident was born overseas, CC be the event that a resident lives in the Central Coast, HEN be the event that a resident lives in the Hunter excluding Newcastle area, and LMN be the event that a resident lives in the Lake Macquarie and Newcastle area.
⦁ [3 marks] Construct a tree diagram that summarises the given probability information.
⦁ [2 marks] What is the probability that a randomly selected resident in the UON service area is a resident of the Central Coast and was born overseas?
⦁ [3 marks] What is the probability that a randomly selected resident in the UON service area was born overseas?
⦁ [2 marks] Are the events B and CC independent? Why or why not?
⦁ [3 marks] If a randomly selected resident in the UON service area was born overseas, what is the probability that he or she is a resident of the Hunter excluding Newcastle area?
⦁ [2 marks] In part (c), you found the probability that a randomly selected resident in the UON service area was born overseas. Can you infer that this probability is the same as the probability that a UON student was born overseas? Why or why not?
Question 3. [16 marks]
To get full marks for the following questions you need to convert the question from words to a mathematical expression (i.e. use mathematical notation), defining your random variables where necessary, and using correct probability statements.
Suppose that the IQ of adults is normally distributed with a mean of 100 and standard deviation of
15.
⦁ [2 marks] What IQ score distinguishes the highest 10%?
⦁ [3 marks] What is the probability that a randomly selected person has an IQ score between 91 and 118?
⦁ [2 marks] Suppose people with IQ scores above 125 are eligible to join a high-IQ club. Show that approximately 4.78% of people have an IQ score high enough to be admitted to this particular club.
⦁ [4 marks] Let X be the number of people in a random sample of 25 who have an IQ score high enough to join the high-IQ club. What probability distribution does X follow? Justify your answer.
⦁ [2 marks] Using the probability distribution from part (d), find the probability that at least 2 people in the random sample of 25 have IQ scores high enough to join the high-IQ club.
⦁ [3 marks] Let L be the amount of time (in minutes) it takes a randomly selected applicant to complete an IQ test. Suppose L follows a uniform distribution from 30 to 60. What is the probability that the applicant will finish the test in less than 45 minutes?