Statistics 2 for business
Individual Coursework Assignment 1
This assignment is worth 50% of the overall module mark and consists of 2 parts (Part I and Part II).
Part I. For this part you are required to perform an independent samples t-test using SPSS. Your solution should be word-processed and submitted electronically. Your solution should include any output produced from the analysis, and an account of the methods you used to obtain that output.
The problem:
Data file. Creditpromo
This is a hypothetical data file that concerns a department store’s efforts to evaluate the effectiveness of a recent credit card promotion (in dollars spent by cardholders). To this end, 500 cardholders were randomly selected. Half received an ad promoting a reduced interest rate on purchases made over the next three months. Half received a standard seasonal ad.
Your task is to perform analysis to find out if the type of an advertisement had an impact on the amount of dollars spent by credit holders.
Perform a suitable independent samples t-test on this data.
You should address the following questions, explaining your answers in detail:
(a) What is the population from which this sample was drawn?
(b) What is the purpose of Levene’s Test? – Explain why it is important that Levene’s Test is included in the output for this independent samples t-test
(c) What is the Null hypothesis for Levene’s Test?
(d) Do we accept or reject the null hypothesis for Levene’s Test? – Explain why.
(e) Which of the two t-tests labelled “Equal variances assumed” and “Equal variances not assumed” should we use? – Explain why.
(f) In order to apply the t-test, what assumption do we make about the distribution of the errors?
(g) What is the null hypothesis for the t-test?
(h) Do we accept or reject the null hypothesis for the t-test? – Explain why.
(i) Is it appropriate to use a one-tailed or two-tailed test here? – Explain why.
(j) What overall conclusion can we draw from this output? – include a reference to the minimum difference between the amount of dollars spent by credit holders who received a standard seasonal ad and those who received a promotional ad. that you would expect to find in the population.
Part II. Task Brief
For this coursework part II you are required to solve a regression problem using SPSS. Your solution should be word-processed and submitted electronically. Your solution should include any output produced from the analysis, and a detailed account of the methods you have used, the reasons you have chosen those particular methods, and the conclusions you have drawn.
The problem:
Data file. Car sales.
This data file contains hypothetical sales estimates and list prices for various makes and models of vehicles.
Analysts for a car sales company are aware that sales of cars depend on car prices, and would like to determine the relationship between list prices and sales more precisely. They collect data on list prices (in thousands of monetary units) and sales (thousands of cars) for six weeks.
Produce a suitable graph to investigate the relationship between the two variables, and report your findings.
Perform an appropriate regression analysis in SPSS, to predict sales figures given the price, and write a detailed report of your findings. Your report should address (but not necessarily be confined to) the following questions:
(a) What percentage of the variation in car sales is accounted for by your model?
(b) What is the equation of best fit, and how do you interpret the coefficients in your model?
(c) By how much, on average, can we expect sales to increase if the list price increases by 10 points?
(d) What assumptions are made about the distribution of the data. If you are able to test whether the assumptions appear to be true, report on your results.
(e) On average, what level of sales can we expect when the list price is 10 thousand (of monetary units)?
(f) In a “worst case scenario”, what is the lowest level of sales that we would expect when the temperature is 10 thousand (of monetary units)? (Use a confidence level of 95%.)
HERE ARE HYPOTHETICAL ANSWERS:
You should address the following questions, explaining your answers in detail:
(a) What is the population from which this sample was drawn?
Credit Card Holders
(b) What is the purpose of Levene’s Test? – Explain why it is important that Levene’s Test is included in the output for this independent samples t-test
Levene’s test is used to assess the equality of variances calculated for two or more groups.
Some common statistical procedures assume that variances of the populations from which different samples are drawn are equal. Levene’s testassesses this assumption.
Equal variances across samples is called homogeneity of variance. Some statisticaltests, for example the analysis of variance, assume that variances are equal across groups or samples. The Levene test can be used to verify that assumption
(c) What is the Null hypothesis for Levene’s Test?
It tests the null hypothesis that the population variances are equal (called homogeneity of variance or homoscedasticity).
(d) Do we accept or reject the null hypothesis for Levene’s Test? – Explain why.
If the resulting p-value of Levene’s test is less than some significance level (typically 0.05), the obtained differences in sample variances are unlikely to have occurred based on random sampling from a population with equal variances. Thus, the null hypothesis of equal variances is rejected and it is concluded that there is a difference between the variances in the population.
(e) Which of the two t-tests labelled “Equal variances assumed” and “Equal variances not assumed” should we use? – Explain why.
A value greater than .05 means that the variability in your two conditions is about the same. Use Equal variances assumed
A value Less than or equal to .05 means that the variability in your two conditions is different. Use equal variances not assumed
(f) In order to apply the t-test, what assumption do we make about the distribution of the errors?
The assumption is that the errors are normally distributed
(g) What is the null hypothesis for the t-test?
The alternative hypothesis assumes that some difference exists between the true mean (μ) and the comparison value (m0), whereas the null hypothesis assumes that no difference exists.
(h) Do we accept or reject the null hypothesis for the t-test? – Explain why.
The purpose of the one sample t–test is to determine if thenull hypothesis should be rejected, given the sample data.
A low p-value indicates decreased support for the null hypothesis thus null hypothesis rejected.
(i) Is it appropriate to use a one-tailed or two-tailed test here? – Explain why.
One tailed as we want to find out if advertising increased credit card purchases
(j) What overall conclusion can we draw from this output? – include a reference to the minimum difference between the amount of dollars spent by credit holders who received a standard seasonal ad and those who received a promotional ad. that you would expect to find in the population.
Depends on the data output
Produce a suitable graph to investigate the relationship between the two variables, and report your findings.
Scatter plot
Perform an appropriate regression analysis in SPSS, to predict sales figures given the price, and write a detailed report of your findings. Your report should address (but not necessarily be confined to) the following questions:
Linear regression (Effect of a one increase unit in the independent variable on the increase in the dependent variable)
(a) What percentage of the variation in car sales is accounted for by your model?
This is measured from the R coefficient
(b) What is the equation of best fit, and how do you interpret the coefficients in your model?
A trendline that bests represents the data on the scatter plot
(c) By how much, on average, can we expect sales to increase if the list price increases by 10 points?
By the price coefficient
(d) What assumptions are made about the distribution of the data. If you are able to test whether the assumptions appear to be true, report on your results.
Assumption made include; data is normally distributed, the error terms are also normally distributed, a linear relationship exist, the independent variables are not correlated wit each other
(e) On average, what level of sales can we expect when the list price is 10 thousand (of monetary units)?
We replace the price by 10 thousand in the regression model output and calculate the sales
(f) In a “worst case scenario”, what is the lowest level of sales that we would expect when the temperature is 10 thousand (of monetary units)? (Use a confidence level of 95%.)
Use the 95% conficdence level to calculate the interval and select the lowest