Statistical Inference & Linear Regression Harvard Case Solution & Analysis

Question 1

Q1a). Two line graphs have been generated in the excel sheet and yes there seem to be some noticeable trends between pc knowledge between customers with and without PC. The variation for the pc knowledge for the customers without a PC is higher. Also, the PC knowledge is much higher on average for the customers that own a PC. The two graphs are as follows:

Q1b). The calculations are performed in excel which are as follows:

PC-Knowledge with PC
Mean 3.57
Lower Limit Upper Limit
Confidence Interval 3.26 3.89
PC-Knowledge without PC
Mean 2.55
Lower Limit Upper Limit
Confidence Interval 2.31 2.80

 Q1c).The confidence intervals based upon the equal variance test are as follows:

CONFIDENCE INTERVAL
Employees with Own PC Employees with No PC
Mean 3.59 2.59
Z value at 95% 1.96 1.96
S.E 0.16 0.12
Lower Limit 3.277 2.354
Upper Limit 3.899 2.820

 The results of the equal variance test are as follows:

t-Test: Two-Sample Assuming Equal Variances
  Employees with Own PC Employees with No PC
Mean 3.588235294 2.586956522
Variance 0.855614973 0.647826087
Observations 34 46
Pooled Variance 0.73573677
Hypothesized Mean Difference 1.018
df 78
t Stat -0.08619465
P(T<=t) one-tail 47%
t Critical one-tail 1.664624645
P(T<=t) two-tail 93%
t Critical two-tail 1.990847036

 As the p-value is 93% which is higher than the level of significance therefore, it could be said that the difference between the two means is not significant and that the null hypothesis which states that the two means are same would be accepted.

Q1d).  The sample size needed would be 82.17 for customer PC knowledge with a PC and 62.22 without a PC respectively. :

Sample Size 82.17 62.22

 Q1e). The confidence interval for the true proportion of the PC-savvy customers is:

One sample t-test
Count 82
Mean 2.988
Standard deviation 1.000
standard error 0.110
Hypothetical mean 4
alpha 0.05
tails 1
df 81
t stat -9.167
p value 0%
sig Yes
Lower Control Limit 2.77
Upper Control Limit 3.20

Question 2

Q2a). The mean and standard deviation is as follows:

MEAN & SD
Mean Standard Deviation
Sony Pictures 63062074 73728582.89
Warner Bros. 73316434 81424660.81
20th Century Fox 74272230 78079986.82
Fox Searchlight 12410194 14759045.01
Universal 59017596 55201941.51

Q2b).The results of the One-Sample t-test are:

One Sample t-test
Count 103
Mean 58836846.7
Standard Deviation 69920396.24
Standard Error 6889461.356
Hypothetical Mean 50000000
alpha 0.05
tails 1
df 102
t stat 1.28266148
p value 10%

The mean total US gross does not exceed $50 million significantly for the five largest movie distributors as the p value is 10% which is above the level of significance (5%).

Q2c).The results of the One-Way ANOVA are as follows:

ANOVA
Source of Variation SS df MS F P-value F crit
Between Groups 4.738E+16 4 1.185E+16 2.572 0.042 2.465
Within Groups 4.513E+17 98 4.605E+15
Total 4.987E+17 102

As the P value is less than 5% and the F value is higher than F crit value, therefore it could be concluded that there are significant differences between the mean total US gross for the five popular distributors.

Q2d).The confidence intervals based on Tukey correction are:

TUKEY CORRECTION
Total US Gross for Sony Total Us Gross for Warner Bros. Total US gross for 20th Century Total US gross for Fox Searchlight Total US gross for Universal
Mean 63062074.04 73316434.36 74272230 12410193.88 59017596.18
Count 23 22 24 17 17
Standard Deviation 73728582.89 81424660.81 78079986.82 14759045.01 55201941.51
S.E 15373472.26 17359796.01 15938010.57 3579594.207 13388437.39
Z value at 95% 1.96 1.96 1.96 1.96 1.96
Lower Control Limit 32930068.41 39291234.18 43033729.28 5394189.237 32776258.9
Upper Control Limit 93194079.68 107341634.5 105510730.7 19426198.53 85258933.46

 The overall confidence intervals for all the distributors are:

Overall Total US GROSS
Mean 58836846.7
Count 103
Standard Deviation 69920396.24
S.E 6889461.356
Z value at 95% 1.96
Lower Control Limit 45333502.44
Upper Control Limit 72340190.96

As the confidence intervals for individual distributors are much wider than the confidence interval for the total US gross sales hence, all distributors have significantly different means.

Statistical Inference & Linear Regression Case Solution

Question 3

Q3a). The results of the regression model are as follows:

Regression Statistics
Multiple R 0.208203733
R Square 0.043348794
Adjusted R Square 0.034391386
Standard Error 8.354657572
Observations 540

 

ANOVA
  df SS MS F Significance F
Regression 5 1688.970113 337.7940226 4.839434894 0.000244522
Residual 534 37273.36188 69.80030314
Total 539 38962.33199
  Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Lower 95.0% Upper 95.0%
Intercept -2.642159211 3.346530759 -0.789521867 0.430157473 -9.216138678 3.931820255 -9.216138678 3.931820255
GRI -2.110460859 0.738857893 -2.85638264 0.004451872 -3.561885324 -0.659036393 -3.561885324 -0.659036393
SAT 0.005734797 0.002659567 2.156289466 0.031506883 0.0005103 0.010959295 0.0005103 0.010959295
MBA -0.180646966 0.756643724 -0.238747723 0.811392803 -1.667010207 1.305716274 -1.667010207 1.305716274
AGE -0.06889255 0.041817798 -1.647445675 0.100054737 -0.151040112 0.013255012 -0.151040112 0.013255012
TEN -0.11872167 0.083502131 -1.421780125 0.155673863 -0.282754614 0.045311274 -0.282754614 0.045311274

 ................................

This is just a sample partial case solution. Please place the order on the website to order your own originally done case solution.

Share This

SALE SALE

Save Up To

30%

IN ONLINE CASE STUDY

FOR FREE CASES AND PROJECTS INCLUDING EXCITING DEALS PLEASE REGISTER YOURSELF !!

Register now and save up to 30%.