Statistical Inference & Linear Regression Harvard Case Solution & Analysis

Question 1

Q1a). Two line graphs have been generated in the excel sheet and yes there seem to be some noticeable trends between pc knowledge between customers with and without PC. The variation for the pc knowledge for the customers without a PC is higher. Also, the PC knowledge is much higher on average for the customers that own a PC. The two graphs are as follows:

Q1b). The calculations are performed in excel which are as follows:

PC-Knowledge with PC
Mean3.57
Lower LimitUpper Limit
Confidence Interval3.263.89
PC-Knowledge without PC
Mean2.55
Lower LimitUpper Limit
Confidence Interval2.312.80

 Q1c).The confidence intervals based upon the equal variance test are as follows:

CONFIDENCE INTERVAL
Employees with Own PCEmployees with No PC
Mean3.592.59
Z value at 95%1.961.96
S.E0.160.12
Lower Limit3.2772.354
Upper Limit3.8992.820

 The results of the equal variance test are as follows:

t-Test: Two-Sample Assuming Equal Variances
 Employees with Own PCEmployees with No PC
Mean3.5882352942.586956522
Variance0.8556149730.647826087
Observations3446
Pooled Variance0.73573677
Hypothesized Mean Difference1.018
df78
t Stat-0.08619465
P(T<=t) one-tail47%
t Critical one-tail1.664624645
P(T<=t) two-tail93%
t Critical two-tail1.990847036

 As the p-value is 93% which is higher than the level of significance therefore, it could be said that the difference between the two means is not significant and that the null hypothesis which states that the two means are same would be accepted.

Q1d).  The sample size needed would be 82.17 for customer PC knowledge with a PC and 62.22 without a PC respectively. :

Sample Size82.1762.22

 Q1e). The confidence interval for the true proportion of the PC-savvy customers is:

One sample t-test
Count82
Mean2.988
Standard deviation1.000
standard error0.110
Hypothetical mean4
alpha0.05
tails1
df81
t stat-9.167
p value0%
sigYes
Lower Control Limit2.77
Upper Control Limit3.20

Question 2

Q2a). The mean and standard deviation is as follows:

MEAN & SD
MeanStandard Deviation
Sony Pictures6306207473728582.89
Warner Bros.7331643481424660.81
20th Century Fox7427223078079986.82
Fox Searchlight1241019414759045.01
Universal5901759655201941.51

Q2b).The results of the One-Sample t-test are:

One Sample t-test
Count103
Mean58836846.7
Standard Deviation69920396.24
Standard Error6889461.356
Hypothetical Mean50000000
alpha0.05
tails1
df102
t stat1.28266148
p value10%

The mean total US gross does not exceed $50 million significantly for the five largest movie distributors as the p value is 10% which is above the level of significance (5%).

Q2c).The results of the One-Way ANOVA are as follows:

ANOVA
Source of VariationSSdfMSFP-valueF crit
Between Groups4.738E+1641.185E+162.5720.0422.465
Within Groups4.513E+17984.605E+15
Total4.987E+17102

As the P value is less than 5% and the F value is higher than F crit value, therefore it could be concluded that there are significant differences between the mean total US gross for the five popular distributors.

Q2d).The confidence intervals based on Tukey correction are:

TUKEY CORRECTION
Total US Gross for SonyTotal Us Gross for Warner Bros.Total US gross for 20th CenturyTotal US gross for Fox SearchlightTotal US gross for Universal
Mean63062074.0473316434.367427223012410193.8859017596.18
Count2322241717
Standard Deviation73728582.8981424660.8178079986.8214759045.0155201941.51
S.E15373472.2617359796.0115938010.573579594.20713388437.39
Z value at 95%1.961.961.961.961.96
Lower Control Limit32930068.4139291234.1843033729.285394189.23732776258.9
Upper Control Limit93194079.68107341634.5105510730.719426198.5385258933.46

 The overall confidence intervals for all the distributors are:

Overall Total US GROSS
Mean58836846.7
Count103
Standard Deviation69920396.24
S.E6889461.356
Z value at 95%1.96
Lower Control Limit45333502.44
Upper Control Limit72340190.96

As the confidence intervals for individual distributors are much wider than the confidence interval for the total US gross sales hence, all distributors have significantly different means.

Statistical Inference & Linear Regression Case Solution

Question 3

Q3a). The results of the regression model are as follows:

Regression Statistics
Multiple R0.208203733
R Square0.043348794
Adjusted R Square0.034391386
Standard Error8.354657572
Observations540

 

ANOVA
 dfSSMSFSignificance F
Regression51688.970113337.79402264.8394348940.000244522
Residual53437273.3618869.80030314
Total53938962.33199
 CoefficientsStandard Errort StatP-valueLower 95%Upper 95%Lower 95.0%Upper 95.0%
Intercept-2.6421592113.346530759-0.7895218670.430157473-9.2161386783.931820255-9.2161386783.931820255
GRI-2.1104608590.738857893-2.856382640.004451872-3.561885324-0.659036393-3.561885324-0.659036393
SAT0.0057347970.0026595672.1562894660.0315068830.00051030.0109592950.00051030.010959295
MBA-0.1806469660.756643724-0.2387477230.811392803-1.6670102071.305716274-1.6670102071.305716274
AGE-0.068892550.041817798-1.6474456750.100054737-0.1510401120.013255012-0.1510401120.013255012
TEN-0.118721670.083502131-1.4217801250.155673863-0.2827546140.045311274-0.2827546140.045311274

 ................................

This is just a sample partial case solution. Please place the order on the website to order your own originally done case solution.

Statistical Inference & Linear Regression Case Solution Other Similar Case Solutions like

Statistical Inference & Linear Regression

Share This