Statistical Inference & Linear Regression Case Solution And Analysis, HBR Case Study Solution & Analysis of Harvard Case Studies

Question 1

Q1a). Two line graphs have been generated in the excel sheet and yes there seem to be some noticeable trends between pc knowledge between customers with and without PC. The variation for the pc knowledge for the customers without a PC is higher. Also, the PC knowledge is much higher on average for the customers that own a PC. The two graphs are as follows:

Q1b). The calculations are performed in excel which are as follows:

PC-Knowledge with PC
Mean	3.57
	Lower Limit	Upper Limit
Confidence Interval	3.26	3.89

PC-Knowledge without PC
Mean	2.55
	Lower Limit	Upper Limit
Confidence Interval	2.31	2.80

Q1c).The confidence intervals based upon the equal variance test are as follows:

CONFIDENCE INTERVAL
	Employees with Own PC	Employees with No PC
Mean	3.59	2.59
Z value at 95%	1.96	1.96
S.E	0.16	0.12
Lower Limit	3.277	2.354
Upper Limit	3.899	2.820

The results of the equal variance test are as follows:

t-Test: Two-Sample Assuming Equal Variances
	*Employees with Own PC*	*Employees with No PC*
Mean	3.588235294	2.586956522
Variance	0.855614973	0.647826087
Observations	34	46
Pooled Variance	0.73573677
Hypothesized Mean Difference	1.018
df	78
t Stat	-0.08619465
P(T<=t) one-tail	47%
t Critical one-tail	1.664624645
P(T<=t) two-tail	93%
t Critical two-tail	1.990847036

As the p-value is 93% which is higher than the level of significance therefore, it could be said that the difference between the two means is not significant and that the null hypothesis which states that the two means are same would be accepted.

Q1d). The sample size needed would be 82.17 for customer PC knowledge with a PC and 62.22 without a PC respectively. :

Sample Size

82.17

62.22

Q1e). The confidence interval for the true proportion of the PC-savvy customers is:

One sample t-test
Count	82
Mean	2.988
Standard deviation	1.000
standard error	0.110

Hypothetical mean	4
alpha	0.05
tails	1
df	81
t stat	-9.167
p value	0%
sig	Yes

Lower Control Limit	2.77
Upper Control Limit	3.20

Question 2

Q2a). The mean and standard deviation is as follows:

MEAN & SD

	Mean	Standard Deviation
Sony Pictures	63062074	73728582.89
Warner Bros.	73316434	81424660.81
20th Century Fox	74272230	78079986.82
Fox Searchlight	12410194	14759045.01
Universal	59017596	55201941.51

Q2b).The results of the One-Sample t-test are:

One Sample t-test

Count	103
Mean	58836846.7
Standard Deviation	69920396.24
Standard Error	6889461.356

Hypothetical Mean	50000000
alpha	0.05
tails	1
df	102
t stat	1.28266148
p value	10%

The mean total US gross does not exceed $50 million significantly for the five largest movie distributors as the p value is 10% which is above the level of significance (5%).

Q2c).The results of the One-Way ANOVA are as follows:

ANOVA
Source of Variation	SS	df	MS	F	P-value	F crit
Between Groups	4.738E+16	4	1.185E+16	2.572	0.042	2.465
Within Groups	4.513E+17	98	4.605E+15

Total	4.987E+17	102

As the P value is less than 5% and the F value is higher than F crit value, therefore it could be concluded that there are significant differences between the mean total US gross for the five popular distributors.

Q2d).The confidence intervals based on Tukey correction are:

TUKEY CORRECTION
	Total US Gross for Sony	Total Us Gross for Warner Bros.	Total US gross for 20th Century	Total US gross for Fox Searchlight	Total US gross for Universal
Mean	63062074.04	73316434.36	74272230	12410193.88	59017596.18
Count	23	22	24	17	17
Standard Deviation	73728582.89	81424660.81	78079986.82	14759045.01	55201941.51
S.E	15373472.26	17359796.01	15938010.57	3579594.207	13388437.39
Z value at 95%	1.96	1.96	1.96	1.96	1.96
Lower Control Limit	32930068.41	39291234.18	43033729.28	5394189.237	32776258.9
Upper Control Limit	93194079.68	107341634.5	105510730.7	19426198.53	85258933.46

The overall confidence intervals for all the distributors are:

Overall Total US GROSS
Mean	58836846.7
Count	103
Standard Deviation	69920396.24
S.E	6889461.356
Z value at 95%	1.96
Lower Control Limit	45333502.44
Upper Control Limit	72340190.96

As the confidence intervals for individual distributors are much wider than the confidence interval for the total US gross sales hence, all distributors have significantly different means.

Statistical Inference & Linear Regression Case Solution

Question 3

Q3a). The results of the regression model are as follows:

*Regression Statistics*
Multiple R	0.208203733
R Square	0.043348794
Adjusted R Square	0.034391386
Standard Error	8.354657572
Observations	540

ANOVA
	df	SS	MS	F	*Significance F*
Regression	5	1688.970113	337.7940226	4.839434894	0.000244522
Residual	534	37273.36188	69.80030314
Total	539	38962.33199

	*Coefficients*	*Standard Error*	*t Stat*	*P-value*	*Lower 95%*	*Upper 95%*	*Lower 95.0%*	*Upper 95.0%*
Intercept	-2.642159211	3.346530759	-0.789521867	0.430157473	-9.216138678	3.931820255	-9.216138678	3.931820255
GRI	-2.110460859	0.738857893	-2.85638264	0.004451872	-3.561885324	-0.659036393	-3.561885324	-0.659036393
SAT	0.005734797	0.002659567	2.156289466	0.031506883	0.0005103	0.010959295	0.0005103	0.010959295
MBA	-0.180646966	0.756643724	-0.238747723	0.811392803	-1.667010207	1.305716274	-1.667010207	1.305716274
AGE	-0.06889255	0.041817798	-1.647445675	0.100054737	-0.151040112	0.013255012	-0.151040112	0.013255012
TEN	-0.11872167	0.083502131	-1.421780125	0.155673863	-0.282754614	0.045311274	-0.282754614	0.045311274