Statistical Analysis in Business and Research: Key Concepts
Statistical Analysis: Key Concepts and Applications
1. Sampling Methods and Statistical Notations
a) The University of North Florida (UNF) has 16,719 students, of which 60% are female. A statistician wants to estimate the mean number of credit hours per semester for students at UNF. He plans to randomly select 60 female and 40 male students. What sampling plan has he chosen? Stratified
b) A statistician wants to estimate the mean number of credit hours per semester for students at UNF. He plans to randomly select 5 classrooms and interview all the students there. What sampling plan has he chosen? Cluster
c) A statistician wants to estimate the mean number of credit hours per semester for students at UNF. He randomly selected 150 numbers from 1 to 16,719 and then he selected those students from a student list provided by the admissions department. What sampling plan has he chosen? Simple random
d) The mean of X = μ and the standard deviation of X is = σ/√n
e) If the confidence level decreases, then the confidence interval becomes wider. False
f) If the sample size decreases, then the confidence interval becomes shorter. False
g) Notation for confidence level is 1 – α
h) 90% C.I. for the mean is (137, 143). What is the error of this estimate? 3
i) 90% C.I. for the mean is (137, 143). What is the point estimate? 140
j) 90% C.I. for the mean is (137, 143). What is the confidence level of this estimate? 90%
k) If the confidence level decreases, then the confidence interval becomes wider. False
l) If the sample size decreases, then the confidence interval becomes shorter. False
m) Notation for significance level is α
n) If we reject Ho at a 5% significance level, then we must also reject it at the 1% level. False
o) If we do not reject Ho at a 5% significance level, then we must also not reject it at the 1% level. True
p) If we reject Ho at a 5% significance level, then we must also reject it at the 10% level. True
q) A Type II Error cannot be committed when: Ho is true
r) A Type I Error cannot be committed when: Ho is false
s) A Type II Error can be committed when: Ho is false
t) A Type I Error can be committed when: Ho is true
u) Probability of Type I Error is called Significance level
v) Power of the test is 1 – P(Type II error)
w) Probability of Type II error is denoted by β
x) Probability of Type I error is denoted by α
y) Notation for the power of the test is 1 – β
z) In a hypothesis test, if we increase α, then β also increases. False
2. Distribution of Sample Means
The grades of all the exams of one professor have a mean µ = 75 and a standard deviation σ = 4.
a) What is the distribution of sample means of samples of 36 exams? What about its mean and its standard deviation? Shape is normal because n = 36 > 30, mean = 75, standard deviation = 4/6.
b) What is the distribution of sample means of samples of 25 exams? What about its mean and its standard deviation? Shape is unknown because n = 25 < 30 and population distribution is unknown, mean = 75, standard deviation = 4/5.
3. Probability Calculations for Normally Distributed Data
The weekly output of a steel mill is normally distributed with a mean of 130 tons and a standard deviation of 20 tons.
a) What is the probability that the mill will produce less than 135 tons next week? P(X < 135) = P(Z < (135 – 130)/20) = P(Z < 0.25) = 0.5987.
b) What is the probability that the mean weekly output of 9 randomly selected weeks is less than 135 tons? P(X < 135) = P(Z < (135 – 130)/(20/√9)) = P(Z < 0.75) = 0.7734.
c) What is the probability that the mean weekly output of 52 randomly selected weeks is less than 135 tons? P(X < 135) = P(Z < (135 – 130)/(20/√52)) = P(Z < 1.80) = 0.9641.
4. Proportion Calculations
60% of UNF students are female. Suppose you decide to randomly select 15 UNF students.
a) What is the expected value of the proportion of females in this sample? E(p) = p = 0.6
b) What is the standard deviation of the proportion of females in this sample? √{(0.6)(1 – 0.6)/15} = 0.1265
c) Calculate the probability that the proportion in the sample is less than 35%. P(p < 0.35) = P(Z < (0.35 – 0.6)/0.1265) = P(Z < -1.98) = 0.0239
d) Calculate the probability that the proportion in the sample is more than 38%. P(p > 0.38) = P(Z > (0.38 – 0.6)/0.1265) = P(Z > -1.74) = 1 – 0.0409 = 0.9591
e) Calculate the probability that the proportion in the sample is between 38% and 41%. P((0.38 – 0.6)/0.1265 < Z < (0.41 – 0.6)/0.1265) = P(-1.74 < Z < -1.5) = 0.0259.
5. Confidence Interval for Mean Temperature
The temperature readings for 40 randomly selected winter days in Grand Rapids, MI have a mean of 5.5 F degrees. Assuming temperatures have a population variance of 2.1, determine the 95% confidence interval estimate for the mean winter temperature. Write your answer in a full sentence pertaining to this particular problem. 5.5 ± 1.96(√2.1/√40) = 5.5 ± 0.4491 = 5.0509 to 5.9491.
We are 95% confident that the average winter temperature in Grand Rapids is between 5.05°F and 5.95°F.
6. Sample Size for Estimating Average Expenditure
The manager of a newly opened Target store wants to estimate the average expenditure of his customers with a 99% confidence level. How many customers should he select to estimate the average with a margin of error of $10, knowing that an estimated standard deviation is $18? n = ((2.575 * 18)/10)2 = 21.48, therefore the manager should select 22 customers.
7. Confidence Interval for Employee Ages
Ages of employees in a company are normally distributed. A sample of ages of 16 employees yielded a mean of 32 and a standard deviation of 3.
a) Can we use the Z-interval, T-interval, or neither to estimate µ? Why? T-interval, because the standard deviation of the population is unknown and the population is normal.
b) What is the point estimate? 32
c) Estimate the average age of all employees with 98% confidence. 32 ± 2.602(3/√16) = 32 ± 1.9515 = 30.05 to 33.95 years.
d) Interpret the confidence interval found above. We are 98% confident that the average age of all employees in this company is between 30.05 years and 33.95 years.
e) What is the margin of error of this estimate? E = 1.9515
f) The math test scores in a certain university are normally distributed. A sample of 49 tests yielded a mean of 80 and a standard deviation of 5. Estimate the mean test score at this university with a 90% confidence. Interpret. 80 ± 1.677(5/√49) = 80 ± 1.198 = 78.8 to 81.2. We are 90% confident that the average math score in this university is between 78.8 and 81.2.
8. Sample Size for Estimating Defective Items
Calculate the sample size needed to estimate the percentage of defective items from a production line at Vistakon to within 2% with a 95% confidence level. We do not have any info about p, therefore we use 0.5; n = (0.5)(1 – 0.5)(1.96/0.02)2 = 2,401 items.
9. Confidence Interval for TV Audience Share
Nielsen Media Research uses samples of 5,000 households to rank TV shows. Suppose Nielsen reports that NFL Monday Night Football had 35% of the TV audience. What is the 99% confidence interval for this parameter? 0.35 ± 2.575√{(0.35)(1 – 0.35)/5000} = 0.35 ± 0.0174 = 33.3% to 36.7%.
10. Confidence Interval for Defective Items Proportion
The manager of a manufacturer found 11 defective items in a sample of 121 items.
a) What is the point estimate for the proportion of defective items in this manufacturer? p = 11/121 = 0.0909
b) Can we use a Z-interval to estimate the proportion of defective items in this manufacturer? Yes, because 121 * 0.0909 = 11 > 5 and 121(1 – 0.0909) = 110 > 5.
c) Estimate the proportion of defective items in this manufacturer with 95% confidence. 0.0909 ± 1.96√{[0.0909(1 – 0.0909)]/121} = 0.0909 ± 0.0512 = 0.04 to 0.14.
d) Interpret the confidence interval found above. We are 95% confident that the proportion of defective items at this manufacturer is between 4% and 14%.
e) What is the margin of error of this estimate? E = 0.0512
f) The manager is not satisfied with the length of the interval obtained. How many items should the manager select to estimate the proportion to within 1% with the same confidence? We have an estimate of p as 0.0909; n = 0.0909(1 – 0.0909)(1.96/0.01)2 = 3174.87, therefore select 3,175 items.
11. Hypothesis Testing for Order Delivery Time
Chicken Delight claims that 90% of its orders are delivered within 10 minutes of the order time. A researcher wants to test if less than 90% of the orders are delivered “on time”.
a) Write the Hypotheses of this test. Ho: p ≥ 0.9 vs HA: p < 0.9
b) Is this a right-tail, left-tail, or two-tail test? Left-tail
12. Error Probabilities in Disease Detection
The screening process for detecting a rare disease is not perfect. Researchers have developed a blood test that is considered fairly reliable. It gives a positive reaction in 96.4% of the people who have that disease. However, it erroneously gives a positive reaction in 2.3% of the people who do not have the disease. Consider the null hypothesis”the individual does not have the diseas” to answer the following questions.
a) What is the probability of Type I error? Ho: the individual does not have the disease. HA: the individual has the disease. Type I error is to conclude that the individual has the disease (positive reaction) when in fact the individual does not have the disease, therefore its probability is α = 0.023.
b) What is the probability of Type II error? Type II error is to conclude that the individual does not have the disease (negative reaction) when in fact the individual has the disease, therefore its probability is β = 1 – 0.964 = 0.036.
13. Hypothesis Testing for Average Cost of Tennis Shoes
A sample of 30 pairs of tennis shoes has an average cost of $70.25. Is there enough evidence at a 1% significance level to infer that the average cost is greater than $69.95? Assume the population standard deviation is $1.87.
a) Write the Hypotheses. Ho: µ ≤ 69.95 vs HA: µ > 69.95
b) Is this a right-tail, left-tail, or two-tail test? Right-tail
c) Calculate the Test Statistic (T.S.). TS = (70.25 – 69.95)/[1.87/√30] = 0.8787
d) Calculate the p-value. P(Z > 0.88) = 1 – 0.8106 = 0.1894
e) Do we reject or do we not reject Ho? Why? P-value = 0.1894 > α = 0.01, therefore we do not reject Ho.
f) Write the conclusion pertaining to this problem. There is NOT enough statistical evidence at a 1% significance level to infer that the average cost of tennis shoes is above $69.95.