# Probability and statistics final exam

PROBABILITY AND STATISTICS

## Final Exam

This is an open-book take-home exam. Good luck!

1.  Let Xi be the life length of an item. Consider X1, X2,…Xn to be independently and identically distributed, each with normal distribution N(m,s2). Assume that s2=16, but that m is unknown. Suppose 100 tests yield an average life of =501.2 hours.

a) Construct a 95% confidence interval for the reliability of the item for a service time of t hours given by

R(t; m)=P(X>t).

b) Compute numerical values for a) if t = 500 hours.

2.  For a random sample of size n from f(x|θ)= θLθx-(1+θ) for x>L, where L is known and θ>0,

a) Find the maximum likelihood estimator of θ and express it as a function of g= , the geometric mean of the observations.

b) Find the set of admissible rejection regions in terms of g for a likelihood ratio test of H0: θ=5 versus H1: θ=2.

3.  For a normal data-generating process with m and s not known but the coefficient of variation c=s/m known, find the maximum likelihood estimates of m and s2 if c=0.25 and the data are: 16, 27, 24, 21, 23, 12, 21, 18, 17, 23. Compare these estimates with estimates that would be obtained if no information were available concerning c.

4.  In a survey, some of the questions concern sensitive issues (e.g., income, drug use, sexual experiences).  As a result, some respondents do not answer the questions truthfully.  Denote the proportion of the members of a particular population that had incomes over \$100,000 last year by p.  A random sample of n members of this population is taken, and each person in the sample is asked “Was your income over \$100,000 last year?”  If a person really had an income over \$100,000, the probability that she will give a truthful answer to this question is 1-l1.  If a person’s income was not over \$100,000, the probability that she will give a truthful answer is 1-l2.  From past experience, l1 and l2 are known, with 0<l1<0.5, 0<l2<0.5.

a) For a sample of size one, find the likelihood function if the answer is “yes” and find the likelihood function if the answer is “no.”

b) For a random sample of size n, find the likelihood function and sufficient statistics.

c) Find the maximum likelihood estimator for p.

d) Assume that l1=0.1, l2=0, and there is one “yes” answer in a random sample of size 10. What is your best estimate of p and why?

e) Consider the same scenario as in (d), but assume that l1 is unknown (0<l1<1). In this case, what would be your best estimate of p and why?

5.  Let X1, X2,…Xn be the times in months until failure of n similar pieces of equipment. If the equipment is subject to wear, a model often used is the one where X1, X2,…Xn (i.i.d) is a sample from a Weibull distribution with density , xi>0.

Here c is a known positive constant and l>0 is the (scale) parameter of interest.

a) Show that is an optimal test statistic for testing H0: 1/l<1/l0 versus H1: 1/l>1/l0, i.e., show that for a UMP test, the rejection and acceptance regions are defined in terms of the statistic .

b) If random variable X has a Weibull distribution specified above, find the distribution of the random variable .

6.  A journal editor says: “If we only publish papers with results that are statistically significant at the a=0.05 level, at most 5% of our papers will have erroneous results.” Denote by p the proportion of researchers with true H0 and false H1. Suppose that each researcher performs one test, sends the paper to the journal, and the paper is accepted if the results of the test are significant at the a=0.05 level.

a) If in a given year the journal publishes n papers, find the distribution of the papers with erroneous results that are published in this year. Assume that all the tests in all papers have the same b, probability of type II error.

b) What is this distribution if p=1, i.e., if all researchers, submitting the papers this year, had true H0 and false H1?

c) Overall, comment on the above statement of a journal editor.

7.  Suppose that a single observation X is to be drawn from an unknown distribution P, and that the following simple hypotheses are to be tested:

H0: P is a uniform distribution on the interval [0,1],

H1: P is a standard normal distribution.

Determine the most powerful test of size 0.01, and calculate the power of the test when H1 is true.

8.  An unethical experimenter desires to test the following hypotheses:

H0: q=q0,

H1: q¹q0.

She draws a random sample X1, X2,…Xn from a distribution with the pdf f(x|q) and carries out a test of size a. If this test does not reject H0, she discards the sample, draws a new independent random sample of n observations, and repeats the test based on the new sample. She continues drawing new independent samples in this way until she obtains a sample for which H0 is rejected.

a)     What is the overall size of this testing procedure?

b)     If H0 is true, what is the distribution of the number of samples that the experimenter will have to draw until she rejects H0? In particular, what is the expected number of samples for a=0.05?

9.  Consider the following situation. There are N job applicants, and, with probability pi, ni of them (i=1,2,…M; 0<ni <N) are invited for an interview. All pi and ni are known to all job applicants, and if ni applicants are invited, then each of N applicants has the same chance ni/N to be invited.

a) Given that a job applicant is invited for an interview, what are her expectations about the total number of applicants invited for an interview? 1) Find the corresponding probability distribution – i.e., the posterior distribution (conditional on an applicant being invited for an interview) for the number of applicants invited for an interview. 2) For this distribution, find the expected number of invited applicants.

b) Assume that if ni applicants are invited, each of them has equal (1/ni) chance of getting a job. Before the applicant is invited, what are her chances of getting a job? After the applicant is invited, what are her chances of getting a job? What are the chances to be invited? Do these three numbers agree with each other?

c) Repeat questions a) and b) for a special case M=2, p1=p2=0.5, n1 = 1, n2 = 100, N=1000 – i.e., out 1000 applicants, either 1 or 100 are invited for an interview. Do the answers make sense?