# Reading 4: Classical Inference II: Estimating and Comparing Distributions#

*For the class on Wednesday, January 24th*

## Reading assignments#

Read the following sections of [ICVG20] (sections based on the 2020 revised edition)

Sec. 4.5 “Confidence Estimates: The Bootstrap and the Jackknife”

Sec. 4.7.2 “Nonparametric Methods for Comparing Distributions” (read this section up to before “The U Test and the Wilcoxon Test”)

## Questions#

Hint

Submit your answer on Canvas. Due at noon, Wednesday, Jan 24th.

List any concepts that confuse you from your reading. Explain why they confuse you. If nothing confuses you, briefly summarize what you have learned from this reading assignment.

Let’s say you have 100 data points that you know for certain come from an exponential distribution. However, you don’t know the value of the \(\lambda\) parameter in that exponential distribution.

How would you estimate the value of \(\lambda\) from the data points?

How would you estimate the uncertainty (or variance) on your estimate of \(\lambda\)? Is bootstrap the right method to use here? Why or why not?

## Discussion Preview#

Note

We will discuss the following questions in class. They are included here so that you have a chance to think about them before class.
You need *not* submit your answers as part of this assignment.

Consider each of the following cases and discuss how you will carry out the estimation or test.

You have some data points and you want to test if they come from a specific distribution. You know the form of the distribution’s pmf/pdf but do not know the values of the parameters (e.g., you know the distribution is a Poisson distribution but do not know the mean).

You have some data points from an unknown distribution. You want to estimate the uncertainty (or variance) of the sample median.

You have two sets of data points. You know both sets come from the same distribution, but with potentially different parameter values (e.g., each comes from a Poisson distribution with a unique mean). You want to test if the parameter values of the distributions from which the two sets of data points were drawn are consistent.