The IID setting is somewhat analogous to the binomial setting in the case where the values in question are scalar. Here are its characteristics:
We can summarize this by saying:
In the IID setting we have a fixed number of Independent, Identically Distributed trials
You should see a lot of similarities with the binomial. In the case of the binomial we had the same chance of success, \(p\). The analog of this here is the claim that the distributions of all trials are identical.
In the IID setting, the quantity of interest is the sample mean:
\[\bar x = \frac{X_1+X_2+\cdots+X_n}{n}\]
Notice that it is a random variable, the sum of all the \(X_i\). Its value depends on the sample we end up with, just like the value of \(\hat p\) depended on the sample in the binomial case. So different samples would give us different values for the sample mean.
Just like in the binomial we were interested in the kinds of values that \(\hat p\) can take, and how likely each is, we can do the same thing here.
The sampling distribution of \(\bar x\) is the distribution of the values that \(\bar x\) takes across all possible samples of size \(n\).
The remarkable fact is that we can describe what this distribution is, even if we know very little about the values that the \(X_i\) can take. Let us set up the stage.
In the IID setting we draw \(n\) samples/trials from a distribution \(X\). We will denote the mean of that distribution by \(\mu\) and the standard deviation of this distribution by \(\sigma\).
Then we can compute the mean and standard deviation of the sampling distribution of \(\bar x\):
\[\mu_{\bar x} = \mu\] \[\sigma_{\bar x} = \frac{\sigma}{\sqrt{n}}\]
The greek letters on the left-hand side denote the mean and standard deviation of the random variable \(\bar x\), in other words the mean and standard deviation of the sampling distribution.
Let us rephrase this:
Suppose we draw samples of size \(n\) by drawing independent values from a population \(X\) with mean \(\mu\) and standard deviation \(\sigma\).
If we then compute the sample mean values \(\bar x\), one for each possible sample of \(n\) values, then the mean of these values is \(\mu\) and the standard deviation is \(\frac{\sigma}{\sqrt{n}}\).
Sample averages vary less than the original values, by a factor of \(\sqrt{n}\).
Sample averages are on average the same as the original values.
This tells us at least the mean and standard deviation of the sampling distribution of \(\bar x\). Amazingly, we can say more about it. This is the famous Central Limit Theorem.
Central Limit Theorem
When the sample size \(n\) is “sufficiently large”, then the sampling distribution of \(\bar x\) will be approximately normal.
So we can assume that \(\bar x\) follows the distribution:
\[N\left(\mu, \frac{\sigma}{\sqrt{n}}\right)\]
This is a remarkable theorem. It tells us that no matter what kind of distribution our original values had, heavily skewed, outliers, multiple modes and so on, then once we take large enough samples, the possible values are behaving like a normal distribution. No matter what we started with.
This is the reason why we have standardized tests. Your score in a standardized test is an average of your scores in many questions, and averages tend to behave in a more normal way than the original values.
- Sample averages are on average the same as the original values.
- Sample averages vary less than the original values.
- Sample averages are more normal than the original values.
The only thing left is to answer the question of what is “sufficiently large sample size”. All we have is a general rule of thumb, but the bottom line is: The more non-normal the original population, the larger the sample size you would need.
Rule of Thumb for sufficient sample size for the Central Limit Theorem to apply.
- If the original population is heavily skewed, outliers etc, we would need a sample size near \(n=100\) or more.
- If the original population is only slightly skewed, without many outliers, a sample size around \(n=40\) or more would suffice.
- If the original population is close to symmetric, a sample size of 10-20 is enough.
- If the original population is normal, then even a sample size of \(n=1\) is sufficient.
The larger the sample size, the better. These are starting points depending on the population.
One important observation is that these sample sizes are just the minimums required to be able to claim that \(\bar x\) is normally distributed. We typically still need even bigger sample sizes, in order to keep \(\sigma_{\bar x} = \frac{\sigma}{\sqrt{n}}\) small.
Suppose we draw at random a sample of size \(40\) from the Hanover student body, and consider their GPAs.
All this leads us to being able to say that \(\bar x\) behaves according to the normal distribution \(N(2.98, 0.087)\). We can use our knowledge of that distribution to answer various questions about \(\bar x\).
For example, say I want to find a range that captures \(90\%\) of the possible \(\bar x\) values.
We scale that back to an \(\bar x\) value using the formulas
\[z = \frac{\bar x - \mu_{\bar x}}{\sigma_{\bar x}}\] \[\bar x = \sigma_{\bar x} \cdot z + \mu_{\bar x}\]So what this means is that \(90\%\) of the possible samples out there have a sample mean value \(\bar x\) somewhere between \(2.83\) and \(3.13\). So \(90\%\) of the times when we choose a sample we’re no more than \(0.15\) away from the actual mean.
Concept | Binomial | IID |
---|---|---|
Sample Size | \(n\) (fixed) | \(n\) (fixed) |
Sample Values | Yes/No | Numeric |
Distribution | Same probability of Success \(p\) | Identical \(X\) for each trial |
Independent | \(20\) times sample less than population And other considerations | \(20\) times sample less than population And other considerations |
Parameter (from population) | Population percent of success (\(p\)) | Population mean \(\mu\) (mean of \(X\)) |
Statistic (from sample) | Sample percent of success (\(\hat p\)) | Sample mean \(\bar x\) |
Sampling Distribution mean | \(\mu_{\hat p} = p\) | \(\mu_{\bar x} = \mu\) |
Sampling Distribution std. dev | \(\sigma_{\hat p}=\frac{\sqrt{p(1-p)}}{\sqrt{n}}\) | \(\sigma_{\bar x} =\frac{\sigma}{\sqrt{n}}\) |
Sampling Distribution is Normal | \(np\geq 10\) and \(n(1-p)\geq 10\) | Central Limit Theorem (rule of thumb) |