Why Small Sampled Do Not Represent Populations Well

Why Small Samples Cannot
Easily Represent a Population

An example might help to illustrate why small samples often fail to represent the population adequately. Suppose that we are sampling from a population of 1000 marbles in a large box. Half the marbles are red, 25% are blue, and 25% are green. In this case, we know exactly what the population looks like; rarely will that be the case in real life.

How big a sample do we need to represent the population adequately? Clearly, no sample of size N = 1 could represent the population. If we select just one marble, it will be red, blue, or green. The only generalization we could make from that sample of one marble is that all of the marbles in the population are the same color as the one we sampled--a clearly erroneous conclusion.

AA similar argument could be made for a sample of two marbles, which at best could represent only two of the three colors in the population. A sample of three marbles might represent the three colors, but no sample of three marbles could represent the three colors and the fact that red is twice as common as either blue or green. Furthermore, if you compute the probability of every possible sample of three marbles, you will find that less than 20% of the possible samples contain one red, one blue, and one green marble.

The minimum sample that could accurately represent the population is four marbles (two red, one blue, and one green), but if you set up a box of marbles as described and try taking samples of four marbles, you will be shocked at how rarely that particular combination comes up. Even in this simple example (a population varying on a single characteristic with only three different values), you would need samples of 30 or more marbles before the sample gave a reasonably accurate representation of the composition of the population most of the time. (The mathematics of this computation is beyond the scope of this text. The interested student is referred to any introductory text on mathematical probability.) In the far more complicated real world, in which participants vary on hundreds of variables, small samples almost never adequately represent the population.

Why Small Samples Cannot Easily Represent a Population

Why Small Samples Cannot
Easily Represent a Population