Quantcast
Typesetting math: 100%
Tags

33.3 Probability distributions by Benjamin Crowell, Light and Matter  licensed under the Creative Commons Attribution-ShareAlike license.

33.3 Probability distributions

So far we've discussed random processes having only two possible outcomes: yes or no, win or lose, on or off. More generally, a random process could have a result that is a number. Some processes yield integers, as when you roll a die and get a result from one to six, but some are not restricted to whole numbers, for example the number of seconds that a uranium-238238 atom will exist before undergoing radioactive decay.

Consider a throw of a die. If the die is “honest,” then we expect all six values to be equally likely. Since all six probabilities must add up to 11, then probability of any particular value coming up must be 11/66. We can summarize this in a graph, d. Areas under the curve can be interpreted as total probabilities. For instance, the area under the curve from 11 to 33 is 1/6+1/6+1/6=1/21/6+1/6+1/6=1/2, so the probability of getting a result from 11 to 33 is 11/22. The function shown on the graph is called the probability distribution.

Figure e shows the probabilities of various results obtained by rolling two dice and adding them together, as in the game of craps. The probabilities are not all the same. There is a small probability of getting a two, for example, because there is only one way to do it, by rolling a one and then another one. The probability of rolling a seven is high because there are six different ways to do it: 1+61+6, 2+52+5, etc.

If the number of possible outcomes is large but finite, for example the number of hairs on a dog, the graph would start to look like a smooth curve rather than a ziggurat.

What about probability distributions for random numbers that are not integers? We can no longer make a graph with probability on the yy axis, because the probability of getting a given exact number is typically zero. For instance, there is zero probability that a radioactive atom will last for exactly 33 seconds, since there are infinitely many possible results that are close to 33 but not exactly 33, for example 2.9999999999999999968768765876584654362.999999999999999996876876587658465436. It doesn't usually make sense, therefore, to talk about the probability of a single numerical result, but it does make sense to talk about the probability of a certain range of results. For instance, the probability that an atom will last more than 33 and less than 44 seconds is a perfectly reasonable thing to discuss. We can still summarize the probability information on a graph, and we can still interpret areas under the curve as probabilities.

But the yy axis can no longer be a unitless probability scale. In radioactive decay, for example, we want the xx axis to have units of time, and we want areas under the curve to be unitless probabilities. The area of a single square on the graph paper is then

(unitless area of a square)=(width of square with time units)×(height of square)(unitless area of a square)=(width of square with time units)×(height of square).

If the units are to cancel out, then the height of the square must evidently be a quantity with units of inverse time. In other words, theyy axis of the graph is to be interpreted as probability per unit time, not probability.

Figure f shows another example, a probability distribution for people's height. This kind of bell-shaped curve is quite common.

self-check:

Compare the number of people with heights in the range of 130-135130135 cm to the number in the range 135-140135140.

(answer in the back of the PDF version of the book)

Example 1: Looking for tall basketball players

  A certain country with a large population wants to find very tall people to be on its Olympic basketball team and strike a blow against western imperialism. Out of a pool of 108108 people who are the right age and gender, how many are they likely to find who are over 225cm225cm (77 feet 44 inches) in height? Figure g gives a close-up of the “tail” of the distribution shown previously in figure f.

  The shaded area under the curve represents the probability that a given person is tall enough. Each rectangle represents a probability of 2×10-7cm-1×1cm=2×10-82×107cm1×1cm=2×108. There are about 3535 rectangles covered by the shaded area, so the probability of having a height greater than 225cm225cm is 7×10-77×107 , or just under one in a million. Using the rule for calculating averages, the average, or expected number of people this tall is(108)×(7×10-7)=70(108)×(7×107)=70.

Average and width of a probability distribution

If the next Martian you meet asks you, “How tall is an adult human?,” you will probably reply with a statement about the average human height, such as “Oh, about 55 feet 66 inches.” If you wanted to explain a little more, you could say, “But that's only an average. Most people are somewhere between 55 feet and 66 feet tall.” Without bothering to draw the relevant bell curve for your new extraterrestrial acquaintance, you've summarized the relevant information by giving an average and a typical range of variation.

The average of a probability distribution can be defined geometrically as the horizontal position at which it could be balanced if it was constructed out of cardboard, h. A convenient numerical measure of the amount of variation about the average, or amount of uncertainty, is the full width at half maximum, or FWHM, defined in figure g. (The FWHM was introduced on p. 469.)

A great deal more could be said about this topic, and indeed an introductory statistics course could spend months on ways of defining the center and width of a distribution. Rather than force-feeding you on mathematical detail or techniques for calculating these things, it is perhaps more relevant to point out simply that there are various ways of defining them, and to inoculate you against the misuse of certain definitions.

The average is not the only possible way to say what is a typical value for a quantity that can vary randomly; another possible definition is the median, defined as the value that is exceeded with 50%50% probability. When discussing incomes of people living in a certain town, the average could be very misleading, since it can be affected massively if a single resident of the town is Bill Gates. Nor is the FWHM the only possible way of stating the amount of random variation; another possible way of measuring it is the standard deviation (defined as the square root of the average squared deviation from the average value).

33.3 Probability distributions by Benjamin Crowell, Light and Matter  licensed under the Creative Commons Attribution-ShareAlike license.

This Collection is empty