| Probability mass function|
Plot of the Yule–Simon PMF
Yule–Simon PMF on a log-log scale. (Note that the function is only defined at integer values of k. The connecting lines do not indicate continuity.)
| Cumulative distribution function|
Plot of the Yule–Simon CMF
Yule–Simon CMF. (Note that the function is only defined at integer values of k. The connecting lines do not indicate continuity.)
|Template:Probability distribution/link mass|
The probability mass function of the Yule–Simon (ρ) distribution is
where is the gamma function. Thus, if is an integer,
The parameter can be estimated using a fixed point algorithm.
The probability mass function f has the property that for sufficiently large k we have
This means that the tail of the Yule–Simon distribution is a realization of Zipf's law: can be used to model, for example, the relative frequency of the th most frequent word in a large collection of text, which according to Zipf's law is inversely proportional to a (typically small) power of .
The Yule–Simon distribution arose originally as the limiting distribution of a particular stochastic process studied by Yule as a model for the distribution of biological taxa and subtaxa. Simon dubbed this process the "Yule process" but it is more commonly known today as a preferential attachment process. The preferential attachment process is an urn process in which balls are added to a growing number of urns, each ball being allocated to an urn with probability linear in the number the urn already contains.
Then a Yule–Simon distributed variable has the following geometric distribution:
The pmf of a geometric distribution is
for . The Yule–Simon pmf is then the following exponential-geometric mixture distribution:
The two-parameter generalization of the original Yule distribution replaces the beta function with an incomplete beta function. The probability mass function of the generalized Yule–Simon(ρ, α) distribution is defined as
with . For the ordinary Yule–Simon(ρ) distribution is obtained as a special case. The use of the incomplete beta function has the effect of introducing an exponential cutoff in the upper tail.
- Colin Rose and Murray D. Smith, Mathematical Statistics with Mathematica. New York: Springer, 2002, ISBN 0-387-95234-9. (See page 107, where it is called the "Yule distribution".)
- ↑ Simon, H. A. (1955). On a class of skew distribution functions. Biometrika 42 (3–4): 425–440.
- ↑ Garcia Garcia, Juan Manuel (2011). A fixed-point algorithm to estimate the Yule-Simon distribution parameter. Applied Mathematics and Computation 217 (21): 8560–8566.
- ↑ Yule, G. U. (1925). A Mathematical Theory of Evolution, based on the Conclusions of Dr. J. C. Willis, F.R.S. Philosophical Transactions of the Royal Society of London, Ser. B 213 (402–410): 21–87.
|This page uses Creative Commons Licensed content from Wikipedia (view authors).|