## Statistical Science

### Strong, Weak and False Inverse Power Laws

Richard Perline

#### Abstract

Pareto, Zipf and numerous subsequent investigators of inverse power distributions have often represented their findings as though their data conformed to a power law form for all ranges of the variable of interest. I refer to this ideal case as a strong inverse power law (SIPL). However, many of the examples used by Pareto and Zipf, as well as others who have followed them, have been truncated data sets, and if one looks more carefully in the lower range of values that was originally excluded, the power law behavior usually breaks down at some point. This breakdown seems to fall into two broad cases, called here (1) weak and (2) false inverse power laws (WIPL and FIPL, resp.). Case 1 refers to the situation where the sample data fit a distribution that has an approximate inverse power form only in some upper range of values. Case 2 refers to the situation where a highly truncated sample from certain exponential-type (and in particular, “lognormal-like”) distributions can convincingly mimic a power law. The main objectives of this paper are (a) to show how the discovery of Pareto–Zipf-type laws is closely associated with truncated data sets; (b) to elaborate on the categories of strong, weak and false inverse power laws; and (c) to analyze FIPLs in some detail. I conclude that many, but not all, Pareto–Zipf examples are likely to be FIPL finite mixture distributions and that there are few genuine instances of SIPLs.

#### Article information

Source
Statist. Sci., Volume 20, Number 1 (2005), 68-88.

Dates
First available in Project Euclid: 6 June 2005

https://projecteuclid.org/euclid.ss/1118065043

Digital Object Identifier
doi:10.1214/088342304000000215

Mathematical Reviews number (MathSciNet)
MR2182988

Zentralblatt MATH identifier
1100.62013

#### Citation

Perline, Richard. Strong, Weak and False Inverse Power Laws. Statist. Sci. 20 (2005), no. 1, 68--88. doi:10.1214/088342304000000215. https://projecteuclid.org/euclid.ss/1118065043

#### References

• Aitchison, J. and Brown, J. A. C. (1957). The Lognormal Distribution. Cambridge Univ. Press.
• Albert, R., Jeong, H. and Barabási, A.-L. (1999). Diameter of the World-Wide Web. Nature 401 130.
• Amaral, L. A. N., Scala, A., Barthelemy, M. and Stanley, H. E. (2000). Classes of small-world networks. Proc. Natl. Acad. Sci. U.S.A. 97 11,149--11,152.
• American Iron and Steel Institute (1957). Directory of Iron and Steel Works of the United States and Canada, 28th ed. American Iron and Steel Institute, New York.
• Arnold, B. C. (1983). Pareto Distributions. International Co-operative Publishing House, Burtonsville, MD.
• Asmussen, S., Klüppelberg, C. and Sigman, K. (1999). Sampling at subexponential times, with queueing applications. Stochastic Process. Appl. 79 265--286.
• Auerbach, F. (1913). Das Gesetz der Bevölkerungskonzentration. Petermanns Geographische Mitteilungen 59 74--76.
• Bak, P. (1996). How Nature Works. Copernicus, New York.
• Barabási, A.-L. (2002). Linked: The New Science of Networks. Perseus, Cambridge, MA.
• Barabási, A.-L. and Albert, R. (1999). Emergence of scaling in random networks. Science 286 509--512.
• Barabási, A.-L. and Bonabeau, E. (2003). Scale-free networks. Scientific American 288 60--69.
• Berg, L. (1958). Asymptotische Darstellungen für Integrale und Reihen mit Anwendungen. Math. Nachr. 17 101--135.
• Bianconi, G. and Barabási, A.-L. (2001). Competition and multiscaling in evolving networks. Europhys. Lett. 54 436--442.
• Bookstein, A. (1997). Informetric distributions. III. Ambiguity and randomness. J. American Society for Information Science 48 2--10.
• Bowley, A. L. (1899). The statistics of wages in the United Kingdom during the last hundred years. Part IV. Agricultural wages. J. Roy. Statist. Soc. 62 555--570.
• Box, G. E. P. and Muller, M. E. (1958). A note on the generation of random normal deviates. Ann. Math. Statist. 29 610--611.
• Bulmer, M. G. (1974). On fitting the Poisson lognormal distribution to species-abundance data. Biometrics 30 101--110.
• David, H. A. (1970). Order Statistics. Wiley, New York.
• Downey, A. B. (2003). Lognormal and Pareto distributions in the internet. Available at http://allendowney.com/research/ longtail.
• Edwards, A. M. (1943). Sixteenth Census of the United States, 1940. Population. Comparative Occupation Statistics for the United States, 1870 to 1940. U.S. Government Printing Office, Washington.
• Embrechts, P., Klüppelberg, C. and Mikosch, T. (1997). Modelling Extremal Events. Springer, Berlin.
• Fishman, G. S. and Moore, L. R. (1982). A statistical evaluation of multiplicative congruential random number generators with modulus $2^31-1$. J. Amer. Statist. Assoc. 77 129--136.
• Gibrat, R. (1931). Les Inégalités Économiques. Libraire de Recueil Sirey, Paris.
• Gong, W., Liu, Y., Misra, V. and Towsley, D. (2001). On the tails of Web file size distributions. In Proc. 39th Annual Allerton Conference on Communication, Control and Computing. Univ. Illinois Press, Champaign. Available at http://www1.cs. columbia.edu/~misra/pubs/allerton.pdf.
• Graham, R. L., Knuth, D. E. and Patashnik, O. (1994). Concrete Mathematics: A Foundation for Computer Science, 2nd ed. Addison--Wesley, Reading, MA.
• Grandell, J. (1997). Mixed Poisson Processes. Chapman and Hall, New York.
• Hall, P. (1979). On the rate of convergence of normal extremes. J. Appl. Probab. 16 433--439.
• Hanley, M. L. (1937). Word Index to James Joyce's Ulysses. Univ. Wisconsin Press, Madison.
• Ijiri, Y. and Simon, H. A. (1977). Skew Distributions and the Sizes of Business Firms. North-Holland, Amsterdam.
• Johnson, N. L., Kotz, S. and Balakrishnan, N. (1994). Distributions in Statistics: Continuous Univariate Distributions 1, 2nd ed. Wiley, New York.
• Kendall, M. G. (1961). Natural law in the social sciences. J. Roy. Statist. Soc. Ser. A 124 1--16.
• Klein, L. R. (1962). An Introduction to Econometrics. Prentice--Hall, Englewood Cliffs, NJ.
• Korčák, J. (1938). Deux types fondamentaux de distribution statistique. Bull. Inst. Internat. Statist. 30(3) 295--298.
• Krugman, P. (1996). The Self-Organizing Economy. Blackwell, Cambridge, MA.
• Lebergott, S. (1959). The shape of the income distribution. American Economic Review 49 328--347.
• Lotka, A. J. (1926). The frequency distribution of scientific productivity. J. Washington Academy of Sciences 16 317--323.
• Macauley, F. (1922). Pareto's law and the general problem of mathematically describing the frequency distribution of income. In Income of the United States. Its Amount and Distribution 1909--1919 2 Chap. 23. National Bureau of Economic Research, New York.
• Mandelbrot, B. (1960). The Pareto--Lévy law and the distribution of income. Internat. Econom. Rev. 1 79--106.
• Mandelbrot, B. (1982). The Fractal Geometry of Nature. W. H. Freeman, San Francisco.
• Mandelbrot, B. (1997). Fractals and Scaling in Finance: Discontinuity, Concentration, Risk. Springer, New York.
• McLachlan, G. J. and Krishnan, T. (1997). The EM Algorithm and Extensions. Wiley, New York.
• Mitzenmacher, M. (2001). A brief history of generative models for power law and lognormal distributions. In Proc. 39th Annual Allerton Conference on Communication, Control and Computing 182--191. Univ. Illinois Press, Champaign.
• Montroll, E. and Shlesinger, M. F. (1982). On $1/f$ noise and other distributions with long tails. Proc. Natl. Acad. Sci. U.S.A. 79 3380--3383.
• Montroll, E. and Shlesinger, M. F. (1983). Maximum entropy formalism, fractals, scaling phenomena, and $1/f$ noise: A tale of tails. J. Statist. Phys. 32 209--230.
• National Resources Committee (1938). Consumer Incomes in the United States: Their Distribution in 1935--36. U.S. Government Printing Office, Washington, DC.
• Paddock, R. H. and Rodgers, R. P. (1939). Preliminary results of road-use studies. Public Roads 20 45--63.
• Pareto, V. (1895). La legge della demanda. Giornale degli Economisti 45--63.
• Pareto, V. (1897). Cours d'Économie Politique 2. F. Rouge, Lausanne.
• Parr, J. B. and Suzuki, K. (1973). Settlement populations and the lognormal distribution. Urban Studies 10 335--352.
• Perline, R. (1982). An extreme value model of weakly harmonic (Pareto--Zipf type) laws. Ph.D. dissertation, Univ. Chicago.
• Perline, R. (1996). Zipf's law, the central limit theorem, and the random division of the unit interval. Phys. Rev. E 54 220--223.
• Perline, R. (1998). Mixed Poisson distributions tail equivalent to their mixing distributions. Statist. Probab. Lett. 38 229--233.
• Polfeldt, T. (1970). Asymptotic results in non-regular estimation. Skand. Aktuarietidskr. 1970 suppl. 1--78.
• Price, D. J. de S. (1963). Little Science, Big Science. Columbia Univ. Press, New York.
• Reed, W. J. (2001). The Pareto, Zipf and other power laws. Econom. Lett. 74 15--19.
• Reed, W. J. and Hughes, B. D. (2002). From gene families and genera to incomes and internet file sizes: Why power laws are so common in nature. Phys. Rev. E 66 067103.
• Sichel, H. S. (1975). On a distribution law for word frequencies. J. Amer. Statist. Assoc. 70 542--547.
• Simon, H. (1955). On a class of skew distribution functions. Biometrika 52 425--440. Also in Ijiri and Simon (1977).
• Simon, H. A. and Bonini, C. P. (1958). The size distribution of business firms. American Economic Review 48 607--617. Also in Ijiri and Simon (1977).
• Stamp, J. (1914). A new illustration of Pareto's law. J. Roy. Statist. Soc. 77 200--204.
• Stewart, J. (1994). The Poisson--lognormal model for bibliometric/scientometric distributions. Information Processing and Management 30 239--251.
• Thatcher, A. R. (1976). The new earnings survey and the distribution of earnings. In The Personal Income Distribution (A. B. Atkinson, ed.) 227--268. Westview Press, Boulder, CO.
• Watts, D. J. (2003). Six Degrees: The Science of a Connected Age. Norton, New York.
• Zipf, G. (1947). The frequency and diversity of business establishments and personal occupations: A study of social stereotypes and cultural roles. J. Psychology 24 139--148.
• Zipf, G. K (1949). Human Behavior and the Principle of Least Effort. Addison--Wesley, Cambridge, MA.