The Annals of Applied Statistics
- Ann. Appl. Stat.
- Volume 4, Number 1 (2010), 26-52.
Analysis of dependence among size, rate and duration in internet flows
In this paper we examine rigorously the evidence for dependence among data size, transfer rate and duration in Internet flows. We emphasize two statistical approaches for studying dependence, including Pearson’s correlation coefficient and the extremal dependence analysis method. We apply these methods to large data sets of packet traces from three networks. Our major results show that Pearson’s correlation coefficients between size and duration are much smaller than one might expect. We also find that correlation coefficients between size and rate are generally small and can be strongly affected by applying thresholds to size or duration. Based on Transmission Control Protocol connection startup mechanisms, we argue that thresholds on size should be more useful than thresholds on duration in the analysis of correlations. Using extremal dependence analysis, we draw a similar conclusion, finding remarkable independence for extremal values of size and rate.
Ann. Appl. Stat., Volume 4, Number 1 (2010), 26-52.
First available in Project Euclid: 11 May 2010
Permanent link to this document
Digital Object Identifier
Mathematical Reviews number (MathSciNet)
Zentralblatt MATH identifier
Park, Cheolwoo; Hernández-Campos, Felix; Marron, J. S.; Jeffay, Kevin; Smith, F. Donelson. Analysis of dependence among size, rate and duration in internet flows. Ann. Appl. Stat. 4 (2010), no. 1, 26--52. doi:10.1214/09-AOAS268. https://projecteuclid.org/euclid.aoas/1273584446