Electronic Journal of Statistics

Attributing hacks with survival trend filtering

Ziqi Liu, Alexander Smola, Kyle Soska, Yu-Xiang Wang, Qinghua Zheng, and Jun Zhou

Full-text: Open access

Abstract

In this paper we describe an algorithm for estimating the provenance of hacks on websites. That is, given properties of sites and the temporal occurrence of attacks, we are able to attribute individual attacks to joint causes and vulnerabilities, as well as estimate the evolution of these vulnerabilities over time. Specifically, we use hazard regression with a time-varying additive hazard function parameterized in a generalized linear form. The activation coefficients on each feature are continuous-time functions over time. We formulate the problem of learning these functions as a constrained variational maximum likelihood estimation problem with total variation penalty and show that the optimal solution is a $0$th order spline (a piecewise constant function) with a finite number of adaptively chosen knots. This allows the inference problem to be solved efficiently and at scale by solving a finite dimensional optimization problem. Extensive experiments on real data sets show that our method significantly outperforms Cox’s proportional hazard model. We also conduct case studies and verify that the fitted functions of the features respond to real-life campaigns.

Article information

Source
Electron. J. Statist., Volume 11, Number 2 (2017), 5311-5341.

Dates
Received: June 2017
First available in Project Euclid: 15 December 2017

Permanent link to this document
https://projecteuclid.org/euclid.ejs/1513306875

Digital Object Identifier
doi:10.1214/17-EJS1380SI

Mathematical Reviews number (MathSciNet)
MR3738213

Zentralblatt MATH identifier
06825048

Keywords
Hazard regression nonparametrics trend filtering survival analysis

Rights
Creative Commons Attribution 4.0 International License.

Citation

Liu, Ziqi; Smola, Alexander; Soska, Kyle; Wang, Yu-Xiang; Zheng, Qinghua; Zhou, Jun. Attributing hacks with survival trend filtering. Electron. J. Statist. 11 (2017), no. 2, 5311--5341. doi:10.1214/17-EJS1380SI. https://projecteuclid.org/euclid.ejs/1513306875


Export citation

References

  • [1] K. Borgolte, C. Kruegel, and G. Vigna. Delta: automatic identification of unknown web-based infection campaigns. In, ACM SIGSAC conference on Computer & communications security, pages 109–120. ACM, 2013.
  • [2] J. Bradic, R. Song, and S. Diego. Structured Estimation in Nonparameteric Cox Model. pages 1–34, 2012..
  • [3] D. R. Cox. Regression models and life tables (with discussion)., Journal of the Royal Statistical Society., 34(2):187–220, 1972.
  • [4] C. De Boor., A practical guide to splines. Springer-Verlag New York, 1978.
  • [5] J. Duchi, E. Hazan, and Y. Singer. Adaptive subgradient methods for online learning and stochastic optimization., Journal of Machine Learning Research, 12(Jul) :2121–2159, 2011.
  • [6] Google. “google safe browsing api”. URL, https://code.google.com/ apis/safebrowsing/.
  • [7] T. J. Hastie and R. J. Tibshirani., Generalized additive models, volume 43. CRC press, 1990.
  • [8] L. Invernizzi and P. M. Comparetti. Evilseed: A guided approach to finding malicious web pages. In, IEEE Symposium on Security and Privacy, pages 428–442. IEEE, 2012.
  • [9] N. A. Johnson. A dynamic programming algorithm for the fused lasso and $l_0$-segmentation., Journal of Computational and Graphical Statistics, 22(2):246–260, 2013.
  • [10] R. Johnson and T. Zhang. Accelerating stochastic gradient descent using predictive variance reduction. In, Advances in Neural Information Processing Systems (NIPS-13), 2013.
  • [11] S.-j. Kim, K. Koh, S. Boyd, and D. Gorinevsky. L1 trend filtering., SIAM Review, 51(2):339–360, 2009.
  • [12] J. P. Klein and M. L. Moeschberger., Survival analysis: techniques for censored and truncated data. Springer Science & Business Media, 2005.
  • [13] C. Kooperberg, C. Stone, and Y. Truong. Hazard Regression, 1994.
  • [14] N. Leontiadis, T. Moore, and N. Christin. A Nearly Four-Year Longitudinal Study of Search-Engine Poisoning. In, ACM SIGSAC Conference on Computer and Communications Security. ACM, 2014.
  • [15] E. Mammen, S. van de Geer, et al. Locally adaptive regression splines., The Annals of Statistics, 25(1):387–413, 1997.
  • [16] N. P. P. Mavrommatis and M. A. R. F. Monrose. All your iframes point to us. In, USENIX Security Symposium, pages 1–16, 2008.
  • [17] McAfee. “site advisor”. URL, http://www.siteadvisor.com/.
  • [18] Norton. “norton safe web”. URL, http://safeweb.norton.com.
  • [19] A. Parekh and I. W. Selesnick. Convex fused lasso denoising with non-convex regularization and its use for pulse detection. In, Signal Processing in Medicine and Biology Symposium (SPMB), 2015 IEEE, pages 1–6. IEEE, 2015.
  • [20] A. Perperoglou. Reduced rank hazard regression with fixed and time-varying effects of the covariates., Biometrical Journal, 55(1):38–51, 2013.
  • [21] A. Perperoglou. Cox models with dynamic ridge penalties on time-varying effects of the covariates., Statistics in medicine, 33(1):170–180, 2014.
  • [22] S. J. Reddi, S. Sra, B. Poczos, and A. Smola. Fast stochastic methods for nonsmooth nonconvex optimization. In, Advances in Neural Information Processing Systems (NIPS-16), 2016.
  • [23] R. T. Rockafellar., Convex analysis. Princeton university press, 2015.
  • [24] V. Sadhanala and R. J. Tibshirani. Additive models with trend filtering., arXiv preprint arXiv :1702.05037, 2017.
  • [25] V. Sadhanala, Y.-X. Wang, and R. J. Tibshirani. Total variation classes beyond 1d: Minimax rates, and the limitations of linear smoothers. In, Advances in Neural Information Processing Systems, pages 3513–3521, 2016.
  • [26] W. Sauerbrei, P. Royston, and M. Look. A new proposal for multivariable modelling of time-varying effects in survival data based on fractional polynomial time-transformation., Biometrical Journal, 49(3):453–473, 2007.
  • [27] K. Soska and N. Christin. Automatically detecting vulnerable websites before they turn malicious., USENIX Security Symposium, 2014.
  • [28] T. M. Therneau and P. M. Grambsch., Modeling survival data: extending the Cox model. Springer Science & Business Media, 2013.
  • [29] T. M. Therneau and T. Lumley. Package ‘survival’, 2017.
  • [30] R. Tibshirani. the Lasso Method for Variable Selection in the Cox Model., Statistics in Medicine, 16(4):385–395, 1997.
  • [31] R. Tibshirani, M. Saunders, S. Rosset, J. Zhu, and K. Knight. Sparsity and smoothness via the fused lasso., Journal of the Royal Statistical Society: Series B (Statistical Methodology), 67(1):91–108, 2005.
  • [32] R. J. Tibshirani. Adaptive piecewise polynomial estimation via trend filtering., The Annals of Statistics, 42(1):285–323, 2014.
  • [33] P. J. Verweij and H. C. van Houwelingen. Time-dependent effects of fixed covariates in cox regression., Biometrics, pages 1550–1556, 1995.
  • [34] Y.-X. Wang, A. Smola, and R. Tibshirani. The falling factorial basis and its statistical applications. In, International Conference on Machine Learning (ICML-14), 2014.
  • [35] Y.-X. Wang, J. Sharpnack, A. J. Smola, and R. J. Tibshirani. Trend filtering on graphs., The Journal of Machine Learning Research, 17(1) :3651–3691, 2016.
  • [36] M. Woodward., Epidemiology: study design and data analysis. CRC press, 2013.
  • [37] Y.-L. Yu. On decomposing the proximal map. In, Advances in Neural Information Processing Systems, pages 91–99, 2013.