## Electronic Journal of Statistics

### Attributing hacks with survival trend filtering

#### Abstract

In this paper we describe an algorithm for estimating the provenance of hacks on websites. That is, given properties of sites and the temporal occurrence of attacks, we are able to attribute individual attacks to joint causes and vulnerabilities, as well as estimate the evolution of these vulnerabilities over time. Specifically, we use hazard regression with a time-varying additive hazard function parameterized in a generalized linear form. The activation coefficients on each feature are continuous-time functions over time. We formulate the problem of learning these functions as a constrained variational maximum likelihood estimation problem with total variation penalty and show that the optimal solution is a $0$th order spline (a piecewise constant function) with a finite number of adaptively chosen knots. This allows the inference problem to be solved efficiently and at scale by solving a finite dimensional optimization problem. Extensive experiments on real data sets show that our method significantly outperforms Cox’s proportional hazard model. We also conduct case studies and verify that the fitted functions of the features respond to real-life campaigns.

#### Article information

Source
Electron. J. Statist., Volume 11, Number 2 (2017), 5311-5341.

Dates
First available in Project Euclid: 15 December 2017

https://projecteuclid.org/euclid.ejs/1513306875

Digital Object Identifier
doi:10.1214/17-EJS1380SI

Mathematical Reviews number (MathSciNet)
MR3738213

Zentralblatt MATH identifier
06825048

#### Citation

Liu, Ziqi; Smola, Alexander; Soska, Kyle; Wang, Yu-Xiang; Zheng, Qinghua; Zhou, Jun. Attributing hacks with survival trend filtering. Electron. J. Statist. 11 (2017), no. 2, 5311--5341. doi:10.1214/17-EJS1380SI. https://projecteuclid.org/euclid.ejs/1513306875

#### References

• [1] K. Borgolte, C. Kruegel, and G. Vigna. Delta: automatic identification of unknown web-based infection campaigns. In, ACM SIGSAC conference on Computer & communications security, pages 109–120. ACM, 2013.
• [2] J. Bradic, R. Song, and S. Diego. Structured Estimation in Nonparameteric Cox Model. pages 1–34, 2012..
• [3] D. R. Cox. Regression models and life tables (with discussion)., Journal of the Royal Statistical Society., 34(2):187–220, 1972.
• [4] C. De Boor., A practical guide to splines. Springer-Verlag New York, 1978.
• [5] J. Duchi, E. Hazan, and Y. Singer. Adaptive subgradient methods for online learning and stochastic optimization., Journal of Machine Learning Research, 12(Jul) :2121–2159, 2011.
• [7] T. J. Hastie and R. J. Tibshirani., Generalized additive models, volume 43. CRC press, 1990.
• [8] L. Invernizzi and P. M. Comparetti. Evilseed: A guided approach to finding malicious web pages. In, IEEE Symposium on Security and Privacy, pages 428–442. IEEE, 2012.
• [9] N. A. Johnson. A dynamic programming algorithm for the fused lasso and $l_0$-segmentation., Journal of Computational and Graphical Statistics, 22(2):246–260, 2013.
• [10] R. Johnson and T. Zhang. Accelerating stochastic gradient descent using predictive variance reduction. In, Advances in Neural Information Processing Systems (NIPS-13), 2013.
• [11] S.-j. Kim, K. Koh, S. Boyd, and D. Gorinevsky. L1 trend filtering., SIAM Review, 51(2):339–360, 2009.
• [12] J. P. Klein and M. L. Moeschberger., Survival analysis: techniques for censored and truncated data. Springer Science & Business Media, 2005.
• [13] C. Kooperberg, C. Stone, and Y. Truong. Hazard Regression, 1994.
• [14] N. Leontiadis, T. Moore, and N. Christin. A Nearly Four-Year Longitudinal Study of Search-Engine Poisoning. In, ACM SIGSAC Conference on Computer and Communications Security. ACM, 2014.
• [15] E. Mammen, S. van de Geer, et al. Locally adaptive regression splines., The Annals of Statistics, 25(1):387–413, 1997.
• [16] N. P. P. Mavrommatis and M. A. R. F. Monrose. All your iframes point to us. In, USENIX Security Symposium, pages 1–16, 2008.
• [18] Norton. “norton safe web”. URL, http://safeweb.norton.com.
• [19] A. Parekh and I. W. Selesnick. Convex fused lasso denoising with non-convex regularization and its use for pulse detection. In, Signal Processing in Medicine and Biology Symposium (SPMB), 2015 IEEE, pages 1–6. IEEE, 2015.
• [20] A. Perperoglou. Reduced rank hazard regression with fixed and time-varying effects of the covariates., Biometrical Journal, 55(1):38–51, 2013.
• [21] A. Perperoglou. Cox models with dynamic ridge penalties on time-varying effects of the covariates., Statistics in medicine, 33(1):170–180, 2014.
• [22] S. J. Reddi, S. Sra, B. Poczos, and A. Smola. Fast stochastic methods for nonsmooth nonconvex optimization. In, Advances in Neural Information Processing Systems (NIPS-16), 2016.
• [23] R. T. Rockafellar., Convex analysis. Princeton university press, 2015.
• [24] V. Sadhanala and R. J. Tibshirani. Additive models with trend filtering., arXiv preprint arXiv :1702.05037, 2017.
• [25] V. Sadhanala, Y.-X. Wang, and R. J. Tibshirani. Total variation classes beyond 1d: Minimax rates, and the limitations of linear smoothers. In, Advances in Neural Information Processing Systems, pages 3513–3521, 2016.
• [26] W. Sauerbrei, P. Royston, and M. Look. A new proposal for multivariable modelling of time-varying effects in survival data based on fractional polynomial time-transformation., Biometrical Journal, 49(3):453–473, 2007.
• [27] K. Soska and N. Christin. Automatically detecting vulnerable websites before they turn malicious., USENIX Security Symposium, 2014.
• [28] T. M. Therneau and P. M. Grambsch., Modeling survival data: extending the Cox model. Springer Science & Business Media, 2013.
• [29] T. M. Therneau and T. Lumley. Package ‘survival’, 2017.
• [30] R. Tibshirani. the Lasso Method for Variable Selection in the Cox Model., Statistics in Medicine, 16(4):385–395, 1997.
• [31] R. Tibshirani, M. Saunders, S. Rosset, J. Zhu, and K. Knight. Sparsity and smoothness via the fused lasso., Journal of the Royal Statistical Society: Series B (Statistical Methodology), 67(1):91–108, 2005.
• [32] R. J. Tibshirani. Adaptive piecewise polynomial estimation via trend filtering., The Annals of Statistics, 42(1):285–323, 2014.
• [33] P. J. Verweij and H. C. van Houwelingen. Time-dependent effects of fixed covariates in cox regression., Biometrics, pages 1550–1556, 1995.
• [34] Y.-X. Wang, A. Smola, and R. Tibshirani. The falling factorial basis and its statistical applications. In, International Conference on Machine Learning (ICML-14), 2014.
• [35] Y.-X. Wang, J. Sharpnack, A. J. Smola, and R. J. Tibshirani. Trend filtering on graphs., The Journal of Machine Learning Research, 17(1) :3651–3691, 2016.
• [36] M. Woodward., Epidemiology: study design and data analysis. CRC press, 2013.
• [37] Y.-L. Yu. On decomposing the proximal map. In, Advances in Neural Information Processing Systems, pages 91–99, 2013.