The Annals of Applied Statistics
- Ann. Appl. Stat.
- Volume 12, Number 4 (2018), 2279-2311.
Single stage prediction with embedded topic modeling of online reviews for mobile app management
Mobile apps are one of the building blocks of the mobile digital economy. A differentiating feature of mobile apps to traditional enterprise software is online reviews, which are available on app marketplaces and represent a valuable source of consumer feedback on the app. We create a supervised topic modeling approach for app developers to use mobile reviews as useful sources of quality and customer feedback, thereby complementing traditional software testing. The approach is based on a constrained matrix factorization that leverages the relationship between term frequency and a given response variable in addition to co-occurrences between terms to recover topics that are both predictive of consumer sentiment and useful for understanding the underlying textual themes. The factorization is combined with ordinal regression to provide guidance from online reviews on a single app’s performance as well as systematically compare different apps over time for benchmarking of features and consumer sentiment. We apply our approach using a dataset of over 100,000 mobile reviews over several years for three of the most popular online travel agent apps from the iTunes and Google Play marketplaces.
Ann. Appl. Stat., Volume 12, Number 4 (2018), 2279-2311.
Received: July 2016
Revised: February 2018
First available in Project Euclid: 13 November 2018
Permanent link to this document
Digital Object Identifier
Mathematical Reviews number (MathSciNet)
Mankad, Shawn; Hu, Shengli; Gopal, Anandasivam. Single stage prediction with embedded topic modeling of online reviews for mobile app management. Ann. Appl. Stat. 12 (2018), no. 4, 2279--2311. doi:10.1214/18-AOAS1152. https://projecteuclid.org/euclid.aoas/1542078045
- Raw data and R code. The zip file contains the raw online reviews data for the three apps on both platforms in addition to implementations in R of the proposed matrix factorization.