The Annals of Applied Statistics
- Ann. Appl. Stat.
- Volume 12, Number 4 (2018), 2054-2074.
Multi-rubric models for ordinal spatial data with application to online ratings data
Interest in online rating data has increased in recent years in which ordinal ratings of products or local businesses are provided by users of a website, such as Yelp! or Amazon. One source of heterogeneity in ratings is that users apply different standards when supplying their ratings; even if two users benefit from a product the same amount, they may translate their benefit into ratings in different ways. In this article we propose an ordinal data model, which we refer to as a multi-rubric model, which treats the criteria used to convert a latent utility into a rating as user-specific random effects, with the distribution of these random effects being modeled nonparametrically. We demonstrate that this approach is capable of accounting for this type of variability in addition to usual sources of heterogeneity due to item quality, user biases, interactions between items and users and the spatial structure of the users and items. We apply the model developed here to publicly available data from the website Yelp! and demonstrate that it produces interpretable clusterings of users according to their rating behavior, in addition to providing better predictions of ratings and better summaries of overall item quality.
Ann. Appl. Stat., Volume 12, Number 4 (2018), 2054-2074.
Received: September 2017
Revised: December 2017
First available in Project Euclid: 13 November 2018
Permanent link to this document
Digital Object Identifier
Mathematical Reviews number (MathSciNet)
Linero, Antonio R.; Bradley, Jonathan R.; Desai, Apurva. Multi-rubric models for ordinal spatial data with application to online ratings data. Ann. Appl. Stat. 12 (2018), no. 4, 2054--2074. doi:10.1214/18-AOAS1143. https://projecteuclid.org/euclid.aoas/1542078036
- Identifiability of model parameters. In this supplementary material we discuss identifiability of the model parameters; we give empirical evidence that the latent variables are identified up to orthogonal transformations.