PTEM: A popularity-based topical expertise model for community question answering

Hohyun Jung; Jae-Gil Lee; Namgil Lee; Sung-Ho Kim

doi:10.1214/20-AOAS1346

September 2020 PTEM: A popularity-based topical expertise model for community question answering

Hohyun Jung, Jae-Gil Lee, Namgil Lee, Sung-Ho Kim

Ann. Appl. Stat. 14(3): 1304-1325 (September 2020). DOI: 10.1214/20-AOAS1346

Abstract

Community Question Answering (CQA) websites are widely used in sharing knowledge, where users can ask questions, reply answers and evaluate answers. So far, the evaluation of answers has been explained by the contents of answers through the investigation of users’ topics of interest and expertise levels. In this paper we focus on modeling the user’s evaluation behavior, in that users can see the answerer’s profile as well as the answer content before evaluating the quality of the answer. We propose a model called Popularity-based Topical Expertise Model (PTEM), a generative model to analyze the rich-get-richer phenomenon that popular user’s answers are more recommended. We can simultaneously estimate the topical expertise of each user and the strength of the rich-get-richer effect through the EM algorithm combined with collapsed Gibbs sampling. Experiments are performed on the StackExchange data, and the results demonstrate a rich-get-richer phenomenon in the community. We further discuss the superiority and usefulness of the proposed model through analysis in the discipline of philosophy.

References

1.

Aslay, Ç., O’Hare, N., Aiello, L. M. and Jaimes, A. (2013). Competition-based networks for expert finding. In Proceedings of the 36th International ACM SIGIR Conference on Research and Development in Information Retrieval 1033–1036. ACM.Aslay, Ç., O’Hare, N., Aiello, L. M. and Jaimes, A. (2013). Competition-based networks for expert finding. In Proceedings of the 36th International ACM SIGIR Conference on Research and Development in Information Retrieval 1033–1036. ACM.

2.

Blei, D. M., Ng, A. Y. and Jordan, M. I. (2003). Latent Dirichlet allocation. J. Mach. Learn. Res. 3 993–1022. 1112.68379Blei, D. M., Ng, A. Y. and Jordan, M. I. (2003). Latent Dirichlet allocation. J. Mach. Learn. Res. 3 993–1022. 1112.68379

3.

Bouguessa, M., Dumoulin, B. and Wang, S. (2008). Identifying authoritative actors in question-answering forums: The case of yahoo! answers. In Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 866–874. ACM.Bouguessa, M., Dumoulin, B. and Wang, S. (2008). Identifying authoritative actors in question-answering forums: The case of yahoo! answers. In Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 866–874. ACM.

4.

Cai, Y. and Chakravarthy, S. (2013). Expertise ranking of users in QA community. In International Conference on Database Systems for Advanced Applications 25–40. Springer.Cai, Y. and Chakravarthy, S. (2013). Expertise ranking of users in QA community. In International Conference on Database Systems for Advanced Applications 25–40. Springer.

5.

Cao, X., Cong, G., Cui, B. and Jensen, C. S. (2010). A generalized framework of exploring category information for question retrieval in community question answer archives. In Proceedings of the 19th International Conference on World Wide Web 201–210. ACM.Cao, X., Cong, G., Cui, B. and Jensen, C. S. (2010). A generalized framework of exploring category information for question retrieval in community question answer archives. In Proceedings of the 19th International Conference on World Wide Web 201–210. ACM.

6.

Gilks, W. R. and Wild, P. (1992). Adaptive rejection sampling for Gibbs sampling. J. R. Stat. Soc. Ser. C. Appl. Stat. 41 337–348. 0825.62407Gilks, W. R. and Wild, P. (1992). Adaptive rejection sampling for Gibbs sampling. J. R. Stat. Soc. Ser. C. Appl. Stat. 41 337–348. 0825.62407

7.

Griffiths, T. L. and Steyvers, M. (2004). Finding scientific topics. Proc. Natl. Acad. Sci. USA 101 5228–5235.Griffiths, T. L. and Steyvers, M. (2004). Finding scientific topics. Proc. Natl. Acad. Sci. USA 101 5228–5235.

8.

Hofmann, T. (1999). Probabilistic latent semantic analysis. In Proceedings of the Fifteenth Conference on Uncertainty in Artificial Intelligence 289–296. Morgan Kaufmann Publishers Inc. 0970.68130 10.1023/A:1007617005950Hofmann, T. (1999). Probabilistic latent semantic analysis. In Proceedings of the Fifteenth Conference on Uncertainty in Artificial Intelligence 289–296. Morgan Kaufmann Publishers Inc. 0970.68130 10.1023/A:1007617005950

9.

Jung, H., Lee, J.-G., Lee, N. and Kim, S.-H. (2018). Comparison of fitness and popularity: Fitness-popularity dynamic network model. J. Stat. Mech. Theory Exp. 12 123403, 15. 07232633 10.1088/1742-5468/aaeb40Jung, H., Lee, J.-G., Lee, N. and Kim, S.-H. (2018). Comparison of fitness and popularity: Fitness-popularity dynamic network model. J. Stat. Mech. Theory Exp. 12 123403, 15. 07232633 10.1088/1742-5468/aaeb40

10.

Jung, H., Lee, J.-G., Lee, N. and Kim, S.-H. (2020). Supplement to “PTEM: A popularity-based topical expertise model for community question answering.” https://doi.org/10.1214/20-AOAS1346SUPPA, https://doi.org/10.1214/20-AOAS1346SUPPBJung, H., Lee, J.-G., Lee, N. and Kim, S.-H. (2020). Supplement to “PTEM: A popularity-based topical expertise model for community question answering.” https://doi.org/10.1214/20-AOAS1346SUPPA, https://doi.org/10.1214/20-AOAS1346SUPPB

11.

Jurczyk, P. and Agichtein, E. (2007). Discovering authorities in question answer communities by using link analysis. In Proceedings of the 16th ACM Conference on Information and Knowledge Management 919–922. ACM.Jurczyk, P. and Agichtein, E. (2007). Discovering authorities in question answer communities by using link analysis. In Proceedings of the 16th ACM Conference on Information and Knowledge Management 919–922. ACM.

12.

Kondor, D., Pósfai, M., Csabai, I. and Vattay, G. (2014). Do the rich get richer? An empirical analysis of the Bitcoin transaction network. PLoS ONE 9 e86197.Kondor, D., Pósfai, M., Csabai, I. and Vattay, G. (2014). Do the rich get richer? An empirical analysis of the Bitcoin transaction network. PLoS ONE 9 e86197.

13.

Liu, J., Song, Y.-I. and Lin, C.-Y. (2011). Competition-based user expertise score estimation. In Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval 425–434. ACM.Liu, J., Song, Y.-I. and Lin, C.-Y. (2011). Competition-based user expertise score estimation. In Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval 425–434. ACM.

14.

Louis, T. A. (1982). Finding the observed information matrix when using the EM algorithm. J. Roy. Statist. Soc. Ser. B 44 226–233. 0488.62018 10.1111/j.2517-6161.1982.tb01203.xLouis, T. A. (1982). Finding the observed information matrix when using the EM algorithm. J. Roy. Statist. Soc. Ser. B 44 226–233. 0488.62018 10.1111/j.2517-6161.1982.tb01203.x

15.

Ma, Z., Sun, A., Yuan, Q. and Cong, G. (2015). A tri-role topic model for domain-specific question answering. In Proceedings of the 29th AAAI Conference on Artificial Intelligence.Ma, Z., Sun, A., Yuan, Q. and Cong, G. (2015). A tri-role topic model for domain-specific question answering. In Proceedings of the 29th AAAI Conference on Artificial Intelligence.

16.

Merton, R. K. (1968). The Matthew effect in science: The reward and communication systems of science are considered. Science 159 56–63.Merton, R. K. (1968). The Matthew effect in science: The reward and communication systems of science are considered. Science 159 56–63.

17.

Movshovitz-Attias, D., Movshovitz-Attias, Y., Steenkiste, P. and Faloutsos, C. (2013). Analysis of the reputation system and user contributions on a question answering website: Stackoverflow. In Proceedings of the 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining 886–893. ACM.Movshovitz-Attias, D., Movshovitz-Attias, Y., Steenkiste, P. and Faloutsos, C. (2013). Analysis of the reputation system and user contributions on a question answering website: Stackoverflow. In Proceedings of the 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining 886–893. ACM.

18.

Pal, A., Farzan, R., Konstan, J. A. and Kraut, R. E. (2011). Early detection of potential experts in question answering communities. In International Conference on User Modeling, Adaptation, and Personalization 231–242. Springer.Pal, A., Farzan, R., Konstan, J. A. and Kraut, R. E. (2011). Early detection of potential experts in question answering communities. In International Conference on User Modeling, Adaptation, and Personalization 231–242. Springer.

19.

Papadimitriou, C. H., Raghavan, P., Tamaki, H. and Vempala, S. (2000). Latent semantic indexing: A probabilistic analysis. J. Comput. System Sci. 61 217–235. 0963.68063 10.1006/jcss.2000.1711Papadimitriou, C. H., Raghavan, P., Tamaki, H. and Vempala, S. (2000). Latent semantic indexing: A probabilistic analysis. J. Comput. System Sci. 61 217–235. 0963.68063 10.1006/jcss.2000.1711

20.

Patra, B. (2017). A survey of community question answering. arXiv preprint arXiv:1705.04009. 1705.04009Patra, B. (2017). A survey of community question answering. arXiv preprint arXiv:1705.04009. 1705.04009

21.

Paul, S. A., Hong, L. and Chi, E. H. (2012). Who is authoritative? Understanding reputation mechanisms in quora. arXiv preprint arXiv:1204.3724. 1204.3724Paul, S. A., Hong, L. and Chi, E. H. (2012). Who is authoritative? Understanding reputation mechanisms in quora. arXiv preprint arXiv:1204.3724. 1204.3724

22.

Perc, M. (2014). The Matthew effect in empirical data. J. R. Soc. Interface 11 20140378.Perc, M. (2014). The Matthew effect in empirical data. J. R. Soc. Interface 11 20140378.

23.

Srba, I. and Bielikova, M. (2016). A comprehensive survey and classification of approaches for community question answering. ACM Trans. Web 10 18.Srba, I. and Bielikova, M. (2016). A comprehensive survey and classification of approaches for community question answering. ACM Trans. Web 10 18.

24.

Tausczik, Y. R. and Pennebaker, J. W. (2011). Predicting the perceived quality of online mathematics contributions from users’ reputations. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems 1885–1888. ACM.Tausczik, Y. R. and Pennebaker, J. W. (2011). Predicting the perceived quality of online mathematics contributions from users’ reputations. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems 1885–1888. ACM.

25.

van de Rijt, A., Kang, S. M., Restivo, M. and Patil, A. (2014). Field experiments of success–breeds–success dynamics. Proc. Natl. Acad. Sci. USA 111 6934–6939.van de Rijt, A., Kang, S. M., Restivo, M. and Patil, A. (2014). Field experiments of success–breeds–success dynamics. Proc. Natl. Acad. Sci. USA 111 6934–6939.

26.

Ver Hoef, J. M. and Boveng, P. L. (2007). Quasi-Poisson vs. negative binomial regression: How should we model overdispersed count data? Ecology 88 2766–2772.Ver Hoef, J. M. and Boveng, P. L. (2007). Quasi-Poisson vs. negative binomial regression: How should we model overdispersed count data? Ecology 88 2766–2772.

27.

Wang, X., Huang, C., Yao, L., Benatallah, B. and Dong, M. (2018). A survey on expert recommendation in community question answering. J. Comput. Sci. Tech. 33 625–653.Wang, X., Huang, C., Yao, L., Benatallah, B. and Dong, M. (2018). A survey on expert recommendation in community question answering. J. Comput. Sci. Tech. 33 625–653.

28.

Xu, F., Ji, Z. and Wang, B. (2012). Dual role model for question recommendation in community question answering. In Proceedings of the 35th International ACM SIGIR Conference on Research and Development in Information Retrieval 771–780. ACM.Xu, F., Ji, Z. and Wang, B. (2012). Dual role model for question recommendation in community question answering. In Proceedings of the 35th International ACM SIGIR Conference on Research and Development in Information Retrieval 771–780. ACM.

29.

Yang, B. and Manandhar, S. (2014). Exploring user expertise and descriptive ability in community question answering. In Proceedings of the 2014 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining 320–327. IEEE Press.Yang, B. and Manandhar, S. (2014). Exploring user expertise and descriptive ability in community question answering. In Proceedings of the 2014 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining 320–327. IEEE Press.

30.

Yang, L., Qiu, M., Gottipati, S., Zhu, F., Jiang, J., Sun, H. and Chen, Z. (2013). Cqarank: Jointly model topics and expertise in community question answering. In Proceedings of the 22nd ACM International Conference on Information and Knowledge Management 99–108. ACM.Yang, L., Qiu, M., Gottipati, S., Zhu, F., Jiang, J., Sun, H. and Chen, Z. (2013). Cqarank: Jointly model topics and expertise in community question answering. In Proceedings of the 22nd ACM International Conference on Information and Knowledge Management 99–108. ACM.

31.

Zhang, J., Ackerman, M. S. and Adamic, L. (2007). Expertise networks in online communities: Structure and algorithms. In Proceedings of the 16th International Conference on World Wide Web 221–230. ACM.Zhang, J., Ackerman, M. S. and Adamic, L. (2007). Expertise networks in online communities: Structure and algorithms. In Proceedings of the 16th International Conference on World Wide Web 221–230. ACM.

32.

Zhou, G., Zhao, J., He, T. and Wu, W. (2014). An empirical study of topic-sensitive probabilistic model for expert finding in question answer communities. Knowl.-Based Syst. 66 136–145.Zhou, G., Zhao, J., He, T. and Wu, W. (2014). An empirical study of topic-sensitive probabilistic model for expert finding in question answer communities. Knowl.-Based Syst. 66 136–145.

Citation Download Citation

Hohyun Jung, Jae-Gil Lee, Namgil Lee, and Sung-Ho Kim "PTEM: A popularity-based topical expertise model for community question answering," The Annals of Applied Statistics 14(3), 1304-1325, (September 2020). https://doi.org/10.1214/20-AOAS1346

Received: 1 August 2019; Published: September 2020

Access the abstract

JOURNAL ARTICLE
22 PAGES

DOWNLOAD PDF + SAVE TO MY LIBRARY