New Paper: Exploring the social influence of Kaggle virtual community on the M5 competition

Authors: Xixi Li, Yun Bai, Yanfei Kang

Abstract: One of the most significant differences of M5 over previous forecasting competitions is that it was held on Kaggle, an online community of data scientists and machine learning practitioners. On the Kaggle platform, people can form virtual communities such as online notebooks and discussions to discuss their models, choice of features, loss functions, etc. This paper aims to study the social influence of virtual communities on the competition. We first study the content of the M5 virtual community by topic modeling and trend analysis. Further, we perform social media analysis to identify the potential relationship network of the virtual community. We find some key roles in the network and study their roles in spreading the LightGBM related information within the network. Overall, this study provides in-depth insights into the dynamic mechanism of the virtual community’s influence on the participants and has potential implications for future online competitions.

Links: Working paper

Published by

Yanfei Kang

Dr. Yanfei Kang is Associate Professor of Statistics at Beihang University in China. Prior to that, she was Senior R&D Engineer in Big Data Group of Baidu Inc. Yanfei obtained her Ph.D. degree at Monash University in 2014. She worked as a postdoctoral research fellow during 2014 and 2015 at Monash University. Her research interests include time series forecasting, time series visualization, text mining and statistical computing.

