Designing Informative Rating Systems: Evidence from an Online Labor Market

Garg, Nikhil and Johari, Ramesh

Platforms critically rely on rating systems to learn the quality of market participants. In practice, however, these ratings are often highly inflated, drastically reducing the signal available to distinguish quality. We consider two questions: First, can rating systems better discriminate quality by altering the meaning and relative importance of the levels in the rating system? And second, if so, how should the platform optimize these choices in the design of the rating system? We first analyze the results of a randomized controlled trial on an online labor market in which an additional question was added to the feedback form. Between treatment conditions, we vary the question phrasing and answer choices. We further run an experiment on Amazon Mechanical Turk with similar structure, to confirm the labor market findings. Our tests reveal that current inflationary norms can in fact be countered by re-anchoring the meaning of the levels of the rating system. In particular, scales that are positive-skewed and provide specific interpretations for what each label means yield rating distributions that are much more informative about quality. Second, we develop a theoretical framework to optimize the design of a rating system by choosing answer labels and their numeric interpretations in a manner that maximizes the rate of convergence to the true underlying quality distribution. Finally, we run simulations with an empirically calibrated model and use these to study the implications for optimal rating system design. Our simulations demonstrate that our modeling and optimization approach can substantially improve the quality of information obtained over baseline designs. Overall, our study illustrates that rating systems that are informative in practice can be designed, and demonstrates how to design them in a principled manner.