Online platform giants such as Facebook, Reddit, and Wikipedia may attract millions of users each day who generate and share content, but that doesn’t mean these sites can run themselves. Behind the scenes, companies must hire both professional and volunteer content moderators who run interference through monitoring and removing content, issuing warnings, and creating community guidelines and regulations.
A recent study by Qinglai He, assistant professor of operations and information management at the Wisconsin School of Business, looks at what happens when “bot moderators” are used alongside their human counterparts. Bot moderators are machine-powered—programmed by algorithms—and can perform many of the tasks the online platforms require.
In her study, He analyzed data from 156 Reddit community forums (known as “subreddits”) that utilized both bots and human volunteer moderators. She found that implementing bots freed up volunteer moderators’ time enough to allow the volunteers to perform 20.2% more corrective moderations on community guidelines and 14.9% more supportive explanations to platform users. This effectively served to increase the volunteers’ role as managers who could delegate more routine tasks to the bot moderators.
WSB sat down with He to talk about her work:
WSB: Tell us more about your study. Machine learning is an emerging area within the larger field of artificial intelligence (AI), is that right?
He: Yes, I believe that what I’m studying now with the online platform space is a very important question. How can we create a healthy online environment and how can we stimulate more effective user interaction on these platforms? Machine learning and the issues surrounding it are still pretty new. Few studies have looked at the role of algorithms in community governance, so this idea got me very excited.
I take a mixed approach, both topic-wise and methods-wise, in my work. On the methods side, for example, I’m doing quantitative empirical research, but I’m also applying state-of-the-art, machine learning techniques. In my field, people talk about how the mixed method is becoming more and more popular, because it allows us to go beyond traditional social science research.
I use a lot of the natural language processing (NLP) technique. NLP is a branch of computer science that uses technology to understand human language and texts. Since the majority of my study data is text and I can often be working with over a million records, I use these tools to understand and process a large amount of the data.
WSB: Your paper mentions that one of the drivers behind introducing bot moderators to a platform is that the moderators, often volunteers, are overwhelmed by the workload. It’s too much.
He: There are several issues we are looking at with the volunteer moderators and the bots, and they are actually closely related. First, the volunteers can get burned out very quickly, because this is an unpaid job and they also have to handle conflict with and among users. We’re looking at whether we can make this volunteer-based model more sustainable.
Second, scalability is a big issue. Can the volunteer base combined with the bot moderators scale for large data moderations and can it generate more desired content? When I say “desired,” what I mean is: If I’m told as a user that my content is not approved, I want more clarity from the moderator. Maybe they can give me a good explanation of why that happened, guide me through the process, or tell me why my content doesn’t fit for that particular community. But because most of the moderators are volunteer and dealing with large amounts of data, it’s challenging for them to generate more careful, more desired moderation.
What I found in the study is that after volunteers received help from bot moderators, they completed more voluntary moderation because they did not have to handle the repetitive tasks; they could now offer more suggestions and explanations to users. And at the same time, the volunteer retention in that community also increased. So, the finding suggests that bot moderators can help the volunteers achieve better outcomes.
WSB: Tell us more about Reddit and why you chose that site for your study. You mention, for example, how the transparency of the data—its public availability—was a factor in that decision.
He: There are quite a few reasons, the first of which is that Reddit is very large. At the time I wrote the paper, it was the fifth largest social media platform in the world. That alone is very important, as it reaches so many people, but also the fact that it’s largely intact as a data set—available and transparent. Additionally, the way the site uses moderators is a more approachable method than many other platforms.
During this time, I was also a moderator for a data visualization community on Reddit called r/dataisbeautiful. Since I was working on this topic, I wanted to get a deeper experience of that role; it did help me understand the context much more.
For example, the bot moderator can only take care of simple, very objective tasks. If a task is more complex or a community rule necessitates subjective judgement, that still requires human effort to complete. In my community at the time, the number one rule was that a visualization must be of high quality. But how do you define high quality? Most of the time, we did not use bots to do that because concepts such as high quality are really hard to quantify. So, even after a site implements a bot, it doesn’t completely substitute for human will.
“The Effects of Machine-Powered Platform Governance: An Empirical Study of Content Moderation” was selected as a 2021 runner-up for the Association for Information Systems’ prestigious ACM SIGMIS Doctoral Dissertation Award.
Read the paper: “The Effects of Machine-powered Platform Governance: An Empirical Study of Content Moderation”
Qinglai He is an assistant professor in the Department of Operations and Information Management at the Wisconsin School of Business.