May the Codeforces Be With You | Wisconsin School of Business

For coders, quality control means reviewing the work of others. Whether a beginner or professional programmer, code review is an integral part of the programming process.

With the advent of crowd-based platforms like CodeChef, Topcoder, and Codeforces, coders have the chance to test out their programming as well as review skills and potentially earn a name for themselves through contests that draw thousands of participants. Contestants are given a specific problem to solve, but they’re also competing by evaluating and “hacking” other contestants’ work. Should an error be found, the hacked submission is removed from the competition and the victor moves forward with extra points. This benefits the platform, too, which gets its own free army of evaluators for each arcane problem without having to write out too many specific test cases. While there’s no direct financial gain in participating on these free platforms, top players benefit both by honing their skills as well as earning a reputation among peers and potential employers.

But a new study from the Wisconsin School of Business suggests that visibly identifying contestants by accrued status may actually hurt the integrity of the platform.

Using Codeforces.com as the study’s setting, Yash Babar, an assistant professor of operations and information management, explored whether a contestant’s status was the driver for hacks by other contest participants. He examined contestant-contest data panels from four contests and across the platform’s 89,590 contestants at a time when the site made an unexpected change to its color ranking system. After Codeforces introduced a new color grouping (cyan), which arbitrarily shifted the status of some existing players to lower levels and others to higher, Babar and his coauthors found that individuals that lost status due to no fault of their own, suddenly received more scrutiny by peers by virtue of their status indicators being changed.

“If peer evaluation is heavily influenced not just by what is being evaluated but by who is being evaluated, then its purpose might be defeated.”
—Yash Babar

“This finding is harmful for a platform because if a judgement is made based on who that person is instead of what that content is, those evaluations might be faulty,” says Babar. “It’s essentially ‘because once I know who you are, I’m changing how I’m looking at the code.’”

Disclosing identity, however, is embedded in the fabric of such contests, partly because many coders are students intent on gaining some professional experience, but also because the platforms themselves benefit from associations with top performers. “These people are so internationally famous that Google or Microsoft would give them a job without an interview,” Babar says.

A possible solution to snap judgements based on player status that could affect code quality may be found in “blinding,” a term used in academic research to describe how author names are hidden from submitted articles during a journal’s review process. Blinding contestants during contests periodically may make sense, or it could be a built-in design decision for the platform, Babar says.

“Whenever you’re engaging in this kind of peer evaluation, there are two things at play,” Babar says. “One, you’re just being altruistic, and you want good for the community. You want to make sure that good contributions come in. On the other hand, you’re also competitive because everybody’s playing for a ranking. So you might want to pull people down. We found that once people lose status, they are more susceptible to scrutiny, especially from others who were lower ranked than they were and might have been hesitant to hack before. This could have been good, but we see that most of these new hacks were less successful and people might be focusing their energies on status losers just because they seem easier. This could take away scrutiny from other bad submissions which actually need weeding out.”

Babar says the study falls in line with larger avenues of operations and information management research on platforms and behavior, particularly in relation to motivation and eliciting quality contributions when no pay is involved. The paper lists similar crowd-based platforms—Duolingo, TripAdvisor, Google Local Guide—that share the same peer-evaluation aspect of coding sites to use crowds to generate information on quality, if not the direct competition. “If peer evaluation is heavily influenced not just by what is being evaluated but by who is being evaluated, then its purpose might be defeated.”

One of the challenges for Codeforces and other platforms like it is in staying competitive with an ever-growing user base. If status is too easily earned, then stronger contestants will seek more challenging virtual arenas, but if the status growth is too hard, then newcomers might give up by abandoning the platform. A future direction of research based on this study, Babar says, would be to look at recovering status: When participants lose status due to platform changes, how can they gain that status back? If enough participants lose a significant amount of status, they may work harder to recover what was lost. However, if too many people lose status too often, they might get demotivated.

“Even though this was an almost autocratic, sudden status change in our case, contestants wanted to get back to where they were before the shift in the color ranking system—they worked harder and sent in more contributions,” Babar says. To understand how often such status redistribution should be done, and at what point in the status hierarchy and stage of platform evolution, is an interesting future research direction.

“The value of the platform is who is on it. It’s what are called ‘network effects:’ the more competent people with higher quality contributions are on it, the better competition I get, the better set of problems and hacks I get, and ultimately, the more likely people are to value the status that they earn on the platform.” All such platform changes, such as changing a status ranking system, can have far reaching consequences for the success of a platform and perhaps need greater thought than they often get in an ever-evolving online platform.

Read the paper: Deodhar, S., Babar, Y., Burtch, G. “The Influence of Status on Evaluations: Evidence from Online Coding Contests,” soon to be published in MIS Quarterly

Yash Babar is an assistant professor in the Department of Operations and Information Management at the Wisconsin School of Business.

Tags:

Operations

Undergraduate Program

Majors & Careers

Precollege

Online Programs

Hire Our Students

Which Program is Right for Me?

MBA

Master's

Certificates & Capstones

For Companies

Faculty & Research

Doctoral

Knowledge Centers

Discover renowned faculty who guide innovation.

Alumni

News & Publications

Get Involved

Recruit & Hire

Alumni Spotlights

About

Initiatives & Values

Trusted to Lead