P/NP and Game Theory
In light of the recent changes of UC Berkeley’s grading scheme to P/NP, it’s quite interesting to examine the problem of grading through a Game Theoretic perspective. Lots of students have proposed that the change to an online format encourages cheating, and that the curves for this semester will change drastically as a result. In this essay, I hope to explore this claim and test its validity.
N.B.: This post was originally the essay for one of my Stat 155 assignments. I thought it interesting enough to share on my blog – I hope you enjoy it!
UC Zoom
On Friday, March 13th, classes were officially moved online by the UC Berkeley Administration. At this point, the change to the grading rubric had not been made yet. Let us model a typical Computer Science class, where grades are curved to a B+, as an nplayer zerosum game^{1} (with zero being the mean, and each integer difference being one standard deviation). To simplify this, we examine the twoplayer zerosum game, which is wellstudied.
The game is modeled as follows, with each student having two possibilities, cheat
and honest
. We assume that the students are of equal skill, and that cheating gives a \(+1\) standard deviation edge. Being caught in a class is \(0.12\)^{2}. Payoffs are written to row, who is \(A\) in this case:
We see \(\mathrm{(cheat, cheat)}\) is the dominant strategy, meaning in the case of letter grading, our model tells us that students do in fact have an incentive to cheat.^{3}
Campuswide P/NP Policy
With the campuswide P/NP policy in place, the incentive to cheat is suddenly changed. In the CS department, it is yet unclear how grades will be assigned, but taking that a C minimum is required to get a passing grade, one can do several standard deviations below mean and still pass in the class. Looking at grade distributions of a large CS class (\(N > 19000\)), historic grades have told us that C and up covers 95% of the population^{4}. Therefore, in a game where two students are both taking the course for P/NP credit, modeling the grade distributions off of a large CS class, there seems to be very low incentive for anyone to cheat. Regardless of whether or not they cheat, they will more than likely make the cutoff if they put enough effort into the class, and cheating only gives the extra possibility of getting caught. Therefore, with P/NP, it seems that the incentive for cheating decreases.
This time, the game is modeled as a generalsum game, since one person doing well has a negligible impact on another. We only look at the payout of one person, since in the nperson generalsum game, everyone has a symmetric payout.
\[\begin{bmatrix} \mathrm{cheat} & \mathrm{honest} \\ 0 + \epsilon & 0 \\ \end{bmatrix}\]Banning Proctoring Over Zoom
With the ban of proctoring, the risk of getting caught for cheating is all but zeroed. Does this impact the risk of cheating?
With the Zoom ban, many CS classes have shifted to an openbook policy. This means that, effectively, everything in the test can be looked up either way. As a result, while cheating is easily doable, with other mitigations such as test randomization, it seems that there is little impact on the rate of cheating. We get the same chart, after adding the ease of cheating and subtracting the hoops needed to hop through to cheat.
\[\begin{bmatrix} \mathrm{cheat} & \mathrm{honest} \\ 0 + \epsilon & 0 \\ \end{bmatrix}\]Footnotes

A brief note about the choice of zerosum versus generalsum: Since classes in the CS department are all curved, this would mean that, if we took two students \(A\) and \(B\), fixing \(A\)’s performance would mean that \(B\) doing better causes \(A\) to get a worse grade. We see that in the twoplayer scenario, since their grades must average to the same grade, \(B\) getting \(+d\) standard deviations would result in \(A\) getting \(d\). This is a zerosum game. ⤴

Suppose 1 in every 50 students are caught for cheating. (This amounts to 2%.) Empirically, we have no data for cheating, since it would require all students to admit their cheating, which is somewhat of a catch22. Either way, suppose that getting caught for cheating results in \(5\) on their grade, whereas not getting caught results in \(+1\). This is a bernoulli random variable with \(p=0.02\), and the expected value is \(0.02 * 5 + 0.98 * 1 = 0.88\). ⤴

Note that these incentives are still in place even without online testing; it’s just that cheating with proctors in person is probably much harder. ⤴

Modelled off of Berkeleytime, ⤴