P/NP and Game Theory

In light of the recent changes of UC Berkeley’s grading scheme to P/NP, it’s quite interesting to examine the problem of grading through a Game Theoretic perspective. Lots of students have proposed that the change to an online format encourages cheating, and that the curves for this semester will change drastically as a result. In this essay, I hope to explore this claim and test its validity.

N.B.: This post was originally the essay for one of my Stat 155 assignments. I thought it interesting enough to share on my blog – I hope you enjoy it!

UC Zoom

On Friday, March 13th, classes were officially moved online by the UC Berkeley Administration. At this point, the change to the grading rubric had not been made yet. Let us model a typical Computer Science class, where grades are curved to a B+, as an n-player zero-sum game¹ (with zero being the mean, and each integer difference being one standard deviation). To simplify this, we examine the two-player zero-sum game, which is well-studied.

The game is modeled as follows, with each student having two possibilities, cheat and honest. We assume that the students are of equal skill, and that cheating gives a \(+1\) standard deviation edge. Being caught in a class is \(-0.12\)². Payoffs are written to row, who is \(A\) in this case:

\[\begin{bmatrix} A \textrm{\\} B & \mathrm{cheat} & \mathrm{honest} \\ \mathrm{cheat} & 0 & 0.88 \\ \mathrm{honest} & -0.88 & 0 \\ \end{bmatrix}\]

We see \(\mathrm{(cheat, cheat)}\) is the dominant strategy, meaning in the case of letter grading, our model tells us that students do in fact have an incentive to cheat.³

Campuswide P/NP Policy

With the campuswide P/NP policy in place, the incentive to cheat is suddenly changed. In the CS department, it is yet unclear how grades will be assigned, but taking that a C- minimum is required to get a passing grade, one can do several standard deviations below mean and still pass in the class. Looking at grade distributions of a large CS class (\(N > 19000\)), historic grades have told us that C- and up covers 95% of the population⁴. Therefore, in a game where two students are both taking the course for P/NP credit, modeling the grade distributions off of a large CS class, there seems to be very low incentive for anyone to cheat. Regardless of whether or not they cheat, they will more than likely make the cutoff if they put enough effort into the class, and cheating only gives the extra possibility of getting caught. Therefore, with P/NP, it seems that the incentive for cheating decreases.

This time, the game is modeled as a general-sum game, since one person doing well has a negligible impact on another. We only look at the payout of one person, since in the n-person general-sum game, everyone has a symmetric payout.

\[\begin{bmatrix} \mathrm{cheat} & \mathrm{honest} \\ 0 + \epsilon & 0 \\ \end{bmatrix}\]

Banning Proctoring Over Zoom

With the ban of proctoring, the risk of getting caught for cheating is all but zeroed. Does this impact the risk of cheating?

With the Zoom ban, many CS classes have shifted to an open-book policy. This means that, effectively, everything in the test can be looked up either way. As a result, while cheating is easily doable, with other mitigations such as test randomization, it seems that there is little impact on the rate of cheating. We get the same chart, after adding the ease of cheating and subtracting the hoops needed to hop through to cheat.

\[\begin{bmatrix} \mathrm{cheat} & \mathrm{honest} \\ 0 + \epsilon & 0 \\ \end{bmatrix}\]

Footnotes

A brief note about the choice of zero-sum versus general-sum: Since classes in the CS department are all curved, this would mean that, if we took two students \(A\) and \(B\), fixing \(A\)’s performance would mean that \(B\) doing better causes \(A\) to get a worse grade. We see that in the two-player scenario, since their grades must average to the same grade, \(B\) getting \(+d\) standard deviations would result in \(A\) getting \(-d\). This is a zero-sum game. ⤴
Suppose 1 in every 50 students are caught for cheating. (This amounts to 2%.) Empirically, we have no data for cheating, since it would require all students to admit their cheating, which is somewhat of a catch-22. Either way, suppose that getting caught for cheating results in \(-5\) on their grade, whereas not getting caught results in \(+1\). This is a bernoulli random variable with \(p=0.02\), and the expected value is \(0.02 * -5 + 0.98 * 1 = 0.88\). ⤴
Note that these incentives are still in place even without online testing; it’s just that cheating with proctors in person is probably much harder. ⤴
Modelled off of Berkeleytime, ⤴