Following the wide public release of the GPT-3 language generator, the internet has been awash in panic and awe—but mostly panic. Headlines like “The College Essay is Dead” (Atlantic) and “Will Chat GPT rot our brains?” (RNS) tell the story of our academic anxiety in the face of this challenge. Others have offered more sanguine takes: “A.I. Could Be Great for College Essays” (Slate); a team of Washington Post authors show how the technology could help dyslexic email writers fix their communication; and a Christian university president has told us not to worry.
For those who haven’t heard about it yet: “GPT-3” stands for “generative pre-trained transformer,” version 3. The GPT system is an artificial intelligence system, hosted online—for free, for anyone to use—that can write original and sometimes creative and compelling prose about (almost) anything one asks it to write. The process is different from, say, merely copying paragraphs from Wikipedia; GPT actually “composes” new and strangely intuitive sentences based on your prompt. It can write in a variety of tones and styles, such as in the style of a bratty teenager, or in King James Bible English. As Aaron Levie, CEO of Box, memorably tweeted, “ChatGPT is one of those rare moments in technology where you see a glimmer of how everything is going to be different going forward.” Imagine students submitting their essay prompts to the machine. Imagine entire newspapers or novels written by AI.
Screen-shot of information GPT-3 gives about itself on its home screen
And if you think GPT-3 is the pinnacle of this emerging technology…you don’t, because you’re savvy enough to know that advancements in this field are exponential, and the next new thing is right around the corner. The New York Times tech reporters who host the “Hard Fork” podcast reported that the more disturbing and/or exciting development here is the next iteration, GPT-4; those who have seen this new version, the hosts said in a recent episode, have spoken about it “as though they have seen the face of God.”
As is often the case with startling new technological advancements, it’s likely that the breathlessly enthusiastic techno-utopian cheerleaders and the despairingly woebegone dystopian critics have gotten things wrong… and right. What, exactly, each is right and wrong about is yet to be seen. What seems clear, however, is that the one wrong thing professors and administrators could do in response to GPT is to ignore it and pretend like it doesn’t exist.
The most immediate question that needs to be addressed is pedagogical: how can we continue to teach in the GPT Age? The challenge here is that, given the pace of technological change, whatever solutions we offer will, at best, be stop-gap measures. Some of us may be tempted to give up altogether. (One professor humorously had the bot create an assignment, then a grading rubric for the assignment, and then a student response to the assignment, and then use the rubric to grade the assignment—it received a B—then write the student email complaining about the grade, then write the professor’s response, and, finally, write a course evaluation blasting the professor for being unfair). But for those of us who feel called to continue participating in the flesh-and-blood practice that is teaching and learning, we need to think about how best to do so. Stop-gap measures are better than no measures at all.
Beyond the questions of pedagogical best practices, GPT-3 raises deeper philosophical and pragmatic questions about the nature and purpose of higher education.
For example, why is it important for students to know how to write well? The same technology that will allow them to bypass detection when completing college writing prompts will also allow them to quickly and (mostly) competently “write” anything from computer code and business plans to legal briefs and sermons. (If this last example seems far-fetched, a pastor friend assures me that busy and burned-out ministers will be tempted to rely on anything that eases the weekly pressure of the sermon.) So, if it can—and likely will—be used in certain industries, what is the purpose of eschewing it in college? There are, of course, good answers to the question. One might point to the way writing not only helps us communicate, but also can be a means to extend and deepen our thinking. As St. Augustine once wrote, “I freely confess, accordingly, that I endeavor to be one of those who write because they have made some progress, and who, by means of writing, make further progress” (Letter 143.2).
We might also point to the character virtues that come through the writing process—patience, attention, diligence, and intellectual humility. One who consistently avoids activities in which these virtues are required for excellence will never have the opportunity to develop them, and may likely find themselves exhibiting their opposite: impatience, distraction, laziness, and pride (with very real implications for their ability to thrive both in and beyond school and work). But, admittedly, that’s a lot to expect a harried undergraduate to keep in mind when she’s under a deadline and afraid of losing her spot in the nursing program. This is only one of the many philosophical questions raised by GPT-3. The point is: when and between whom are these conversations going to occur?
A related, but more pragmatic type of question is this: how are administrators going to respond to the issues raised in these conversations? Administrators, at least in universities like our own, have the unenviable job of ensuring the survival of their institutions within an increasingly competitive economic environment, where the “value proposition” of the university must be demonstrated, quantified, and packaged for ready consumption by potential customers. Needless to say, unless one is very careful, this works against arrangements that are conducive to the type of personal relationships with students that might discourage GPT-3 use. As we’ll suggest below, incorporating personal reflections into writing assignments may make it possible to discern genuine engagement, but only if we know our students. Furthermore, “avoiding breaches of academic integrity,” in the abstract, is a less compelling moral ideal than “don’t try to bullshit someone whom you are trying to respect.” But in large classes where students feel anonymous, it may be difficult to avoid the impression that they are already involved in a B.S. enterprise. In those cases, “getting through it” seems more important than maintaining trust (if one has to choose between the two). The difficult question we are facing thus becomes clear: is it possible, in this day and age, to create financially sustainable conditions through which students can be known by professors (and vice versa)?
As Christian scholars, we want to encourage open-minded but clear-sighted conversations about the implications of GPT-3 for teaching and learning in the university setting. The material cited here indicates that such a conversation is beginning to occur on a broad level, but we also think it behooves us all to talk about what this technology means within our own particular, local contexts. This likely means setting aside time to have focused conversations with other faculty members and with administrators about how to wisely respond. It also may mean raising the issue directly with students, though, this is admittedly a fraught notion. It feels a bit like introducing a young teen to crack. If students are not already aware of GPT-3, then why would I want to open Pandora’s box? And yet, to riff on the WaPo slogan, academic integrity dies in darkness. Perhaps it’s best to shine a light?
For instructors heading back into the classroom and fearing the worst for the fate of their essay assignments, we offer these five ideas:
(1) Take a deep breath, find the GPT-3 system online, sign up, and run some of your essay assignments through the system. You’ll learn a lot about what the system can and cannot do, and you’ll begin the journey to find out how “bot-proof” your assignments currently are. The exercise may even calm your nerves considerably, as you’ll find that on some topics the bot is completely terrible. On others…you may start to worry. One of the most obvious takeaways is that very generic essay questions are extremely susceptible to cheating through GPT. This revelation will not be a bad thing, however, as many of us know this already and all of us can use a little kick in the pants to consider whether we are asking students to merely report basic-level information back to us on the lowest level of Bloom’s taxonomy (recall facts and concepts) or whether we can move our students toward better projects that connect, analyze, judge, design, and present new ideas. And in fact, overly generic and common essay topics were already susceptible to various forms of plagiarism and cheating. As Andy Crouch has observed, GPT is quite adept at delivering writing that is basically correct, but also quite often cliché. The problem is that this is exactly the sort of writing we too often expect from undergraduates in, say, an introductory humanities course.
(2) As of the time of this article, GPT has only “limited knowledge of world and events” after 2021 (see screen-shot from the program’s home page above). For now, this opens up an opportunity for instructors to require, as part of their writing assignments, students to integrate citation and discussion of current events or written sources published in 2022. For example, instead of merely asking students to “Explain Kant’s categorical imperative,” we could ask students to “Briefly explain Kant’s categorical imperative, and evaluate in light of the case study in Smith 2022”—where “Smith 2022” refers to a recently published think-piece on the topic you have asked students to read as part of the class or specifically for the assignment at hand. Moreover, GPT has problems with citations and using a particular citation style, so specifying things along these lines will discourage the most egregious copying/pasting from the bots.
(3) Move toward writing assignments that not only ask students to show mastery of objective facts and cite specific and recent things (see 2 above), but also to integrate their own detailed personal experiences in light of the topic. To expand upon the example given above: “Briefly explain Kant’s categorical imperative, and evaluate in light of the case study in Smith 2022. As part of this explanation and evaluation, discuss a situation in your own life where you have either failed or succeeded to act according to the imperative, with some reflection on your success or failure in this respect.” Yes, GPT-3 can write fake first-hand experiences. But knowing our students (when that is possible) will help us discern genuine engagement, and requiring the experiential piece will push some students away from trotting out the bot’s experiences as their own.
(4) For many years expert writing instructors have been touting a “scaffolding” approach to writing—that is, teaching students to write though a series of drafts, steps, and an editing process that mirrors the way real knowledge is created. Yes, the bots can write drafts of assignments. But requiring multiple writing steps, especially combined with face-to-face meetings with peers or the instructor en route to the final draft, will help create a better barrier against the most routine forms of cheating. This was good writing-teaching advice even before GPT-3!
(5) A probably endless cat-and-mouse game will emerge to create user friendly technologies that can detect AI-generated text. Instructors may not be able to rely on them fully at this point, but when they are reliable, they could provide an obvious deterrent, much like existing plagiarism checkers (e.g., Turnitin, Grammarly, etc.). At this point, it’s probably best to experiment a bit with available options (see e.g., “Streamlit“), and share what works with your colleagues. Again, as GPT technology improves, detection tools will likely be playing catch-up. But it is also possible that we may one day reach an equilibrium where those who create tools for AI-generated text tolerate a certain level of detectability so long as that text still appears natural to human readers.