Skip to main content

I am working on a large translation project this year. I have been surprised to find several conversation partners voicing the assumption that I am getting AI to do the translating for me. I’ve been wondering how to respond.

A short, but in the end inadequate answer is that, impressive as the current variations on machine translation are, they still get things wrong. Neural machine translation services such as Google Translate and DeepL still produce oddities fairly regularly. I have been working lately with seventeenth-century Czech texts, an area in which I would expect machine translation to struggle a little more than usual. One mention of “God’s law and authority” came back as “God’s law and haircuts.” An allegation that the nobles who were meant to nourish the church were instead bleeding it dry came out as “You were supposed to be the foster parents of the church, but you were the vacuum cleaners of the church.” An odd obsession with livestock became mingled with the hope “that ye may be the dry bones of Ezekiel, whom the Lord hath covered with flesh, and hath put sinews upon them, and hath covered them with goats.” Results like these are good for a chuckle. A little more alarming are translations that voice the opposite of what the passage means, as when the statement that Job feared to look upon the nakedness of the poor, oppress the fatherless, and cause the eyes of widows to fail (an allusion to Job 31:16-23) came back as “Job would have fun if he saw the poor naked, he would mock the orphan and grieve the eyes of widows…”

The weird mistranslations offer a quick and amusing way to make the point that translation still demands a bit more than copy, paste, and wait, at least if you have cause to care about the precision of the results. But my concern here is not actually with accuracy rates; I make mistakes too, and I am not always the one doing the correcting.

Generative AI tools offer something a little different, including more attention to context and a focus on style. Many of the translations they offer are remarkably good. The days when the gibberish offered by machine translation visibly invited reflexive caution are over. GenAI interfaces now offer me multiple stylistic variants on an elegant English equivalent. The sentences do not look problematic. By design, they represent what the original sentence is statistically likely to be saying, with context taken into account, so they often have the ring of truth. They are fluent and glib. They seem to have beautifully solved my problem. And they are still sometimes quite wrong. I have been reflecting on how this affects me as I work. The goats and vacuum cleaners are easy to spot and compensate for. It is the more impressive results that are putting me to the test.

For one thing, GenAI translation is one more technology among many that trains me toward sensing speed as an inherent good. The experience of seeing a translation appear in seconds rather than after ten minutes of dictionary work, a detour into a reference grammar, and a side trip to a parallel passage tickles the need for speed and feeds the desire to keep moving, to get things done. Checking the results feels increasingly like stopping, like waiting in line, like sitting in traffic. Being human, I tire over time, and the temptation to accept what came back and move on grows as the work session progresses. (When did you last check your calculator’s arithmetic?)

I am also being trained to imagine AI as a reference source. This happens every time I run a quick internet search on some random topic and settle for the AI summary, even knowing from experience that those summaries are sometimes pure fiction. A treacherous part of my mind is starting to feel as if running a sentence through AI means I have already checked what it means. Here again, the careful cross-checking of mid-morning can become a lazy impulse to just roll with the flow by late afternoon.

These familiar pressures (Go faster! Trust your tools!) reinforce and are reinforced by an invitation, unsettling in the relentlessness of its repetition, to trust appearances. This is the core of my concern. The temptation to premature closure is not in itself new. I have, in the past, graded many a student assignment in a version of German or French shaped by the temptation to seize upon the first equivalent word listed in the bilingual dictionary. Taking care with words has always been hard.

Yet unlike traditional reference materials, GenAI interfaces offer me more than a list of raw materials. They offer a pre-assembled display of confident prowess and an assertion of its validity. When obscure words are encountered, a probabilistic guess is woven into a sentence that sounds likely to be right, even when it is not. In Copilot, for instance, results are introduced with the words: “Here is a clear, faithful translation of your new sentence.” The textual response includes variants and an offer to nuance them further, to make them a little more literal, theological, or formal. When I type back that a phrase is wrong, I am quickly told that I am “absolutely right.” A different translation ensues, delivered with the same rhetorical performance of absolute competence and confidence that framed the mistaken first try. My dictionary was not trying to sell me on its findings; the AI interface performs persuasive authority, simulating someone who definitely knows.

The plausibility is both the advance and the problem here. As snake oil salesmen have always known, I am conditioned to associate linguistic fluency and precision of expression with competence, and I am predisposed to accept solutions that seem designed to meet my needs. Shortcuts are always tempting for busy souls. Plausible, articulate, apparently precise shortcuts that might still be shortcuts to an entirely wrong road increase the pressure. Dealing with them requires more self-discipline than the goats and vacuum cleaners. I have a lot of text to get through, and it feels good to make progress and move to the next sentence. When GenAI offers me an elegant version of just the kind of thing that the original sentence might perhaps have said, it sometimes requires a determined act of will to check it word by word anyway. I experience my share of tiredness, impatience, laziness, and overcommitment. Pair me with a source of instant, fluent answers in sentences that sing, and it takes conscious and doggedly repeated intentionality to stop, check everything, and find out which sentences are lying.

What I am being tempted to do here, frequently and repeatedly, is to prioritize appearance over truth, ease over integrity, results over accuracy. It is hard not to judge by appearances when the appearances are so slick and meet my needs so well. Perhaps repeated resistance is teaching me virtue; I hope so. I find myself fluctuating between joy and slight awe that I am working at a time when such tools are available (my present project would take much, much longer without them), and grim awareness of how relentlessly they are putting my self-control to the test. As I look ahead to teaching a course next fall that I last taught before AI, I am already wondering how I might help my students develop the virtues that are being challenged in my own interactions with current tools.

A question that might be worth pondering: if “people look at the outward appearance, but the Lord looks at the heart” (1 Samuel 16:7, NIV), where does God look when AI offers me the false appearance of confident knowledge? Perhaps it is my own heart that is in the scales as I decide when to trust.

David I. Smith

Calvin University
David I. Smith is Professor of education and Director of the Kuyers Institute for Christian Teaching and Learning at Calvin University. He writes on teaching and learning at https://onchristianteaching.com.

Leave a Reply