The Good, the Bot, and the Ugly

How good can A.I.-generated writing be when it’s designed to sound like everything else out there?

Several months ago, I got my hands on the short quiz The Atlantic uses to prescreen copyediting candidates and ran it through ChatGPT 3.5. I was impressed with the results: it was able to catch about half of the 20 or so errors deliberately seeded throughout the text. Half isn’t good enough to make it as a professional copy editor, of course—not even close—but it caught some errors most humans would miss, and I can imagine that, in a few years, it will be able to correct most or all typos in a given text.

If catching typos were all took to make average writing great, that might be sufficient — but it isn’t, and it’s not.

There are already a few major platforms that have A.I.-powered copyediting tools built right into them. Axios suggests ways to “improve” your writing as you type, as do Gmail and Outlook. Perhaps you too have had your writerly pride injured by a robotic suggestion to change your use of “quite acceptable” to “acceptable”—as if they meant the same thing!

But A.I. no longer simply offers unsolicited advice of dubious value—it also invents entire passages from the whole cloth. I’ve been working with A.I.-generated text for several weeks now, and the more I see of it, the more dismal it looks. It’s uncanny how closely ChatGPT can simulate human writing, but it’s a very indistinct and superficial kind of writing. Wade through enough A.I.-generated writing and it begins to feel flat and samey and, well, robotic.

And while A.I. is constantly improving, I’m not sure that it will ever reach the level of skill that a human writer or editor can bring to the job.

Large language models are powered by statistics, after all, which makes it unlikely that they’ll ever be able to conjure up something truly original and surprising.

Worse, the larger the corpus an LLM is trained on, the more likely it is to represent average-quality writing. As Sturgeon’s Law famously put it, “Ninety percent of everything is crud.” So if you’re training an LLM on what’s already out there, you can expect it to be ninety percent awful. (One can only imagine what kind of writing will emerge when LLMs are trained on texts produced by other LLMs and the snake begins to eat its own tail.)

But we’re not aiming for average quality, are we? The goal is always to be the best out there, not to sound indistinguishable from the average writer, making average mistakes and producing average results. As readers abandon search engines and A.I. makes it possible to produce a virtually infinite amount of content on demand, publications will no longer be able to distinguish themselves or compete based on the quantity of content they produce. Only consistently high quantity builds loyal readerships.

And the only way to produce consistently high-quality content is to hire the writers and editors capable of creating it. Running it through an LLM specifically designed to make it sound more like everything else out there can only produce mediocre results.

But I don’t expect you to take my word for it. Let’s look at some examples!

The President’s surgery went smoothly and she is expected to make a full recovery, as per White House sources.

I can only imagine that ChatGPT produces sentences like this because millions of writers have already made the same mistake, but the fact is that “as per” ≠ “per.”

Roughly speaking, “per” means “according to” (at least when it’s used in an attribution); “per White House sources” is correct.

But “as per” suggests something more like “in accordance with”—as in “I always shovel my sidewalk within 24 hours of snowfall, as per local regulations.”

Using the wrong form just sounds a bit off, and when enough of these kinds of errors pile up—even if readers don’t consciously register them—it erodes a publication’s readability and credibility.

Vitamin supplements are growing in popularity, with vitamin C being one of the most popular.

I’ve written about this before—you can’t just stick “with” and a gerund on the end of a sentence and call it a modifying clause. A well-formed participial phrase begins with a participle (and even well-formed ones should be used sparingly). ChatGPT uses this construction all the time, no doubt taking its cues from un- or poorly copyedited writing.

At the risk of giving away one of my trade secrets: everything I copyedit, I read out loud to myself so I can hear how it sounds. And every time I reach a “with” + noun + gerund construction like this…well, if you spoke like that in real life, people would look at you funny, too.

When labor markets soften, the importance of strong technical skills becomes increasingly critical.

Sounds reasonable. Unfortunately, this sentence contains an error that actually renders it false—because the skills themselves become more critical, not their importance. The importance per se is just a measure of how critical those skills are. While this sentence is still comprehensible, the fact that its subject noun is completely superfluous to its meaning sucks all the life out of it—it’s like gesturing to empty space, rather than to the thing you want the reader to focus on.

(This is as good a place as any to point out that I’ve never seen ChatGPT use a dash correctly. I used two ems in the paragraph above, both for emphasis, to mimic the cadence of human speech. If ChatGPT is capable of doing this, I’ve yet to observe it.)

The persistent conflict underscores the necessity for leaders to adjust to unstable situations.

Again, this statement might look reasonable at first glance. But by the hundredth time you’ve witnessed ChatGPT use the word “underscores” to describe the relationship between two things, it becomes all you can see. And if things aren’t “underscoring” things, they’re “highlighting” them; ChatGPT is obsessed with things “underscoring” and “highlighting” each other to the exclusion of all other words, as if the universe were powered by emphasis alone.

Both of these words have specific meanings, but ChatGPT throws them around like conjunctions—likely thanks to their ability to imply some kind of vague correlation that almost makes sense, if you don’t think about it too hard.

What does it mean to “underscore the necessity” of something? A better way of phrasing this might be “The persistent conflict makes it necessary for leaders to adjust to unstable situations.” That’s still not a very good sentence, but at least causality is now stated outright, rather than implied.

Inflation is not only impacting the cost of consumer goods but also the labor used to manufacture and distribute them.

Human writers produce sentences like this quite often—too often!

Its problem is a common one: there’s a breakdown in parallel construction. When you see phrases like “both X and Y” or “not only X but also Y,” these X–Y pairs (or lists) of items need to be set up such that the lead-in to the first item also agrees with the items that follow.

In this example (simplified as “not only impacting X but also Y”), “not only” precedes the present participle “impacting.” This sets us up for another participle after “but also”—but there is no participle there, only a plain old noun phrase. The human brain can compensate for this imbalance—but it takes effort, and the more work you make your reader do to interpret a faulty text, the more wearying and unpleasant the experience is.

Sad to say, ChatGPT makes errors like this all the time, and as long as it’s trained on writing that includes them, it will probably continue to do so.

But even if its writing were grammatically correct, it would still lack a distinctive voice. ChatGPT can only mimic what it has already read, and so much of what’s out there is not worth imitating.

Leave a Reply

Your email address will not be published. Required fields are marked *