The Future of Poetry
“Write a poem about a sunrise.” I asked three AI chatbots—OpenAI’s ChatGPT-4, Google’s Bard, and Anthropic’s Claude—and myself—an 8th grade human. I then surveyed a panel of 38 AI experts and 39 English experts to judge the results. Is AI smarter than an 8th grader? And the survey says…AI is not smarter than an 8th grader, at least not yet. The 8th grader won 1st place, and by a higher margin when judged by English experts. Bard, ChatGPT-4, and Claude came in 2nd, 3rd, and 4th places, respectively, both in writing quality and their ability to fool the judges into believing they were authored by a human. Most strikingly, English experts were far better at discerning which poems were written by AI, with 11 English experts vs. only 3 AI experts guessing the author (human vs. AI) of all four poems correctly. This points to a need for English experts to play a greater role in helping shape future versions of AI technology.
With the explosive popularity of large language models (LLMs), much has been written about AI claiming the roles of human writers, and with that the loss of authentic human creativity. Personally, I’ve been working on a creative writing project — a collection of short fiction pieces and poetry, a few of which I have submitted for publication.
Recently, in response to one of my submissions, an editor responded, “The meter is exceptionally sharp on this poem, which is unusual for high school students, let alone someone in eighth grade. Please sign this statement attesting you did not use AI in any way to write this poem.” I felt a strange combination of flattered and slighted, but most of all, startled.
I then decided to add an offshoot to my ongoing creative writing project — I wanted to take a closer look at how well AI can create authentic writing. For my study, I chose to focus on poetry. Unlike other AI-generated writing, poetry is significantly more challenging for AI to generate authentically. Harvard student Maya Bodnick found that AI-generated essays easily passed all her freshman year classes, for example. But unlike in essays, a major component of poetry is human emotion, which AI intrinsically lacks. Keith Holyoak in the MIT Press Reader writes that “poetry may serve as a kind of canary in the coal mine — an early indicator of the extent to which AI promises (threatens?) to challenge humans as artistic creators.”
The experiment
How well can AI write poetry? In February 2023, Walt Hunter in The Atlantic examined AI poetry, concluding that AI poems were clichéd and full of wince-worthy rhymes. I wanted to see how AI capabilities have changed, roughly a year later. Mainly, I wanted to learn more about the implications for the future of poetry, and of creativity in general. I was interested in three questions:
- Turing Test: Can people correctly detect when poems are generated by AI?
- Are poems generated by AI actually quality poems?
- Is there a difference in judgment between English experts and AI experts?