Recently, a Swiss research team carried out an unsettling experiment in online persuasion. They went to a Reddit forum — one where users post opinions and invite others to challenge them — and they quietly unleashed a set of AI accounts to try to change people’s minds. Their large language models participated just like any other Redditor. The bots wrote original posts, engaged with users, and produced bits with the hope of changing minds.
To measure success, the researchers tallied whenever an original poster publicly admitted that their mind had been changed.
The bots came in three varieties. Some were generic, relying only on the text of the post itself without any additional context. Others were personalized, leveraging basic information like the original poster’s age, location, and political leanings. And a third group was community-aligned: fine-tuned on the archive of past winning comments from the channel, in an attempt to match the tone and style that historically led to successful persuasion.
When the experiment ended, the personalized bots had achieved an 18% success rate. The generic bots followed close behind at 17%. The community-aligned bots, despite being trained on past human winners, managed only 9%.
Importantly, the average human success rate in that reddit channel hovers around 3%.
So the bots had crushed their human competition. In an environment teeming with intellectuals, the bots not only survived but thrived. In fact, one of the AI accounts climbed into the 99th percentile of all users on the subreddit, racking up 10,000 karma points along the way.
One of the most surprising aspects of the experiment was how effectively a small team of researchers, operating with modest academic resources, was able to outperform essentially every human debater on the platform. Let that sink in: A handful of graduate students, armed with an LLM and a sneaky deployment strategy, quietly demonstrated what it might look like if influence operations were scaled by AI.
Worse still, no one noticed. This Reddit channel prides itself as one of the most critically-minded communities on the platform. If any place could sniff out an imposter, it should have been here. But for the entire 4-months, the bots played along undetected by the fleshy human natives.
This should also make us consider the role of personal data. When the bots were given a simple profile of the user (a rough sketch of age, location, and political leaning) their performance improved measurably: The success rate bumped from 17% to 18%. Wait a minute, you might say, that’s a pretty small gain. But in political terms this can be enough to swing many national elections. The difference between nudging public sentiment one way or another can come down to tiny tweaks. The lesson is that even minimal personalization can significantly sharpen the edge of an AI’s persuasive power.
When the story surfaced, reactions were grim. After all, this modest academic experiment illustrates what stealth influence operations could look like in the near future. If a handful of researchers could do this undetected, what happens when state actors, corporations, or political campaigns with real resources do the same, at a scale thousands of times larger?
And the failure of detection raises serious questions about how societies can protect public discourse as we enter this new future. If smart Redditor debaters couldn’t spot the difference between a human and a bot, what hope is there for broader audiences?
The major AI companies have pledged to avoid building models with “dangerous capabilities” (among them, the ability to manipulate public opinion en masse). But this Reddit experiment suggests the thresholds may already be easier to cross than any of us had anticipated.
With major elections looming in the coming years, the margin for error is rapidly closing. The halcyon days are gone — those days when we could assume the replies to our online messages came from a fellow human.
But here comes my more optimistic take
News articles about this Swiss experiment frame it as a harbinger of danger, and with good reason: The risks are real, and the potential for abuse are obvious.
But it may be worth also asking a different question: Why did the bots succeed?
After all, they didn’t hack people’s brains with neural interfaces. They didn’t spread fear or disinformation. They didn’t manipulate emotions or overwhelm users with noise.
They simply made better arguments.
The bots presented their points calmly, rationally, and persuasively.
And when users changed their minds, it wasn’t because they had been tricked. It was because they recognized that another perspective made sense.
Humans often change their minds when faced with sound reasoning that can be backed up. And mind changing isn’t a flaw. It means you’re willing to reconsider some closely-held opinion when the facts or logic warrant it. It’s a mark of intellectual strength, not a sign that you’ve been tricked.
In that light, the success of AI debaters is not necessarily a story of manipulation. It might be a story about raising the bar of debating.
Lessons from Chess and Go
As I followed this study, it struck me that there may be a helpful precedent for thinking about this moment: the events that unfolded in the competitive worlds of chess and Go.
When IBM’s Deep Blue defeated Garry Kasparov in 1997, it was seen as a seismic moment. The games had been “solved”. Human mastery was obsolete. Later, when AlphaGo beat the world’s top Go player, the choir of concerned voices grew louder.
But something unexpected happened. Here’s a clip from my new book, Empire of the Invisible, slated for next year:
In May of 2017, the world’s number one Go player Ke Jie faced off against his toughest opponent. Jie was the reigning champion of the ancient game of Go, in which two players use smooth black rocks or white rocks to surround more territory than their opponent.
In Jie’s case his opponent was an artificial intelligence program called AlphaGo, designed by Deep Mind. AlphaGo had been trained on many millions of games of Go, deeply absorbing the statistics of possible plays.
Jie lost the first game. AlphaGo had pulled moves that none of Jie’s human opponents had ever thought of.
Then Jie lost the second game. He had stood no chance. The AI had won over a human in a game more complex than chess. And subsequent versions of the AI will, without doubt, continue to win evermore.
But that’s not the interesting part of the story. The interesting part is what happened next.
Jie got over his embarrassment. He became mesmerized by what had just transpired. He studied the games he lost.
Before he played AlphaGo, Jie had previously won a majority of the games against his human opponents. But afterward he found he was able to beat his human opponents even more easily. After his species-shaming defeat in June of 2017, Jie went on to play 12 straight matches against humans, winning them all in a row. What had happened? Jie had been exposed to new kinds of moves and strategies pulled by AlphaGo that lay outside the traditional ideas. Such moves were legal and possible, but different from what had been played over the previous 2500 years. (For Go aficionados, this included novelties such as playing a stone directly diagonal to its opponent’s lone stone, or commonly playing 6-space extensions while humans tend to prefer 5-space).
Jie reported that playing against the AI was like opening a door to another world.
When AI trumps chess champions and Go champions, it does so with moves that seem inhumanly creative. But all the moves are allowed by the rules: Humans simply never thought to go there before. The key is that once the moves are seen, they are easily incorporated into the human’s model. Jie’s experience with AlphaGo illuminated new nooks and crannies in his landscape, exposing pathways that had never been lit up before.
Many commentators are worried that AI is going to leave humans far behind, and in many respects that’s true. But as computation improves, so will we. AI will illuminate dark parts of our maps, allowing us to see new roads we didn’t even suspect.
AI didn’t make the games of chess and Go irrelevant — instead, AI became a tool for steep improvement. Nowadays, all good players train with AI. They study the surprising, sometimes counterintuitive, and alien strategies of the artificial mind. And just like Ke Jie, today’s grandmasters play a deeper, more creative game than ever before.
Instead of dampening human excellence, AI sparked a renaissance in these games.
Human minds are elevated by learning from their artificial counterparts.
I suggest it’s possible (perhaps even likely) that a similar dynamic will unfold with persuasive argumentation. If AI agents can model the best forms of debate (clear, structured, empathetic, rational) then we humans can learn something from our artificial exemplars. We can try out new moves. We can sharpen our skills on digital grinding stones.
Imagine a future where students practice crafting arguments by debating highly skilled AI tutors. Imagine online discussions becoming more useful because users have grown accustomed to high-quality exchanges. Imagine politicians, journalists, and everyday citizens pushed to improve their thinking and better articulate their positions.
Rather than dumbing down conversation, the rise of high-performing debate bots could nudge public discourse toward a new level of reasoned discussion. If that turns out to be the case, then we may come to see AI not as an enemy but as a sparring partner, which has very different implications.
I don’t want to minimize the risks we’re facing. We need audit tools and authentication systems that can verify whether content was written by humans or AI. Technical solutions like watermarking, cryptographic content authentication, and provenance tracking are probably going to become not just helpful, but essential.
But beyond these defensive measures, we might also recognize an opportunity. Because, like the chess and Go engines that reshaped how champion players think, debate bots could reshape how we reason, argue, and understand one another.
If we take this on correctly, AI might just up our game.
Also check out my Inner Cosmos episode on secrets in the human brain… and whether AI is already showing evidence of deceit: Why is it hard to keep a secret?