AI backwash

AI backwash

If you have not yet listened to Shell Game by Evan Ratliff, go download it and treat yourself. It is funny, profound and scary in equal measure. In Shell Game, tech journo Evan Ratliff clones his voice using ElevenLabs, uses Vapi to make an AI agent that connects to telephonic infrastructure, and uses ChatGPT 4.0 to conversationally power it, all the way back in 2024. The agent sounds like him, represents him, and from this point, I will call the agent "Agent Evan".

Real Evan sends "Agent Evan" out into the world to interact conversationally on his behalf with his long-suffering wife, banking customer service people and telephone scammers. The podcast presents unredacted, hilarious and bizarre interactions between humans and Agentic Evan.

Then, real Evan steps into the surreal by sending multiple Agentic Evans into a room to chat together in what can be explained as a three-way mirror of probably the weirdest conversations I have ever heard. For me, this episode had the most lolz and the most questions. I highly recommend this episode, not just for the laughs and mega uncanny valley, but also for the fascinating observations Real Evan makes about the limitations of AI, and its capacity to affect our society, our very being, even our wiring.

Halfway through the season, Real Evan sounds different. Irritating even. In fact, Real Evan comments on his own real-world voice. He and those around him noticed he had begun to adopt a weird flat affect. In fact, I heard it just before he commented on it himself: he had begun to sound like Agentic Evan and it sounded soooo off.

I analysed what was creepy and got it down to some specific features. The features of Real Evan's new voice were:

Weird pauses and pacing: his pauses became unnatural, in weird places or too long or too short. His pacing was too even without the natural speeding up and slowing down humans do, possibly the most uncanny aspect.

Flat affect: Real-world Evan developed an emotionally flat affect that had an undefinable, hollow quality.

Thankfully, Real-world Evan never got as far as displaying latency, because he was sounding weird enough.

It is only one example, but it was so powerful to listen to this happening to someone. For months I had observed myself and my colleagues using ChatGPT and was aware the way we write is starting to sound a little like ChatGPT. For months people were "deep-diving", "navigating" and things were "quietly" "seamless".

As humans, we adapt and model norms. It is no surprise that we begin to mirror our environment, including how it sounds. We talk about Generative AI and we forget that while we are prompters, we are also the consumers of what we are generating. We are also consuming everyone else's GenAI content, and adjusting to that norm. We adopt the style and voice that we have exposure to. That may have consequences for who we are, and how we think and relate to one another, especially in a workplace setting. These consequences are the "AI backwash".

Before I go into these "backwash" consequences, let me just provide a super simplistic explanation of how AI works. GenAI, like ChatGPT, is trained on very large amounts of language data and learns statistical patterns in how words tend to follow one another. We give a command, and it generates a response by estimating what is most likely to come next based on those learned associations: it makes semantic connections, matches patterns, and produces a result. It doesn't replicate nuanced contextual thinking... yet. When most of us are not using particularly sophisticated or layered prompts, what we are often getting back, and then consuming, is what is most statistically typical. So what we are beginning to mirror is statistically typical. This has been noticed and named as a homogenising effect, and academics are either saying it is reducing linguistic diversity,¹ or that it is simply an adoption of a style.²

For me, the backwash isn't necessarily that it is making us thick, although concerns about brain rot and passivity are not unfounded; it is that we may be losing some originality of voice, some personality on the page and, worse, some confidence in our ability to express ourselves in our own voice, just in case it isn't good enough. In other words, we may be starting to standardise ourselves to fit a corporate average, and losing our confidence in who we are in the process. Corporate norming and editing the self existed long before LLMs, of course; but what's changed is the speed and scale of the norming. We are "Starbucksing" our workplace selves and losing confidence along the way.

If you are using it for work correspondence (and I often do) it may make you more agreeable, less direct, and more conforming to a corporate environment. And while that is helpful in some contexts, it can also result in self-repression. That in turn means that in face-to-face interactions there could be a surprising distance between who you are, and how you have been appearing in correspondence. If you use AI for modifying your emails persistently and without modification, your uncanny valley AI prose may in fact lead to being perceived as worse than if you had never used AI, with your tone seen as flat and insincere.

If we are collectively using AI for our difficult conversations or standard correspondence we may establish a new norm, and it becomes at least possible that certain workplaces begin to expect atypical communicators and bigger personalities to AI-modify their tone and expression as standard.

And if AI is shaping how we express ourselves, it's a short step to letting it shape what we think and do.

Let's use fictional Kevin and Paula: they are in a relationship and communicate via WhatsApp. Kevin is slightly anxious and so exports his and Paula's chat and uploads it into ChatGPT. He prompts: "You are a Jungian psychoanalyst, with a specialist interest in attachment theory. Analyse the information provided and comment on the following: Who is more emotionally invested in this relationship, who initiates contact the most; who is doing the emotional heavy-lifting. What does this say about the relationship?"

So far, so ethically questionable, but plenty of people are doing this, privately. After the answer is received, Kevin feels more anxious; his fears appear confirmed. Kevin feels he is putting in the hard work and Paula is coasting in the relationship. Yet ChatGPT has only generated a response based on patterns learned from many similar analyses.

Now comes the dangerous moment.

Kev is now fully engaged and wants to know more: "Does Paula even care? Should I stay with her? What's the probable outcome?" I needn't say what the typical ChatGPT reply looks like, but it seems to offer Kev a coherent and persuasive narrative. Kev has fully farmed out his judgement.

And although a break-up may well have been likely without GPT, AI does not have full context to the chats beyond what is shared, does not know the Kev-Paula dyad, and has very limited information on Paula, yet the story it produces can still engage the user's emotions, skew judgement and influence how the next few weeks unfolds for Paula and Kevin.

Fictional Kevin has changed his course of action, his mood, and handed over his judgement to a system, affecting two fictional lives. Yet similar scenarios are happening with increasing frequency IRL. Not only does it raise questions about privacy and mutual trust, but AI third-party interference in real life is largely invisible.

Whether it is "how do I please my boss" or "what should I text back to this bully/boyfriend/co-worker", grown adults who previously navigated their lives and difficulties themselves are increasingly relying on AI to help them navigate moments of uncertainty. Skilled, intentional users can resist this, but most people are busy or too emotionally engaged to notice. Real Evan noticed, eventually. Most of us won't have a podcast to play it back to us.

So what happens when children do this, people who have not yet had much time to learn about life? How will they develop their own judgement and conflict-negotiation skills? How do they build confidence in their own identity or self-reflection if a bot is continuously doing some of that work for them? These questions are still open, but they are not trivial.

Research suggests that children are even more susceptible than adults to treating AI as a trustworthy, human-like confidante. One study found that children described an educational chatbot as "knowledgeable" and "smart", displaying high levels of trust in AI as a source of information.³ We as teachers also know that teens are asking ChatGPT and other apps what to do about friends, they are asking bots to reply to texts or check the tone of texts for responses on social media, in addition to doing homework.

If judgement is something you acquire through experience, through getting it wrong, repairing, recalibrating, and trying again, then outsourcing it too early matters. You don't develop judgement by being told what to think; you develop it by getting used to uncertainty, ambiguity, misreading situations, and learning the consequences.

So what might help?

School is a place where we can learn, not just about soft skills, but about judgement itself. Disagreement, misinterpretation, emotional regulation, and repair are all necessary to develop judgement and they shouldn't be smoothed over by AI as a third party in human affairs. I don't think we should be banning tools, but instead we have to be explicit about what they are not for. A chatbot can help you write, summarise, or practise, but it should not be positioned as an authority on who you are, what someone else feels, or what you should do next. We need classes to teach that, and to focus more on human relationships.

Children will need to be taught to tolerate not knowing, how to be OK with ambiguity, and to understand we can't immediately resolve everything into a neat narrative. Speaking, writing and responding in your own voice is imperfect, uneven, maybe even risky, but this is how judgement, confidence and authenticity forms.

Once a generation grows up fluent in outsourcing uncertainty, it will not just be their writing that sounds the same. It will be their decisions.


References

  1. Moon, K., Green, A.E. and Kushlev, K. (2025) 'Homogenizing effect of large language models (LLMs) on creative diversity: An empirical comparison of human and ChatGPT writing', Computers in Human Behavior: Artificial Humans, 6, 100207. Available at: https://doi.org/10.1016/j.chbah.2025.100207
  2. Fitterer, S., Gangl, D. and Ulbrich, J. (2025) 'Testing English News Articles for Lexical Homogenization Due to Widespread Use of Large Language Models', in Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 4: Student Research Workshop), Vienna, Austria, pp. 1239–1245. Available at: https://aclanthology.org/2025.acl-srw.95/
  3. Vahedian Movahed, S. and Martin, F.G. (2025) 'Ask Me Anything: Exploring Children's Attitudes Toward an Age-tailored AI-powered Chatbot', International Journal of Artificial Intelligence in Education, 35, pp. 3979–4001. Available at: https://doi.org/10.1007/s40593-025-00523-4