What can we do about the increasingly sophisticated AI generated content in our lives?
In my prior column, I established how AI generated content is expanding online, and described scenarios to illustrate why it’s occurring. (Please read that before you go on here!) Let’s move on now to talking about what the impact is, and what possibilities the future might hold.
Human beings are social creatures, and visual ones as well. We learn about our world through images and language, and we use visual inputs to shape how we think and understand concepts. We are shaped by our surroundings, whether we want to be or not.
Accordingly, no matter how much we are consciously aware of the existence of AI generated content in our own ecosystems of media consumption, our subconscious response and reaction to that content will not be fully within our control. As the truism goes, everyone thinks they’re immune to advertising — they’re too smart to be led by the nose by some ad executive. But advertising continues! Why? Because it works. It inclines people to make purchasing choices that they otherwise wouldn’t have, whether just from increasing brand visibility, to appealing to emotion, or any other advertising technique.
AI-generated content may end up being similar, albeit in a less controlled way. We’re all inclined to believe we’re not being fooled by some bot with an LLM generating text in a chat box, but in subtle or overt ways, we’re being affected by the continued exposure. As much as it may be alarming that advertising really does work on us, consider that with advertising the subconscious or subtle effects are being designed and intentionally driven by ad creators. In the case of generative AI, a great deal of what goes into creating the content, no matter what its purpose, is based on an algorithm using historical information to choose the features most likely to appeal, based on its training, and human actors are less in control of what that model generates.
I mean to say that the results of generative AI routinely surprise us, because we’re not that well attuned to what our history really says, and we often don’t think of edge cases or interpretations of prompts we write. The patterns that AI is uncovering in the data are sometimes completely invisible to human beings, and we can’t control how these patterns influence the output. As a result, our thinking and understanding are being influenced by models that we don’t completely understand and can’t always control.
Beyond that, as I’ve mentioned, public critical thinking and critical media consumption skills are struggling to keep pace with AI generated content, to give us the ability to be as discerning and thoughtful as the situation demands. Similarly to the development of Photoshop, we need to adapt, but it’s unclear whether we have the ability to do so.
We are all learning tell-tale signs of AI generated content, such as certain visual clues in images, or phrasing choices in text. The average internet user today has learned a huge amount in just a few years about what AI generated content is and what it looks like. However, suppliers of the models used to create this content are trying to improve their performance to make such clues subtler, attempting to close the gap between obviously AI generated and obviously human produced media. We’re in a race with AI companies, to see whether they can make more sophisticated models faster than we can learn to spot their output.
We’re in a race with AI companies, to see whether they can make more sophisticated models faster than we can learn to spot their output.
In this race, it’s unclear if we will catch up, as people’s perceptions of patterns and aesthetic data have limitations. (If you’re skeptical, try your hand at detecting AI generated text: https://roft.io/) We can’t examine images down to the pixel level the way a model can. We can’t independently analyze word choices and frequencies throughout a document at a glance. We can and should build tools that help do this work for us, and there are some promising approaches for this, but when it’s just us facing an image, a video, or a paragraph, it’s just our eyes and brains versus the content. Can we win? Right now, we often don’t. People are fooled every day by AI-generated content, and for every piece that gets debunked or revealed, there must be many that slip past us unnoticed.
One takeaway to keep in mind is that it’s not just a matter of “people need to be more discerning” — it’s not as simple as that, and if you don’t catch AI generated materials or deepfakes when they cross your path every time, it’s not all your fault. This is being made increasingly difficult on purpose.
So, living in this reality, we have to cope with a disturbing fact. We can’t trust what we see, at least not in the way we have become accustomed to. In a lot of ways, however, this isn’t that new. As I described in my first part of this series, we kind of know, deep down, that photographs may be manipulated to change how we interpret them and how we perceive events. Hoaxes have been perpetuated with newspapers and radio since their invention as well. But it’s a little different because of the race — the hoaxes are coming fast and furious, always getting a little more sophisticated and a little harder to spot.
We can’t trust what we see, at least not in the way we have become accustomed to.
There’s also an additional layer of complexity in the fact that a large amount of the AI generated content we see, particularly on social media, is being created and posted by bots (or agents, in the new generative AI parlance), for engagement farming/clickbait/scams and other purposes as I discussed in part 1 of this series. Frequently we are quite a few steps disconnected from a person responsible for the content we’re seeing, who used models and automation as tools to produce it. This obfuscates the origins of the content, and can make it harder to infer the artificiality of the content by context clues. If, for example, a post or image seems too good (or weird) to be true, I might investigate the motives of the poster to help me figure out if I should be skeptical. Does the user have a credible history, or institutional affiliations that inspire trust? But what if the poster is a fake account, with an AI generated profile picture and fake name? It only adds to the challenge for a regular person to try and spot the artificiality and avoid a scam, deepfake, or fraud.
As an aside, I also think there’s general harm from our continued exposure to unlabeled bot content. When we get more and more social media in front of us that is fake and the “users” are plausibly convincing bots, we can end up dehumanizing all social media engagement outside of people we know in analog life. People already struggle to humanize and empathize through computer screens, hence the longstanding problems with abuse and mistreatment online in comments sections, on social media threads, and so on. Is there a risk that people’s numbness to humanity online worsens, and degrades the way they respond to people and models/bots/computers?
How do we as a society respond, to try and prevent being taken in by AI-generated fictions? There’s no amount of individual effort or “do your homework” that can necessarily get us out of this. The patterns and clues in AI-generated content may be undetectable to the human eye, and even undetectable to the person who built the model. Where you might normally do online searches to validate what you see or read, those searches are heavily populated with AI-generated content themselves, so they are increasingly no more trustworthy than anything else. We absolutely need photographs, videos, text, and music to learn about the world around us, as well as to connect with each other and understand the broader human experience. Even though this pool of material is becoming poisoned, we can’t quit using it.
There are a number of possibilities for what I think might come next that could help with this dilemma.
- AI declines in popularity or fails due to resource issues. There are a lot of factors that threaten the growth and expansion of generative AI commercially, and these are mostly not mutually exclusive. Generative AI very possibly could suffer some degree of collapse due to AI generated content infiltrating the training datasets. Economic and/or environmental challenges (insufficient power, natural resources, or capital for investment) could all slow down or hinder the expansion of AI generation systems. Even if these issues don’t affect the commercialization of generative AI, they could create barriers to the technology’s progressing further past the point of easy human detection.
- Organic content becomes premium and gains new market appeal. If we are swarmed with AI generated content, that becomes cheap and low quality, but the scarcity of organic, human-produced content may drive a demand for it. In addition, there is a significant growth already in backlash against AI. When customers and consumers find AI generated material off-putting, companies will move to adapt. This aligns with some arguments that AI is in a bubble, and that the excessive hype will die down in time.
- Technological work challenges the negative effects of AI. Detector models and algorithms will be necessary to differentiate organic and generated content where we can’t do it ourselves, and work is already going on in this direction. As generative AI grows in sophistication, making this necessary, a commercial and social market for these detector models may develop. These models need to become a lot more accurate than they are today for this to be possible — we don’t want to rely upon notably bad models like those being used to identify generative AI content in student essays in educational institutions today. But, a lot of work is being done in this space, so there’s reason for hope. (I have included a few research papers on these topics in the notes at the end of this article.)
- Regulatory efforts expand and gain sophistication. Regulatory frameworks may develop sufficiently to be helpful in reining in the excesses and abuses generative AI enables. Establishing accountability and provenance for AI agents and bots would be a massively positive step. However, all this relies on the effectiveness of governments around the world, which is always uncertain. We know big tech companies are intent on fighting against regulatory obligations and have immense resources to do so.
I think it very unlikely that generative AI will continue to gain sophistication at the rate seen in 2022–2023, unless a significantly different training methodology is developed. We are running short of organic training data, and throwing more data at the problem is showing diminishing returns, for exorbitant costs. I am concerned about the ubiquity of AI-generated content, but I (optimistically) don’t think these technologies are going to advance at more than a slow incremental rate going forward, for reasons I have written about before.
This means our efforts to moderate the negative externalities of generative AI have a pretty clear target. While we continue to struggle with difficulty detecting AI-generated content, we have a chance to catch up if technologists and regulators put the effort in. I also think it is vital that we work to counteract the cynicism this AI “slop” inspires. I love machine learning, and I’m very glad to be a part of this field, but I’m also a sociologist and a citizen, and we need to take care of our communities and our world as well as pursuing technical progress.