April 2024 | This Month in Generative AI: Deepfakes, Real Consequences
News and trends shaping our understanding of generative AI technology and its applications.
For some time we have been talking about the challenges of a world in which generative AI makes it difficult to distinguish the real from the fake. In this month’s post, I will discuss a disturbing case that in my view didn't get enough attention or coverage and should serve as a cautionary tale.
Mayor Sadiq Khan
In November 2023, an AI-generated audio clip of London Mayor Sadiq Khan was released. In it, Khan was heard calling for postponement of Remembrance Week to make room for pro-Palestinian marches. (Remembrance Day in the UK honors members of the armed forces who have died in the line of duty.)
On BBC Radio 4, Khan said, "You know, we did get concerned very quickly about what impression it may create. I've got to be honest, it did sound a lot like me." Khan was right to be concerned. The audio clip quickly spread online, particularly among far-right groups. It triggered a wave of hateful and racist comments directed at Khan, the first Muslim mayor of London. The clip eventually led to over 1,000 protesters clashing with the police in the streets of London.
The Analysis
In the post-mortem, I analyzed the audio using two different techniques designed to distinguish real from AI-generated voices. Both of these techniques confidently flagged the audio as AI-generated. No amount of fact-checking, however, was enough to convince the person who first posted the clip. After being told by the BBC that the audio was fake, the person responded by saying, "It's what we all know Sadiq thinks."
Beyond Deepfakes
Although generative AI was at the core of this event, it was not the sole contributor. First, we have to take a hard look at the online distribution channels over which lies and conspiracies spread and are amplified. The mayor said it was "really worrying" that social media companies including Instagram, TikTok, and X did not contact him or the authorities about the faked audio at the time it went viral. Although TikTok "does not allow synthetic media that contains the likeness of any real private figure," they and other social media platforms appear to be ill equipped to rapidly find this type of violating material and remove or limit its spread.
Second, if the service used to create this fake audio had deployed Content Credentials, then it would have been easier to rapidly expose this audio as fake, for both consumers and the social media platforms themselves.
Content Credentials is a technology developed by the C2PA, a standards organization composed of ~100 companies working to build and implement a global standard for digital content provenance. The three pillars of durable Content Credentials implementation are cryptographically secure metadata, invisible watermarking, and cloud-based fingerprinting. Content Credentials allow consumers to see where and how digital files were created and edited.
Third, our current regulatory system is not prepared or equipped to handle this new era of generative AI. There is, for example, no criminal law in the UK which specifically covers the scenario involving Mayor Khan. Sooner than later, we will need to put in place regulatory guardrails that prevent this type of abuse while still preserving individual rights of expression. This will surely involve juggling at times conflicting ideals.
A governing principle that I have been advocating around the use of generative AI to mimic a person's voice or likeness is "consent and disclosure:" A person depicted should consent to their likeness being used and any AI-generated content should be clearly labeled as such.
Fourth, fake content often gains traction when it conforms to preconceived biases and notions. In this case, the fake audio of Mayor Khan played into the intolerance and distrust of others that has become all too common. We have to learn how to better disagree with each other without demonizing those with whom we don't agree. As a technologist, I am wholly disqualified to opine about how to accomplish this, but I believe that the technological and regulatory guardrails enumerated above are necessary but not entirely sufficient to achieve this goal.
Moving Forward
The paradox of our digital world is that the internet was designed to democratize access to information and knowledge. It did. But it democratized access to both good and entertaining information, and bad, deceptive, and hateful malinformation. And because of the nature of the attention economy that drives social media, the latter seems to generally gain more traction and visibility than the former.
It is said that generative AI will democratize access to creating content. But in the same way, this democratization does not distinguish between the good, the bad, and the ugly. Perhaps we were naive in the early days of the internet, but we should – at a minimum – learn from our past mistakes, and do better in this next stage of the technology revolution and work hard to create a landscape in which technology works for our betterment, not downfall.
Author bio: Professor Hany Farid is a world-renowned expert in the field of misinformation, disinformation, and digital forensics. He joined the Content Authenticity Initiative (CAI) as an advisor in June 2023. The CAI is an Adobe-led community of media and tech companies, NGOs, academics, and others working to promote adoption of the open industry standard for content authenticity and provenance.
Professor Farid teaches at the University of California, Berkeley, with a joint appointment in electrical engineering and computer sciences at the School of Information. He’s also a member of the Berkeley Artificial Intelligence Lab, Berkeley Institute for Data Science, Center for Innovation in Vision and Optics, Development Engineering Program, and Vision Science Program, and he’s a senior faculty advisor for the Center for Long-Term Cybersecurity. His research focuses on digital forensics, forensic science, misinformation, image analysis, and human perception.
He received his undergraduate degree in computer science and applied mathematics from the University of Rochester in 1989, his M.S. in computer science from SUNY Albany, and his Ph.D. in computer science from the University of Pennsylvania in 1997. Following a two-year post-doctoral fellowship in brain and cognitive sciences at MIT, he joined the faculty at Dartmouth College in 1999 where he remained until 2019.
Professor Farid is the recipient of an Alfred P. Sloan Fellowship and a John Simon Guggenheim Fellowship, and he’s a fellow of the National Academy of Inventors.