February 2025 | This Month in Generative AI: Je Suis à Nouveau Multilingue (I Am Multilingual Again)

2025-02-01-Hany-french-ezgif.com-optimize.gif

Hany Farid speaks French in an AI-generated video. Image credit: Hany Farid

When my brother and I were born in Germany (where my father was in graduate school), my parents wanted us to learn both French and Arabic for our planned return to Egypt. Geo-politics had other plans and we eventually immigrated to the United States. My parents tried to keep us speaking French and Arabic while also integrating us into American schools. We had our own plans and refused to speak anything other than English. It didn't take long for both languages to become a distant memory.

To this day, I regret not keeping up with my multilingual roots. Thanks to generative AI, however, I've become multilingual again.

Below are links to an original video of me reading the opening paragraph of this post alongside two AI-generated versions of me speaking French and German (in my voice). This video was created automatically by combining several generative AI techniques: (1) the original audio track is first transcribed; (2) this transcription is then translated into French; (3) a new voice track is synthesized in my voice reading the French/German transcription; and (4) a new video is generated in which my mouth in the original video is replaced so that it moves consistent with the new voice track.

[English] [French] [German]

The creation of this video is impressive on many levels, but mainly because of the voice synthesis in what sounds like my voice and the nearly flawless lip-sync video generation. My mother – a former French school teacher – had this to say: "It is even Parisian French, Just amazing." On the other hand, my younger French colleague Tina Nikoukhah says that my accent seems dated.

By comparison, just about a year ago, Maty Bohacek and I, created a lip-sync deepfake of CNN's Anderson Cooper that was good enough to air on national television, but also took nearly three months to create and – sorry Maty – was not nearly as convincing as this video that took all of three minutes to create.

The ability to mimic a person's voice and likeness like this should be considered a major success for AI technologists. And, you don't have to work hard to see many exciting applications of this type of generative AI. For me, I can translate my online lectures into any language, eliminating the need for captions and/or distracting dubbing.

But, you also don't have to work hard to see many concerning applications. This recent article, for example, cites some troubling trends:

According to the FBI, nearly 40% of online scam victims in 2023 were targeted with deepfake content.
A 2024 Medius study found that over half of finance professionals in the U.S. and U.K. have been targets of a deepfake-powered financial scam, with 43% falling victim to such attacks.
The cryptocurrency sector has been particularly affected, with deepfake-related incidents increasing by 654% from 2023 to 2024.

This article closes with some sage advice: "Proving the authenticity of individuals online in a way that is both privacy-preserving and accessible will be crucial for safeguarding all internet users. Without this ability, the digital landscape will remain increasingly vulnerable to manipulation and deceit."

To this end, while the threat vectors from deepfakes have expanded, the solutions have remained largely the same. We need broader deployment of Content Credentials, we need more investment in forensic identification, we need the AI industry to more broadly adopt a mindset of safety-by-design, and we need continued public education and outreach to warn users of deepfakes and their dangers.

Author bio: Professor Hany Farid is a world-renowned expert in the field of misinformation, disinformation, and digital forensics. He joined the Content Authenticity Initiative (CAI) as an advisor in June 2023. The CAI is an Adobe-led community of media and tech companies, NGOs, academics, and others working to promote adoption of the open industry standard for content authenticity and provenance. Professor Farid teaches at the University of California, Berkeley, with a joint appointment in electrical engineering and computer sciences at the School of Information. He’s also a member of the Berkeley Artificial Intelligence Lab, Berkeley Institute for Data Science, Center for Innovation in Vision and Optics, Development Engineering Program, and Vision Science Program, and he’s a senior faculty advisor for the Center for Long-Term Cybersecurity. He is also a co-founder and Chief Science Officer at GetReal Labs, where he works to protect organizations worldwide from the threats posed by the malicious use of manipulated and synthetic information.

He received his undergraduate degree in computer science and applied mathematics from the University of Rochester in 1989, his M.S. in computer science from SUNY Albany, and his Ph.D. in computer science from the University of Pennsylvania in 1997. Following a two-year post-doctoral fellowship in brain and cognitive sciences at MIT, he joined the faculty at Dartmouth College in 1999 where he remained until 2019. Professor Farid is the recipient of an Alfred P. Sloan Fellowship and a John Simon Guggenheim Fellowship, and he’s a fellow of the National Academy of Inventors.