This dramatization demonstrates a new way cyber criminals use free and easily accessible AI to generate convincing deepfake audio of anyone, using only a few seconds of speech.
The technology behind this example is already over a year old, and even more convincing voice models are available. Generative AI creates immense potential for nearly every industry, but the speed and scale at which realistic content can now be generated has supercharged existing issues that society already struggles to address.
Dis/mis-information, fraud, and image-based abuse are not new social problems, but AI allows even non-technical people to produce content quicker and easier than ever before. Users, moderators, and policymakers all struggle to keep up with the rapid advances in this amazing technology.
Experts at Duke University are working to advance interpretable and moral AI. Detection tools like Microsoft's Content Credentials (developed with Adobe and others) embed metadata into images and videos to flag manipulated content. Tools like Deepware Scanner and SynthID by Google DeepMind can already detect AI-generated audio and images. Laws like the Biometric Information Privacy Act (BIPA) in Illinois are starting to treat voiceprints and facial data as sensitive personal information. But detection alone isn’t enough to enforce these laws. We need traceability, like a digital chain of custody, before trust is lost in digital space.
The Duke Initiative for Science & Society - through programs like their Master of Arts in Applied Ethics & Policy and their undergraduate Digital Intelligence Certificate - teaches future technologists to ask more probing questions, like: Should we develop this? Who may be affected & how? How do we mitigate the negative aspects and promote human flourishing? And then challenges those students to create actionable frameworks, policies, and communications plans to inform public discourse and guide public and private interests.
Disclaimer: Audio in this project was generated using commercial third-party text-to-speech software. This synthetic voice content is intended to illustrate the capabilities and potential risks of generative AI technologies. We share this example to raise awareness and demonstrate the need for public awareness and thoughtful policies that maximize technological benefits while minimizing their potential harms.