Meta's Plan to Strengthen AI-Generated Fake Detection, Yet Some Remain Undetected

Meta, like other leading tech companies, has spent the past year promising to speed up deployment of generative artificial intelligence. Today it acknowledged it must also respond to the technology’s hazards, announcing an expanded policy of tagging AI-generated images posted to Facebook, Instagram, and Threads with warning labels to inform people of their artificial origins.

Yet much of the synthetic media likely to appear on Meta’s platforms is unlikely to be covered by the new policy, leaving many gaps through which malicious actors could slip. “It’s a step in the right direction, but with challenges,” says Sam Gregory, program director of the nonprofit Witness, which helps people use technology to support human rights.

Meta already labels AI-generated images made using its own generative AI tools with the tag “Imagined with AI,” in part by looking for the digital “watermark” its algorithms embed into their output. Now Meta says that in coming months it will also label AI images made with tools offered by other companies that embed watermarks into their technology.

The policy is supposed to reduce the risk of mis- or disinformation being spread by AI-generated images passed off as photos. But although Meta said it is working to support disclosure technology in development at Google, OpenAI, Microsoft, Adobe, Midjourney, and Shutterstock, the technology is not yet widely deployed. And many AI image generation tools are available that do not watermark their output, with the technology becoming increasingly easy to access and modify. “The only way a system like that will be effective is if a broad range of generative tools and platforms participated,” says Gregory.

Even if there is wide support for watermarking, it is unclear how robust any protection it offers will be. There is no universally deployed standard in place, but the Coalition for Content Provenance and Authenticity (C2PA), an initiative founded by Adobe, has helped companies start to align their work on the concept. But the technology developed so far is not foolproof. In a study released last year, researchers found they could easily break watermarks, or add them to images that hadn’t been generated by AI to make it appear that they had.

Malicious Loophole

Hany Farid, a professor at the UC Berkeley School of Information who has advised the C2PA initiative, says that anyone interested in using generative AI maliciously will likely turn to tools that don’t watermark their output or betray its nature. For example, the creators of the fake robocall using President Joe Biden’s voice targeted at some New Hampshire voters last month didn’t add any disclosure of its origins.

And he thinks companies should be prepared for bad actors to target whatever method they try to use to identify content provenance. Farid suspects that multiple forms of identification might need to be used in concert to robustly identify AI-generated images, for example by combining watermarking with hash-based technology used to create watch lists for child sex abuse material. And watermarking is a less developed concept for AI-generated media other than images, such as audio and video.

“While companies are starting to include signals in their image generators, they haven’t started including them in AI tools that generate audio and video at the same scale, so we can’t yet detect those signals and label this content from other companies,” Meta spokesperson Kevin McAlister acknowledges. “While the industry works towards this capability, we’re adding a feature for people to disclose when they share AI-generated video or audio so we can add a label to it.”

Meta’s new policies may help it catch more fake content, but not all manipulated media is AI-generated. A ruling released on Monday by Meta’s Oversight Board of independent experts, which reviews some moderation calls, upheld the company’s decision to leave up a video of President Joe Biden that had been edited to make it appear that he is inappropriately touching his granddaughter’s chest. But the board said that while the video, which was not AI-generated, didn’t violate Meta’s current policies, it should revise and expand its rules for “manipulated media” to cover more than just AI-generated content.

McAlister, the Meta spokesperson, says the company is “reviewing the Oversight Board’s guidance and will respond publicly to their recommendations within 60 days in accordance with the bylaws.” Farid says that hole in Meta’s policies, and the technical focus on only watermarked AI-generated images, suggests the company’s plan for the gen AI era is incomplete.

In recent years, the rise of AI-generated fake content has become a growing concern for both individuals and organizations. The ability of artificial intelligence to create highly convincing and realistic fake images, videos, and text has raised questions about the authenticity and reliability of online information. In response to this challenge, Meta, the parent company of Facebook, has announced its plan to strengthen AI-generated fake detection. However, despite these efforts, some fake content still manages to go undetected.

Meta’s plan to combat AI-generated fake content involves a multi-faceted approach that combines advanced technology, human intervention, and collaboration with external experts. One of the key components of this strategy is the development and improvement of AI algorithms specifically designed to detect and flag fake content. These algorithms are trained on large datasets of both real and fake content, allowing them to learn and identify patterns that distinguish genuine information from AI-generated fakes.

To enhance the effectiveness of their algorithms, Meta is investing in research and development to stay ahead of evolving techniques used by creators of fake content. By constantly updating their detection systems, Meta aims to minimize the spread of misleading information and protect users from falling victim to misinformation campaigns.

In addition to relying on AI algorithms, Meta recognizes the importance of human intervention in the fight against AI-generated fakes. The company employs a team of content reviewers who manually review flagged content to make final determinations. These reviewers undergo extensive training to develop a deep understanding of the nuances and subtleties that distinguish real content from fakes. Their expertise is crucial in identifying sophisticated fake content that may fool AI algorithms.

Furthermore, Meta acknowledges that no single company can tackle this issue alone. They actively collaborate with external experts, researchers, and industry partners to share knowledge, insights, and best practices. By fostering a collaborative environment, Meta aims to collectively strengthen the detection capabilities across various platforms and ensure a safer online environment for users.

Despite Meta’s efforts, some AI-generated fake content still manages to slip through the detection systems. The creators of fake content are constantly evolving their techniques to make their creations more convincing and harder to detect. This cat-and-mouse game between technology companies and those who seek to deceive poses an ongoing challenge.

One reason why some fake content remains undetected is the speed at which it can spread across social media platforms. Fake news, manipulated images, and deepfake videos can go viral within minutes, reaching millions of users before they are flagged and removed. This highlights the need for real-time detection systems that can quickly identify and mitigate the impact of fake content.

Another challenge lies in the ethical considerations surrounding the detection and removal of fake content. Striking a balance between protecting users from misinformation and preserving freedom of speech is a complex task. The risk of false positives, where genuine content is mistakenly flagged as fake, is a concern that technology companies must address to avoid unintended consequences.

In conclusion, Meta’s plan to strengthen AI-generated fake detection demonstrates their commitment to combating the spread of misleading information. By combining advanced AI algorithms, human expertise, and collaboration with external partners, Meta aims to stay ahead of those who seek to deceive. However, the ever-evolving nature of AI-generated fakes presents an ongoing challenge, and some fake content may still go undetected. As technology continues to advance, it is crucial for companies like Meta to adapt and refine their strategies to ensure a safer online environment for all users.