White House challenges hackers to break top AI models at DEF CON 31

An AI-generated image of the White House in front of a cybernetic background.
Enlarge / An AI-generated image of the White House in front of a cybernetic background.

reader comments
36 with

On Thursday, the White House announced a surprising collaboration between top AI developers, including OpenAI, Google, Antrhopic, Hugging Face, Microsoft, Nvidia, and Stability AI, to participate in a public evaluation of their generative AI systems at DEF CON 31, a hacker convention taking place in Las Vegas in August. The event will be hosted by AI Village, a community of AI hackers.

Since last year, large language models (LLMs) such as ChatGPT have become a popular way to accelerate writing and communications tasks, but officials recognize that they also come with inherent risks. Issues such as confabulations, jailbreaks, and biases pose challenges for security professionals and the public. That’s why the White House Office of Science, Technology, and Policy endorses pushing these new generative AI models to their limits.

“This independent exercise will provide critical information to researchers and the public about the impacts of these models and will enable AI companies and developers to take steps to fix issues found in those models,” says a statement from the White House, which says the event aligns with the Biden administration’s AI Bill of Rights and the National Institute of Standards and Technology’s AI Risk Management Framework.

In a parallel announcement written by AI Village, organizers Sven Cattell, Rumman Chowdhury, and Austin Carson call the upcoming event “the largest red teaming exercise ever for any group of AI models.” Thousands of people will take part in the public AI model assessment, which will utilize an evaluation platform developed by Scale AI.

prompt injection,” which we broke a story about in September. AI researcher Simon Willison has written in detail about the dangers of prompt injection, a technique that can derail a language model into performing actions not intended by its creator.

During the DEF CON event, participants will have timed access to multiple LLMs through laptops provided by the organizers. A capture-the-flag-style point system will encourage testing a wide range of potential harms. At the end, the person with the most points will win a high-end Nvidia GPU.

“We’ll publish what we learn from this event to help others who want to try the same thing,” writes AI Village. “The more people who know how to best work with these models, and their limitations, the better.”

DEF CON 31 will take place on August 10–13, 2023, at Caesar’s Forum in Las Vegas.

Article Tags:
Article Categories: