This week, the White House announced it had secured “voluntary commitments” from seven leading AI companies to manage the risks posed by AI.
Getting companies like Amazon, Anthropic, Google, Inflection, Meta, Microsoft, and OpenAI to agree on anything is a step forward. They include fierce competitors with subtle but important differences in how they approach AI research and development.
Meta, for example, is so eager to get its AI models into the hands of developers that it has open-sourced many of them, making their code available for anyone to use. Other labs, such as Anthropic, have taken a more cautious approach, releasing their techniques in more limited ways.
But what do these promises actually mean? Given that AI companies have no legal force to back them, are they likely to make a big difference to how AI companies operate?
Given the potential risks of AI regulation, the details matter. So let’s take a closer look at the agreements reached here and assess the potential implications.
Commitment 1: The company commits to internal and external safety testing of AI systems before releasing them.
These AI companies have all conducted safety tests on their models—often referred to as “red team testing”—before they are released. In a way, this isn’t really a new commitment. It’s a vague promise. It did not provide details on what kind of testing would be required or by whom.
exist statement with commitmentThe White House said only that testing of the AI models “will be conducted in part by independent experts” and will focus on “AI risks such as biosecurity and cybersecurity, and their wider societal implications.”
It’s a good idea to have AI companies publicly commit to continuing such testing, and to encourage more transparency in the testing process. There are also types of AI risks—such as the danger that AI models could be used to develop biological weapons—that government and military officials may be better placed to assess than companies.
I’m glad to see the AI industry agree on a set of standard safety tests, such as “autonomous replication” tests Alignment Research Center Performed on OpenAI and Anthropic pre-release models. I’d also like to see the federal government fund such tests, which can be expensive and require engineers with significant technical expertise. Currently, much security testing is funded and overseen by corporations, which raises the question of obvious conflicts of interest.
Commitment 2: The company commits to sharing information on managing AI risks across industry and with government, civil society and academia.
This promise is also a bit vague. Some of these companies have published information about their AI models — often in academic papers or corporate blog posts. Some of these companies, including OpenAI and Anthropic, have also released documents called “system cards,” outlining the steps they’re taking to make these models more secure.
But they also sometimes withhold information, citing security concerns. When OpenAI released the latest AI model GPT-4 this year, break industry conventions And chose not to disclose how much data it was trained on, or how big the model was (a metric called “parameters”). The company said it declined to disclose the information due to competition and safety concerns. It also happens to be the type of data that tech companies like to keep away from competitors.
Under these new commitments, will AI companies be forced to disclose such information? What if doing so risks accelerating an AI arms race?
I suspect that the White House’s goal is not to compel companies to disclose their parameter quantities, but to encourage them to exchange information with each other about the risks their models do (or do not) pose.
But even this information sharing can be risky. If Google’s AI team prevents new models from being used to design deadly biological weapons during pre-release testing, should it share that information outside of Google? Does this give bad actors ideas on how to get a less protected model to perform the same task?
Commitment 3: The company commits to invest in cybersecurity and insider threat protection measures to protect proprietary and unpublished model weights.
This question is very simple and not controversial among the AI insiders I spoke to. “Model weights” is a technical term that refers to the mathematical instructions that give an AI model its ability to function. If you’re an agent of a foreign government (or a rival company) looking to build your own version of ChatGPT or other AI product, weights are what you want to steal. AI companies have a vested interest in tightly controlling this.
The problem of model weight leakage is well known.For example, the weights of Meta’s original LLaMA language model are Leaked on 4chan The model was released publicly a few days later, along with other sites. Given the risk of more leaks, and the possibility that other countries may be interested in stealing the technology from U.S. companies, requiring AI companies to invest more in their own security seems like a no-brainer.
Commitment 4: The two companies commit to facilitating the discovery and reporting of vulnerabilities in their AI systems by third parties.
I’m not quite sure what that means. Every AI company discovers holes in its models after they release them, usually because users try to do bad things with them or circumvent the model’s guardrails in ways the company didn’t foresee (a practice known as “jailbreaking”).
The White House has pledged to require companies to create “robust reporting mechanisms” for the vulnerabilities, but it’s unclear what that might mean. An in-app feedback button, similar to the ones that allow Facebook and Twitter users to report offending posts?A bug bounty program like this one OpenAI launches this year Reward users for discovering flaws in their system? Is there anything else? We will have to wait for more details.
Commitment 5: The company is committed to developing strong technical mechanisms to ensure users know when content is generated by artificial intelligence, such as watermarking systems.
It’s an interesting idea, but leaves a lot of room for interpretation. So far, AI companies have struggled to design tools that would allow people to tell whether they are viewing AI-generated content. There are good technical reasons for this, but it’s a real problem when people can pass off AI-generated jobs as their own. (Ask any high school teacher.) Many of the current tools advertised as being able to detect the output of AI actually fail to do so with any degree of accuracy.
I’m not optimistic that this problem will be fully resolved. But I’m glad companies are committing to work on it.
Commitment 6: Companies commit to publicly reporting on the capabilities, limitations, and areas of appropriate and inappropriate use of their AI systems.
Another sensible-sounding commitment with plenty of wiggle room. How often do companies need to report on the capabilities and limitations of their systems? How detailed must this information be? Given that many companies building AI systems are surprised by their systems’ capabilities after the fact, how far can they really describe these systems in advance?
Commitment 7: Companies commit to prioritizing research on the risks to society that AI systems may pose, including avoiding harmful bias and discrimination and protecting privacy.
Commitment to “prioritizing research” is a vague commitment. Still, I believe this commitment will be welcomed by many in the AI ethics community, who want AI companies to make preventing near-term harm like bias and discrimination a priority, rather than worrying about doomsday scenarios, as AI safety folks do.
If you’re confused about the difference between “AI ethics” and “AI safety,” know that there are two rival factions within the AI research community, each of which believes the other is focused on preventing harm from mistakes.
Commitment 8: Both companies are committed to developing and deploying advanced artificial intelligence systems to help solve society’s greatest challenges.
I don’t think many people think that advanced artificial intelligence should no Used to help solve society’s biggest challenges. I wouldn’t disagree with the White House citing cancer prevention and climate change mitigation as two areas it wants AI companies to focus on.
Complicating this goal somewhat, however, is that in artificial intelligence research, things that seem boring at first tend to have more serious consequences. Some of the techniques employed by DeepMind’s AlphaGo, an artificial intelligence system trained to play the board game Go, have been extremely useful in predicting the three-dimensional structure of proteins, a major discovery that advances basic science research.
Overall, the White House’s deals with AI companies appear to be more symbolic than substantive. There are no enforcement mechanisms to ensure companies abide by these commitments, many of which reflect precautions already taken by AI companies.
Still, it’s a reasonable first step. Agreeing to abide by the rules shows that AI companies have learned from the failures of early tech companies that waited until they got in trouble before engaging with governments. In Washington, at least when it comes to tech regulation, it pays to show up early.