This week, the White House announced that it had obtained “voluntary commitments” from seven leading AI companies to manage the risks posed by AI.
Getting companies (Amazon, Anthropic, Google, Inflection, Meta, Microsoft, and OpenAI) to agree to anything is a step forward. They include bitter rivals with subtle but important differences in the way they approach AI research and development.
Meta, for example, is so eager to get its AI models into the hands of developers that it has opened up many of them, putting their code out in the open for anyone to use. Other labs, like Anthropic, have taken a more cautious approach, releasing their technology on a more limited basis.
But what do these commitments really mean? And are they likely to change a lot about how AI companies operate, given that they are not backed by the force of law?
Given the potential risks of AI regulation, details matter. So let’s take a closer look at what was agreed here and assess the potential impact.
Commitment 1: Companies commit to conduct internal and external security testing of their AI systems prior to release.
Each of these AI companies already conducts safety testing, often referred to as “red team,” of their models prior to release. On one level, this isn’t really a new commitment. And it is a vague promise. It doesn’t come with a lot of detail about what type of testing is required or who will do the testing.
In a statement accompanying the commitmentsthe White House said only that the tests of the AI models “will be conducted in part by independent experts” and will focus on AI risks “such as biosecurity and cybersecurity, as well as their broader societal effects.”
It’s a good idea to make AI companies publicly commit to continue doing this kind of testing and to encourage more transparency in the testing process. And there are some kinds of AI risk, like the danger that AI models could be used to develop biological weapons, that government and military officials are probably better prepared than companies to assess.
I’d love to see the AI industry agree to a standard battery of security tests, like the “autonomous replication” tests that the Alignment Research Center performed on models pre-released by OpenAI and Anthropic. I’d also like to see the federal government fund these types of tests, which can be expensive and require engineers with deep technical expertise. Right now, many security tests are funded and overseen by companies, which raises obvious questions about conflicts of interest.
Commitment 2: Companies commit to share information within industry and with governments, civil society and academia on AI risk management.
This commit is also a bit vague. Several of these companies already publish information about their AI models, usually in academic articles or corporate blog posts. Some of them, including OpenAI and Anthropic, also publish documents called “system cards,” which outline the steps they’ve taken to make those models more secure.
But they have also withheld information at times, citing security concerns. When OpenAI released its latest AI model, GPT-4, this year, broke with industry mores and chose not to disclose the amount of data it trained on or the size of the model (a metric known as “parameters”). He said that he refused to release this information due to competition and security concerns. It also happens to be the kind of data that tech companies like to keep away from the competition.
Under these new commitments, will AI companies be required to make that kind of information public? What if doing so risks accelerating the AI arms race?
I suspect the White House goal is less about forcing companies to disclose their parameter counts and more about encouraging them to trade information with each other about the risks their models pose (or don’t pose).
But even that kind of information sharing can be risky. If Google’s AI team prevented a new model from being used to design a deadly bioweapon during preliminary testing, should you share that information outside of Google? Would that risk giving bad actors ideas about how they could get a less protected model to perform the same task?
Commitment 3: Companies commit to investing in cybersecurity and insider threat safeguards to protect the weights of proprietary and unreleased models.
This one is fairly straightforward and not controversial among the AI experts I’ve spoken to. “Model Weights” is a technical term for the mathematical instructions that give AI models the ability to function. The pesos are what you would want to steal if you were an agent of a foreign government (or a rival corporation) wanting to build your own version of ChatGPT or other AI product. And it’s something that AI companies have a vested interest in keeping tightly in check.
There have already been highly publicized problems with leaking model weights. The weights for Meta’s original LLaMA language model, for example, were leaked on 4chan and other websites just days after the model was released publicly. Given the risks of more leaks, and the interest other nations may have in stealing this technology from US companies, asking AI companies to invest more in their own security seems like a no-brainer.
Commitment 4: Companies commit to facilitating the discovery and reporting of vulnerabilities in their artificial intelligence systems by third parties.
I’m not quite sure what this means. All AI companies have discovered vulnerabilities in their models after they were released, usually because users try to do bad things with the models or bypass their security barriers (a practice known as “jailbreaking”) in ways the companies didn’t anticipate.
The White House pledge calls for companies to establish a “robust notification mechanism” for these vulnerabilities, but it’s unclear what that might mean. An in-app comment button, similar to the ones that allow Facebook and Twitter users to report posts that violate the rules? A bug bounty program, such as the OpenAI started this year to reward users who find bugs in your systems? Anything else? We will have to wait for more details.
Commitment 5: Companies commit to developing robust technical mechanisms to ensure users know when AI-powered content is being generated, such as a watermarking system.
This is an interesting idea but it leaves a lot of room for interpretation. Until now, AI companies have struggled to design tools that let people know whether or not they are viewing AI-generated content. There are good technical reasons for this, but it’s a real problem when people can pass off AI-generated work as their own. (Ask any high school teacher.) And many of the tools currently touted as being able to detect AI outputs can’t really do it with any degree of accuracy.
I am not optimistic that this problem is fully fixable. But I’m glad companies are committing to work on it.
Commitment 6: Companies commit to publicly report the capabilities, limitations, and areas of good and bad use of their AI systems.
Another sensible-sounding promise with plenty of leeway. How often will companies be required to report on the capabilities and limitations of their systems? How detailed will that information need to be? And since many of the companies that build AI systems have been surprised by the capabilities of their own systems after the fact, how well can they be expected to describe them ahead of time?
Commitment 7: Companies commit to prioritizing research on the social risks that AI systems may pose, including avoiding harmful bias and discrimination and protecting privacy.
Committing to “prioritize research” is about as confusing as a commitment. Still, I’m sure this commitment will be welcomed by many in the AI ethics community, who want AI companies to make preventing short-term harm such as bias and discrimination a priority over worrying about doomsday scenarios, as AI security folks do.
If you’re confused by the difference between “AI ethics” and “AI security”, just know that there are two warring factions within the AI research community, each of which thinks the other is focused on preventing the wrong kinds of damage.
Commitment 8: Companies commit to developing and deploying advanced AI systems to help address society’s biggest challenges.
I don’t think many people argue that advanced AI should No used to help address society’s biggest challenges. The White House lists “cancer prevention” and “climate change mitigation” as two of the areas it would like AI companies to focus their efforts on, and you won’t get any disagreement from me on that.
What makes this goal somewhat tricky, however, is that in AI research, what starts out looking frivolous often turns out to have more serious implications. Some of the technology used in DeepMind’s AlphaGo, an artificial intelligence system that was trained to play the board game Go, turned out to be useful in predicting the three-dimensional structures of proteins, an important discovery that boosted basic science research.
Overall, the White House deal with AI companies seems more symbolic than substantive. There is no compliance mechanism in place to make sure companies meet these commitments, and many of them reflect precautions AI companies are already taking.
Still, it’s a reasonable first step. And agreeing to follow these rules shows that AI companies have learned from the failures of previous tech companies, which waited to engage with the government until they got into trouble. In Washington, at least when it comes to tech regulation, it pays to show up early.