For a long time, the public was often the first real tester of powerful AI systems. That is starting to change. On May 5, 2026, the U.S. National Institute of Standards and Technology announced that its Center for AI Standards and Innovation, or CAISI, had signed new agreements with Google DeepMind, Microsoft, and xAI. These agreements let the government study frontier AI models before public release, not only after launch. In other words, some of the world’s most advanced chatbots and reasoning systems may now face government evaluation while they are still private products. (nist.gov)
This is not a small experiment. NIST says CAISI has already completed more than 40 evaluations, including tests on state-of-the-art models that were still unreleased at the time. The agency also says developers may provide versions with reduced or removed safeguards so evaluators can better measure national-security-related capabilities and risks. That means the government is not just asking, “Is this model helpful?” It is also asking harder questions: Could it help with cyberattacks? Could it behave dangerously if protections fail? Could the U.S. government be surprised by what a new model can do? (nist.gov)
The May 2026 deals build on an earlier shift. In August 2024, NIST’s then-named U.S. AI Safety Institute signed first-of-their-kind agreements with OpenAI and Anthropic to gain access to major new models before and after release. A few months later, U.S. and U.K. government teams jointly carried out a pre-deployment evaluation of Anthropic’s upgraded Claude 3.5 Sonnet. Step by step, pre-release testing has moved from an unusual idea to a normal part of frontier-AI governance. (nist.gov)
For learners of English, this story shows a useful modern pattern: technology now moves so fast that oversight is trying to move earlier too. Instead of waiting for problems in the real world, governments want access at the prototype stage. At the same time, companies still worry about secrecy, intellectual property, and data protection, which is why CAISI also launched work in 2026 on privacy-preserving evaluation methods with OpenMined. The result is a new era: before the public meets a powerful AI model, the government may already have tested its limits. (nist.gov)










