OpenAI, the artificial intelligence company behind ChatGPT has created a new team that will assess, evaluate and probe AI models against the potential threats that can arise from advanced AI capabilities described as “catastrophic risks.”
OpenAI in a statement said “We believe that frontier AI models, which will exceed the capabilities currently present in the most advanced existing models, have the potential to benefit all of humanity,” “But they also pose increasingly severe risks.”
The team, called Preparedness, will be led by Aleksander Madry, the director of MIT’s Center for Deployable Machine Learning, and he will be responsible for tracking, forecasting and protecting against the threats posed by upcoming AI systems.”
These risks include the AI’s capacity to deceive and manipulate people, as seen in phishing attacks, as well as their ability to generate harmful computer code, it said.
However, as part of the team, OpenAI is hiring a national security threat researcher and a research engineer. Each could earn an annual salary between $200,000 and $370,000, according to the job listings.
For months, tech leaders at top AI companies have raised alarms around AI safety.
Elon Musk, who helped cofound OpenAI before leaving the company, said in February that AI is “one of the biggest risks to the future of civilization.”
Also, Geoffrey Hinton, known as the ‘Godfather’ of artificial intelligence (AI) in May, resigned from Google and warned about the technology’s danger to humanity.
He said that some of the dangers of AI chatbots were “quite scary”.
“Right now, they’re not more intelligent than us, as far as I can tell. But I think they soon may be,” Hinton said.
Similarly, In March, OpenAI CEO Sam Altman said on an episode of Lex Fridman’s podcast that he empathises with people who are afraid of AI, noting that advancements in the technology come with risks related to “disinformation problems,” economic shocks” like job replacement, and threats “far beyond anything we’re prepared for,” Business Insider reported.
Earlier this month, Anthropic, an OpenAI rival behind the AI chatbot Claude, revamped its constitution with input from users to level up its guardrails and prevent toxic and racist responses.