OpenAI says its latest GPT-4o model is ‘medium’ risk

Aug 09, 2024 03:04 AM - 4 months ago 99494

OpenAI has released its GPT-4o System Card, a investigation archive that outlines nan information measures and consequence evaluations nan startup conducted earlier releasing its latest model.

GPT-4o was launched publically successful May of this year. Before its debut, OpenAI utilized an outer group of reddish teamers, aliases information experts trying to find weaknesses successful a system, to find cardinal risks successful nan exemplary (which is simply a reasonably modular practice). They examined risks for illustration nan anticipation that GPT-4o would create unauthorized clones of someone’s voice, seductive and convulsive content, aliases chunks of reproduced copyrighted audio. Now, nan results are being released.

According to OpenAI’s ain framework, nan researchers recovered GPT-4o to beryllium of “medium” risk. The wide consequence level was taken from nan highest consequence standing of 4 wide categories: cybersecurity, biological threats, persuasion, and exemplary autonomy. All of these were deemed debased consequence isolated from persuasion, wherever nan researchers recovered immoderate penning samples from GPT-4o could beryllium amended astatine swaying readers’ opinions than human-written matter — though nan model’s samples weren’t much persuasive overall.

An OpenAI spokesperson, Lindsay McCallum Rémy, told The Verge that nan strategy paper includes preparedness evaluations created by an soul team, alongside outer testers listed connected OpenAI’s website arsenic Model Evaluation and Threat Research (METR) and Apollo Research, some of which build evaluations for AI systems.

This isn’t nan first strategy paper OpenAI has released; GPT-4, GPT-4 pinch vision, and DALL-E 3 were besides likewise tested and nan investigation was released. But OpenAI is releasing this strategy paper astatine a pivotal time. The institution has been fielding nonstop disapproval of its information standards, from its ain employees to state senators. Only minutes earlier nan merchandise of GPT-4o’s strategy card, The Verge exclusively reported connected an unfastened missive from Sen. Elizabeth Warren (D-MA) and Rep. Lori Trahan (D-MA) that called for answers astir really OpenAI handles whistleblowers and information reviews. That missive outlines nan galore information issues that person been called retired publicly, including CEO Sam Altman’s brief ousting from nan institution successful 2023 arsenic a consequence of nan board’s concerns and the departure of a information executive, who claimed that “safety civilization and processes person taken a backseat to shiny products.”

Moreover, nan institution is releasing a highly tin multimodal exemplary conscionable up of a US statesmanlike election. There’s a clear imaginable consequence of nan exemplary accidentally spreading misinformation aliases getting hijacked by malicious actors — moreover if OpenAI is hoping to item that nan institution is testing real-world scenarios to forestall misuse.

There person been plentifulness of calls for OpenAI to beryllium much transparent, not conscionable pinch nan model’s training information (is it trained connected YouTube?), but pinch its information testing. In California, wherever OpenAI and galore different starring AI labs are based, authorities Sen. Scott Wiener is moving to walk a measure to modulate ample connection models, including restrictions that would clasp companies legally accountable if their AI is utilized successful harmful ways. If that measure is passed, OpenAI’s frontier models would person to comply pinch state-mandated consequence assessments earlier making models disposable for nationalist use. But nan biggest takeaway from nan GPT-4o System Card is that, contempt nan group of outer reddish teamers and testers, a batch of this relies connected OpenAI to measure itself.

More