OpenAI has officially released “o1,” the production version of its highly anticipated Project Strawberry, alongside a faster “mini” version, o1-mini. This new model boasts significant advancements in reasoning capabilities, designed to tackle complex questions in science, coding, and math faster and more accurately than previous models.
Enhanced Reasoning and Problem-Solving
o1 demonstrates a substantial leap in problem-solving compared to its predecessor, GPT-4o. In tests involving the International Mathematics Olympiad qualifying exam, o1 achieved an 83% success rate, a dramatic improvement over GPT-4o’s 13%. Furthermore, o1 scored in the 89th percentile in an online Codeforces competition, showcasing its coding prowess. The model also addresses queries that previously stumped older models, demonstrating a more robust understanding.
chatGPT on a phone on an encyclopedia
New Training and Optimization
OpenAI’s research lead, Jerry Tworek, explained to The Verge that o1 utilizes a novel optimization algorithm and a specialized training dataset. This combination, along with reinforcement learning and “chain of thought” reasoning, contributes to o1’s improved accuracy and reduced hallucinations, although the company acknowledges that hallucinations haven’t been entirely eliminated.
Availability and Pricing
ChatGPT-Plus and Teams subscribers can access both o1 and o1-mini immediately, with Enterprise and Edu subscribers gaining access next week. OpenAI plans to eventually offer o1-mini to free-tier users. However, developers should note a significant price increase for o1 API access compared to GPT-4o. o1 costs $15 per million input tokens and $60 per million output tokens, substantially higher than GPT-4o’s $5 and $15, respectively.
A Glimpse into the Future
While this release represents only an initial preview of o1’s capabilities, it signifies a significant step forward in AI reasoning. The advancements in accuracy, problem-solving, and reduced hallucinations suggest a promising future for the model. One lingering question, however, is whether o1 can accurately determine the number of “R’s” in the word “strawberry”.
This early access for subscribers allows OpenAI to gather valuable feedback and further refine the model, paving the way for even more powerful and reliable AI reasoning tools.