On Thursday, OpenAI launched what’s successfully a $200-a-month chatbot — and the AI neighborhood didn’t know fairly what to make of it.
The corporate’s new ChatGPT Professional plan grants entry to “o1 pro mode,” which OpenAI says “uses more compute for the best answers to the hardest questions.” A souped-up model of OpenAI’s o1 reasoning mannequin, o1 professional mode ought to reply questions regarding science, math, and coding extra “reliably” and “comprehensively,” OpenAI says.
Virtually instantly, individuals began asking it to attract unicorns:
I requested ChatGPT o1 Professional Mode to create an SVG of a unicorn.
(That is the mannequin you get entry to for $200 month-to-month) pic.twitter.com/h9HwY3aYwU
— Rammy (@rammydev) December 5, 2024
And design a “crab-based” laptop:
Lastly placing o1-pro to its final use case. pic.twitter.com/nX4JAjx71m
— Ethan Mollick (@emollick) December 6, 2024
And wax poetic on the that means of life:
I simply subscribed to OpenAI’s $200/month subscription.
Reply with inquiries to ask it and I’ll repost them on this thread. pic.twitter.com/oTQxbPxnoP— Garrett Scott 🕳 (@thegarrettscott) December 5, 2024
However many people on X didn’t appear satisfied that o1 professional mode’s solutions have been, effectively, $200-level.
“Have OpenAI shared any concrete examples of prompts that fail in regular o1 but succeed in o1-pro?” requested British laptop scientist Simon Willison. “I want to see a single concrete example that shows its advantage.”
It’s an inexpensive query; in spite of everything, that is the world’s most costly chatbot subscription. The service comes with different advantages, just like the elimination of charge limits and limitless entry to OpenAI’s different fashions. However $2,400 per yr isn’t chump change, and the worth proposition of o1 professional mode particularly stays murky.
It didn’t take lengthy to search out failure instances. O1 professional mode struggles with Sudoku, and it’s tripped up by an optical phantasm joke that’s apparent to any human.
o1 and o1-pro each failed right here, in all probability nonetheless due to the imaginative and prescient limitations (the identical with Sudoku puzzles)https://t.co/mAVK7WxBrq pic.twitter.com/O9boSv7ZGt
— Tibor Blaho (@btibor91) December 5, 2024
OpenAI’s inner benchmarks present that o1 professional mode performs solely barely higher than the usual o1 on coding and math issues:
OpenAI ran a “stricter” analysis on the identical benchmarks to showcase o1 professional mode’s consistency: the mannequin was solely thought-about to have solved a query if it bought the reply proper 4 out of 4 instances. However even in these assessments, the enhancements weren’t dramatic:

OpenAI CEO Sam Altman, who as soon as wrote that OpenAI was on a path “towards intelligence too cheap to meter,” was compelled to make clear a number of instances on Thursday that ChatGPT Professional isn’t for most individuals.
“Most users will be very happy with the o1 in the [ChatGPT] Plus tier!” he stated on X. “Almost everyone will be best-served by our free tier or the Plus tier.”
So who’s it for? Are there actually individuals on the market keen to pay $200 a month to ask toy questions like “Write a 3-paragraph essay on strawberries without using the letter ‘e’” or “remedy this Math Olympiad drawback“? Will they fortunately half methods with their hard-earned money with out a lot assure that the usual o1 can’t satisfactorily reply the identical questions?
I requested Ameet Talwalkar, an affiliate professor of machine studying at Carnegie Mellon and a enterprise associate at Amplify Companions, his opinion. “It seems like a big risk to me to raise the price tenfold,” he instructed TechCrunch through electronic mail. “I think we’ll have a much better sense in just a few weeks as to the appetite for this functionality.”
UCLA laptop scientist Man Van den Broeck was extra candid in his evaluation. “I don’t know if the price point makes sense,” he instructed TechCrunch, “and if pricey reasoning models will be the norm.”
o1 is “better than most humans at most tasks” as a result of, sure, people exist solely in amnesic disembodied multi-turn chat interfaces https://t.co/zbLY2BG5pQ
— Aidan McLau (@aidan_mclau) December 6, 2024
A beneficiant take is that it’s a advertising blunder. Describing o1 professional mode as finest at fixing “the hardest problems” doesn’t inform potential prospects a lot. Nor do obscure statements about how the mannequin can “think longer” and reveal “intelligence.” As Willison factors out, with out particular examples of this supposedly improved functionality, it’s arduous to justify paying extra in any respect, not to mention ten instances the worth.
that is such a humorous advisable immediate for an ai mannequin that prices $2400/yr
I hope openai retains these boilerplate pattern prompts all the best way to asi pic.twitter.com/JQ5vLKxWWR
— Dean W. Ball (@deanwball) December 6, 2024
As far as I can inform, consultants in specialised fields are the meant viewers. OpenAI says it plans to grant a handful of medical researchers at “leading institutions” free entry to ChatGPT Professional, which is able to embody o1 professional mode. Errors matter quite a bit in healthcare, and, as Bob McGrew, OpenAI’s former chief analysis officer, famous on X, higher reliability is probably o1 professional mode’s chief unlock.
Been enjoying with o1 and o1-pro for bit.
They’re excellent & a little bit bizarre. They’re additionally not for most individuals more often than not. You actually need to have explicit arduous issues to resolve to be able to get worth out of it. However if in case you have these issues, it is a very large deal.
— Ethan Mollick (@emollick) December 5, 2024
McGrew additionally mused o1 professional mode is an instance of what he calls “intelligence overhang”: customers (and maybe the mannequin’s creators) not figuring out the best way to get worth from any “extra intelligence” because of basic limits of a easy, text-based interface. As with OpenAI’s different fashions, the one solution to work together with o1 professional mode is thru ChatGPT, and — to McGrew’s level — ChatGPT isn’t excellent.
It’s additionally true, although, that $200 units expectations excessive. And judging by the early reception on social media, ChatGPT Professional isn’t any slam dunk.