Update (Feb. 19, 8:34 am UTC): This article has been updated to clarify that LMArena has not independently confirmed whether Grok-3’s ranking represents a significant breakthrough over competitors.
An earlier version of the newly launched Grok-3, an AI large language model (LLM), has beat rival AI systems from Google, OpenAI and DeepSeek in a community-driven blind evaluation.
On Feb. 18, Elon Musk announced xAI’s latest AI model release, Grok-3, during a livestream on X. In the discussion, the xAI team revealed it had released an early Grok-3 version on LMarena under the alias “chocolate” for community testing.
Source: LMArena
Grok-3 tops multiple AI performance metrics
Chatbot Arena, a community-driven AI evaluation platform, allows users to compare AI models in a blind test by ranking responses from two anonymous chatbots. According to its website, the platform has recorded over a million votes from users.
According to xAI’s internal comparison of AI models, Grok-3 scored at least 10 points more than its biggest competitors — ChatGPT o3mini, o1, Deepseek-R1 and Gemini-2 Flash Thinking — in math, science and coding.
Comparison between Grok-3 and other AI models. Source: xAI
Grok-3 dominates AI chatbots across all categories
LMArena also noted that the early Grok-3 model currently ranks first in all categories, including overall with style control, hard prompts and hard prompts with style control, coding, math, creative writing, instruction following, longer query and multi-turn.
Grok-3’s performance across all the top categories. Source: LMArena
Musk and the xAI team reiterated LMArena’s finding that the early Grok-3 model — codenamed chocolate — achieved a record milestone of 1400 score. “And it’s still climbing. So we have to keep updating it. It’s 1400 and climbing,” Musk said.
LMArena has not independently confirmed whether Grok-3’s ranking represents a significant breakthrough over competitors or if external factors, such as audience demographics, may have contributed to the model’s ranking.
Elon Musk prepares Grok-powered Tesla Bots for space exploration
Further into the announcement, Musk revealed plans to send a Tesla Bot, powered by xAI’s artificial intelligence model Grok, on SpaceX’s next Mars mission by the end of 2026.
During a discussion, he revealed that most of SpaceX’s projects for Mars exploration are slated for around Q4 2026.
He explained that the Earth-Mars transit window occurs every 26 months, making November 2026 the next ideal opportunity for rocket launches to the Red Planet.
Source: xAI
Musk also said he may be sending a Tesla Bot and Grok on the Mars mission:
“If all goes well, SpaceX will send Starship rockets to Mars with Optimus robots and Grok.”
Grok-3 engineer exits upon ultimatum
On Feb. 12, an xAI engineer quit over an X post in which he had ranked Grok-3 lower than ChatGPT, sharing his personal opinion prior to the model’s release.
Source: Benjamin DeKraker
“I either had to delete the post quoted below or face being fired, DeKraker wrote, adding:
“After reviewing everything and thinking a lot, I’ve decided that I’m not going to delete the post -- which is very clearly a harmless personal opinion.”
Magazine: Korea to lift corporate crypto ban, beware crypto mining HDs: Asia Express