Update (Feb. 19, 8:34 am UTC): This article has been updated to clarify that LMArena has not independently confirmed whether Grok-3’s ranking represents a significant breakthrough over competitors.

An earlier version of the newly launched Grok-3, an AI large language model (LLM), has beat rival AI systems from Google, OpenAI and DeepSeek in a community-driven blind evaluation.

On Feb. 18, Elon Musk announced xAI’s latest AI model release, Grok-3, during a livestream on X. In the discussion, the xAI team revealed it had released an early Grok-3 version on LMarena under the alias “chocolate” for community testing.

Source: LMArena

Grok-3 tops multiple AI performance metrics

Chatbot Arena, a community-driven AI evaluation platform, allows users to compare AI models in a blind test by ranking responses from two anonymous chatbots. According to its website, the platform has recorded over a million votes from users.

According to xAI’s internal comparison of AI models, Grok-3 scored at least 10 points more than its biggest competitors — ChatGPT o3mini, o1, Deepseek-R1 and Gemini-2 Flash Thinking — in math, science and coding.

Bot, United States, Space, Elon Musk

Comparison between Grok-3 and other AI models. Source: xAI

Grok-3 dominates AI chatbots across all categories

LMArena also noted that the early Grok-3 model currently ranks first in all categories, including overall with style control, hard prompts and hard prompts with style control, coding, math, creative writing, instruction following, longer query and multi-turn.

Grok-3’s performance across all the top categories. Source: LMArena

Musk and the xAI team reiterated LMArena’s finding that the early Grok-3 model — codenamed chocolate — achieved a record milestone of 1400 score. “And it’s still climbing. So we have to keep updating it. It’s 1400 and climbing,” Musk said.

LMArena has not independently confirmed whether Grok-3’s ranking represents a significant breakthrough over competitors or if external factors, such as audience demographics, may have contributed to the model’s ranking.

Elon Musk prepares Grok-powered Tesla Bots for space exploration

Further into the announcement, Musk revealed plans to send a Tesla Bot, powered by xAI’s artificial intelligence model Grok, on SpaceX’s next Mars mission by the end of 2026.

During a discussion, he revealed that most of SpaceX’s projects for Mars exploration are slated for around Q4 2026. 

He explained that the Earth-Mars transit window occurs every 26 months, making November 2026 the next ideal opportunity for rocket launches to the Red Planet.

Source: xAI

Musk also said he may be sending a Tesla Bot and Grok on the Mars mission:

“If all goes well, SpaceX will send Starship rockets to Mars with Optimus robots and Grok.”

Grok-3 engineer exits upon ultimatum

On Feb. 12, an xAI engineer quit over an X post in which he had ranked Grok-3 lower than ChatGPT, sharing his personal opinion prior to the model’s release.

Source: Benjamin DeKraker

“I either had to delete the post quoted below or face being fired, DeKraker wrote, adding:

“After reviewing everything and thinking a lot, I’ve decided that I’m not going to delete the post -- which is very clearly a harmless personal opinion.”

Magazine: Korea to lift corporate crypto ban, beware crypto mining HDs: Asia Express