Numerai: A beginner’s guide to the AI-run, crowd-sourced hedge fund

What is Numerai?

The stock market is one of the toughest nuts to crack as the target of prediction in quantitative finance, even in the age of artificial intelligence (AI). This system is intricate, as it involves unpredictable events that can wield significant influence. Moreover, its remarkable efficiency is evident in how even the tiniest snippets of information can trigger swift reactions in stock prices.

Imagine a solution, where all the available data that potentially moves the market would be gathered in one place and anyone would have the opportunity to make sense of them with the help of AI and blockchain. This is exactly what Numerai is doing.

Numerai is an innovative AI-enabled hedge fund that combines AI, blockchain technology and crowd-sourced stock market prediction models to revolutionize quantitative finance. The company functions as a platform disseminating encrypted financial data sets to a global network of data scientists who utilize machine learning (ML) to develop predictive models for signaling stock market trades. This collective intelligence from data scientists fuels the platform’s investing approach.

The unique economic model of Numerai is underpinned by Numeraire (NMR), its native Ethereum-based cryptocurrency. The network of data scientists includes more than 5,500 participants, who stake their NMR cryptocurrency on their predictive models in weekly Numerai tournaments. The system mimics the incentivization of traders. Data scientists have the opportunity to gain tokens by building models that perform well on the data sets provided by Numereai or lose tokens if their stock predictions underperform.

Besides being a technological pioneer, Numerai seems to outperform traditional hedge funds. Information about the company’s performance is not publicly available, but it made the news in February when Bloomberg reported a 20% return for investors in the middle of a financial downturn when most of the sector’s representatives struggled. 

The history of Numerai

Numerai was founded in 2015 by Richard Craib, a San Francisco-based entrepreneur with a background in quantitative finance and ML. Craib has had a passion for finance from a young age and studied mathematics at Cornell University. 

He founded Numerai with the aim of building a decentralized hedge fund that would leverage the power of AI and crowd-sourced intelligence. His vision and master plan for the company is not smaller than becoming the last hedge fund in the form of a decentralized monopoly.

The company made headlines when it introduced Bitcoin (BTC) as means of payment, reacting to the feedback of contributors about lacking access to the company’s original paying solution, PayPal. However, the key milestone in the history of Numerai was the introduction of its native cryptocurrency, NMR, in 2017, as the first hedge fund having its own token. 

After successful fundraising supported by Placeholder and Paradigm, Numerai decided to send over 1.2 million tokens to 19,000 data scientists around the world as an invitation to participate in the redesigned tournaments.

In 2018, Numerai announced the introduction of Erasure, a decentralized finance (DeFi) marketplace for any finance-related data. The Erasure protocol enables decentralized application building for users to buy and sell predictions and data using the Ethereum blockchain. 

This expansion into DeFi further solidified Numerai’s position as a pioneer in the integration of AI and blockchain technologies. In the same year, additional features were also added to the platform. The reputation board served the purpose of rewarding users with high-performing models and providing a measure of reputation within the community.

A new announcement followed in 2020. Numerai introduced Numerai Signals, a new product that aimed to democratize access to financial data. Numerai Signals offers the opportunity for the company to collect entirely new complementary signals based on any new data sources provided by data scientists. 

In 2022, Numerai opened a new pillar of the hedge fund, Numerai Supreme. Numerai Supreme is based on the same investment framework as the flagship fund (now called Numerai One) but with a different portfolio, which promises higher returns with higher volatility. 

How does Numerai work?

Numerai’s concept was inspired by Kaggle’s data science competitions, where data scientists design ML models to solve different AI problems. In the case of Numerai, the aim of each competition is to give predictions about the stock market. Due to the way the platform and the competition are designed, participation does not require prior knowledge about finance itself but rather a deep understanding of ML.

An extremely challenging point of creating efficient ML models for predictions in finance is the limited access to data. Financial data is sensitive and highly valuable, hence traditional hedge funds keep their data for themselves. Numerai solved this problem with obfuscated data, which means that the features do not show the original values of the data but some sort of modified version of them. 

In the case of Numerai, the method of obfuscation is structure-preserving encryption. After the encryption, the raw data remains hidden, while its structure is preserved; therefore, ML models can learn the structure of the data sets and build predictions on top of them.

Numerai offers open access to its data sets to a global network of data scientists, who build ML models to provide predictions about the stock market. These models solve generic ML problems rather than focusing on specific financial scenarios. 

Numerai tournaments are divided into weekly rounds, with each round lasting one month. The company releases a new set of data each week. It consists of a set of training data and a set of test data, which participants can use to run their trained models against to generate their predictions.

The following image underpins how Numerai turns signals into optimized portfolios


The platform even provides examples of basic and more advanced code for building ML predictions. However, contributors are incentivized by the staking program to build their unique and original solutions. Submitted predictions are scored, and scores depend on the accuracy and originality of the models. 

The payouts are calculated based on the information coefficient, which means they are calculated based on the correlation between the prediction and return after optimization. In the classical sense, contributors do not compete against one another; their performance is measured based on how their prediction is performing on new data. 

The submission of the predictions is followed by the computation of the meta model. Numerai creates a stake-weighted meta model by combining the latest predictions from the Numerai tournaments and predicts a signal for each stock. 

Calculating the stake-weighted averages means that the importance of each participant’s prediction in the meta model is weighted by the person’s staked amount. By imposing constraints on risk factors, such as country, sector and market risk, convex optimization transforms the meta model signal into a portfolio.

Besides the “classical” Numerai tournaments, in 2020, a new product came to life, called Numerai Signals. Numerai Signals allows participants to submit their market signals based on their own data. While in tournaments, participants are expected to predict targets — and rank them — based on a given dataset. In Numerai Signals, data scientists provide the list of stocks that they are willing to signal. 

How does the NMR token work?

Blockchain technology is used to incentivize participants. NMR serves as the native utility token of the Numerai platform. The NMR cryptocurrency follows the ERC-20 standard, and the smart contracts are controlled by the Erasure protocol. The tokenomics of NMR are designed to create a self-sustaining ecosystem where data scientists are incentivized to contribute high-quality models, stake NMR tokens, and actively engage in the platform’s governance.

Data scientists who want to participate in the Numerai tournament are required to stake a certain amount of NMR cryptocurrency. New participants can acquire NMR tokens by purchasing them, but Numerai also offers NMR as bug bounties. 

The staked NMR acts as a commitment to the quality of their models and ensures that participants have “skin in the game.” The payout of the participants is primarily a function of their scores. A positive score brings a payout, while a negative score is penalized by burning a proportion of their staked tokens.

A proportion of stakes, which belongs to badly performing predictions, is burned. Besides keeping the balance in the incentivization, it also helps to partly regulate the supply of NMR. With the same motivation, Numerai also burned a part of the NMR at the introduction of Erasure. 

NMR token holders also have governance rights within the Numerai platform. They can participate in voting on important decisions, such as protocol upgrades, parameter changes and platform improvements. 


NMR has external value beyond the Numerai ecosystem; it can be traded on various cryptocurrency exchanges. As staking NMR enables data scientists to amplify their earnings, it makes sense to consider its economic value. 

For instance, for a data scientist with an infallible model, the value of all NMR would be equivalent to the net present value of all future stake payouts by Numerai. NMR is also listed on major exchanges.

Risks associated with AI-powered, crowd-sourced hedge funds

While promising, crowdsourced hedge funds driven by AI include some inherent risks. Dependence on machine learning models exposes these funds to biases and mistakes in the algorithms, which may result in poor investment choices. Furthermore, due to their complexity, AI models are susceptible to changes in market conditions, which may result in huge losses.

Concerns surrounding data security and integrity can arise from putting users' trust in a decentralized network of contributors. The coordination of various strategies by different participants may lead to conflicting activities that have an impact on the performance of the fund as a whole. 

Moreover, such risks can be amplified by regulatory obstacles and a lack of human supervision, thereby jeopardizing investor interests. Addressing these concerns is essential to ensuring the long-term viability and success of crowdsourced, AI-driven hedge funds as the sector develops.

Written by Eleonóra Bassi