Predicting Token Sale Probabilities with Lock-up x ROI Using Random Forest | by Yann MASTIN | Mar, 2025

The biggest challenge in this analysis is collecting data from addresses that sold their tokens. To achieve this, we rely on staking protocols that agree to provide anonymized entry and exit data, as well as transfers to centralized exchanges (CEX). Another challenge was identifying the magnitude of sales relative to the initially locked bag.

This model is still in draft form, and all contributions or suggestions for improvement are greatly appreciated. The biggest challenge in this analysis is collecting data from addresses that sold their tokens. To achieve this, we rely on staking protocols that agree to provide anonymized entry and exit data, as well as transfers to centralized exchanges (CEX). Another challenge was identifying the magnitude of sales relative to the initially locked bag.

Predicting token sale probabilities is critical in tokenomics design, especially when managing liquidity, staking rewards, and investor expectations. Most existing models focus either on the Lock-up Period or ROI independently. However, we decided to combine both factors to develop a more comprehensive prediction model. This strategy leverages a Random Forest model to assess the likelihood of token sales based on these two key factors: Dormancy Period (lock-up duration) and Return on Investment (ROI) conditions. By analyzing these factors, the model provides insights into potential selling behaviors and allows better calibration of vesting and staking mechanisms. Predicting token sale probabilities is critical in tokenomics design, especially when managing liquidity, staking rewards, and investor expectations. This strategy leverages a Random Forest model to assess the likelihood of token sales based on two key factors: Dormancy Period (lock-up duration) and Return on Investment (ROI) conditions. By analyzing these factors, the model provides insights into potential selling behaviors and allows better calibration of vesting and staking mechanisms.

Why Use a Random Forest Model?

The Random Forest algorithm is particularly well-suited for this use case due to its strength in:

Handling Non-Linear Relationships: Token sale behavior often has complex patterns that Random Forest effectively captures.
Robustness to Outliers: Dormancy periods and ROI spikes are often volatile, and Random Forest mitigates these effects by averaging multiple decision trees.
Flexibility in Feature Importance: The model allows dynamic weighting of different factors such as staking, ROI levels, and vesting conditions.

Key Model Features

The Random Forest model leverages the following features to predict the probability of a token sale:

Dormancy Period: Number of days since vesting or staking began, representing the lock-up period.
ROI (Return on Investment): The financial return observed relative to the initial token price, influencing selling decisions.
Staking and Vesting Conditions: Historical data on staking behavior, with factors such as lock-in periods, reward schedules, and distribution methods.
Market Conditions: Price volatility and trend indicators are also included for better forecasting.

Model Workflow

Data Preparation

Extract historical data on token vesting, staking lock up, and ROI based on price.
Clean and normalize the data to ensure uniform scale and format.

2.Feature Engineering The following engineered features are created to improve model performance:

Days Since Vesting: Captures the dormancy period.
ROI Thresholds: Identifies key ROI points that trigger selling.
Token Sale Volume: Historical volume data to calibrate market behavior.

3.Training Process

The model is trained using Random Forest’s ensemble method, where each decision tree predicts token sale probabilities.
Predictions are averaged to generate a final probability score, improving stability and reducing variance.

4.Prediction Output

The final output provides the probability of token sale based on the lock-up period and ROI conditions (and a stochastic price).

The model’s predictive power is best illustrated through sensitivity analysis. Key insights include:

Impact of Dormancy Period: As dormancy increases, token sale probability typically drops unless ROI spikes significantly.
Influence of ROI Thresholds: Higher ROI values sharply increase the likelihood of sales, even for long dormancy periods.
Staking Duration Impact: Extended staking periods tend to reduce short-term sale probability but may lead to a large sell-off once vesting ends.

The Dormancy x ROI model has valuable applications across multiple Web3 scenarios:

Liquidity Management: Predict when large token sales might occur to manage liquidity pools effectively.
Incentive Calibration: Design staking and vesting mechanisms to align investor behavior with project goals.
Investor Strategy: Identify high-risk ROI points where sell pressure is likely to surge.

The Random Forest model offers a powerful framework to predict token sale behavior by combining Dormancy Period and ROI Analysis. This data-driven approach allows crypto projects to make informed decisions regarding token distribution, liquidity planning, and staking strategies. By understanding the key drivers behind token sales, teams can better anticipate and mitigate risks in volatile markets.

Predicting Token Sale Probabilities with Lock-up x ROI Using Random Forest | by Yann MASTIN | Mar, 2025

Recent Articles

Automate video insights for contextual advertising using Amazon Bedrock Data Automation

Rogue npm Packages Mimic Telegram Bot API to Plant SSH Backdoors on Linux Systems

Google rolls out Gemini 2.5 Flash preview on April 17

Budget-Aware Fashion Matching With Gemini | by Arwa Awad | Apr, 2025

NVIDIA Introduces CLIMB: A Framework for Iterative Data Mixture Optimization in Language Model Pretraining

Related Stories

Leave A Reply Cancel reply