aarnâ AI

> deep learning for DeFi

alpha is all we desire

AI has since long augmented financial market analysis, mirroring the more recent transformative impact seen in natural language processing and other fields. Drawing parallels with the seminal work "Attention Is All You Need," which introduced the Transformer architecture and reshaped AI paradigms, the aarnâ AI alpha 30/7 Model leverages state-of-the-art deep learning to redefine investment strategies in decentralized finance (DeFi).

Traditional finance giants like Renaissance Technologies have demonstrated the outlier potential of advanced quantitative strategies in equity and derivative markets. However, the DeFi landscape requires redefined approaches for feature extraction and market analysis. The aarnâ AI Alpha 30/7 deep learning model is designed to navigate the volatile landscape of cryptocurrency markets, bringing advanced quant techniques to digital assets.

Building on a comprehensive dataset that includes blockchain transactional data, social sentiment, technical indicators, and high value user behavior, the alpha 30/7 model employs an advanced AI architecture that integrates Variational Autoencoders (VAE), Long Short-Term Memory Networks (LSTM), and Attention Mechanisms (AT). This framework enables the model to capture and learn non-linear patterns, aiming for absolute returns with downside risk cover. By drawing on the foundational principles of deep learning and attention mechanisms, alpha 30/7 Model sets a new standard in AI-driven crypto structured products in DeFi.

Extensive back testing highlights the model's efficacy, exceeding 300% returns over 12-months. Not only outperforming benchmarks like CCI30 and Bitcoin, but also remaining positive consistently every quarter, even in negative markets. Pioneering application convergence of AI and DeFi, at the heart of aarnâ protocol’s approach is the âtv802 vault, a smart contract that utilizes the predictive analytics of alpha 30/7 and tokenizes into a crypto structured product. In essence, "Alpha Is All We Desire" encapsulates leveraging cutting-edge AI to unlock participation in DeFi at scale, making it an invaluable tool for high-value investors seeking consistent and superior performance in digital assets.

Deep Learning in DeFi: alpha 30/7

In aarna’s alpha 30/7 neural network architecture, the initial processing begins with a Variational Autoencoder (VAE), which transforms the input dataset of 93 features into 32 latent spaces. These latent spaces are then passed into LSTM layers that capture and analyze temporal dependencies within the sequence. Enhancing this analysis, an attention mechanism focuses selectively on the most pertinent aspects of the LSTM outputs, ensuring that critical information is emphasized for subsequent layers. The processed data is then integrated and further interpreted in the Dense layers (ANN), which apply non-linear transformations to consolidate the insights derived from earlier stages. The architecture is fortified with a risk management framework, incorporating a probability filter designed to avoid predictions when the model lacks confidence and a dynamic stop-loss mechanism adjusted based on market moments to mitigate(reduce) downside risks. This flow ensures that the network not only predicts effectively but also guards against potential financial volatilities(uncertainties).

Variational Autoencoders

VAEs are useful for generating new data samples similar to the original data, making them ideal for understanding complex datasets and discovering latent structures. alpha 30/7 architecture utilizes VAEs to reduce the noise from the high-dimensional data (93 features) by encoding it into a lower-dimensional latent space. This probabilistic approach captures underlying factors, allowing for a more compact and meaningful representation.

The alpha 30/7’s VAE architecture consists of two main components: the encoder and the decoder. The encoder maps the input data to a latent space characterized by a mean and log-variance, defining a distribution rather than a single point. This is achieved using a series of dense layers with ReLU activation. The decoder reconstructs the original data from the latent space, employing a similar network structure but with a sigmoid activation in the final layer to output values in the range [0, 1].

Long Short-Term Memory Networks

LSTMs are advanced recurrent neural network architectures capable of learning long-term dependencies in sequential data. alpha 30/7 architecture is utilizing the Bidirectional LSTM for processing the input sequences through a 64-unit layer with return sequences enabled, enhancing the model's ability to capture temporal relationships at each time step. Regularization techniques like L2 regularization, with a regularization factor of 0.1, are utilized to prevent overfitting by penalizing large weights, helping to maintain a balance between model complexity and generalization. The learning rate is set at 1e-3, ensuring that the optimizer converges efficiently while avoiding the pitfalls of overshooting or slow convergence. The model employs the Adam optimizer, known for its adaptive learning rate capabilities, and binary cross entropy loss, optimizing it for binary classification tasks.

Attention Mechanisms

The attention mechanism plays an important role in the LSTM architecture that processes 32 features derived from a Variational Autoencoder (VAE). This setup allows the LSTM to selectively concentrate on specific features within the sequence that are most predictive for the task. The attention layer utilizes a query-value attention scheme, where each of the 32 features can be weighted differently based on their relevance to the output prediction. This method ensures that the network doesn’t treat all features equally but focuses more on those that enhance its predictive accuracy. By integrating this attention mechanism, the LSTM can efficiently handle the complexity and variability of the features generated by the VAE, leading to more nuanced and effective model performance.

Artificial Neural Networks (ANN): Dense Layers

The fully connected layers are fundamental building blocks where every input neuron is connected to every output neuron. The Dense layers in our model are particularly important for integrating and interpreting the features extracted by previous layers, such as LSTM outputs and attention-filtered sequences. By applying activation functions like ReLU, these layers introduce non-linearities into the model, enabling it to learn complex patterns and dependencies in the data. The final Dense layer, often followed by a sigmoid activation function in binary classification tasks, consolidates the learned features into a final output, providing the predictive probabilities.

Risk Filters

The risk management system incorporates two advanced filters that effectively mitigate potential downside risks:

Probability Filter: This filter is designed to minimize losses by intervening when the model's confidence is low.
Dynamic Stop-Loss Filter: Adjusts its parameters based on Bitcoin (BTC) performance, setting a maximum allowable loss limit at -6%.

Together, these mechanisms provide a robust risk management framework. They proactively manage potential declines and adapt to evolving market conditions, thereby protecting investments from severe losses.

Navigating the Data Maze

Data Selection

DeFi liquidity is a constraining factor for autonomous execution onchain. aarnâ focuses on a dataset based on tokens with a minimum liquidity of $500k on Uniswap v3 on Ethereum. This approach ensures the model’s training is based on assets that are economically significant and active, enhancing the relevance and accuracy of the predictions. Also ensuring that long tail assets are filtered out both from the training and inference dataset. The data indicates a concentration of tokens in the $2.5M to $5M liquidity range. Despite liquidity challenges in DeFi, the liquidity levels of the top 100 market cap tokens are improving, supporting the effective execution of predictions without significantly affecting the Uniswap liquidity pools. Additionally, many tokens in the study have market caps exceeding $100M. Distribution graphs below illustrate the token universe that has been considered for alpha 30/7

Figure 4: (a) distribution of tokens across various Total Value Locked (TVL) USD categories; Figure (b) illustrates the number of tokens by different Market Capitalization thresholds.

Data Sources and Composition

OHLCV: The Open-High-Low-Close-Volume data for all tokens in the universe is obtained from CoinGecko. The prices are updated daily to ensure the most recent market information is available for analysis.
User Transaction Data: incorporated detailed transaction records from users involving specified tokens, providing insights into market dynamics such as buying, selling, and liquidity provision. These users are the top 100 holders of each token in the token universe on the Ethereum blockchain. This dataset is updated monthly to ensure accuracy and relevance.
Social Data: Tweets from key influencers and relevant blog posts are analyzed to gauge sentiment and public perception of specific tokens and the broader DeFi market. This data is obtained daily from Twitter using the Twitter API, and blogs are scraped from open websites using Beautiful Soup.

Data Handling Pipeline

All collected data, including OHLCV, Twitter, blogs, and user transactions, are stored daily on AWS Cloud, in Amazon S3 as raw data. To transform this raw data into a structured format suitable for analysis, AWS Glue jobs are utilized, which automate the extraction, transformation, and loading processes. Additionally, AWS Lambda functions are employed to handle event-driven data processing, ensuring that data flows smoothly and remains up-to-date. The transformed data is then queried and analyzed using Amazon Athena.

Feature Engineering

The feature set for alpha 30/7 is extensive, with over 93 handcrafted features based on multiple data sources outlined in the data groups in the table below. It includes 17 sentiment-based features from Twitter and blog content, 18 transactional features on whale users filtered from a universe of 22k users to capture market impact, and various price-related metrics to capture market dynamics. For sentiment analysis, aarnâ utilizes the LLAMA and CryptoBERT models, specifically designed to capture trends from social media and blog data.

Data source

Group

Description of Features Created

OHLCV

Price and Market DynamicsFeature

Includes day-specific price data, market volatility, moving averages, trend and momentum indicators to analyze market behavior.

USER TRANSACTION

Transaction and Volume

Features capturing whale users activities, and transaction values,

to reflect trading dynamics and volume flow

SOCIAL MEDIA

Sentiment and Social Media

Data from Twitter and news articles to analyze sentiment trends, consistency, and dynamics through various sentiment scores and changes

Table 1: Feature Overview for DL Algorithm

Results Analysis

Experiments

To finalize aarna’s alpha 30/7 architecture, extensive empirical testing were conducted:

Feature Integration:

OHLCV data was initially used, and Twitter sentiments, blog insights, and user transaction patterns were incrementally incorporated to enrich the dataset.

Feature Selection:

XGBoost, Random Forest, PCA, and Recursive Feature Elimination were used to identify and retain the most impactful features.

Model Evaluation:

Various models, including XGBoost, Random Forest, AutoML, and MLP, were tested to select the most effective one based on performance metrics.

Sliding Window Testing:

Different time frames (30/7, 60/7, 90/7) were experimented to optimize the balance between historical data depth and the prediction horizon.

Probability Filter Optimization:

The probability filter was calibrated by testing various thresholds against historical data to effectively minimize downside risk.

Dynamic Stop-Loss Adjustment:

Various stop-loss techniques, ranging from constant to variable, were tested to determine the optimal strategy that protects investments during bearish markets while avoiding unnecessary exits.

Model Tuning:

Extensive tuning involved experimenting with multiple layers, bidirectional LSTMs, custom loss functions (e.g., Tversky Loss, focal loss, and weighted binary cross-entropy), dynamic parameter adjustments, regularization techniques (L2, L1), learning rates ranging from 1e-2 to 1e-6, and varied activation functions (ReLU, tanh) and optimizers (Adam, RMSprop).

Results

To validate the VAE model, a custom VAE class was defined in TensorFlow/Keras, incorporating the essential components and training process. The training involves minimizing both the reconstruction loss and the KL divergence loss, balancing the fidelity of the reconstruction with the regularization of the latent space. These losses are monitored during training to ensure the model is learning appropriately. The trained VAE can then be used to analyze the contribution of different feature sets.

Figure 2 shows the distribution of features from OHLCV data (0-54), transaction data (55-75), and social media data (76-93) across the latent dimensions. This indicates that each feature set is integrated into the latent space, contributing to the final representation. The VAE captures the various types of data, providing a unified and comprehensive view of the dataset. This approach allows the model to utilize the strengths of each feature set, leading to a robust representation of the data. Note that these contributions are dynamic and can change weekly based on the data.

Dataset

Precision

Recall

Training 1Y Avg

0.9

0.91

Validation 1Y Avg

0.62

0.44

Table 2:Precision and Recall for Target Class

The objective of the model is to identify instances where returns are greater than 5% in a week. The model shows good performance in this regard, achieving recall values of 0.91 in both training and validation sets, indicating it correctly identifies most positive instances. The precision is 0.9 in training and 0.62 in validation, meaning a substantial proportion of the predicted positive instances are true positives. The loss curves for both training and validation datasets consistently decrease with each epoch, demonstrating the model's learning capabilities. While the classifier is not perfect, it effectively predicts tokens above 5% in a week, aligning with the objective.

Backtesting & Conclusion

The model has shown impressive performance, surpassing both Bitcoin and the CCI30 index with returns over 300% in the last 12 months. The base model's risk filter has successfully reduced downside risks but tends to miss recovery opportunities due to its conservative settings. Although in the last quarter Bitcoin showed marginal positive returns, the model missed opportunities over a two-week period as shown in Figure 7 due to these conservative nature of risk filters.

Fig 6 shows the risk filter's effectiveness with "Flag 1" indicating safe trading periods in teal, and "Flag 0" in magenta for high-risk periods, effectively preventing trades during uncertain market conditions, holding assets in stablecoins during that week.

Future Work

While the Alpha 30/7 DL model has shown impressive performance surpassing both Bitcoin and the CCI30 index, it still has potential to grow. The model’s risk filter's conservative calibration tends to miss opportunities during market recoveries, indicating a need to implement a design to better capture the gains in the rebound phase.

The feature set will also be expanded further to include blockchain transaction & consensus level data and other metrics that might enhance the model’s learning. Additionally, aarnâ will explore the use of advanced models like transformers that excel in handling complex relationships and temporal sequences in blockchain data.

Future plans also include open sourcing contributions to aarnâ AI via decentralized protocols.

Previouswhitepaper Nextâtv tokenization

Last updated 5 months ago