Navigating the Data Maze
Data Selection
DeFi liquidity constraints pose challenges for autonomous on-chain execution. aarnâ addresses this by focusing on tokens with a minimum liquidity threshold on DEX, ensuring that the model is trained on economically significant and active assets. This approach filters out long-tail assets from both the training and inference datasets, enhancing prediction relevance and accuracy.
Data Handling Pipeline
All collected data, including OHLCV, Twitter, blogs, and user transactions, are stored daily on AWS Cloud, in Amazon S3 as raw data. To transform this raw data into a structured format suitable for analysis, AWS Glue jobs are utilized, which automate the extraction, transformation, and loading processes. Additionally, AWS Lambda functions are employed to handle event-driven data processing, ensuring that data flows smoothly and remains up-to-date. The transformed data is then queried and analyzed using Amazon Athena.
Feature Engineering
The feature set for alpha 30/7 is extensive, with over 93 handcrafted features based on multiple data sources outlined in the data groups in the table below. It includes 17 sentiment-based features from Twitter and blog content, 18 transactional features on whale users filtered from a universe of 22k users to capture market impact, and various price-related metrics to capture market dynamics. For sentiment analysis, aarnâ utilizes the LLAMA and CryptoBERT models, specifically designed to capture trends from social media and blog data.
Data source
Group
Description of Features Created
OHLCV
Price and Market DynamicsFeature
Includes day-specific price data, market volatility, moving averages, trend and momentum indicators to analyze market behavior.
USER TRANSACTION
Transaction and Volume
Features capturing whale users activities, and transaction values,
to reflect trading dynamics and volume flow
SOCIAL MEDIA
Sentiment and Social Media
Data from Twitter and news articles to analyze sentiment trends, consistency, and dynamics through various sentiment scores and changes
Last updated