# Navigating the Data Maze

#### Data Selection

DeFi liquidity constraints pose challenges for autonomous on-chain execution. aarnâ addresses this by focusing on tokens with a minimum liquidity threshold on DEX, ensuring that the model is trained on economically significant and active assets. This approach filters out long-tail assets from both the training and inference datasets, enhancing prediction relevance and accuracy.&#x20;

#### Data Handling Pipeline

All collected data, including OHLCV, Twitter, blogs, and user transactions, are stored daily on AWS Cloud, in Amazon S3 as raw data. To transform this raw data into a structured format suitable for analysis, AWS Glue jobs are utilized, which automate the extraction, transformation, and loading processes. Additionally, AWS Lambda functions are employed to handle event-driven data processing, ensuring that data flows smoothly and remains up-to-date. The transformed data is then queried and analyzed using Amazon Athena.

#### Feature Engineering

The feature set for alpha 30/7 is extensive, with over 93 handcrafted features based on multiple data sources outlined in the data groups in the table below. It includes 17 sentiment-based features from Twitter and blog content, 18 transactional features on whale users filtered from a universe of 22k users to capture market impact, and various price-related metrics to capture market dynamics. For sentiment analysis, aarnâ utilizes the Perplexity Sonar Pro and CryptoBERT models, specifically designed to capture trends from social media and blog data.

| Data source      | Group                            | Description of Features Created                                                                                                         |
| ---------------- | -------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------- |
| OHLCV            | Price and Market DynamicsFeature | Includes day-specific price data, market volatility, moving averages, trend and momentum indicators to analyze market behavior.         |
| USER TRANSACTION | Transaction and Volume           | <p>Features capturing whale users activities, and transaction values,</p><p>to reflect trading dynamics and volume flow</p>             |
| SOCIAL MEDIA     | Sentiment and Social Media       | Data from Twitter and news articles to analyze sentiment trends, consistency, and dynamics through various sentiment scores and changes |


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.aarna.ai/aarna-agentic-engine/alpha-30-7/navigating-the-data-maze.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
