XGBoost vs LSTM with Time-Series Cross-Validation
Learn how to leverage the temporal structure in CrowdCent's training data to dramatically improve prediction performance using an LSTM model compared to a traditional XGBoost approach. We'll use the sklego.model_selection.TimeGapSplit
to set up proper time-series cross-validation with gap periods to prevent leakage and then visualize the performance on various metrics over time with moving averages.
Key Insight: Sequential Feature Processing¶
CrowdCent's training and inference data contains features with a defined temporal sequence through lag windows (e.g., feature_1_lag15
, feature_1_lag10
, feature_1_lag5
, feature_1_lag0
). While traditional models like XGBoost treat these as independent features, an LSTM can process them sequentially, capturing temporal dependencies.
Model Comparison¶
- XGBoost: Treats all features as independent inputs, achieving ~0.06 average 30-day Spearman correlation
- LSTM (from Centimators): Processes features sequentially by reshaping them along the lag axis, achieving ~0.19 average 30-day Spearman correlation - over 3x improvement!
Feature Reshaping Example¶
The LSTM transforms the flat feature vector:
[feature_1_lag10, feature_1_lag5, feature_1_lag0, feature_2_lag10, feature_2_lag5, feature_2_lag0, ...]
into a sequential 2D tensor:
[[feature_1_lag10, feature_2_lag10],
[feature_1_lag5, feature_2_lag5],
[feature_1_lag0, feature_2_lag0]]
What You'll Learn¶
- How to set up proper time-series cross-validation with gap periods to prevent leakage
- How to train both XGBoost and LSTM models on the same features for fair comparison
- How to evaluate models using CrowdCent's official scoring metrics (Spearman correlation and NDCG@40)
- How to visualize performance over time with moving averages
Key Findings¶
- 10-day predictions achieve higher raw scores than 30-day predictions (e.g., 0.255 vs 0.19 spearman corr for LSTM)
- The LSTM significantly outperforms XGBoost without any hyperparameter tuning (experimental results)
- All results shown are out-of-sample using proper time-series cross-validation (TimeGapSplit from sklego)
Note: This comparison uses identical features for both models and no hyperparameter optimization, demonstrating the power of sequential processing for this dataset.
import crowdcent_challenge as cc
import polars as pl
from xgboost import XGBRegressor
from datetime import timedelta
from sklego.model_selection import TimeGapSplit
import altair as alt
import os
from crowdcent_challenge.scoring import evaluate_hyperliquid_submission
os.environ["KERAS_BACKEND"] = "jax"
from centimators.model_estimators import LSTMRegressor
Initialize the client¶
client = cc.ChallengeClient(
challenge_slug="hyperliquid-ranking",
)
2025-06-16 20:28:18,994 - INFO - ChallengeClient initialized for 'hyperliquid-ranking' at URL: https://crowdcent.com/api
Get CrowdCent's training data¶
client.download_training_dataset(version="latest", dest_path="training_data.parquet")
data = pl.read_parquet("training_data.parquet")
data.head()
2025-06-16 20:28:19,256 - INFO - Downloading training data for challenge 'hyperliquid-ranking' v1.0 to training_data.parquet Downloading training_data.parquet: 100%|██████████| 85.1M/85.1M [00:01<00:00, 66.6MB/s] 2025-06-16 20:28:20,974 - INFO - Successfully downloaded training data to training_data.parquet
id | eodhd_id | date | feature_16_lag15 | feature_13_lag15 | feature_14_lag15 | feature_15_lag15 | feature_8_lag15 | feature_5_lag15 | feature_6_lag15 | feature_7_lag15 | feature_12_lag15 | feature_9_lag15 | feature_10_lag15 | feature_11_lag15 | feature_4_lag15 | feature_1_lag15 | feature_2_lag15 | feature_3_lag15 | feature_20_lag15 | feature_17_lag15 | feature_18_lag15 | feature_19_lag15 | feature_16_lag10 | feature_13_lag10 | feature_14_lag10 | feature_15_lag10 | feature_8_lag10 | feature_5_lag10 | feature_6_lag10 | feature_7_lag10 | feature_12_lag10 | feature_9_lag10 | feature_10_lag10 | feature_11_lag10 | feature_4_lag10 | feature_1_lag10 | … | feature_5_lag5 | feature_6_lag5 | feature_7_lag5 | feature_12_lag5 | feature_9_lag5 | feature_10_lag5 | feature_11_lag5 | feature_4_lag5 | feature_1_lag5 | feature_2_lag5 | feature_3_lag5 | feature_20_lag5 | feature_17_lag5 | feature_18_lag5 | feature_19_lag5 | feature_16_lag0 | feature_13_lag0 | feature_14_lag0 | feature_15_lag0 | feature_8_lag0 | feature_5_lag0 | feature_6_lag0 | feature_7_lag0 | feature_12_lag0 | feature_9_lag0 | feature_10_lag0 | feature_11_lag0 | feature_4_lag0 | feature_1_lag0 | feature_2_lag0 | feature_3_lag0 | feature_20_lag0 | feature_17_lag0 | feature_18_lag0 | feature_19_lag0 | target_10d | target_30d |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
str | str | datetime[μs] | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | … | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 |
"BRETT" | "BRETT29743-USD.CC" | 2024-05-06 00:00:00 | 0.55721 | 0.540086 | 0.512754 | 0.508386 | 0.692426 | 0.55288 | 0.549131 | 0.533897 | 0.736748 | 0.617263 | 0.603953 | 0.623957 | 0.450295 | 0.479222 | 0.519303 | 0.528181 | 0.527312 | 0.525878 | 0.497428 | 0.518564 | 0.239416 | 0.398313 | 0.451205 | 0.487323 | 0.118248 | 0.405337 | 0.448927 | 0.492024 | 0.086131 | 0.41144 | 0.485798 | 0.550351 | 0.236496 | 0.343396 | … | 0.177372 | 0.365126 | 0.493638 | 0.351825 | 0.218978 | 0.418121 | 0.546089 | 0.268613 | 0.252555 | 0.365888 | 0.495271 | 0.420438 | 0.551095 | 0.538486 | 0.542948 | 0.233333 | 0.249513 | 0.323913 | 0.447919 | 0.289855 | 0.263176 | 0.334256 | 0.435102 | 0.428986 | 0.390405 | 0.400922 | 0.508626 | 0.272464 | 0.270538 | 0.306967 | 0.431848 | 0.401449 | 0.410944 | 0.507738 | 0.507471 | 0.507246 | 0.985507 |
"BRETT" | "BRETT29743-USD.CC" | 2024-05-07 00:00:00 | 0.456339 | 0.471133 | 0.514562 | 0.508886 | 0.54709 | 0.519471 | 0.515427 | 0.532992 | 0.613331 | 0.547406 | 0.606655 | 0.627392 | 0.563407 | 0.461703 | 0.512452 | 0.548603 | 0.517968 | 0.518984 | 0.491352 | 0.514658 | 0.233577 | 0.344958 | 0.446484 | 0.48631 | 0.116788 | 0.331939 | 0.402621 | 0.492332 | 0.056934 | 0.335133 | 0.450072 | 0.55014 | 0.110949 | 0.337178 | … | 0.175161 | 0.347316 | 0.470913 | 0.351814 | 0.204374 | 0.37589 | 0.527665 | 0.230646 | 0.170798 | 0.31625 | 0.470839 | 0.374209 | 0.602433 | 0.560709 | 0.53946 | 0.272464 | 0.269793 | 0.307375 | 0.428926 | 0.407246 | 0.32039 | 0.326165 | 0.426455 | 0.484058 | 0.417936 | 0.376534 | 0.491264 | 0.365217 | 0.297932 | 0.317555 | 0.420723 | 0.281159 | 0.327684 | 0.500999 | 0.487154 | 0.869565 | 0.985507 |
"BRETT" | "BRETT29743-USD.CC" | 2024-05-08 00:00:00 | 0.566488 | 0.453614 | 0.507711 | 0.529308 | 0.383759 | 0.434102 | 0.471631 | 0.518237 | 0.448497 | 0.491656 | 0.563988 | 0.615796 | 0.37807 | 0.391257 | 0.465007 | 0.524881 | 0.46842 | 0.55421 | 0.480155 | 0.521811 | 0.106569 | 0.336529 | 0.407477 | 0.490267 | 0.287591 | 0.335675 | 0.417691 | 0.514816 | 0.230657 | 0.339577 | 0.45308 | 0.574943 | 0.291971 | 0.33502 | … | 0.262028 | 0.348065 | 0.471507 | 0.332974 | 0.281815 | 0.386736 | 0.525698 | 0.239405 | 0.265688 | 0.328473 | 0.472002 | 0.498498 | 0.616402 | 0.585306 | 0.538537 | 0.365217 | 0.297916 | 0.317222 | 0.417801 | 0.255072 | 0.245769 | 0.290722 | 0.407634 | 0.314493 | 0.323733 | 0.331655 | 0.468477 | 0.171014 | 0.20521 | 0.270115 | 0.396829 | 0.205797 | 0.352147 | 0.476755 | 0.47601 | 0.913043 | 0.985507 |
"BRETT" | "BRETT29743-USD.CC" | 2024-05-09 00:00:00 | 0.382578 | 0.383157 | 0.46101 | 0.505768 | 0.401234 | 0.42504 | 0.441107 | 0.496157 | 0.321855 | 0.413705 | 0.524751 | 0.593716 | 0.40717 | 0.410203 | 0.472939 | 0.505173 | 0.590994 | 0.579914 | 0.497107 | 0.513647 | 0.286131 | 0.334355 | 0.427468 | 0.513196 | 0.268613 | 0.334924 | 0.391343 | 0.490949 | 0.337226 | 0.329541 | 0.440073 | 0.566184 | 0.254015 | 0.330593 | … | 0.261943 | 0.343492 | 0.450375 | 0.403047 | 0.370136 | 0.391921 | 0.52686 | 0.242272 | 0.248143 | 0.329173 | 0.449071 | 0.554988 | 0.581873 | 0.580894 | 0.524828 | 0.171014 | 0.205194 | 0.269774 | 0.393907 | 0.268116 | 0.261695 | 0.298309 | 0.387545 | 0.276812 | 0.339929 | 0.334735 | 0.464148 | 0.321739 | 0.282006 | 0.306299 | 0.392148 | 0.265217 | 0.410103 | 0.50499 | 0.468336 | 0.934783 | 0.992754 |
"BRETT" | "BRETT29743-USD.CC" | 2024-05-10 00:00:00 | 0.407299 | 0.398432 | 0.467491 | 0.485709 | 0.268613 | 0.414399 | 0.446095 | 0.489301 | 0.232117 | 0.413673 | 0.488628 | 0.5732 | 0.240876 | 0.396751 | 0.456653 | 0.506786 | 0.60292 | 0.598595 | 0.535617 | 0.530595 | 0.252555 | 0.329927 | 0.390503 | 0.489134 | 0.235036 | 0.251825 | 0.366699 | 0.490923 | 0.348905 | 0.290511 | 0.440729 | 0.546316 | 0.267153 | 0.254015 | … | 0.268333 | 0.341366 | 0.43541 | 0.420322 | 0.384613 | 0.399143 | 0.530058 | 0.233365 | 0.250259 | 0.323505 | 0.451219 | 0.38994 | 0.498619 | 0.548607 | 0.510192 | 0.321739 | 0.28199 | 0.305958 | 0.389226 | 0.249275 | 0.275452 | 0.263639 | 0.385433 | 0.344928 | 0.382625 | 0.336568 | 0.452784 | 0.343478 | 0.288422 | 0.271218 | 0.394539 | 0.273913 | 0.331926 | 0.468518 | 0.468964 | 0.949275 | 0.985507 |
Create cross validation folds¶
cv_kwargs = {
"val_days": 100,
"gap_days": 30,
"n_splits": 3,
"cv_window_type": "rolling",
}
cv = TimeGapSplit(
date_serie=data["date"],
valid_duration=timedelta(days=cv_kwargs["val_days"]),
gap_duration=timedelta(days=cv_kwargs["gap_days"]),
n_splits=cv_kwargs["n_splits"],
window=cv_kwargs["cv_window_type"],
)
cv.summary(data)
Start date | End date | Period | Unique days | nbr samples | part | fold |
---|---|---|---|---|---|---|
datetime[μs] | datetime[μs] | duration[μs] | i64 | i64 | str | i64 |
2020-02-26 00:00:00 | 2024-06-09 00:00:00 | 1565d | 1566 | 128580 | "train" | 0 |
2024-07-10 00:00:00 | 2024-10-17 00:00:00 | 99d | 100 | 14170 | "valid" | 0 |
2020-06-05 00:00:00 | 2024-09-17 00:00:00 | 1565d | 1566 | 138210 | "train" | 1 |
2024-10-18 00:00:00 | 2025-01-25 00:00:00 | 99d | 100 | 15160 | "valid" | 1 |
2020-09-13 00:00:00 | 2024-12-26 00:00:00 | 1565d | 1566 | 148300 | "train" | 2 |
2025-01-26 00:00:00 | 2025-05-05 00:00:00 | 99d | 100 | 16958 | "valid" | 2 |
Define features, targets, and lag windows¶
feature_cols = [
col
for col in data.columns
if col not in ["id", "eodhd_id", "date", "target_10d", "target_30d"]
]
target_cols = ["target_10d", "target_30d"]
lag_windows = [0, 5, 10, 15]
n_features_per_timestep = len(feature_cols) // len(lag_windows)
Train models on each train fold and predict validation¶
print("Training and evaluating XGBoost model with detailed scoring...")
fold_data = []
models = [
XGBRegressor(n_estimators=2000),
LSTMRegressor(
output_units=2,
lag_windows=lag_windows,
n_features_per_timestep=n_features_per_timestep,
),
]
for fold, (train_idx, val_idx) in enumerate(cv.split(data)):
print(f"\nFold {fold + 1}/{cv.n_splits}")
# Get train and validation data
train_data = data[train_idx]
val_data = data[val_idx]
print(f" Train dates: {train_data['date'].min()} to {train_data['date'].max()}")
print(f" Val dates: {val_data['date'].min()} to {val_data['date'].max()}")
print(f" Train samples: {len(train_data)}, Val samples: {len(val_data)}")
# Train model
for model in models:
fit_kwargs = {}
if isinstance(model, LSTMRegressor):
fit_kwargs["epochs"] = 5
fit_kwargs["validation_data"] = (
val_data[feature_cols].to_pandas(),
val_data[target_cols].to_pandas(),
)
model.fit(
train_data[feature_cols].to_pandas(),
train_data[target_cols].to_pandas(),
**fit_kwargs,
)
# Make predictions
preds = pl.from_numpy(
model.predict(val_data[feature_cols].to_pandas()),
{"pred_10d": pl.Float64, "pred_30d": pl.Float64},
)
preds = preds.with_columns(
pl.lit(fold).alias("fold"), pl.lit(model.__class__.__name__).alias("model")
)
fold_data.append(val_data.with_columns(preds))
Training and evaluating XGBoost model with detailed scoring... Fold 1/3 Train dates: 2020-02-26 00:00:00 to 2024-06-09 00:00:00 Val dates: 2024-07-10 00:00:00 to 2024-10-17 00:00:00 Train samples: 128580, Val samples: 14170
Epoch 1/5 4019/4019 ━━━━━━━━━━━━━━━━━━━━ 22s 5ms/step - loss: 0.0817 - mse: 0.0817 - val_loss: 0.0784 - val_mse: 0.0784 Epoch 2/5 4019/4019 ━━━━━━━━━━━━━━━━━━━━ 21s 5ms/step - loss: 0.0784 - mse: 0.0784 - val_loss: 0.0781 - val_mse: 0.0781 Epoch 3/5 4019/4019 ━━━━━━━━━━━━━━━━━━━━ 21s 5ms/step - loss: 0.0780 - mse: 0.0780 - val_loss: 0.0781 - val_mse: 0.0781 Epoch 4/5 4019/4019 ━━━━━━━━━━━━━━━━━━━━ 21s 5ms/step - loss: 0.0783 - mse: 0.0783 - val_loss: 0.0781 - val_mse: 0.0781 Epoch 5/5 4019/4019 ━━━━━━━━━━━━━━━━━━━━ 21s 5ms/step - loss: 0.0781 - mse: 0.0781 - val_loss: 0.0778 - val_mse: 0.0778 28/28 ━━━━━━━━━━━━━━━━━━━━ 0s 10ms/step Fold 2/3 Train dates: 2020-06-05 00:00:00 to 2024-09-17 00:00:00 Val dates: 2024-10-18 00:00:00 to 2025-01-25 00:00:00 Train samples: 138210, Val samples: 15160 Epoch 1/5 4320/4320 ━━━━━━━━━━━━━━━━━━━━ 24s 5ms/step - loss: 0.0773 - mse: 0.0773 - val_loss: 0.0734 - val_mse: 0.0734 Epoch 2/5 4320/4320 ━━━━━━━━━━━━━━━━━━━━ 22s 5ms/step - loss: 0.0775 - mse: 0.0775 - val_loss: 0.0729 - val_mse: 0.0729 Epoch 3/5 4320/4320 ━━━━━━━━━━━━━━━━━━━━ 22s 5ms/step - loss: 0.0773 - mse: 0.0773 - val_loss: 0.0728 - val_mse: 0.0728 Epoch 4/5 4320/4320 ━━━━━━━━━━━━━━━━━━━━ 22s 5ms/step - loss: 0.0772 - mse: 0.0772 - val_loss: 0.0729 - val_mse: 0.0729 Epoch 5/5 4320/4320 ━━━━━━━━━━━━━━━━━━━━ 22s 5ms/step - loss: 0.0773 - mse: 0.0773 - val_loss: 0.0731 - val_mse: 0.0731 30/30 ━━━━━━━━━━━━━━━━━━━━ 0s 11ms/step Fold 3/3 Train dates: 2020-09-13 00:00:00 to 2024-12-26 00:00:00 Val dates: 2025-01-26 00:00:00 to 2025-05-05 00:00:00 Train samples: 148300, Val samples: 16958 Epoch 1/5 4635/4635 ━━━━━━━━━━━━━━━━━━━━ 25s 5ms/step - loss: 0.0769 - mse: 0.0769 - val_loss: 0.0767 - val_mse: 0.0767 Epoch 2/5 4635/4635 ━━━━━━━━━━━━━━━━━━━━ 24s 5ms/step - loss: 0.0767 - mse: 0.0767 - val_loss: 0.0783 - val_mse: 0.0783 Epoch 3/5 4635/4635 ━━━━━━━━━━━━━━━━━━━━ 24s 5ms/step - loss: 0.0769 - mse: 0.0769 - val_loss: 0.0770 - val_mse: 0.0770 Epoch 4/5 4635/4635 ━━━━━━━━━━━━━━━━━━━━ 24s 5ms/step - loss: 0.0764 - mse: 0.0764 - val_loss: 0.0767 - val_mse: 0.0767 Epoch 5/5 4635/4635 ━━━━━━━━━━━━━━━━━━━━ 24s 5ms/step - loss: 0.0768 - mse: 0.0768 - val_loss: 0.0787 - val_mse: 0.0787 34/34 ━━━━━━━━━━━━━━━━━━━━ 0s 11ms/step
combined_preds = pl.concat(fold_data)
combined_preds.head()
id | eodhd_id | date | feature_16_lag15 | feature_13_lag15 | feature_14_lag15 | feature_15_lag15 | feature_8_lag15 | feature_5_lag15 | feature_6_lag15 | feature_7_lag15 | feature_12_lag15 | feature_9_lag15 | feature_10_lag15 | feature_11_lag15 | feature_4_lag15 | feature_1_lag15 | feature_2_lag15 | feature_3_lag15 | feature_20_lag15 | feature_17_lag15 | feature_18_lag15 | feature_19_lag15 | feature_16_lag10 | feature_13_lag10 | feature_14_lag10 | feature_15_lag10 | feature_8_lag10 | feature_5_lag10 | feature_6_lag10 | feature_7_lag10 | feature_12_lag10 | feature_9_lag10 | feature_10_lag10 | feature_11_lag10 | feature_4_lag10 | feature_1_lag10 | … | feature_9_lag5 | feature_10_lag5 | feature_11_lag5 | feature_4_lag5 | feature_1_lag5 | feature_2_lag5 | feature_3_lag5 | feature_20_lag5 | feature_17_lag5 | feature_18_lag5 | feature_19_lag5 | feature_16_lag0 | feature_13_lag0 | feature_14_lag0 | feature_15_lag0 | feature_8_lag0 | feature_5_lag0 | feature_6_lag0 | feature_7_lag0 | feature_12_lag0 | feature_9_lag0 | feature_10_lag0 | feature_11_lag0 | feature_4_lag0 | feature_1_lag0 | feature_2_lag0 | feature_3_lag0 | feature_20_lag0 | feature_17_lag0 | feature_18_lag0 | feature_19_lag0 | target_10d | target_30d | pred_10d | pred_30d | fold | model |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
str | str | datetime[μs] | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | … | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | i32 | str |
"BRETT" | "BRETT29743-USD.CC" | 2024-07-10 00:00:00 | 0.582636 | 0.63196 | 0.567633 | 0.653365 | 0.567062 | 0.530895 | 0.508704 | 0.56644 | 0.505835 | 0.622784 | 0.645015 | 0.691582 | 0.61006 | 0.661289 | 0.568315 | 0.640492 | 0.504829 | 0.469691 | 0.471515 | 0.486004 | 0.591667 | 0.587152 | 0.516999 | 0.629483 | 0.386723 | 0.476893 | 0.436182 | 0.551673 | 0.581011 | 0.543423 | 0.546789 | 0.664751 | 0.453945 | 0.532003 | … | 0.486167 | 0.554475 | 0.619781 | 0.508145 | 0.481045 | 0.571167 | 0.583708 | 0.431666 | 0.397232 | 0.433462 | 0.443389 | 0.4 | 0.460027 | 0.523589 | 0.557202 | 0.367133 | 0.485561 | 0.481227 | 0.496003 | 0.427972 | 0.409647 | 0.476535 | 0.569828 | 0.388811 | 0.448478 | 0.49024 | 0.547042 | 0.544056 | 0.487861 | 0.460837 | 0.467384 | 0.328671 | 0.405594 | 0.262604 | 0.436791 | 0 | "XGBRegressor" |
"TIA" | "TIA-USD.CC" | 2024-07-10 00:00:00 | 0.527324 | 0.462547 | 0.424517 | 0.479083 | 0.667706 | 0.596006 | 0.501783 | 0.502782 | 0.664125 | 0.489914 | 0.433712 | 0.452701 | 0.547324 | 0.464556 | 0.434525 | 0.467587 | 0.532696 | 0.549262 | 0.500588 | 0.489446 | 0.333488 | 0.430406 | 0.42528 | 0.450303 | 0.26735 | 0.467528 | 0.457388 | 0.474888 | 0.385108 | 0.524616 | 0.436511 | 0.439001 | 0.351354 | 0.449339 | … | 0.41404 | 0.451977 | 0.446531 | 0.44049 | 0.395922 | 0.430239 | 0.438368 | 0.548498 | 0.521432 | 0.535347 | 0.489605 | 0.758042 | 0.569009 | 0.499707 | 0.453081 | 0.752448 | 0.547282 | 0.507405 | 0.484174 | 0.883916 | 0.663444 | 0.59403 | 0.471979 | 0.630769 | 0.53563 | 0.492484 | 0.447263 | 0.591608 | 0.570053 | 0.541792 | 0.516922 | 0.013986 | 0.174825 | 0.189965 | 0.13181 | 0 | "XGBRegressor" |
"UNIBOT" | "UNIBOT27009-USD.CC" | 2024-07-10 00:00:00 | 0.197304 | 0.492413 | 0.455958 | 0.434261 | 0.217384 | 0.427973 | 0.415099 | 0.429836 | 0.145171 | 0.438711 | 0.40283 | 0.410917 | 0.204346 | 0.441176 | 0.439864 | 0.43401 | 0.592475 | 0.495348 | 0.519382 | 0.52508 | 0.450291 | 0.323797 | 0.469618 | 0.423626 | 0.39935 | 0.308367 | 0.436676 | 0.406346 | 0.364257 | 0.254714 | 0.410198 | 0.388812 | 0.434758 | 0.319552 | … | 0.533015 | 0.485863 | 0.425274 | 0.539693 | 0.487225 | 0.464201 | 0.42267 | 0.379602 | 0.435206 | 0.465277 | 0.503209 | 0.381818 | 0.435098 | 0.379448 | 0.425286 | 0.51049 | 0.504368 | 0.406368 | 0.417618 | 0.376224 | 0.538998 | 0.396856 | 0.403073 | 0.511888 | 0.52579 | 0.422671 | 0.437971 | 0.604196 | 0.491899 | 0.516771 | 0.511046 | 0.972028 | 0.811189 | 0.312127 | 0.459849 | 0 | "XGBRegressor" |
"GMX" | "GMX11857-USD.CC" | 2024-07-10 00:00:00 | 0.655352 | 0.547352 | 0.52505 | 0.51411 | 0.493541 | 0.478626 | 0.453018 | 0.502233 | 0.582777 | 0.488875 | 0.457043 | 0.480831 | 0.577586 | 0.516362 | 0.489627 | 0.518291 | 0.43994 | 0.512894 | 0.457628 | 0.51247 | 0.511632 | 0.583492 | 0.489914 | 0.525345 | 0.619905 | 0.556723 | 0.478647 | 0.518188 | 0.57777 | 0.580273 | 0.487355 | 0.488807 | 0.597449 | 0.587517 | … | 0.582902 | 0.535889 | 0.505657 | 0.598907 | 0.598178 | 0.55727 | 0.56189 | 0.466148 | 0.491062 | 0.501978 | 0.503281 | 0.488112 | 0.564592 | 0.574042 | 0.535743 | 0.425175 | 0.522023 | 0.539373 | 0.507692 | 0.492308 | 0.54017 | 0.560222 | 0.477271 | 0.461538 | 0.530223 | 0.55887 | 0.523608 | 0.577622 | 0.521885 | 0.499921 | 0.489724 | 0.265734 | 0.447552 | 0.362585 | 0.391194 | 0 | "XGBRegressor" |
"MERL" | "MERL-USD.CC" | 2024-07-10 00:00:00 | 0.526539 | 0.40421 | 0.405903 | 0.454694 | 0.657304 | 0.499628 | 0.401089 | 0.481043 | 0.523863 | 0.432692 | 0.431368 | 0.517578 | 0.523581 | 0.48196 | 0.395542 | 0.446414 | 0.650423 | 0.513289 | 0.508811 | 0.512539 | 0.516113 | 0.521326 | 0.438554 | 0.481455 | 0.491579 | 0.574441 | 0.459491 | 0.504831 | 0.569024 | 0.546444 | 0.467102 | 0.548991 | 0.481897 | 0.502739 | … | 0.548784 | 0.490738 | 0.517577 | 0.364572 | 0.423235 | 0.452597 | 0.444308 | 0.417768 | 0.43888 | 0.476085 | 0.494537 | 0.584615 | 0.535802 | 0.528564 | 0.453047 | 0.535664 | 0.446439 | 0.51044 | 0.44371 | 0.605594 | 0.567069 | 0.556756 | 0.491442 | 0.584615 | 0.474594 | 0.488666 | 0.429555 | 0.544056 | 0.480912 | 0.51806 | 0.489741 | 0.055944 | 0.517483 | 0.277185 | 0.227254 | 0 | "XGBRegressor" |
Calculate scores¶
def calculate_daily_scores(predictions_df: pl.DataFrame) -> pl.DataFrame:
"""Calculate daily scores for predictions using hyperliquid evaluation metrics.
Args:
predictions_df: DataFrame containing predictions and targets
Returns:
DataFrame with daily evaluation metrics
"""
columns = ["pred_10d", "target_10d", "pred_30d", "target_30d"]
return (
predictions_df.group_by(["date", "model"])
.agg([pl.col(col) for col in columns])
.with_columns(
pl.struct(columns)
.alias("daily_scores")
.map_elements(
lambda x: evaluate_hyperliquid_submission(
y_true_10d=x["target_10d"],
y_pred_10d=x["pred_10d"],
y_true_30d=x["target_30d"],
y_pred_30d=x["pred_30d"],
),
return_dtype=pl.Struct,
)
)
).unnest("daily_scores")
daily_scores = calculate_daily_scores(combined_preds)
daily_scores
date | model | pred_10d | target_10d | pred_30d | target_30d | spearman_10d | spearman_30d | ndcg@40_10d | ndcg@40_30d |
---|---|---|---|---|---|---|---|---|---|
datetime[μs] | str | list[f64] | list[f64] | list[f64] | list[f64] | f64 | f64 | f64 | f64 |
2025-03-19 00:00:00 | "LSTMRegressor" | [0.510342, 0.589169, … 0.516715] | [0.611429, 0.831429, … 0.177143] | [0.520879, 0.534651, … 0.528912] | [0.845714, 0.231429, … 0.977143] | 0.407122 | 0.284606 | 0.749886 | 0.690413 |
2025-03-05 00:00:00 | "XGBRegressor" | [0.438884, 0.374294, … 0.463834] | [0.212644, 0.066092, … 0.833333] | [0.33935, 0.566031, … 0.488783] | [0.45977, 0.175287, … 0.511494] | 0.152309 | 0.054923 | 0.653985 | 0.550478 |
2024-10-13 00:00:00 | "LSTMRegressor" | [0.614008, 0.540451, … 0.517269] | [0.655629, 0.443709, … 0.907285] | [0.592994, 0.529025, … 0.515621] | [0.741722, 0.125828, … 0.695364] | 0.127505 | 0.073099 | 0.603702 | 0.608025 |
2025-04-08 00:00:00 | "XGBRegressor" | [0.509997, 0.595915, … 0.702079] | [0.938202, 0.176966, … 0.983146] | [0.625393, 0.684723, … 0.639499] | [0.955056, 0.69382, … 0.713483] | 0.287763 | 0.04348 | 0.639029 | 0.585965 |
2024-10-24 00:00:00 | "XGBRegressor" | [0.532243, 0.478276, … 0.520768] | [0.188312, 0.168831, … 0.305195] | [0.66408, 0.632372, … 0.574681] | [0.883117, 0.103896, … 0.350649] | 0.228429 | 0.094441 | 0.655664 | 0.615117 |
… | … | … | … | … | … | … | … | … | … |
2025-03-10 00:00:00 | "LSTMRegressor" | [0.416265, 0.472912, … 0.559355] | [0.554286, 0.597143, … 0.748571] | [0.460617, 0.44555, … 0.547587] | [0.542857, 0.408571, … 0.925714] | 0.178997 | 0.125798 | 0.566258 | 0.631034 |
2024-08-06 00:00:00 | "XGBRegressor" | [0.913959, 0.477772, … 0.412959] | [0.041379, 0.641379, … 0.682759] | [0.760317, 0.657521, … 0.529122] | [0.034483, 0.02069, … 0.462069] | 0.239257 | -0.001098 | 0.696198 | 0.599227 |
2024-07-18 00:00:00 | "XGBRegressor" | [0.68644, 0.285066, … 0.467805] | [0.482517, 0.083916, … 0.951049] | [0.57571, 0.447701, … 0.350697] | [0.160839, 0.286713, … 0.825175] | 0.16556 | -0.128605 | 0.649676 | 0.524151 |
2024-10-14 00:00:00 | "XGBRegressor" | [0.428081, 0.506207, … 0.358323] | [0.112583, 0.298013, … 0.827815] | [0.41764, 0.546764, … 0.229562] | [0.344371, 0.192053, … 0.549669] | 0.269487 | 0.179918 | 0.697752 | 0.658282 |
2024-09-12 00:00:00 | "LSTMRegressor" | [0.458443, 0.373957, … 0.47088] | [0.136986, 0.047945, … 0.979452] | [0.488663, 0.425867, … 0.500952] | [0.60274, 0.205479, … 0.876712] | 0.20172 | 0.056749 | 0.675872 | 0.585648 |
from centimators.feature_transformers import MovingAverageTransformer
daily_scores = daily_scores.sort("date")
ma_transformer = MovingAverageTransformer(
feature_names=["spearman_10d", "spearman_30d", "ndcg@40_10d", "ndcg@40_30d"],
windows=[7, 30],
)
ma_columns = ma_transformer.fit_transform(
daily_scores, ticker_series=daily_scores["model"]
)
daily_scores_df = daily_scores.with_columns(ma_columns)
Plot validation metrics (combining all folds)¶
def plot_metric_comparison(df, metric_name, anchor_ref=0, width=400, height=200):
"""Plot a single metric across 10d and 30d timeframes side by side with shared y-axis.
Args:
df: Polars DataFrame with daily scores
metric_name: Base metric name ('spearman' or 'ndcg@40')
width: Chart width in pixels per timeframe
height: Chart height
"""
# Convert to pandas once
pdf = df.to_pandas()
# Shared selection for zooming
brush = alt.selection_interval(bind="scales", encodings=["x"])
# Define timeframes and colors
timeframes = ["10d", "30d"]
# Get column names for this metric
col_10d = f"{metric_name}_10d"
col_30d = f"{metric_name}_30d"
# Calculate shared y-axis domain
min_val = min(df[col_10d].min(), df[col_30d].min())
max_val = max(df[col_10d].max(), df[col_30d].max())
y_domain = [min_val * 0.95, max_val * 1.05]
charts = []
for timeframe in timeframes:
col_name = f"{metric_name}_{timeframe}"
# Calculate per-model statistics
model_stats = (
df.group_by("model")
.agg(
[
pl.col(col_name).mean().alias("mean"),
pl.col(col_name).std().alias("std"),
]
)
.with_columns((pl.col("mean") / pl.col("std")).alias("sharpe"))
)
# Create concise title with mean values
mean_values = model_stats.to_pandas()
mean_text = " | ".join(
[f"{row['model']}: {row['mean']:.3f}" for _, row in mean_values.iterrows()]
)
title_text = f"{metric_name.upper()} {timeframe}\nMeans: {mean_text}"
chart = (
alt.Chart(pdf)
.add_params(brush)
.mark_point(opacity=0.6)
.encode(
x=alt.X("date:T", title="Date"),
y=alt.Y(
f"{col_name}:Q",
title=f"{metric_name.upper()} {timeframe}",
scale=alt.Scale(domain=y_domain),
),
color=alt.Color("model:N", legend=alt.Legend(symbolOpacity=1.0)),
tooltip=[
"date:T",
"model:N",
f"{col_name}:Q",
alt.Tooltip("model:N", title="Model"),
],
)
.properties(
width=width,
height=height,
title=alt.TitleParams(
text=title_text,
fontSize=10,
anchor="start",
),
)
)
moving_average_chart = (
alt.Chart(pdf)
.mark_line(strokeWidth=2, opacity=1)
.encode(
x=alt.X("date:T"),
y=alt.Y(f"{col_name}_ma30:Q", scale=alt.Scale(domain=y_domain)),
color=alt.Color("model:N", legend=alt.Legend(symbolOpacity=1.0)),
tooltip=[
"date:T",
"model:N",
f"{col_name}_ma30:Q",
alt.Tooltip("model:N", title="Model"),
],
)
)
# Add per-model mean reference lines
mean_lines = []
for row in model_stats.iter_rows(named=True):
model = row["model"]
mean_val = row["mean"]
mean_line = (
alt.Chart(pdf[pdf["model"] == model])
.mark_rule(strokeDash=[5, 5], opacity=1, strokeWidth=2)
.encode(
y=alt.datum(mean_val),
color=alt.Color("model:N", legend=alt.Legend(symbolOpacity=0.7)),
)
)
mean_lines.append(mean_line)
# Add anchor reference line
anchor_line = (
alt.Chart(pdf)
.mark_rule(strokeDash=[2, 2], opacity=0.7, color="gray", strokeWidth=1)
.encode(y=alt.datum(anchor_ref))
)
combined_chart = chart + moving_average_chart + anchor_line
for mean_line in mean_lines:
combined_chart += mean_line
charts.append(combined_chart)
return alt.hconcat(*charts, spacing=10).resolve_scale(x="shared")
# Plot both metrics
spearman_chart = plot_metric_comparison(daily_scores_df, "spearman", anchor_ref=0)
ndcg_chart = plot_metric_comparison(daily_scores_df, "ndcg@40", anchor_ref=0.5)
# Combine vertically
alt.vconcat(spearman_chart, ndcg_chart, spacing=20)