XGBoost vs LSTM with CV
Learn how to leverage the temporal structure in CrowdCent's training data to dramatically improve prediction performance using an LSTM model compared to a traditional XGBoost approach. We'll use the sklego.model_selection.TimeGapSplit to set up proper time-series cross-validation with gap periods to prevent leakage and then visualize the performance on various metrics over time with moving averages.
Key Insight: Sequential Feature Processing¶
CrowdCent's training and inference data contains features with a defined temporal sequence through lag windows (e.g., feature_1_lag15, feature_1_lag10, feature_1_lag5, feature_1_lag0). While traditional models like XGBoost treat these as independent features, an LSTM can process them sequentially, capturing temporal dependencies.
Model Comparison¶
- XGBoost: Treats all features as independent inputs, achieving ~0.06 average 30-day Spearman correlation
- LSTM (from Centimators): Processes features sequentially by reshaping them along the lag axis, achieving ~0.19 average 30-day Spearman correlation - over 3x improvement!
Feature Reshaping Example¶
The LSTM transforms the flat feature vector:
[feature_1_lag10, feature_1_lag5, feature_1_lag0, feature_2_lag10, feature_2_lag5, feature_2_lag0, ...]
into a sequential 2D tensor:
[[feature_1_lag10, feature_2_lag10],
[feature_1_lag5, feature_2_lag5],
[feature_1_lag0, feature_2_lag0]]
What You'll Learn¶
- How to set up proper time-series cross-validation with gap periods to prevent leakage
- How to train both XGBoost and LSTM models on the same features for fair comparison
- How to evaluate models using CrowdCent's official scoring metrics (Spearman correlation and NDCG@40)
- How to visualize performance over time with moving averages
Key Findings¶
- 10-day predictions achieve higher raw scores than 30-day predictions (e.g., 0.255 vs 0.19 spearman corr for LSTM)
- The LSTM significantly outperforms XGBoost without any hyperparameter tuning (experimental results)
- All results shown are out-of-sample using proper time-series cross-validation (TimeGapSplit from sklego)
Note: This comparison uses identical features for both models and no hyperparameter optimization, demonstrating the power of sequential processing for this dataset.
import crowdcent_challenge as cc
import polars as pl
from xgboost import XGBRegressor
from datetime import timedelta
from sklego.model_selection import TimeGapSplit
import altair as alt
import numpy as np
import os
from crowdcent_challenge.scoring import evaluate_hyperliquid_submission
os.environ["KERAS_BACKEND"] = "jax"
from centimators.model_estimators import LSTMRegressor
Initialize the client¶
client = cc.ChallengeClient(
challenge_slug="hyperliquid-ranking",
)
2026-02-03 21:00:42,153 - INFO - ChallengeClient initialized for 'hyperliquid-ranking' at URL: https://crowdcent.com/api
Get CrowdCent's training data¶
client.download_training_dataset(version="latest", dest_path="training_data.parquet")
data = pl.read_parquet("training_data.parquet")
data.head()
2026-02-03 21:00:42,162 - INFO - Downloading training data v2.0 to training_data.parquet Downloading training_data.parquet: 100%|██████████| 124M/124M [00:02<00:00, 59.4MB/s] 2026-02-03 21:00:44,852 - INFO - Successfully downloaded training data v2.0 to training_data.parquet
| id | eodhd_id | date | feature_16_lag15 | feature_13_lag15 | feature_14_lag15 | feature_15_lag15 | feature_8_lag15 | feature_5_lag15 | feature_6_lag15 | feature_7_lag15 | feature_12_lag15 | feature_9_lag15 | feature_10_lag15 | feature_11_lag15 | feature_4_lag15 | feature_1_lag15 | feature_2_lag15 | feature_3_lag15 | feature_20_lag15 | feature_17_lag15 | feature_18_lag15 | feature_19_lag15 | feature_16_lag10 | feature_13_lag10 | feature_14_lag10 | feature_15_lag10 | feature_8_lag10 | feature_5_lag10 | feature_6_lag10 | feature_7_lag10 | feature_12_lag10 | feature_9_lag10 | feature_10_lag10 | feature_11_lag10 | feature_4_lag10 | feature_1_lag10 | … | feature_5_lag5 | feature_6_lag5 | feature_7_lag5 | feature_12_lag5 | feature_9_lag5 | feature_10_lag5 | feature_11_lag5 | feature_4_lag5 | feature_1_lag5 | feature_2_lag5 | feature_3_lag5 | feature_20_lag5 | feature_17_lag5 | feature_18_lag5 | feature_19_lag5 | feature_16_lag0 | feature_13_lag0 | feature_14_lag0 | feature_15_lag0 | feature_8_lag0 | feature_5_lag0 | feature_6_lag0 | feature_7_lag0 | feature_12_lag0 | feature_9_lag0 | feature_10_lag0 | feature_11_lag0 | feature_4_lag0 | feature_1_lag0 | feature_2_lag0 | feature_3_lag0 | feature_20_lag0 | feature_17_lag0 | feature_18_lag0 | feature_19_lag0 | target_10d | target_30d |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| str | str | datetime[μs] | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | … | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 |
| "0G" | "0G-USD.CC" | 2025-11-16 00:00:00 | 0.157692 | 0.156336 | 0.239762 | 0.313349 | 0.051923 | 0.135122 | 0.269735 | 0.300353 | 0.189423 | 0.200007 | 0.227781 | 0.336593 | 0.161538 | 0.145204 | 0.238061 | 0.291121 | 0.605769 | 0.587816 | 0.566855 | 0.516127 | 0.264423 | 0.211058 | 0.241617 | 0.296768 | 0.329808 | 0.190865 | 0.256535 | 0.31913 | 0.407692 | 0.298558 | 0.277047 | 0.321981 | 0.257692 | 0.209615 | … | 0.353349 | 0.244235 | 0.33943 | 0.549011 | 0.478352 | 0.339179 | 0.344593 | 0.496117 | 0.376905 | 0.261054 | 0.311914 | 0.270146 | 0.324496 | 0.456156 | 0.475274 | 0.522488 | 0.5093 | 0.360179 | 0.328206 | 0.611483 | 0.494187 | 0.342526 | 0.351089 | 0.536842 | 0.542926 | 0.420742 | 0.331381 | 0.524402 | 0.510259 | 0.359937 | 0.322647 | 0.588517 | 0.429332 | 0.46082 | 0.494638 | 0.832536 | 0.186603 |
| "0G" | "0G-USD.CC" | 2025-11-17 00:00:00 | 0.160577 | 0.145192 | 0.238297 | 0.291124 | 0.035577 | 0.134135 | 0.257164 | 0.280925 | 0.021154 | 0.200481 | 0.224395 | 0.312074 | 0.033654 | 0.146154 | 0.224768 | 0.267206 | 0.609615 | 0.603365 | 0.603132 | 0.512047 | 0.257692 | 0.209135 | 0.236536 | 0.277681 | 0.332692 | 0.184135 | 0.25169 | 0.317794 | 0.417308 | 0.219231 | 0.271717 | 0.302891 | 0.334615 | 0.184135 | … | 0.420793 | 0.277464 | 0.332612 | 0.639856 | 0.528582 | 0.364531 | 0.333954 | 0.594281 | 0.464448 | 0.305301 | 0.309641 | 0.431731 | 0.338462 | 0.470913 | 0.474799 | 0.524402 | 0.51072 | 0.359927 | 0.322639 | 0.470813 | 0.489853 | 0.336994 | 0.348187 | 0.466986 | 0.553421 | 0.386326 | 0.333876 | 0.41244 | 0.503361 | 0.343748 | 0.328964 | 0.400957 | 0.416344 | 0.421874 | 0.471446 | 0.779904 | 0.167464 |
| "0G" | "0G-USD.CC" | 2025-11-18 00:00:00 | 0.032692 | 0.145673 | 0.225003 | 0.267208 | 0.225 | 0.228846 | 0.280594 | 0.305561 | 0.215385 | 0.281731 | 0.264013 | 0.313656 | 0.225962 | 0.228365 | 0.268489 | 0.291843 | 0.769231 | 0.639423 | 0.616418 | 0.536684 | 0.334615 | 0.183654 | 0.25431 | 0.283911 | 0.332692 | 0.278846 | 0.297583 | 0.341345 | 0.392308 | 0.303846 | 0.313781 | 0.324164 | 0.335577 | 0.280769 | … | 0.339453 | 0.28415 | 0.320811 | 0.62851 | 0.510409 | 0.39607 | 0.330011 | 0.501458 | 0.418518 | 0.323442 | 0.322918 | 0.242262 | 0.243727 | 0.441575 | 0.474793 | 0.41244 | 0.503821 | 0.343737 | 0.329076 | 0.484211 | 0.415212 | 0.347029 | 0.348368 | 0.355024 | 0.491767 | 0.397807 | 0.338166 | 0.453589 | 0.477523 | 0.379146 | 0.343882 | 0.476555 | 0.359408 | 0.43331 | 0.48185 | 0.885167 | 0.22488 |
| "0G" | "0G-USD.CC" | 2025-11-19 00:00:00 | 0.225 | 0.227885 | 0.268724 | 0.291845 | 0.327885 | 0.270673 | 0.302442 | 0.319021 | 0.409615 | 0.377404 | 0.30822 | 0.338053 | 0.261538 | 0.217788 | 0.270617 | 0.296166 | 0.607692 | 0.584615 | 0.588395 | 0.535952 | 0.335577 | 0.280288 | 0.29972 | 0.30867 | 0.372115 | 0.35 | 0.290399 | 0.359733 | 0.394231 | 0.401923 | 0.355811 | 0.347837 | 0.495192 | 0.378365 | … | 0.316703 | 0.293688 | 0.328585 | 0.489759 | 0.441995 | 0.409699 | 0.335865 | 0.335922 | 0.415557 | 0.316673 | 0.324101 | 0.228722 | 0.250419 | 0.417517 | 0.471778 | 0.453589 | 0.477984 | 0.379136 | 0.343994 | 0.567464 | 0.414377 | 0.382189 | 0.347465 | 0.36555 | 0.427655 | 0.414789 | 0.32552 | 0.440191 | 0.388057 | 0.383211 | 0.325019 | 0.552153 | 0.390438 | 0.415171 | 0.485846 | 0.870813 | 0.301435 |
| "0G" | "0G-USD.CC" | 2025-11-20 00:00:00 | 0.260577 | 0.217308 | 0.270853 | 0.296168 | 0.329808 | 0.251923 | 0.289152 | 0.318531 | 0.414423 | 0.300481 | 0.283331 | 0.338529 | 0.265385 | 0.211538 | 0.241381 | 0.296764 | 0.445192 | 0.578846 | 0.564472 | 0.513369 | 0.495192 | 0.377885 | 0.301414 | 0.332464 | 0.377885 | 0.353846 | 0.243044 | 0.338834 | 0.546154 | 0.480288 | 0.346681 | 0.3446 | 0.495192 | 0.380288 | … | 0.402339 | 0.327131 | 0.328132 | 0.361722 | 0.453938 | 0.377209 | 0.31916 | 0.522488 | 0.50884 | 0.360189 | 0.328094 | 0.419139 | 0.342742 | 0.460794 | 0.47396 | 0.440191 | 0.388517 | 0.383201 | 0.325131 | 0.402871 | 0.414833 | 0.384339 | 0.327887 | 0.42488 | 0.393301 | 0.436795 | 0.332415 | 0.308134 | 0.415311 | 0.3978 | 0.318543 | 0.427751 | 0.423445 | 0.389607 | 0.490231 | 0.799043 | 0.124402 |
Create cross validation folds¶
cv_kwargs = {
"val_days": 100,
"gap_days": 30,
"n_splits": 3,
"cv_window_type": "rolling",
}
cv = TimeGapSplit(
date_serie=data["date"],
valid_duration=timedelta(days=cv_kwargs["val_days"]),
gap_duration=timedelta(days=cv_kwargs["gap_days"]),
n_splits=cv_kwargs["n_splits"],
window=cv_kwargs["cv_window_type"],
)
cv.summary(data)
| Start date | End date | Period | Unique days | nbr samples | part | fold |
|---|---|---|---|---|---|---|
| datetime[μs] | datetime[μs] | duration[μs] | i64 | i64 | str | i64 |
| 2019-07-26 00:00:00 | 2025-02-04 00:00:00 | 2020d | 2021 | 182226 | "train" | 0 |
| 2025-03-07 00:00:00 | 2025-06-14 00:00:00 | 99d | 100 | 18348 | "valid" | 0 |
| 2019-11-03 00:00:00 | 2025-05-15 00:00:00 | 2020d | 2021 | 196281 | "train" | 1 |
| 2025-06-15 00:00:00 | 2025-09-22 00:00:00 | 99d | 100 | 19601 | "valid" | 1 |
| 2020-02-11 00:00:00 | 2025-08-23 00:00:00 | 2020d | 2021 | 211223 | "train" | 2 |
| 2025-09-23 00:00:00 | 2025-12-31 00:00:00 | 99d | 100 | 20309 | "valid" | 2 |
Define features, targets, and lag windows¶
feature_cols = [
col
for col in data.columns
if col not in ["id", "eodhd_id", "date", "target_10d", "target_30d"]
]
target_cols = ["target_10d", "target_30d"]
lag_windows = [0, 5, 10, 15]
n_features_per_timestep = len(feature_cols) // len(lag_windows)
Train models on each train fold and predict validation¶
print("Training and evaluating XGBoost model with detailed scoring...")
fold_data = []
models = [
XGBRegressor(n_estimators=2000, device="cuda"),
LSTMRegressor(
output_units=2,
lag_windows=lag_windows,
n_features_per_timestep=n_features_per_timestep,
),
]
for fold, (train_idx, val_idx) in enumerate(cv.split(data)):
print(f"\nFold {fold + 1}/{cv.n_splits}")
# Get train and validation data
train_data = data[train_idx]
val_data = data[val_idx]
print(f" Train dates: {train_data['date'].min()} to {train_data['date'].max()}")
print(f" Val dates: {val_data['date'].min()} to {val_data['date'].max()}")
print(f" Train samples: {len(train_data)}, Val samples: {len(val_data)}")
# Train model
for model in models:
fit_kwargs = {}
if isinstance(model, LSTMRegressor):
fit_kwargs["epochs"] = 5
fit_kwargs["validation_data"] = (
val_data[feature_cols],
val_data[target_cols],
)
model.fit(
train_data[feature_cols],
train_data[target_cols],
**fit_kwargs,
)
# Make predictions
preds = pl.from_numpy(
np.asarray(model.predict(val_data[feature_cols])),
schema={"pred_10d": pl.Float64, "pred_30d": pl.Float64},
)
preds = preds.with_columns(
pl.lit(fold).alias("fold"), pl.lit(model.__class__.__name__).alias("model")
)
fold_data.append(val_data.with_columns(preds))
Training and evaluating XGBoost model with detailed scoring... Fold 1/3 Train dates: 2019-07-26 00:00:00 to 2025-02-04 00:00:00 Val dates: 2025-03-07 00:00:00 to 2025-06-14 00:00:00 Train samples: 182226, Val samples: 18348
/home/exx/projects/crowdcent-challenge/.venv/lib/python3.12/site-packages/xgboost/core.py:774: UserWarning: [21:01:01] WARNING: /workspace/src/common/error_msg.cc:62: Falling back to prediction using DMatrix due to mismatched devices. This might lead to higher memory usage and slower performance. XGBoost is running on: cuda:0, while the input data is on: cpu. Potential solutions: - Use a data structure that matches the device ordinal in the booster. - Set the device for booster before call to inplace_predict. This warning will only be shown once. return func(**kwargs) INFO:2026-02-03 21:01:01,218:jax._src.xla_bridge:834: Unable to initialize backend 'tpu': INTERNAL: Failed to open libtpu.so: libtpu.so: cannot open shared object file: No such file or directory 2026-02-03 21:01:01,218 - INFO - Unable to initialize backend 'tpu': INTERNAL: Failed to open libtpu.so: libtpu.so: cannot open shared object file: No such file or directory WARNING:2026-02-03 21:01:01,219:jax._src.xla_bridge:876: An NVIDIA GPU may be present on this machine, but a CUDA-enabled jaxlib is not installed. Falling back to cpu. 2026-02-03 21:01:01,219 - WARNING - An NVIDIA GPU may be present on this machine, but a CUDA-enabled jaxlib is not installed. Falling back to cpu.
Epoch 1/5 5695/5695 ━━━━━━━━━━━━━━━━━━━━ 13s 2ms/step - loss: 0.9988 - mse: 0.9988 - val_loss: 1.0020 - val_mse: 1.0020 Epoch 2/5 5695/5695 ━━━━━━━━━━━━━━━━━━━━ 9s 2ms/step - loss: 0.9965 - mse: 0.9965 - val_loss: 0.9976 - val_mse: 0.9976 Epoch 3/5 5695/5695 ━━━━━━━━━━━━━━━━━━━━ 10s 2ms/step - loss: 0.9958 - mse: 0.9958 - val_loss: 0.9988 - val_mse: 0.9988 Epoch 4/5 5695/5695 ━━━━━━━━━━━━━━━━━━━━ 10s 2ms/step - loss: 0.9945 - mse: 0.9945 - val_loss: 0.9959 - val_mse: 0.9959 Epoch 5/5 5695/5695 ━━━━━━━━━━━━━━━━━━━━ 10s 2ms/step - loss: 0.9933 - mse: 0.9933 - val_loss: 1.0026 - val_mse: 1.0026 36/36 ━━━━━━━━━━━━━━━━━━━━ 1s 15ms/step Fold 2/3 Train dates: 2019-11-03 00:00:00 to 2025-05-15 00:00:00 Val dates: 2025-06-15 00:00:00 to 2025-09-22 00:00:00 Train samples: 196281, Val samples: 19601 Epoch 1/5 6134/6134 ━━━━━━━━━━━━━━━━━━━━ 12s 2ms/step - loss: 0.9928 - mse: 0.9928 - val_loss: 1.0110 - val_mse: 1.0110 Epoch 2/5 6134/6134 ━━━━━━━━━━━━━━━━━━━━ 9s 2ms/step - loss: 0.9923 - mse: 0.9923 - val_loss: 1.0119 - val_mse: 1.0119 Epoch 3/5 6134/6134 ━━━━━━━━━━━━━━━━━━━━ 11s 2ms/step - loss: 0.9919 - mse: 0.9919 - val_loss: 1.0109 - val_mse: 1.0109 Epoch 4/5 6134/6134 ━━━━━━━━━━━━━━━━━━━━ 11s 2ms/step - loss: 0.9917 - mse: 0.9917 - val_loss: 1.0095 - val_mse: 1.0095 Epoch 5/5 6134/6134 ━━━━━━━━━━━━━━━━━━━━ 12s 2ms/step - loss: 0.9915 - mse: 0.9915 - val_loss: 1.0165 - val_mse: 1.0165 39/39 ━━━━━━━━━━━━━━━━━━━━ 0s 9ms/step Fold 3/3 Train dates: 2020-02-11 00:00:00 to 2025-08-23 00:00:00 Val dates: 2025-09-23 00:00:00 to 2025-12-31 00:00:00 Train samples: 211223, Val samples: 20309 Epoch 1/5 6601/6601 ━━━━━━━━━━━━━━━━━━━━ 12s 2ms/step - loss: 0.9900 - mse: 0.9900 - val_loss: 0.9999 - val_mse: 0.9999 Epoch 2/5 6601/6601 ━━━━━━━━━━━━━━━━━━━━ 12s 2ms/step - loss: 0.9901 - mse: 0.9901 - val_loss: 0.9919 - val_mse: 0.9919 Epoch 3/5 6601/6601 ━━━━━━━━━━━━━━━━━━━━ 12s 2ms/step - loss: 0.9897 - mse: 0.9897 - val_loss: 0.9919 - val_mse: 0.9919 Epoch 4/5 6601/6601 ━━━━━━━━━━━━━━━━━━━━ 12s 2ms/step - loss: 0.9895 - mse: 0.9895 - val_loss: 0.9932 - val_mse: 0.9932 Epoch 5/5 6601/6601 ━━━━━━━━━━━━━━━━━━━━ 12s 2ms/step - loss: 0.9890 - mse: 0.9890 - val_loss: 0.9922 - val_mse: 0.9922 40/40 ━━━━━━━━━━━━━━━━━━━━ 0s 7ms/step
combined_preds = pl.concat(fold_data)
combined_preds.head()
| id | eodhd_id | date | feature_16_lag15 | feature_13_lag15 | feature_14_lag15 | feature_15_lag15 | feature_8_lag15 | feature_5_lag15 | feature_6_lag15 | feature_7_lag15 | feature_12_lag15 | feature_9_lag15 | feature_10_lag15 | feature_11_lag15 | feature_4_lag15 | feature_1_lag15 | feature_2_lag15 | feature_3_lag15 | feature_20_lag15 | feature_17_lag15 | feature_18_lag15 | feature_19_lag15 | feature_16_lag10 | feature_13_lag10 | feature_14_lag10 | feature_15_lag10 | feature_8_lag10 | feature_5_lag10 | feature_6_lag10 | feature_7_lag10 | feature_12_lag10 | feature_9_lag10 | feature_10_lag10 | feature_11_lag10 | feature_4_lag10 | feature_1_lag10 | … | feature_9_lag5 | feature_10_lag5 | feature_11_lag5 | feature_4_lag5 | feature_1_lag5 | feature_2_lag5 | feature_3_lag5 | feature_20_lag5 | feature_17_lag5 | feature_18_lag5 | feature_19_lag5 | feature_16_lag0 | feature_13_lag0 | feature_14_lag0 | feature_15_lag0 | feature_8_lag0 | feature_5_lag0 | feature_6_lag0 | feature_7_lag0 | feature_12_lag0 | feature_9_lag0 | feature_10_lag0 | feature_11_lag0 | feature_4_lag0 | feature_1_lag0 | feature_2_lag0 | feature_3_lag0 | feature_20_lag0 | feature_17_lag0 | feature_18_lag0 | feature_19_lag0 | target_10d | target_30d | pred_10d | pred_30d | fold | model |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| str | str | datetime[μs] | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | … | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | f64 | i32 | str |
| "AAVE" | "AAVE-USD.CC" | 2025-03-07 00:00:00 | 0.601081 | 0.559595 | 0.545875 | 0.568961 | 0.554595 | 0.480727 | 0.486328 | 0.552762 | 0.646486 | 0.58405 | 0.557259 | 0.586123 | 0.6 | 0.529047 | 0.519603 | 0.558423 | 0.484324 | 0.438582 | 0.45656 | 0.524445 | 0.378895 | 0.489988 | 0.463527 | 0.530164 | 0.386616 | 0.470605 | 0.446241 | 0.517614 | 0.250975 | 0.448731 | 0.449008 | 0.535998 | 0.307515 | 0.453757 | … | 0.329292 | 0.456671 | 0.48453 | 0.505435 | 0.406475 | 0.467761 | 0.47668 | 0.508696 | 0.537133 | 0.487857 | 0.492303 | 0.594565 | 0.477717 | 0.483853 | 0.503655 | 0.593478 | 0.477717 | 0.474161 | 0.487833 | 0.616304 | 0.511957 | 0.480344 | 0.506214 | 0.478261 | 0.491848 | 0.472803 | 0.497013 | 0.559783 | 0.534239 | 0.529593 | 0.51642 | 0.168478 | 0.277174 | 0.610243 | 0.762482 | 0 | "XGBRegressor" |
| "ACE" | "ACE28674-USD.CC" | 2025-03-07 00:00:00 | 0.580541 | 0.534338 | 0.475895 | 0.492022 | 0.641081 | 0.601625 | 0.510432 | 0.517501 | 0.624865 | 0.594769 | 0.475677 | 0.476454 | 0.71027 | 0.574541 | 0.488501 | 0.503881 | 0.628108 | 0.527576 | 0.528672 | 0.547597 | 0.615347 | 0.597944 | 0.554041 | 0.499166 | 0.577479 | 0.60928 | 0.575796 | 0.511861 | 0.612068 | 0.618467 | 0.56129 | 0.486013 | 0.587256 | 0.648763 | … | 0.527773 | 0.561271 | 0.49452 | 0.478261 | 0.532759 | 0.55365 | 0.509366 | 0.415217 | 0.541936 | 0.534756 | 0.533822 | 0.359783 | 0.42663 | 0.512287 | 0.483007 | 0.361957 | 0.428804 | 0.519042 | 0.486101 | 0.321739 | 0.382609 | 0.500538 | 0.463103 | 0.354348 | 0.416304 | 0.532534 | 0.478166 | 0.597826 | 0.506522 | 0.577452 | 0.53849 | 0.516304 | 0.184783 | 0.423555 | 0.521463 | 0 | "XGBRegressor" |
| "ADA" | "ADA-USD.CC" | 2025-03-07 00:00:00 | 0.600541 | 0.591174 | 0.532486 | 0.568692 | 0.616757 | 0.614633 | 0.562553 | 0.57077 | 0.588649 | 0.615623 | 0.559419 | 0.577756 | 0.57027 | 0.572663 | 0.54722 | 0.556776 | 0.455676 | 0.42934 | 0.478219 | 0.480199 | 0.463049 | 0.531795 | 0.51938 | 0.540344 | 0.467756 | 0.542256 | 0.521619 | 0.542869 | 0.399136 | 0.493892 | 0.509617 | 0.544668 | 0.412239 | 0.491254 | … | 0.423209 | 0.519416 | 0.520604 | 0.547283 | 0.479761 | 0.526212 | 0.532037 | 0.608152 | 0.57964 | 0.50449 | 0.492727 | 0.534239 | 0.464674 | 0.498234 | 0.510566 | 0.453804 | 0.479891 | 0.511074 | 0.536807 | 0.5375 | 0.492391 | 0.493142 | 0.518864 | 0.344022 | 0.445652 | 0.468453 | 0.496501 | 0.409239 | 0.508696 | 0.506049 | 0.497705 | 0.182065 | 0.459239 | 0.313885 | 0.451394 | 0 | "XGBRegressor" |
| "AERO" | "AERO29270-USD.CC" | 2025-03-07 00:00:00 | 0.353514 | 0.425039 | 0.438603 | 0.465937 | 0.495135 | 0.488102 | 0.461237 | 0.479841 | 0.286486 | 0.379997 | 0.431903 | 0.48874 | 0.347027 | 0.444324 | 0.4245 | 0.480306 | 0.60973 | 0.564357 | 0.471345 | 0.484001 | 0.587309 | 0.470411 | 0.456425 | 0.474803 | 0.433848 | 0.464492 | 0.434893 | 0.452151 | 0.503655 | 0.395071 | 0.39428 | 0.46003 | 0.468919 | 0.407973 | … | 0.444762 | 0.41238 | 0.443883 | 0.477174 | 0.473046 | 0.458685 | 0.474102 | 0.419565 | 0.528878 | 0.546617 | 0.510811 | 0.561957 | 0.43913 | 0.454771 | 0.477229 | 0.478261 | 0.446739 | 0.455615 | 0.485994 | 0.668478 | 0.527174 | 0.461122 | 0.491451 | 0.479348 | 0.478261 | 0.443117 | 0.484611 | 0.516304 | 0.467935 | 0.545947 | 0.509282 | 0.11413 | 0.027174 | 0.423225 | 0.518208 | 0 | "XGBRegressor" |
| "AI16Z" | "AI16Z-USD.CC" | 2025-03-07 00:00:00 | 0.037838 | 0.316893 | 0.401356 | 0.390141 | 0.211892 | 0.246218 | 0.329748 | 0.333969 | 0.235676 | 0.39209 | 0.478427 | 0.425993 | 0.222703 | 0.318408 | 0.445235 | 0.414023 | 0.521081 | 0.456069 | 0.550069 | 0.492483 | 0.4946 | 0.266219 | 0.429426 | 0.379084 | 0.492456 | 0.352174 | 0.395956 | 0.344396 | 0.523978 | 0.379827 | 0.522681 | 0.423128 | 0.46396 | 0.343331 | … | 0.593511 | 0.4928 | 0.474 | 0.618478 | 0.541219 | 0.429814 | 0.414298 | 0.593478 | 0.487321 | 0.471695 | 0.513324 | 0.221739 | 0.407609 | 0.336914 | 0.397168 | 0.06087 | 0.428261 | 0.390217 | 0.356054 | 0.118478 | 0.390761 | 0.385294 | 0.451857 | 0.03913 | 0.328804 | 0.336068 | 0.395842 | 0.515217 | 0.554348 | 0.502735 | 0.526878 | 0.119565 | 0.043478 | 0.116033 | 0.531905 | 0 | "XGBRegressor" |
Calculate scores¶
def calculate_daily_scores(predictions_df: pl.DataFrame) -> pl.DataFrame:
"""Calculate daily scores for predictions using hyperliquid evaluation metrics.
Args:
predictions_df: DataFrame containing predictions and targets
Returns:
DataFrame with daily evaluation metrics
"""
columns = ["pred_10d", "target_10d", "pred_30d", "target_30d"]
# Define the struct schema for the scoring output
score_schema = pl.Struct(
{
"spearman_10d": pl.Float64,
"spearman_30d": pl.Float64,
"ndcg@40_10d": pl.Float64,
"ndcg@40_30d": pl.Float64,
}
)
return (
predictions_df.group_by(["date", "model"])
.agg([pl.col(col) for col in columns])
.with_columns(
pl.struct(columns)
.map_elements(
lambda x: evaluate_hyperliquid_submission(
y_true_10d=x["target_10d"],
y_pred_10d=x["pred_10d"],
y_true_30d=x["target_30d"],
y_pred_30d=x["pred_30d"],
),
return_dtype=score_schema,
)
.alias("daily_scores")
)
).unnest("daily_scores")
daily_scores = calculate_daily_scores(combined_preds)
daily_scores
| date | model | pred_10d | target_10d | pred_30d | target_30d | spearman_10d | spearman_30d | ndcg@40_10d | ndcg@40_30d |
|---|---|---|---|---|---|---|---|---|---|
| datetime[μs] | str | list[f64] | list[f64] | list[f64] | list[f64] | f64 | f64 | f64 | f64 |
| 2025-04-23 00:00:00 | "LSTMRegressor" | [0.504178, 0.454119, … 0.517369] | [0.838542, 0.46875, … 0.161458] | [0.502318, 0.46069, … 0.510441] | [0.890625, 0.442708, … 0.135417] | 0.085382 | 0.005518 | 0.636939 | 0.586528 |
| 2025-05-10 00:00:00 | "XGBRegressor" | [0.482643, 0.389933, … 0.60705] | [0.923077, 0.420513, … 0.323077] | [0.436532, 0.5263, … 0.600579] | [0.969231, 0.410256, … 0.210256] | 0.023454 | -0.014625 | 0.548543 | 0.516524 |
| 2025-12-18 00:00:00 | "LSTMRegressor" | [0.40894, 0.479743, … 0.546537] | [0.976303, 0.829384, … 0.085308] | [0.41065, 0.471246, … 0.529192] | [0.668246, 0.791469, … 0.890995] | -0.130748 | -0.097971 | 0.479473 | 0.476082 |
| 2025-12-28 00:00:00 | "LSTMRegressor" | [0.434099, 0.512211, … 0.499404] | [0.028436, 0.123223, … 0.739336] | [0.426927, 0.510298, … 0.475056] | [0.075829, 0.781991, … 0.976303] | -0.042675 | 0.049757 | 0.499236 | 0.52808 |
| 2025-04-22 00:00:00 | "LSTMRegressor" | [0.501167, 0.455346, … 0.515472] | [0.82199, 0.361257, … 0.13089] | [0.49875, 0.453371, … 0.510702] | [0.879581, 0.39267, … 0.151832] | 0.115555 | -0.017872 | 0.618231 | 0.576474 |
| … | … | … | … | … | … | … | … | … | … |
| 2025-03-09 00:00:00 | "LSTMRegressor" | [0.480994, 0.50544, … 0.516951] | [0.091892, 0.481081, … 0.994595] | [0.47823, 0.494458, … 0.504098] | [0.172973, 0.189189, … 0.989189] | -0.015914 | 0.14982 | 0.51775 | 0.600634 |
| 2025-07-13 00:00:00 | "LSTMRegressor" | [0.519002, 0.471842, … 0.487213] | [0.10101, 0.328283, … 0.070707] | [0.510259, 0.476054, … 0.485446] | [0.393939, 0.494949, … 0.515152] | 0.164265 | 0.177585 | 0.646322 | 0.606326 |
| 2025-10-09 00:00:00 | "LSTMRegressor" | [0.539149, 0.470603, … 0.528085] | [0.362319, 0.31401, … 0.164251] | [0.535295, 0.454981, … 0.512345] | [0.429952, 0.487923, … 0.458937] | -0.011584 | 0.202059 | 0.573776 | 0.621784 |
| 2025-03-28 00:00:00 | "XGBRegressor" | [0.539257, 0.514973, … 0.729769] | [0.258065, 0.188172, … 0.83871] | [0.46656, 0.577899, … 0.658475] | [0.198925, 0.284946, … 0.645161] | 0.154568 | -0.014839 | 0.621086 | 0.509221 |
| 2025-03-15 00:00:00 | "LSTMRegressor" | [0.485251, 0.486415, … 0.486219] | [0.372973, 0.535135, … 0.945946] | [0.471388, 0.485451, … 0.473527] | [0.405405, 0.172973, … 0.859459] | -0.245933 | 0.19142 | 0.413213 | 0.649429 |
from centimators.feature_transformers import MovingAverageTransformer
daily_scores = daily_scores.sort("date")
ma_transformer = MovingAverageTransformer(
feature_names=["spearman_10d", "spearman_30d", "ndcg@40_10d", "ndcg@40_30d"],
windows=[7, 30],
)
ma_columns = ma_transformer.fit_transform(
daily_scores, ticker_series=daily_scores["model"]
)
daily_scores_df = daily_scores.with_columns(ma_columns)
Plot validation metrics (combining all folds)¶
def plot_metric_comparison(df, metric_name, anchor_ref=0, width=400, height=200):
"""Plot a single metric across 10d and 30d timeframes side by side with shared y-axis.
Args:
df: Polars DataFrame with daily scores
metric_name: Base metric name ('spearman' or 'ndcg@40')
width: Chart width in pixels per timeframe
height: Chart height
"""
# Convert to pandas once
pdf = df.to_pandas()
# Shared selection for zooming
brush = alt.selection_interval(bind="scales", encodings=["x"])
# Define timeframes and colors
timeframes = ["10d", "30d"]
# Get column names for this metric
col_10d = f"{metric_name}_10d"
col_30d = f"{metric_name}_30d"
# Calculate shared y-axis domain
min_val = min(df[col_10d].min(), df[col_30d].min())
max_val = max(df[col_10d].max(), df[col_30d].max())
y_domain = [min_val * 0.95, max_val * 1.05]
charts = []
for timeframe in timeframes:
col_name = f"{metric_name}_{timeframe}"
# Calculate per-model statistics
model_stats = (
df.group_by("model")
.agg(
[
pl.col(col_name).mean().alias("mean"),
pl.col(col_name).std().alias("std"),
]
)
.with_columns((pl.col("mean") / pl.col("std")).alias("sharpe"))
)
# Create concise title with mean values
mean_values = model_stats.to_pandas()
mean_text = " | ".join(
[f"{row['model']}: {row['mean']:.3f}" for _, row in mean_values.iterrows()]
)
title_text = f"{metric_name.upper()} {timeframe}\nMeans: {mean_text}"
chart = (
alt.Chart(pdf)
.add_params(brush)
.mark_point(opacity=0.6)
.encode(
x=alt.X("date:T", title="Date"),
y=alt.Y(
f"{col_name}:Q",
title=f"{metric_name.upper()} {timeframe}",
scale=alt.Scale(domain=y_domain),
),
color=alt.Color("model:N", legend=alt.Legend(symbolOpacity=1.0)),
tooltip=[
"date:T",
"model:N",
f"{col_name}:Q",
alt.Tooltip("model:N", title="Model"),
],
)
.properties(
width=width,
height=height,
title=alt.TitleParams(
text=title_text,
fontSize=10,
anchor="start",
),
)
)
moving_average_chart = (
alt.Chart(pdf)
.mark_line(strokeWidth=2, opacity=1)
.encode(
x=alt.X("date:T"),
y=alt.Y(f"{col_name}_ma30:Q", scale=alt.Scale(domain=y_domain)),
color=alt.Color("model:N", legend=alt.Legend(symbolOpacity=1.0)),
tooltip=[
"date:T",
"model:N",
f"{col_name}_ma30:Q",
alt.Tooltip("model:N", title="Model"),
],
)
)
# Add per-model mean reference lines
mean_lines = []
for row in model_stats.iter_rows(named=True):
model = row["model"]
mean_val = row["mean"]
mean_line = (
alt.Chart(pdf[pdf["model"] == model])
.mark_rule(strokeDash=[5, 5], opacity=1, strokeWidth=2)
.encode(
y=alt.datum(mean_val),
color=alt.Color("model:N", legend=alt.Legend(symbolOpacity=0.7)),
)
)
mean_lines.append(mean_line)
# Add anchor reference line
anchor_line = (
alt.Chart(pdf)
.mark_rule(strokeDash=[2, 2], opacity=0.7, color="gray", strokeWidth=1)
.encode(y=alt.datum(anchor_ref))
)
combined_chart = chart + moving_average_chart + anchor_line
for mean_line in mean_lines:
combined_chart += mean_line
charts.append(combined_chart)
return alt.hconcat(*charts, spacing=10).resolve_scale(x="shared")
# Plot both metrics
spearman_chart = plot_metric_comparison(daily_scores_df, "spearman", anchor_ref=0)
ndcg_chart = plot_metric_comparison(daily_scores_df, "ndcg@40", anchor_ref=0.5)
# Combine vertically
alt.vconcat(spearman_chart, ndcg_chart, spacing=20)
/tmp/ipykernel_3825538/3460603390.py:130: UserWarning: Automatically deduplicated selection parameter with identical configuration. If you want independent parameters, explicitly name them differently (e.g., name='param1', name='param2'). See https://github.com/vega/altair/issues/3891 spearman_chart = plot_metric_comparison(daily_scores_df, "spearman", anchor_ref=0) /tmp/ipykernel_3825538/3460603390.py:131: UserWarning: Automatically deduplicated selection parameter with identical configuration. If you want independent parameters, explicitly name them differently (e.g., name='param1', name='param2'). See https://github.com/vega/altair/issues/3891 ndcg_chart = plot_metric_comparison(daily_scores_df, "ndcg@40", anchor_ref=0.5) /home/exx/projects/crowdcent-challenge/.venv/lib/python3.12/site-packages/IPython/core/interactiveshell.py:3701: UserWarning: Automatically deduplicated selection parameter with identical configuration. If you want independent parameters, explicitly name them differently (e.g., name='param1', name='param2'). See https://github.com/vega/altair/issues/3891 exec(code_obj, self.user_global_ns, self.user_ns)