Quick Start¶

This guide will walk you through running your first experiment with BatteryML in under 5 minutes.

Prerequisites¶

BatteryML installed (see Installation)
LG M50T dataset available in Raw Data/ directory

Step 1: Verify Data Location¶

Ensure your data is structured as follows:

Raw Data/
└── Expt 5 - Standard Cycle Aging (Control)/
    ├── Summary Data/
    │   ├── Performance Summary/
    │   └── Ageing Sets Summary/
    └── Processed Timeseries Data/
        └── 0.1C Voltage Curves/

Step 2: Run Milestone A¶

Milestone A demonstrates a complete end-to-end workflow with a LightGBM baseline:

python examples/milestone_a.py

Expected Output¶

============================================================
Milestone A: LGBM Baseline on Summary Data
============================================================

[1/6] Loading data...
  ✓ Loaded 168 samples from 8 cells

[2/6] Creating feature pipeline...
  ✓ Pipeline: SummarySetPipeline(include_arrhenius=True, ...)

[3/6] Transforming data to samples...
  ✓ Created 168 samples
  ✓ Feature dimension: 15

[4/6] Splitting by temperature...
  ✓ Train: 126 samples (10°C + 40°C)
  ✓ Val:   42 samples (25°C)

[5/6] Training LightGBM model...
  ✓ Model trained: LGBMModel(...)

[6/6] Evaluating...

========================================
Results (25°C Holdout)
========================================
  RMSE: 0.03425
  MAE:  0.02891
  MAPE: 0.61%
  R²:   0.9847

========================================
Top Features
========================================
  cumulative_throughput_Ah: 245.3
  temperature_K: 189.2
  ...

Step 3: View Results¶

TensorBoard¶

View training curves and metrics:

tensorboard --logdir artifacts/runs

Open your browser to http://localhost:6006

MLflow¶

Compare experiments:

mlflow ui --backend-store-uri file:./artifacts/mlruns

Open your browser to http://localhost:5000

Understanding the Output¶

Metrics Explained¶

RMSE (Root Mean Squared Error): Lower is better, measures prediction error
MAE (Mean Absolute Error): Average absolute difference between predictions and targets
MAPE (Mean Absolute Percentage Error): Percentage error, useful for relative comparison
R² (Coefficient of Determination): Closer to 1.0 is better, measures explained variance

What Happened?¶

Data Loading: Loaded Performance Summary CSV files for all cells
Feature Extraction: Created features from summary statistics (throughput, resistance, temperature)
Data Splitting: Split by temperature (train on 10°C+40°C, validate on 25°C)
Model Training: Trained LightGBM gradient boosting model
Evaluation: Computed metrics on validation set
Tracking: Logged results to local files and MLflow

Next Steps¶

Explore More Examples¶

Milestone B: ICA features + SHAP analysis
```
python examples/milestone_b.py
```
Milestone C: Neural ODE vs LSTM comparison
```
python examples/milestone_c.py
```

Learn More¶

Core Concepts - Understand Sample, Registry, Caching
User Guide - Detailed usage documentation
API Reference - Complete API documentation

Common Issues¶

Issue: FileNotFoundError for data files - Solution: Verify data path matches expected structure (see Step 1)

Issue: Import errors - Solution: Ensure virtual environment is activated and dependencies installed

Issue: CUDA out of memory - Solution: Reduce batch size or use CPU (models will auto-detect device)

For more troubleshooting help, see Troubleshooting.