Models API Reference¶
The models module contains the model zoo, including gradient boosting (LGBMModel), deep learning (MLPModel, LSTMAttentionModel), and continuous-time models (NeuralODEModel). All models follow a consistent interface for fitting and predicting Sample objects.
Usage Example¶
from src.models.neural_ode import NeuralODEModel
model = NeuralODEModel(input_dim=5, latent_dim=32, solver='dopri5')
# Trainer will handle the forward/backward passes
Special Models¶
ACLA (Attention-CNN-LSTM-ANODE)¶
The ACLAModel is a hybrid architecture that integrates attention for feature weighting, CNN-LSTM for hierarchical temporal feature extraction, and Augmented Neural ODEs for continuous-time degradation modeling.
For a deep dive into the theory, see the ACLA Model Theory page.
base ¶
Base class for models.
Classes¶
BaseModel ¶
Bases: ABC, Module
Abstract base for all neural network models.
All models must: 1. Accept Sample.x as input (tensor or dict) 2. Return predictions compatible with Sample.y 3. Optionally accept Sample.t for time-aware models 4. Provide explain() hook for interpretability
Note: LightGBM is a special case that doesn't inherit from nn.Module but follows the same interface pattern.
Initialize the model.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
input_dim |
int
|
Number of input features |
required |
output_dim |
int
|
Number of output predictions (default: 1 for SOH) |
1
|
Source code in src/models/base.py
Functions¶
count_parameters ¶
Count trainable parameters.
Returns:
| Type | Description |
|---|---|
int
|
Number of trainable parameters |
explain ¶
Return interpretability information.
Override in subclasses to provide model-specific explanations (attention weights, SHAP values, etc.).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
x |
Tensor
|
Input features |
required |
**kwargs |
Additional arguments |
{}
|
Returns:
| Type | Description |
|---|---|
Dict[str, Any]
|
Dictionary with explanation data |
Source code in src/models/base.py
forward
abstractmethod
¶
Forward pass.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
x |
Tensor
|
Features tensor (batch, features) or (batch, seq_len, features) |
required |
t |
Optional[Tensor]
|
Optional time tensor for ODE models (batch, seq_len) |
None
|
**kwargs |
Additional model-specific arguments |
{}
|
Returns:
| Name | Type | Description |
|---|---|---|
predictions |
Tensor
|
Tensor of shape (batch, output_dim) |
Source code in src/models/base.py
predict ¶
Inference with no gradient computation.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
x |
Tensor
|
Input features tensor |
required |
**kwargs |
Additional arguments passed to forward |
{}
|
Returns:
| Type | Description |
|---|---|
ndarray
|
Numpy array of predictions |
Source code in src/models/base.py
registry ¶
Registry for models.
Classes¶
ModelRegistry ¶
Registry for model classes.
Example usage
@ModelRegistry.register("mlp") ... class MLPModel(BaseModel): ... pass
model = ModelRegistry.get("mlp", input_dim=10, hidden_dims=[64, 32])
Functions¶
get
classmethod
¶
Get a model instance by name.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name |
str
|
Registry name of the model |
required |
**kwargs |
Arguments to pass to model constructor |
{}
|
Returns:
| Type | Description |
|---|---|
Union[BaseModel, object]
|
Model instance |
Raises:
| Type | Description |
|---|---|
ValueError
|
If model name is not found |
Source code in src/models/registry.py
get_class
classmethod
¶
Get model class by name (without instantiating).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name |
str
|
Registry name of the model |
required |
Returns:
| Type | Description |
|---|---|
Optional[Type]
|
Model class or None if not found |
list_available
classmethod
¶
register
classmethod
¶
Decorator to register a model class.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name |
str
|
Registry name for the model |
required |
Returns:
| Type | Description |
|---|---|
|
Decorator function |
Source code in src/models/registry.py
lgbm ¶
LightGBM wrapper compatible with pipeline interface.
Classes¶
LGBMModel ¶
LGBMModel(input_dim: int = None, output_dim: int = 1, n_estimators: int = 1000, learning_rate: float = 0.05, max_depth: int = 6, num_leaves: int = 31, reg_alpha: float = 0.1, reg_lambda: float = 0.1, early_stopping_rounds: int = 50, random_state: int = 42)
LightGBM wrapper compatible with pipeline.
Not a nn.Module (sklearn-style interface), but follows same pattern. Excellent for: - Fast baselines - SHAP interpretability - Summary + ICA peak features
Example usage
model = LGBMModel(n_estimators=500) model.fit(X_train, y_train, X_val, y_val) predictions = model.predict(X_test)
Initialize the model.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
input_dim |
int
|
Input dimension (unused, for interface compatibility) |
None
|
output_dim |
int
|
Output dimension |
1
|
n_estimators |
int
|
Number of boosting rounds |
1000
|
learning_rate |
float
|
Learning rate |
0.05
|
max_depth |
int
|
Maximum tree depth |
6
|
num_leaves |
int
|
Maximum number of leaves |
31
|
reg_alpha |
float
|
L1 regularization |
0.1
|
reg_lambda |
float
|
L2 regularization |
0.1
|
early_stopping_rounds |
int
|
Early stopping patience |
50
|
random_state |
int
|
Random seed |
42
|
Source code in src/models/lgbm.py
Attributes¶
Functions¶
explain ¶
Return SHAP values for interpretability.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
X |
ndarray
|
Input features |
required |
Returns:
| Type | Description |
|---|---|
Dict[str, Any]
|
Dictionary with SHAP values, expected value, feature names |
Source code in src/models/lgbm.py
fit ¶
fit(X: np.ndarray, y: np.ndarray, X_val: Optional[np.ndarray] = None, y_val: Optional[np.ndarray] = None, feature_names: Optional[List[str]] = None) -> LGBMModel
Fit the model.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
X |
ndarray
|
Training features |
required |
y |
ndarray
|
Training targets |
required |
X_val |
Optional[ndarray]
|
Validation features (for early stopping) |
None
|
y_val |
Optional[ndarray]
|
Validation targets |
None
|
feature_names |
Optional[List[str]]
|
Feature names for interpretability |
None
|
Returns:
| Type | Description |
|---|---|
LGBMModel
|
self |
Source code in src/models/lgbm.py
load ¶
Load model from file.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
path |
str
|
Path to load model from |
required |
Returns:
| Type | Description |
|---|---|
LGBMModel
|
self |
Source code in src/models/lgbm.py
predict ¶
Predict.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
X |
ndarray
|
Input features |
required |
Returns:
| Type | Description |
|---|---|
ndarray
|
Predictions shaped (n_samples, 1) |
Source code in src/models/lgbm.py
save ¶
Save model to file.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
path |
str
|
Path to save model |
required |
mlp ¶
Simple MLP baseline model.
Classes¶
MLPModel ¶
MLPModel(input_dim: int, output_dim: int = 1, hidden_dims: List[int] = None, dropout: float = 0.1, activation: str = 'relu')
Bases: BaseModel
Simple MLP baseline for tabular features.
Example usage
model = MLPModel(input_dim=10, hidden_dims=[64, 32]) output = model(x)
Initialize the model.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
input_dim |
int
|
Number of input features |
required |
output_dim |
int
|
Number of outputs |
1
|
hidden_dims |
List[int]
|
List of hidden layer dimensions |
None
|
dropout |
float
|
Dropout rate |
0.1
|
activation |
str
|
Activation function ("relu" or "tanh") |
'relu'
|
Source code in src/models/mlp.py
Functions¶
forward ¶
Forward pass.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
x |
Tensor
|
Input tensor of shape (batch, input_dim) |
required |
t |
Optional[Tensor]
|
Ignored (for interface compatibility) |
None
|
Returns:
| Type | Description |
|---|---|
Tensor
|
Output tensor of shape (batch, output_dim) |
Source code in src/models/mlp.py
lstm_attn ¶
BiLSTM with self-attention for sequence modeling.
Classes¶
LSTMAttentionModel ¶
LSTMAttentionModel(input_dim: int, output_dim: int = 1, hidden_dim: int = 64, num_layers: int = 2, num_heads: int = 4, dropout: float = 0.1)
Bases: BaseModel
BiLSTM with self-attention for sequence modeling.
Suitable for: - Variable length sequences - Capturing long-range dependencies - Interpretable attention weights
Example usage
model = LSTMAttentionModel(input_dim=5, hidden_dim=64) output = model(x) # x shape: (batch, seq_len, input_dim)
Initialize the model.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
input_dim |
int
|
Number of input features per timestep |
required |
output_dim |
int
|
Number of outputs |
1
|
hidden_dim |
int
|
LSTM hidden dimension |
64
|
num_layers |
int
|
Number of LSTM layers |
2
|
num_heads |
int
|
Number of attention heads |
4
|
dropout |
float
|
Dropout rate |
0.1
|
Source code in src/models/lstm_attn.py
Functions¶
explain ¶
Return attention weights.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
x |
Tensor
|
Input tensor |
required |
Returns:
| Type | Description |
|---|---|
Dict[str, Any]
|
Dictionary with attention weights |
Source code in src/models/lstm_attn.py
forward ¶
forward(x: torch.Tensor, t: Optional[torch.Tensor] = None, mask: Optional[torch.Tensor] = None, **kwargs) -> torch.Tensor
Forward pass.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
x |
Tensor
|
Input tensor of shape (batch, seq_len, input_dim) |
required |
t |
Optional[Tensor]
|
Ignored (for interface compatibility) |
None
|
mask |
Optional[Tensor]
|
Optional attention mask |
None
|
Returns:
| Type | Description |
|---|---|
Tensor
|
Output tensor of shape (batch, output_dim) |
Source code in src/models/lstm_attn.py
neural_ode ¶
Neural ODE for continuous-time degradation modeling.
Classes¶
NeuralODEModel ¶
NeuralODEModel(input_dim: int, output_dim: int = 1, latent_dim: int = 32, hidden_dim: int = 64, solver: str = 'dopri5', rtol: float = 0.001, atol: float = 0.0001, use_adjoint: bool = False)
Bases: BaseModel
Latent ODE for continuous-time degradation modeling.
Architecture: 1. Encoder: maps x_0 → z_0 (initial latent state) 2. ODE: integrates dz/dt = f(z,t) from t_0 to t_N 3. Decoder: maps z_N → y (SOH prediction)
Uses actual time values from Sample.t for integration.
Example usage
model = NeuralODEModel(input_dim=5, latent_dim=32) output = model(x, t=t) # x: (batch, seq_len, features), t: (seq_len,)
Initialize the model.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
input_dim |
int
|
Number of input features per timestep |
required |
output_dim |
int
|
Number of output predictions |
1
|
latent_dim |
int
|
Dimension of latent ODE state |
32
|
hidden_dim |
int
|
Hidden dimension in networks |
64
|
solver |
str
|
ODE solver ("dopri5", "euler", "rk4", etc.) |
'dopri5'
|
rtol |
float
|
Relative tolerance for solver (default 1e-3 for relaxed tolerance, optimal for accuracy based on benchmarks) |
0.001
|
atol |
float
|
Absolute tolerance for solver (default 1e-4 for relaxed tolerance, optimal for accuracy based on benchmarks) |
0.0001
|
use_adjoint |
bool
|
Use adjoint method for memory-efficient gradients. Default False (direct backprop) gives better accuracy and is faster for short sequences. Set True only for very long sequences or memory-constrained scenarios. |
False
|
Source code in src/models/neural_ode.py
Functions¶
explain ¶
Return latent trajectory for visualization.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
x |
Tensor
|
Input sequence |
required |
t |
Optional[Tensor]
|
Time points |
None
|
Returns:
| Type | Description |
|---|---|
Dict[str, Any]
|
Dictionary with latent trajectory |
Source code in src/models/neural_ode.py
forward ¶
Forward pass with ODE integration.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
x |
Tensor
|
Input sequence of shape (batch, seq_len, input_dim) |
required |
t |
Optional[Tensor]
|
Time points of shape (batch, seq_len) or (seq_len,) |
None
|
Returns:
| Type | Description |
|---|---|
Tensor
|
Predictions of shape (batch, output_dim) |
Source code in src/models/neural_ode.py
forward_trajectory ¶
Get predictions at all timesteps.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
x |
Tensor
|
Input sequence |
required |
t |
Optional[Tensor]
|
Time points |
None
|
Returns:
| Type | Description |
|---|---|
Tensor
|
Predictions at all timesteps of shape (batch, seq_len, output_dim) |
Source code in src/models/neural_ode.py
ODEFunc ¶
Bases: Module
Neural network defining dz/dt = f(z, t).
The ODE function takes the current state z and time t, and returns the rate of change dz/dt.
Initialize the ODE function.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
latent_dim |
int
|
Dimension of the latent state |
required |
hidden_dim |
int
|
Hidden layer dimension |
64
|
Source code in src/models/neural_ode.py
Functions¶
forward ¶
Compute dz/dt.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
t |
Tensor
|
Current time (scalar or batch) |
required |
z |
Tensor
|
Current state of shape (batch, latent_dim) |
required |
Returns:
| Type | Description |
|---|---|
Tensor
|
dz/dt of shape (batch, latent_dim) |
Source code in src/models/neural_ode.py
acla ¶
ACLA: Attention-CNN-LSTM-ANODE hybrid model for sequence-based degradation prediction.
Classes¶
ACLAModel ¶
ACLAModel(input_dim: int, output_dim: int = 1, hidden_dim: int = 64, augment_dim: int = 20, cnn_filters: Optional[List[int]] = None, solver: str = 'dopri5', rtol: float = 0.0001, atol: float = 1e-05, use_adjoint: bool = False)
Bases: BaseModel
ACLA: Attention-CNN-LSTM-ANODE hybrid model.
Architecture: 1. Attention: Temporal attention across sequence timesteps 2. CNN-LSTM: Feature extraction and temporal modeling 3. ANODE: Augmented Neural ODE for continuous-time dynamics 4. Output: Sequence-to-sequence predictions
Suitable for: - Complex sequence-based degradation prediction - Multi-target prediction (e.g., LAM_NE, LAM_PE, LLI) - Understanding which timesteps are important (attention) - Continuous-time trajectory modeling
Example usage
model = ACLAModel(input_dim=20, output_dim=3, hidden_dim=64) output = model(x, t=t) # x: (batch, seq_len, features), t: (seq_len,)
Initialize the ACLA model.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
input_dim |
int
|
Number of input features per timestep |
required |
output_dim |
int
|
Number of output predictions |
1
|
hidden_dim |
int
|
Hidden dimension for LSTM and ODE |
64
|
augment_dim |
int
|
Number of augmented dimensions for ANODE |
20
|
cnn_filters |
Optional[List[int]]
|
List of CNN filter sizes [first, second] |
None
|
solver |
str
|
ODE solver ("dopri5", "euler", "rk4", etc.) |
'dopri5'
|
rtol |
float
|
Relative tolerance for solver |
0.0001
|
atol |
float
|
Absolute tolerance for solver |
1e-05
|
use_adjoint |
bool
|
Use adjoint method for memory-efficient gradients. Default False (direct backprop) gives better accuracy and is faster for short sequences. Set True only for very long sequences or memory-constrained scenarios. |
False
|
Source code in src/models/acla.py
Functions¶
explain ¶
Return interpretability information.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
x |
Tensor
|
Input sequence |
required |
t |
Optional[Tensor]
|
Time points |
None
|
Returns:
| Type | Description |
|---|---|
Dict[str, Any]
|
Dictionary with attention weights and trajectory |
Source code in src/models/acla.py
forward ¶
Forward pass with attention, CNN-LSTM, and ODE integration.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
x |
Tensor
|
Input sequence of shape (batch, seq_len, input_dim) |
required |
t |
Optional[Tensor]
|
Time points of shape (batch, seq_len) or (seq_len,) |
None
|
Returns:
| Type | Description |
|---|---|
Tensor
|
Predictions of shape (batch, seq_len, output_dim) |
Source code in src/models/acla.py
AttentionLayer ¶
Bases: Module
Temporal attention layer for sequence data.
Applies attention mechanism across timesteps to focus on important parts of the sequence.
Initialize attention layer.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
input_dim |
int
|
Input feature dimension |
required |
hidden_dim |
int
|
Hidden dimension for attention (must be divisible by num_heads) |
64
|
num_heads |
int
|
Number of attention heads |
4
|
Source code in src/models/acla.py
Functions¶
forward ¶
Apply attention to input sequence.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
x |
Tensor
|
Input tensor of shape (batch, seq_len, input_dim) |
required |
Returns:
| Name | Type | Description |
|---|---|---|
attended |
Attended sequence (batch, seq_len, input_dim) |
|
attn_weights |
Attention weights (batch, seq_len, seq_len) |
Source code in src/models/acla.py
CNNLSTMEncoder ¶
Bases: Module
CNN-LSTM encoder for feature extraction from sequences.
Uses 1D CNN to extract local patterns, then LSTM to capture temporal dependencies.
Initialize CNN-LSTM encoder.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
input_dim |
int
|
Input feature dimension |
required |
cnn_filters |
Optional[List[int]]
|
List of CNN filter sizes [first_layer, second_layer] |
None
|
lstm_hidden |
int
|
LSTM hidden dimension |
64
|
Source code in src/models/acla.py
Functions¶
forward ¶
Encode input sequence.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
x |
Tensor
|
Input tensor of shape (batch, seq_len, input_dim) |
required |
Returns:
| Type | Description |
|---|---|
Tensor
|
Encoded representation of shape (batch, seq_len, lstm_hidden) |
Source code in src/models/acla.py
ODEFunc ¶
Bases: Module
Neural network defining ODE dynamics with augmented dimensions.
Implements ANODE (Augmented Neural ODE) by including augmented dimensions in the state space.
Initialize ODE function.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
hidden_dim |
int
|
Dimension of hidden state |
required |
augment_dim |
int
|
Number of augmented dimensions |
20
|
ode_hidden_dim |
int
|
Hidden dimension in ODE network |
128
|
Source code in src/models/acla.py
Functions¶
forward ¶
Compute dy/dt.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
t |
Tensor
|
Current time (scalar or batch) |
required |
y |
Tensor
|
Current state of shape (batch, hidden_dim + augment_dim) |
required |
Returns:
| Type | Description |
|---|---|
Tensor
|
dy/dt of shape (batch, hidden_dim + augment_dim) |