Skip to content

Models API Reference

The models module contains the model zoo, including gradient boosting (LGBMModel), deep learning (MLPModel, LSTMAttentionModel), and continuous-time models (NeuralODEModel). All models follow a consistent interface for fitting and predicting Sample objects.

Usage Example

from src.models.neural_ode import NeuralODEModel

model = NeuralODEModel(input_dim=5, latent_dim=32, solver='dopri5')
# Trainer will handle the forward/backward passes

Special Models

ACLA (Attention-CNN-LSTM-ANODE)

The ACLAModel is a hybrid architecture that integrates attention for feature weighting, CNN-LSTM for hierarchical temporal feature extraction, and Augmented Neural ODEs for continuous-time degradation modeling.

For a deep dive into the theory, see the ACLA Model Theory page.

base

Base class for models.

Classes

BaseModel

BaseModel(input_dim: int, output_dim: int = 1)

Bases: ABC, Module

Abstract base for all neural network models.

All models must: 1. Accept Sample.x as input (tensor or dict) 2. Return predictions compatible with Sample.y 3. Optionally accept Sample.t for time-aware models 4. Provide explain() hook for interpretability

Note: LightGBM is a special case that doesn't inherit from nn.Module but follows the same interface pattern.

Initialize the model.

Parameters:

Name Type Description Default
input_dim int

Number of input features

required
output_dim int

Number of output predictions (default: 1 for SOH)

1
Source code in src/models/base.py
def __init__(self, input_dim: int, output_dim: int = 1):
    """Initialize the model.

    Args:
        input_dim: Number of input features
        output_dim: Number of output predictions (default: 1 for SOH)
    """
    super().__init__()
    self.input_dim = input_dim
    self.output_dim = output_dim
Functions
count_parameters
count_parameters() -> int

Count trainable parameters.

Returns:

Type Description
int

Number of trainable parameters

Source code in src/models/base.py
def count_parameters(self) -> int:
    """Count trainable parameters.

    Returns:
        Number of trainable parameters
    """
    return sum(p.numel() for p in self.parameters() if p.requires_grad)
explain
explain(x: torch.Tensor, **kwargs) -> Dict[str, Any]

Return interpretability information.

Override in subclasses to provide model-specific explanations (attention weights, SHAP values, etc.).

Parameters:

Name Type Description Default
x Tensor

Input features

required
**kwargs

Additional arguments

{}

Returns:

Type Description
Dict[str, Any]

Dictionary with explanation data

Source code in src/models/base.py
def explain(self, x: torch.Tensor, **kwargs) -> Dict[str, Any]:
    """Return interpretability information.

    Override in subclasses to provide model-specific explanations
    (attention weights, SHAP values, etc.).

    Args:
        x: Input features
        **kwargs: Additional arguments

    Returns:
        Dictionary with explanation data
    """
    return {}
forward abstractmethod
forward(x: torch.Tensor, t: Optional[torch.Tensor] = None, **kwargs) -> torch.Tensor

Forward pass.

Parameters:

Name Type Description Default
x Tensor

Features tensor (batch, features) or (batch, seq_len, features)

required
t Optional[Tensor]

Optional time tensor for ODE models (batch, seq_len)

None
**kwargs

Additional model-specific arguments

{}

Returns:

Name Type Description
predictions Tensor

Tensor of shape (batch, output_dim)

Source code in src/models/base.py
@abstractmethod
def forward(self, x: torch.Tensor, t: Optional[torch.Tensor] = None, 
            **kwargs) -> torch.Tensor:
    """Forward pass.

    Args:
        x: Features tensor (batch, features) or (batch, seq_len, features)
        t: Optional time tensor for ODE models (batch, seq_len)
        **kwargs: Additional model-specific arguments

    Returns:
        predictions: Tensor of shape (batch, output_dim)
    """
    pass
predict
predict(x: torch.Tensor, **kwargs) -> np.ndarray

Inference with no gradient computation.

Parameters:

Name Type Description Default
x Tensor

Input features tensor

required
**kwargs

Additional arguments passed to forward

{}

Returns:

Type Description
ndarray

Numpy array of predictions

Source code in src/models/base.py
def predict(self, x: torch.Tensor, **kwargs) -> np.ndarray:
    """Inference with no gradient computation.

    Args:
        x: Input features tensor
        **kwargs: Additional arguments passed to forward

    Returns:
        Numpy array of predictions
    """
    self.eval()
    with torch.no_grad():
        return self.forward(x, **kwargs).cpu().numpy()

registry

Registry for models.

Classes

ModelRegistry

Registry for model classes.

Example usage

@ModelRegistry.register("mlp") ... class MLPModel(BaseModel): ... pass

model = ModelRegistry.get("mlp", input_dim=10, hidden_dims=[64, 32])

Functions
get classmethod
get(name: str, **kwargs) -> Union[BaseModel, object]

Get a model instance by name.

Parameters:

Name Type Description Default
name str

Registry name of the model

required
**kwargs

Arguments to pass to model constructor

{}

Returns:

Type Description
Union[BaseModel, object]

Model instance

Raises:

Type Description
ValueError

If model name is not found

Source code in src/models/registry.py
@classmethod
def get(cls, name: str, **kwargs) -> Union[BaseModel, object]:
    """Get a model instance by name.

    Args:
        name: Registry name of the model
        **kwargs: Arguments to pass to model constructor

    Returns:
        Model instance

    Raises:
        ValueError: If model name is not found
    """
    if name not in cls._models:
        available = list(cls._models.keys())
        raise ValueError(f"Unknown model: {name}. Available: {available}")
    return cls._models[name](**kwargs)
get_class classmethod
get_class(name: str) -> Optional[Type]

Get model class by name (without instantiating).

Parameters:

Name Type Description Default
name str

Registry name of the model

required

Returns:

Type Description
Optional[Type]

Model class or None if not found

Source code in src/models/registry.py
@classmethod
def get_class(cls, name: str) -> Optional[Type]:
    """Get model class by name (without instantiating).

    Args:
        name: Registry name of the model

    Returns:
        Model class or None if not found
    """
    return cls._models.get(name)
list_available classmethod
list_available() -> list

List all registered model names.

Returns:

Type Description
list

List of model names

Source code in src/models/registry.py
@classmethod
def list_available(cls) -> list:
    """List all registered model names.

    Returns:
        List of model names
    """
    return list(cls._models.keys())
register classmethod
register(name: str)

Decorator to register a model class.

Parameters:

Name Type Description Default
name str

Registry name for the model

required

Returns:

Type Description

Decorator function

Source code in src/models/registry.py
@classmethod
def register(cls, name: str):
    """Decorator to register a model class.

    Args:
        name: Registry name for the model

    Returns:
        Decorator function
    """
    def decorator(model_class):
        cls._models[name] = model_class
        model_class.name = name
        return model_class
    return decorator

lgbm

LightGBM wrapper compatible with pipeline interface.

Classes

LGBMModel

LGBMModel(input_dim: int = None, output_dim: int = 1, n_estimators: int = 1000, learning_rate: float = 0.05, max_depth: int = 6, num_leaves: int = 31, reg_alpha: float = 0.1, reg_lambda: float = 0.1, early_stopping_rounds: int = 50, random_state: int = 42)

LightGBM wrapper compatible with pipeline.

Not a nn.Module (sklearn-style interface), but follows same pattern. Excellent for: - Fast baselines - SHAP interpretability - Summary + ICA peak features

Example usage

model = LGBMModel(n_estimators=500) model.fit(X_train, y_train, X_val, y_val) predictions = model.predict(X_test)

Initialize the model.

Parameters:

Name Type Description Default
input_dim int

Input dimension (unused, for interface compatibility)

None
output_dim int

Output dimension

1
n_estimators int

Number of boosting rounds

1000
learning_rate float

Learning rate

0.05
max_depth int

Maximum tree depth

6
num_leaves int

Maximum number of leaves

31
reg_alpha float

L1 regularization

0.1
reg_lambda float

L2 regularization

0.1
early_stopping_rounds int

Early stopping patience

50
random_state int

Random seed

42
Source code in src/models/lgbm.py
def __init__(self,
             input_dim: int = None,  # Unused, for interface compatibility
             output_dim: int = 1,
             n_estimators: int = 1000,
             learning_rate: float = 0.05,
             max_depth: int = 6,
             num_leaves: int = 31,
             reg_alpha: float = 0.1,
             reg_lambda: float = 0.1,
             early_stopping_rounds: int = 50,
             random_state: int = 42):
    """Initialize the model.

    Args:
        input_dim: Input dimension (unused, for interface compatibility)
        output_dim: Output dimension
        n_estimators: Number of boosting rounds
        learning_rate: Learning rate
        max_depth: Maximum tree depth
        num_leaves: Maximum number of leaves
        reg_alpha: L1 regularization
        reg_lambda: L2 regularization
        early_stopping_rounds: Early stopping patience
        random_state: Random seed
    """
    if not HAS_LIGHTGBM:
        raise ImportError("LightGBM required. Install with: pip install lightgbm")

    self.params = {
        'objective': 'regression',
        'metric': 'rmse',
        'n_estimators': n_estimators,
        'learning_rate': learning_rate,
        'max_depth': max_depth,
        'num_leaves': num_leaves,
        'reg_alpha': reg_alpha,
        'reg_lambda': reg_lambda,
        'verbosity': -1,
        'random_state': random_state,
    }
    self.early_stopping_rounds = early_stopping_rounds
    self.model: Optional[lgb.LGBMRegressor] = None
    self.feature_names_: List[str] = []
    self._use_feature_names: bool = False  # Track if feature names were explicitly provided
    self.input_dim = input_dim
    self.output_dim = output_dim
Attributes
feature_importances_ property
feature_importances_: ndarray

Get feature importances.

Functions
explain
explain(X: np.ndarray) -> Dict[str, Any]

Return SHAP values for interpretability.

Parameters:

Name Type Description Default
X ndarray

Input features

required

Returns:

Type Description
Dict[str, Any]

Dictionary with SHAP values, expected value, feature names

Source code in src/models/lgbm.py
def explain(self, X: np.ndarray) -> Dict[str, Any]:
    """Return SHAP values for interpretability.

    Args:
        X: Input features

    Returns:
        Dictionary with SHAP values, expected value, feature names
    """
    try:
        import shap
        explainer = shap.TreeExplainer(self.model)
        shap_values = explainer.shap_values(X)
        return {
            'shap_values': shap_values,
            'expected_value': explainer.expected_value,
            'feature_names': self.feature_names_,
        }
    except ImportError:
        return {'error': 'shap not installed'}
    except Exception as e:
        return {'error': str(e)}
fit
fit(X: np.ndarray, y: np.ndarray, X_val: Optional[np.ndarray] = None, y_val: Optional[np.ndarray] = None, feature_names: Optional[List[str]] = None) -> LGBMModel

Fit the model.

Parameters:

Name Type Description Default
X ndarray

Training features

required
y ndarray

Training targets

required
X_val Optional[ndarray]

Validation features (for early stopping)

None
y_val Optional[ndarray]

Validation targets

None
feature_names Optional[List[str]]

Feature names for interpretability

None

Returns:

Type Description
LGBMModel

self

Source code in src/models/lgbm.py
def fit(self, X: np.ndarray, y: np.ndarray, 
        X_val: Optional[np.ndarray] = None,
        y_val: Optional[np.ndarray] = None,
        feature_names: Optional[List[str]] = None) -> 'LGBMModel':
    """Fit the model.

    Args:
        X: Training features
        y: Training targets
        X_val: Validation features (for early stopping)
        y_val: Validation targets
        feature_names: Feature names for interpretability

    Returns:
        self
    """
    # Store original feature names for interpretability
    original_feature_names = feature_names or [f"f{i}" for i in range(X.shape[1])]
    # Sanitize feature names for LightGBM compatibility
    self.feature_names_ = self._sanitize_feature_names(original_feature_names)
    self._use_feature_names = feature_names is not None

    # Convert to pandas DataFrames if feature names are provided to avoid warnings
    if self._use_feature_names:
        X_df = pd.DataFrame(X, columns=self.feature_names_)
        X_val_df = pd.DataFrame(X_val, columns=self.feature_names_) if X_val is not None else None
    else:
        X_df = X
        X_val_df = X_val

    self.model = lgb.LGBMRegressor(**self.params)

    eval_set = [(X_val_df, y_val.ravel())] if X_val is not None else None
    callbacks = [lgb.early_stopping(self.early_stopping_rounds)] if eval_set else None

    self.model.fit(
        X_df, y.ravel(),
        eval_set=eval_set,
        callbacks=callbacks,
    )

    logger.info(f"Trained LGBM with {self.model.n_estimators_} estimators")

    return self
load
load(path: str) -> LGBMModel

Load model from file.

Parameters:

Name Type Description Default
path str

Path to load model from

required

Returns:

Type Description
LGBMModel

self

Source code in src/models/lgbm.py
def load(self, path: str) -> 'LGBMModel':
    """Load model from file.

    Args:
        path: Path to load model from

    Returns:
        self
    """
    self.model = lgb.LGBMRegressor(**self.params)
    self.model.booster_ = lgb.Booster(model_file=path)
    return self
predict
predict(X: np.ndarray) -> np.ndarray

Predict.

Parameters:

Name Type Description Default
X ndarray

Input features

required

Returns:

Type Description
ndarray

Predictions shaped (n_samples, 1)

Source code in src/models/lgbm.py
def predict(self, X: np.ndarray) -> np.ndarray:
    """Predict.

    Args:
        X: Input features

    Returns:
        Predictions shaped (n_samples, 1)
    """
    if self.model is None:
        raise RuntimeError("Model not fitted. Call fit() first.")

    # Use DataFrame if feature names were explicitly provided during fit
    if self._use_feature_names:
        X_df = pd.DataFrame(X, columns=self.feature_names_)
        return self.model.predict(X_df).reshape(-1, 1)
    else:
        return self.model.predict(X).reshape(-1, 1)
save
save(path: str) -> None

Save model to file.

Parameters:

Name Type Description Default
path str

Path to save model

required
Source code in src/models/lgbm.py
def save(self, path: str) -> None:
    """Save model to file.

    Args:
        path: Path to save model
    """
    if self.model is None:
        raise RuntimeError("Model not fitted.")
    self.model.booster_.save_model(path)

mlp

Simple MLP baseline model.

Classes

MLPModel

MLPModel(input_dim: int, output_dim: int = 1, hidden_dims: List[int] = None, dropout: float = 0.1, activation: str = 'relu')

Bases: BaseModel

Simple MLP baseline for tabular features.

Example usage

model = MLPModel(input_dim=10, hidden_dims=[64, 32]) output = model(x)

Initialize the model.

Parameters:

Name Type Description Default
input_dim int

Number of input features

required
output_dim int

Number of outputs

1
hidden_dims List[int]

List of hidden layer dimensions

None
dropout float

Dropout rate

0.1
activation str

Activation function ("relu" or "tanh")

'relu'
Source code in src/models/mlp.py
def __init__(self,
             input_dim: int,
             output_dim: int = 1,
             hidden_dims: List[int] = None,
             dropout: float = 0.1,
             activation: str = "relu"):
    """Initialize the model.

    Args:
        input_dim: Number of input features
        output_dim: Number of outputs
        hidden_dims: List of hidden layer dimensions
        dropout: Dropout rate
        activation: Activation function ("relu" or "tanh")
    """
    super().__init__(input_dim, output_dim)

    if hidden_dims is None:
        hidden_dims = [64, 32]

    layers = []
    prev_dim = input_dim

    for hidden_dim in hidden_dims:
        layers.extend([
            nn.Linear(prev_dim, hidden_dim),
            nn.ReLU() if activation == "relu" else nn.Tanh(),
            nn.Dropout(dropout),
        ])
        prev_dim = hidden_dim

    layers.append(nn.Linear(prev_dim, output_dim))

    self.net = nn.Sequential(*layers)

    # Initialize weights
    self._init_weights()
Functions
forward
forward(x: torch.Tensor, t: Optional[torch.Tensor] = None, **kwargs) -> torch.Tensor

Forward pass.

Parameters:

Name Type Description Default
x Tensor

Input tensor of shape (batch, input_dim)

required
t Optional[Tensor]

Ignored (for interface compatibility)

None

Returns:

Type Description
Tensor

Output tensor of shape (batch, output_dim)

Source code in src/models/mlp.py
def forward(self, x: torch.Tensor, t: Optional[torch.Tensor] = None,
            **kwargs) -> torch.Tensor:
    """Forward pass.

    Args:
        x: Input tensor of shape (batch, input_dim)
        t: Ignored (for interface compatibility)

    Returns:
        Output tensor of shape (batch, output_dim)
    """
    return self.net(x)

lstm_attn

BiLSTM with self-attention for sequence modeling.

Classes

LSTMAttentionModel

LSTMAttentionModel(input_dim: int, output_dim: int = 1, hidden_dim: int = 64, num_layers: int = 2, num_heads: int = 4, dropout: float = 0.1)

Bases: BaseModel

BiLSTM with self-attention for sequence modeling.

Suitable for: - Variable length sequences - Capturing long-range dependencies - Interpretable attention weights

Example usage

model = LSTMAttentionModel(input_dim=5, hidden_dim=64) output = model(x) # x shape: (batch, seq_len, input_dim)

Initialize the model.

Parameters:

Name Type Description Default
input_dim int

Number of input features per timestep

required
output_dim int

Number of outputs

1
hidden_dim int

LSTM hidden dimension

64
num_layers int

Number of LSTM layers

2
num_heads int

Number of attention heads

4
dropout float

Dropout rate

0.1
Source code in src/models/lstm_attn.py
def __init__(self,
             input_dim: int,
             output_dim: int = 1,
             hidden_dim: int = 64,
             num_layers: int = 2,
             num_heads: int = 4,
             dropout: float = 0.1):
    """Initialize the model.

    Args:
        input_dim: Number of input features per timestep
        output_dim: Number of outputs
        hidden_dim: LSTM hidden dimension
        num_layers: Number of LSTM layers
        num_heads: Number of attention heads
        dropout: Dropout rate
    """
    super().__init__(input_dim, output_dim)

    self.hidden_dim = hidden_dim

    self.lstm = nn.LSTM(
        input_dim, hidden_dim,
        num_layers=num_layers,
        batch_first=True,
        bidirectional=True,
        dropout=dropout if num_layers > 1 else 0,
    )

    self.attention = nn.MultiheadAttention(
        hidden_dim * 2, num_heads,
        batch_first=True,
        dropout=dropout,
    )

    self.layer_norm = nn.LayerNorm(hidden_dim * 2)

    self.fc = nn.Sequential(
        nn.Linear(hidden_dim * 2, hidden_dim),
        nn.ReLU(),
        nn.Dropout(dropout),
        nn.Linear(hidden_dim, output_dim),
    )

    self._last_attn_weights: Optional[torch.Tensor] = None
Functions
explain
explain(x: torch.Tensor, **kwargs) -> Dict[str, Any]

Return attention weights.

Parameters:

Name Type Description Default
x Tensor

Input tensor

required

Returns:

Type Description
Dict[str, Any]

Dictionary with attention weights

Source code in src/models/lstm_attn.py
def explain(self, x: torch.Tensor, **kwargs) -> Dict[str, Any]:
    """Return attention weights.

    Args:
        x: Input tensor

    Returns:
        Dictionary with attention weights
    """
    _ = self.forward(x)
    return {
        'attention_weights': self._last_attn_weights.cpu().numpy() if self._last_attn_weights is not None else None,
    }
forward
forward(x: torch.Tensor, t: Optional[torch.Tensor] = None, mask: Optional[torch.Tensor] = None, **kwargs) -> torch.Tensor

Forward pass.

Parameters:

Name Type Description Default
x Tensor

Input tensor of shape (batch, seq_len, input_dim)

required
t Optional[Tensor]

Ignored (for interface compatibility)

None
mask Optional[Tensor]

Optional attention mask

None

Returns:

Type Description
Tensor

Output tensor of shape (batch, output_dim)

Source code in src/models/lstm_attn.py
def forward(self, x: torch.Tensor, t: Optional[torch.Tensor] = None,
            mask: Optional[torch.Tensor] = None,
            **kwargs) -> torch.Tensor:
    """Forward pass.

    Args:
        x: Input tensor of shape (batch, seq_len, input_dim)
        t: Ignored (for interface compatibility)
        mask: Optional attention mask

    Returns:
        Output tensor of shape (batch, output_dim)
    """
    # x: (batch, seq_len, input_dim)
    lstm_out, _ = self.lstm(x)  # (batch, seq_len, hidden*2)

    # Self-attention
    attn_out, attn_weights = self.attention(lstm_out, lstm_out, lstm_out, 
                                              key_padding_mask=mask)
    self._last_attn_weights = attn_weights.detach()

    # Residual connection + layer norm
    attn_out = self.layer_norm(attn_out + lstm_out)

    # Decode all timesteps
    out = self.fc(attn_out)
    return out

neural_ode

Neural ODE for continuous-time degradation modeling.

Classes

NeuralODEModel

NeuralODEModel(input_dim: int, output_dim: int = 1, latent_dim: int = 32, hidden_dim: int = 64, solver: str = 'dopri5', rtol: float = 0.001, atol: float = 0.0001, use_adjoint: bool = False)

Bases: BaseModel

Latent ODE for continuous-time degradation modeling.

Architecture: 1. Encoder: maps x_0 → z_0 (initial latent state) 2. ODE: integrates dz/dt = f(z,t) from t_0 to t_N 3. Decoder: maps z_N → y (SOH prediction)

Uses actual time values from Sample.t for integration.

Example usage

model = NeuralODEModel(input_dim=5, latent_dim=32) output = model(x, t=t) # x: (batch, seq_len, features), t: (seq_len,)

Initialize the model.

Parameters:

Name Type Description Default
input_dim int

Number of input features per timestep

required
output_dim int

Number of output predictions

1
latent_dim int

Dimension of latent ODE state

32
hidden_dim int

Hidden dimension in networks

64
solver str

ODE solver ("dopri5", "euler", "rk4", etc.)

'dopri5'
rtol float

Relative tolerance for solver (default 1e-3 for relaxed tolerance, optimal for accuracy based on benchmarks)

0.001
atol float

Absolute tolerance for solver (default 1e-4 for relaxed tolerance, optimal for accuracy based on benchmarks)

0.0001
use_adjoint bool

Use adjoint method for memory-efficient gradients. Default False (direct backprop) gives better accuracy and is faster for short sequences. Set True only for very long sequences or memory-constrained scenarios.

False
Source code in src/models/neural_ode.py
def __init__(self,
             input_dim: int,
             output_dim: int = 1,
             latent_dim: int = 32,
             hidden_dim: int = 64,
             solver: str = "dopri5",
             rtol: float = 1e-3,
             atol: float = 1e-4,
             use_adjoint: bool = False):
    """Initialize the model.

    Args:
        input_dim: Number of input features per timestep
        output_dim: Number of output predictions
        latent_dim: Dimension of latent ODE state
        hidden_dim: Hidden dimension in networks
        solver: ODE solver ("dopri5", "euler", "rk4", etc.)
        rtol: Relative tolerance for solver (default 1e-3 for relaxed tolerance,
              optimal for accuracy based on benchmarks)
        atol: Absolute tolerance for solver (default 1e-4 for relaxed tolerance,
              optimal for accuracy based on benchmarks)
        use_adjoint: Use adjoint method for memory-efficient gradients.
                     Default False (direct backprop) gives better accuracy and is faster
                     for short sequences. Set True only for very long sequences or
                     memory-constrained scenarios.
    """
    super().__init__(input_dim, output_dim)

    if not HAS_TORCHDIFFEQ:
        raise ImportError("torchdiffeq required. Install with: pip install torchdiffeq")

    self.latent_dim = latent_dim
    self.solver = solver
    self.rtol = rtol
    self.atol = atol
    self.use_adjoint = use_adjoint

    # Encoder: x_0 → z_0
    self.encoder = nn.Sequential(
        nn.Linear(input_dim, hidden_dim),
        nn.Tanh(),
        nn.Linear(hidden_dim, hidden_dim),
        nn.Tanh(),
        nn.Linear(hidden_dim, latent_dim),
    )

    # ODE function
    self.ode_func = ODEFunc(latent_dim, hidden_dim)

    # Decoder: z_N → y
    self.decoder = nn.Sequential(
        nn.Linear(latent_dim, hidden_dim),
        nn.Tanh(),
        nn.Linear(hidden_dim, hidden_dim // 2),
        nn.Tanh(),
        nn.Linear(hidden_dim // 2, output_dim),
    )

    self._trajectory: Optional[torch.Tensor] = None
Functions
explain
explain(x: torch.Tensor, t: Optional[torch.Tensor] = None, **kwargs) -> Dict[str, Any]

Return latent trajectory for visualization.

Parameters:

Name Type Description Default
x Tensor

Input sequence

required
t Optional[Tensor]

Time points

None

Returns:

Type Description
Dict[str, Any]

Dictionary with latent trajectory

Source code in src/models/neural_ode.py
def explain(self, x: torch.Tensor, t: Optional[torch.Tensor] = None,
            **kwargs) -> Dict[str, Any]:
    """Return latent trajectory for visualization.

    Args:
        x: Input sequence
        t: Time points

    Returns:
        Dictionary with latent trajectory
    """
    _ = self.forward(x, t)
    return {
        'trajectory': self._trajectory.cpu().numpy() if self._trajectory is not None else None,
        'latent_dim': self.latent_dim,
    }
forward
forward(x: torch.Tensor, t: Optional[torch.Tensor] = None, **kwargs) -> torch.Tensor

Forward pass with ODE integration.

Parameters:

Name Type Description Default
x Tensor

Input sequence of shape (batch, seq_len, input_dim)

required
t Optional[Tensor]

Time points of shape (batch, seq_len) or (seq_len,)

None

Returns:

Type Description
Tensor

Predictions of shape (batch, output_dim)

Source code in src/models/neural_ode.py
def forward(self, x: torch.Tensor, t: Optional[torch.Tensor] = None,
            **kwargs) -> torch.Tensor:
    """Forward pass with ODE integration.

    Args:
        x: Input sequence of shape (batch, seq_len, input_dim)
        t: Time points of shape (batch, seq_len) or (seq_len,)

    Returns:
        Predictions of shape (batch, output_dim)
    """
    batch_size = x.shape[0]

    # Encode initial state
    z0 = self.encoder(x[:, 0, :])  # (batch, latent_dim)

    # Get integration times
    if t is None:
        # Default: uniform time points
        t_span = torch.linspace(0, 1, x.shape[1], device=x.device)
    else:
        # Use provided times (normalize to start at 0)
        if t.dim() == 2:
            t_span = t[0]  # Assume same times across batch
        else:
            t_span = t
        t_span = t_span - t_span[0]

    # Integrate ODE
    odeint_fn = odeint_adjoint if self.use_adjoint else odeint

    trajectory = odeint_fn(
        self.ode_func,
        z0,
        t_span,
        method=self.solver,
        rtol=self.rtol,
        atol=self.atol,
    )  # (seq_len, batch, latent_dim)

    self._trajectory = trajectory.detach()

    # Decode all timesteps (sequence-to-sequence)
    traj = trajectory.permute(1, 0, 2)  # (batch, seq_len, latent_dim)
    batch_size, seq_len, _ = traj.shape

    # Flatten, decode, reshape
    traj_flat = traj.reshape(-1, self.latent_dim)
    y_flat = self.decoder(traj_flat)
    y = y_flat.reshape(batch_size, seq_len, -1)

    return y
forward_trajectory
forward_trajectory(x: torch.Tensor, t: Optional[torch.Tensor] = None, **kwargs) -> torch.Tensor

Get predictions at all timesteps.

Parameters:

Name Type Description Default
x Tensor

Input sequence

required
t Optional[Tensor]

Time points

None

Returns:

Type Description
Tensor

Predictions at all timesteps of shape (batch, seq_len, output_dim)

Source code in src/models/neural_ode.py
def forward_trajectory(self, x: torch.Tensor, t: Optional[torch.Tensor] = None,
                       **kwargs) -> torch.Tensor:
    """Get predictions at all timesteps.

    Args:
        x: Input sequence
        t: Time points

    Returns:
        Predictions at all timesteps of shape (batch, seq_len, output_dim)
    """
    _ = self.forward(x, t)

    if self._trajectory is None:
        raise RuntimeError("Forward pass failed")

    # Decode at all timesteps
    traj = self._trajectory.permute(1, 0, 2)  # (batch, seq_len, latent_dim)
    batch_size, seq_len, _ = traj.shape

    # Flatten, decode, reshape
    traj_flat = traj.reshape(-1, self.latent_dim)
    y_flat = self.decoder(traj_flat)
    y = y_flat.reshape(batch_size, seq_len, -1)

    return y

ODEFunc

ODEFunc(latent_dim: int, hidden_dim: int = 64)

Bases: Module

Neural network defining dz/dt = f(z, t).

The ODE function takes the current state z and time t, and returns the rate of change dz/dt.

Initialize the ODE function.

Parameters:

Name Type Description Default
latent_dim int

Dimension of the latent state

required
hidden_dim int

Hidden layer dimension

64
Source code in src/models/neural_ode.py
def __init__(self, latent_dim: int, hidden_dim: int = 64):
    """Initialize the ODE function.

    Args:
        latent_dim: Dimension of the latent state
        hidden_dim: Hidden layer dimension
    """
    super().__init__()
    self.net = nn.Sequential(
        nn.Linear(latent_dim + 1, hidden_dim),  # +1 for time
        nn.Tanh(),
        nn.Linear(hidden_dim, hidden_dim),
        nn.Tanh(),
        nn.Linear(hidden_dim, latent_dim),
    )
Functions
forward
forward(t: torch.Tensor, z: torch.Tensor) -> torch.Tensor

Compute dz/dt.

Parameters:

Name Type Description Default
t Tensor

Current time (scalar or batch)

required
z Tensor

Current state of shape (batch, latent_dim)

required

Returns:

Type Description
Tensor

dz/dt of shape (batch, latent_dim)

Source code in src/models/neural_ode.py
def forward(self, t: torch.Tensor, z: torch.Tensor) -> torch.Tensor:
    """Compute dz/dt.

    Args:
        t: Current time (scalar or batch)
        z: Current state of shape (batch, latent_dim)

    Returns:
        dz/dt of shape (batch, latent_dim)
    """
    # Concatenate time as feature
    if t.dim() == 0:
        t_expanded = t.expand(z.shape[0], 1)
    else:
        t_expanded = t.unsqueeze(-1) if t.dim() == 1 else t

    zt = torch.cat([z, t_expanded], dim=-1)
    return self.net(zt)

acla

ACLA: Attention-CNN-LSTM-ANODE hybrid model for sequence-based degradation prediction.

Classes

ACLAModel

ACLAModel(input_dim: int, output_dim: int = 1, hidden_dim: int = 64, augment_dim: int = 20, cnn_filters: Optional[List[int]] = None, solver: str = 'dopri5', rtol: float = 0.0001, atol: float = 1e-05, use_adjoint: bool = False)

Bases: BaseModel

ACLA: Attention-CNN-LSTM-ANODE hybrid model.

Architecture: 1. Attention: Temporal attention across sequence timesteps 2. CNN-LSTM: Feature extraction and temporal modeling 3. ANODE: Augmented Neural ODE for continuous-time dynamics 4. Output: Sequence-to-sequence predictions

Suitable for: - Complex sequence-based degradation prediction - Multi-target prediction (e.g., LAM_NE, LAM_PE, LLI) - Understanding which timesteps are important (attention) - Continuous-time trajectory modeling

Example usage

model = ACLAModel(input_dim=20, output_dim=3, hidden_dim=64) output = model(x, t=t) # x: (batch, seq_len, features), t: (seq_len,)

Initialize the ACLA model.

Parameters:

Name Type Description Default
input_dim int

Number of input features per timestep

required
output_dim int

Number of output predictions

1
hidden_dim int

Hidden dimension for LSTM and ODE

64
augment_dim int

Number of augmented dimensions for ANODE

20
cnn_filters Optional[List[int]]

List of CNN filter sizes [first, second]

None
solver str

ODE solver ("dopri5", "euler", "rk4", etc.)

'dopri5'
rtol float

Relative tolerance for solver

0.0001
atol float

Absolute tolerance for solver

1e-05
use_adjoint bool

Use adjoint method for memory-efficient gradients. Default False (direct backprop) gives better accuracy and is faster for short sequences. Set True only for very long sequences or memory-constrained scenarios.

False
Source code in src/models/acla.py
def __init__(self,
             input_dim: int,
             output_dim: int = 1,
             hidden_dim: int = 64,
             augment_dim: int = 20,
             cnn_filters: Optional[List[int]] = None,
             solver: str = "dopri5",
             rtol: float = 1e-4,
             atol: float = 1e-5,
             use_adjoint: bool = False):
    """Initialize the ACLA model.

    Args:
        input_dim: Number of input features per timestep
        output_dim: Number of output predictions
        hidden_dim: Hidden dimension for LSTM and ODE
        augment_dim: Number of augmented dimensions for ANODE
        cnn_filters: List of CNN filter sizes [first, second]
        solver: ODE solver ("dopri5", "euler", "rk4", etc.)
        rtol: Relative tolerance for solver
        atol: Absolute tolerance for solver
        use_adjoint: Use adjoint method for memory-efficient gradients.
                     Default False (direct backprop) gives better accuracy and is faster
                     for short sequences. Set True only for very long sequences or
                     memory-constrained scenarios.
    """
    super().__init__(input_dim, output_dim)

    if not HAS_TORCHDIFFEQ:
        raise ImportError("torchdiffeq required. Install with: pip install torchdiffeq")

    if cnn_filters is None:
        cnn_filters = [64, 32]

    self.hidden_dim = hidden_dim
    self.augment_dim = augment_dim
    self.solver = solver
    self.rtol = rtol
    self.atol = atol
    self.use_adjoint = use_adjoint

    # 1. Attention layer
    self.attention = AttentionLayer(input_dim, hidden_dim=64, num_heads=4)

    # 2. CNN-LSTM encoder
    self.encoder = CNNLSTMEncoder(
        input_dim=input_dim,
        cnn_filters=cnn_filters,
        lstm_hidden=hidden_dim
    )

    # 3. Linear layer to initialize ODE state
    self.fc_ode_init = nn.Linear(hidden_dim, hidden_dim)

    # 4. ODE function (with augmented dimensions)
    self.ode_func = ODEFunc(hidden_dim, augment_dim)

    # 5. Output layer
    self.fc_out = nn.Linear(hidden_dim + augment_dim, output_dim)

    self._trajectory: Optional[torch.Tensor] = None
    self._last_attn_weights: Optional[torch.Tensor] = None
Functions
explain
explain(x: torch.Tensor, t: Optional[torch.Tensor] = None, **kwargs) -> Dict[str, Any]

Return interpretability information.

Parameters:

Name Type Description Default
x Tensor

Input sequence

required
t Optional[Tensor]

Time points

None

Returns:

Type Description
Dict[str, Any]

Dictionary with attention weights and trajectory

Source code in src/models/acla.py
def explain(self, x: torch.Tensor, t: Optional[torch.Tensor] = None,
            **kwargs) -> Dict[str, Any]:
    """Return interpretability information.

    Args:
        x: Input sequence
        t: Time points

    Returns:
        Dictionary with attention weights and trajectory
    """
    _ = self.forward(x, t)
    return {
        'attention_weights': self._last_attn_weights.cpu().numpy() if self._last_attn_weights is not None else None,
        'trajectory': self._trajectory.cpu().numpy() if self._trajectory is not None else None,
        'hidden_dim': self.hidden_dim,
        'augment_dim': self.augment_dim,
    }
forward
forward(x: torch.Tensor, t: Optional[torch.Tensor] = None, **kwargs) -> torch.Tensor

Forward pass with attention, CNN-LSTM, and ODE integration.

Parameters:

Name Type Description Default
x Tensor

Input sequence of shape (batch, seq_len, input_dim)

required
t Optional[Tensor]

Time points of shape (batch, seq_len) or (seq_len,)

None

Returns:

Type Description
Tensor

Predictions of shape (batch, seq_len, output_dim)

Source code in src/models/acla.py
def forward(self, x: torch.Tensor, t: Optional[torch.Tensor] = None,
            **kwargs) -> torch.Tensor:
    """Forward pass with attention, CNN-LSTM, and ODE integration.

    Args:
        x: Input sequence of shape (batch, seq_len, input_dim)
        t: Time points of shape (batch, seq_len) or (seq_len,)

    Returns:
        Predictions of shape (batch, seq_len, output_dim)
    """
    batch_size, seq_len, _ = x.shape

    # 1. Apply attention
    x_attended, attn_weights = self.attention(x)
    self._last_attn_weights = attn_weights.detach()

    # 2. CNN-LSTM encoding
    encoded = self.encoder(x_attended)  # (batch, seq_len, hidden_dim)

    # 3. Use first timestep's encoding as ODE initial condition
    # This is more efficient: single ODE integration instead of seq_len calls
    z0 = self.fc_ode_init(encoded[:, 0, :])  # (batch, hidden_dim)

    # Augment with zeros for ANODE
    augment = torch.zeros(batch_size, self.augment_dim, device=x.device)
    y0 = torch.cat([z0, augment], dim=1)  # (batch, hidden_dim + augment_dim)

    # Get integration times
    if t is None:
        t_span = torch.linspace(0, 1, seq_len, device=x.device)
    else:
        if t.dim() == 2:
            t_span = t[0]  # Assume same times across batch
        else:
            t_span = t
        t_span = t_span - t_span[0]  # Normalize to start at 0

    # 4. Single ODE integration for entire trajectory (much more efficient)
    odeint_fn = odeint_adjoint if self.use_adjoint else odeint

    trajectory = odeint_fn(
        self.ode_func,
        y0,
        t_span,
        method=self.solver,
        rtol=self.rtol,
        atol=self.atol,
    )  # (seq_len, batch, hidden_dim + augment_dim)

    self._trajectory = trajectory.detach()

    # 5. Decode all timesteps
    traj = trajectory.permute(1, 0, 2)  # (batch, seq_len, hidden_dim + augment_dim)

    # Flatten, decode, reshape
    traj_flat = traj.reshape(-1, self.hidden_dim + self.augment_dim)
    y_flat = self.fc_out(traj_flat)
    output = y_flat.reshape(batch_size, seq_len, -1)

    return output

AttentionLayer

AttentionLayer(input_dim: int, hidden_dim: int = 64, num_heads: int = 4)

Bases: Module

Temporal attention layer for sequence data.

Applies attention mechanism across timesteps to focus on important parts of the sequence.

Initialize attention layer.

Parameters:

Name Type Description Default
input_dim int

Input feature dimension

required
hidden_dim int

Hidden dimension for attention (must be divisible by num_heads)

64
num_heads int

Number of attention heads

4
Source code in src/models/acla.py
def __init__(self, input_dim: int, hidden_dim: int = 64, num_heads: int = 4):
    """Initialize attention layer.

    Args:
        input_dim: Input feature dimension
        hidden_dim: Hidden dimension for attention (must be divisible by num_heads)
        num_heads: Number of attention heads
    """
    super().__init__()
    self.input_dim = input_dim
    self.hidden_dim = hidden_dim

    # Project input to hidden_dim (divisible by num_heads) before attention
    self.input_proj = nn.Linear(input_dim, hidden_dim)

    self.attention = nn.MultiheadAttention(
        hidden_dim, num_heads,
        batch_first=True,
        dropout=0.1,
    )

    # Project back to input_dim after attention
    self.output_proj = nn.Linear(hidden_dim, input_dim)

    self.layer_norm = nn.LayerNorm(input_dim)
Functions
forward
forward(x: torch.Tensor)

Apply attention to input sequence.

Parameters:

Name Type Description Default
x Tensor

Input tensor of shape (batch, seq_len, input_dim)

required

Returns:

Name Type Description
attended

Attended sequence (batch, seq_len, input_dim)

attn_weights

Attention weights (batch, seq_len, seq_len)

Source code in src/models/acla.py
def forward(self, x: torch.Tensor):
    """Apply attention to input sequence.

    Args:
        x: Input tensor of shape (batch, seq_len, input_dim)

    Returns:
        attended: Attended sequence (batch, seq_len, input_dim)
        attn_weights: Attention weights (batch, seq_len, seq_len)
    """
    # Project input to hidden_dim (divisible by num_heads)
    x_proj = self.input_proj(x)

    # Self-attention in hidden_dim space
    attn_out, attn_weights = self.attention(x_proj, x_proj, x_proj)

    # Project back to input_dim
    attn_out = self.output_proj(attn_out)

    # Residual connection + layer norm
    attended = self.layer_norm(attn_out + x)

    return attended, attn_weights

CNNLSTMEncoder

CNNLSTMEncoder(input_dim: int, cnn_filters: Optional[List[int]] = None, lstm_hidden: int = 64)

Bases: Module

CNN-LSTM encoder for feature extraction from sequences.

Uses 1D CNN to extract local patterns, then LSTM to capture temporal dependencies.

Initialize CNN-LSTM encoder.

Parameters:

Name Type Description Default
input_dim int

Input feature dimension

required
cnn_filters Optional[List[int]]

List of CNN filter sizes [first_layer, second_layer]

None
lstm_hidden int

LSTM hidden dimension

64
Source code in src/models/acla.py
def __init__(self, input_dim: int, cnn_filters: Optional[List[int]] = None,
             lstm_hidden: int = 64):
    """Initialize CNN-LSTM encoder.

    Args:
        input_dim: Input feature dimension
        cnn_filters: List of CNN filter sizes [first_layer, second_layer]
        lstm_hidden: LSTM hidden dimension
    """
    super().__init__()
    if cnn_filters is None:
        cnn_filters = [64, 32]
    self.input_dim = input_dim
    self.cnn_filters = cnn_filters
    self.lstm_hidden = lstm_hidden

    # 1D CNN layers: process features at each timestep
    self.cnn = nn.Sequential(
        nn.Conv1d(1, cnn_filters[0], kernel_size=3, padding=1),
        nn.ReLU(),
        nn.Conv1d(cnn_filters[0], cnn_filters[1], kernel_size=3, padding=1),
        nn.ReLU()
    )

    # LSTM for temporal modeling
    self.lstm = nn.LSTM(
        input_size=cnn_filters[1],
        hidden_size=lstm_hidden,
        num_layers=1,
        batch_first=True
    )
Functions
forward
forward(x: torch.Tensor) -> torch.Tensor

Encode input sequence.

Parameters:

Name Type Description Default
x Tensor

Input tensor of shape (batch, seq_len, input_dim)

required

Returns:

Type Description
Tensor

Encoded representation of shape (batch, seq_len, lstm_hidden)

Source code in src/models/acla.py
def forward(self, x: torch.Tensor) -> torch.Tensor:
    """Encode input sequence.

    Args:
        x: Input tensor of shape (batch, seq_len, input_dim)

    Returns:
        Encoded representation of shape (batch, seq_len, lstm_hidden)
    """
    batch_size, seq_len, feat_dim = x.size()

    # Reshape for CNN: treat each timestep independently
    # (batch * seq_len, 1, feat_dim)
    x_reshaped = x.view(batch_size * seq_len, 1, feat_dim)

    # CNN feature extraction
    cnn_out = self.cnn(x_reshaped)  # (batch * seq_len, cnn_filters[1], feat_dim)

    # Reshape for LSTM: (batch, seq_len, cnn_filters[1])
    # Average over feature dimension to get single value per timestep
    cnn_out = cnn_out.mean(dim=2)  # (batch * seq_len, cnn_filters[1])
    cnn_out = cnn_out.view(batch_size, seq_len, self.cnn_filters[1])

    # LSTM processing
    lstm_out, _ = self.lstm(cnn_out)  # (batch, seq_len, lstm_hidden)

    return lstm_out

ODEFunc

ODEFunc(hidden_dim: int, augment_dim: int = 20, ode_hidden_dim: int = 128)

Bases: Module

Neural network defining ODE dynamics with augmented dimensions.

Implements ANODE (Augmented Neural ODE) by including augmented dimensions in the state space.

Initialize ODE function.

Parameters:

Name Type Description Default
hidden_dim int

Dimension of hidden state

required
augment_dim int

Number of augmented dimensions

20
ode_hidden_dim int

Hidden dimension in ODE network

128
Source code in src/models/acla.py
def __init__(self, hidden_dim: int, augment_dim: int = 20, 
             ode_hidden_dim: int = 128):
    """Initialize ODE function.

    Args:
        hidden_dim: Dimension of hidden state
        augment_dim: Number of augmented dimensions
        ode_hidden_dim: Hidden dimension in ODE network
    """
    super().__init__()
    total_dim = hidden_dim + augment_dim

    self.net = nn.Sequential(
        nn.Linear(total_dim + 1, ode_hidden_dim),  # +1 for time
        nn.Tanh(),
        nn.Linear(ode_hidden_dim, ode_hidden_dim),
        nn.Tanh(),
        nn.Linear(ode_hidden_dim, total_dim),
    )

    # Initialize weights with small std for stable ODE integration
    for m in self.net.modules():
        if isinstance(m, nn.Linear):
            nn.init.normal_(m.weight, mean=0, std=0.01)
            nn.init.constant_(m.bias, val=0)
Functions
forward
forward(t: torch.Tensor, y: torch.Tensor) -> torch.Tensor

Compute dy/dt.

Parameters:

Name Type Description Default
t Tensor

Current time (scalar or batch)

required
y Tensor

Current state of shape (batch, hidden_dim + augment_dim)

required

Returns:

Type Description
Tensor

dy/dt of shape (batch, hidden_dim + augment_dim)

Source code in src/models/acla.py
def forward(self, t: torch.Tensor, y: torch.Tensor) -> torch.Tensor:
    """Compute dy/dt.

    Args:
        t: Current time (scalar or batch)
        y: Current state of shape (batch, hidden_dim + augment_dim)

    Returns:
        dy/dt of shape (batch, hidden_dim + augment_dim)
    """
    # Concatenate time as feature
    if t.dim() == 0:
        t_expanded = t.expand(y.shape[0], 1)
    else:
        t_expanded = t.unsqueeze(-1) if t.dim() == 1 else t

    yt = torch.cat([y, t_expanded], dim=-1)
    return self.net(yt)