API reference¶

Generated from docstrings and type hints in the gamfit source. See the topical guides for narrative explanations.

Top-level functions¶

fit ¶

fit(
    data: Any,
    formula: str,
    *,
    family: str = ...,
    offset: str | None = ...,
    weights: str | None = ...,
    transformation_normal: bool | None = ...,
    survival_likelihood: str | None = ...,
    baseline_target: str | None = ...,
    baseline_scale: float | None = ...,
    baseline_shape: float | None = ...,
    baseline_rate: float | None = ...,
    baseline_makeham: float | None = ...,
    z_column: str | None = ...,
    link: str | None = ...,
    logslope_formula: str | None = ...,
    frailty_kind: str | None = ...,
    frailty_sd: float | None = ...,
    hazard_loading: str | None = ...,
    scale_dimensions: bool | None = ...,
    adaptive_regularization: bool | None = ...,
    firth: bool | None = ...,
    precision_hyperpriors: Any | None = ...,
    response_geometry: None = ...,
    response_columns: list[str] | tuple[str, ...] | None = ...,
    response_coordinates: str | None = ...,
    response_reference: int | None = ...,
    config: dict[str, Any] | None = ...,
) -> Model

fit(
    data: Any,
    formula: str,
    *,
    family: str = ...,
    offset: str | None = ...,
    weights: str | None = ...,
    transformation_normal: bool | None = ...,
    survival_likelihood: str | None = ...,
    baseline_target: str | None = ...,
    baseline_scale: float | None = ...,
    baseline_shape: float | None = ...,
    baseline_rate: float | None = ...,
    baseline_makeham: float | None = ...,
    z_column: str | None = ...,
    link: str | None = ...,
    logslope_formula: str | None = ...,
    frailty_kind: str | None = ...,
    frailty_sd: float | None = ...,
    hazard_loading: str | None = ...,
    scale_dimensions: bool | None = ...,
    adaptive_regularization: bool | None = ...,
    firth: bool | None = ...,
    precision_hyperpriors: Any | None = ...,
    response_geometry: str,
    response_columns: list[str] | tuple[str, ...] | None = ...,
    response_coordinates: str | None = ...,
    response_reference: int | None = ...,
    config: dict[str, Any] | None = ...,
) -> ResponseGeometryModel

fit(
    data: Any,
    formula: str,
    *,
    family: str = "auto",
    offset: str | None = None,
    weights: str | None = None,
    transformation_normal: bool | None = None,
    survival_likelihood: str | None = None,
    baseline_target: str | None = None,
    baseline_scale: float | None = None,
    baseline_shape: float | None = None,
    baseline_rate: float | None = None,
    baseline_makeham: float | None = None,
    z_column: str | None = None,
    link: str | None = None,
    logslope_formula: str | None = None,
    frailty_kind: str | None = None,
    frailty_sd: float | None = None,
    hazard_loading: str | None = None,
    scale_dimensions: bool | None = None,
    adaptive_regularization: bool | None = None,
    firth: bool | None = None,
    precision_hyperpriors: Any | None = None,
    response_geometry: str | None = None,
    response_columns: list[str] | tuple[str, ...] | None = None,
    response_coordinates: str | None = None,
    response_reference: int | None = None,
    config: dict[str, Any] | None = None,
) -> Model | ResponseGeometryModel

Fit a GAM model from a formula and a tabular dataset.

Parameters:

Name	Type	Description	Default
`data`	`Any`	Input table. Accepts a pandas DataFrame, pyarrow Table, dict of columns, list of records, or any object normalize_table understands.	required
`formula`	`str`	Wilkinson-style formula string (e.g. `"y ~ s(x1) + te(x2, x3)"`).	required
`family`	`str`	Likelihood family, or `"auto"` to infer from the response. Corresponds to the `--family` CLI flag.	`'auto'`
`offset`	`str \| None`	Name of the offset column. Corresponds to `--offset-column`.	`None`
`weights`	`str \| None`	Name of the observation-weight column. Corresponds to `--weights-column`.	`None`
`transformation_normal`	`bool \| None`	Fit a conditional transformation-normal model (`h(Y\|x) ~ N(0,1))`). Corresponds to `--transformation-normal`.	`None`
`survival_likelihood`	`str \| None`	Survival likelihood formulation. One of `"transformation"`, `"weibull"`, `"location-scale"`, `"marginal-slope"`, `"latent"`, or `"latent-binary"`. Corresponds to `--survival-likelihood`.	`None`
`baseline_target`	`str \| None`	Parametric baseline target for survival models. One of `"linear"`, `"weibull"`, `"gompertz"`, `"gompertz-makeham"`. Corresponds to `--baseline-target`.	`None`
`baseline_scale`	`float \| None`	Weibull baseline scale (>0) when `baseline_target="weibull"`. Corresponds to `--baseline-scale`.	`None`
`baseline_shape`	`float \| None`	Weibull baseline shape (>0). Corresponds to `--baseline-shape`.	`None`
`baseline_rate`	`float \| None`	Gompertz hazard rate (>0) when `baseline_target` is `"gompertz"` or `"gompertz-makeham"`. Corresponds to `--baseline-rate`.	`None`
`baseline_makeham`	`float \| None`	Makeham additive hazard (>0) when `baseline_target="gompertz-makeham"`. Corresponds to `--baseline-makeham`.	`None`
`z_column`	`str \| None`	Name of the latent/observed z-score column used by score-warp families and latent transformation models. Corresponds to `--z-column`.	`None`
`link`	`str \| None`	Override the default link function. Corresponds to `--link`.	`None`
`logslope_formula`	`str \| None`	Secondary formula for the logslope / score-warp submodel. Corresponds to `--logslope-formula`.	`None`
`frailty_kind`	`str \| None`	Frailty family for frailty-aware survival models. One of `"gaussian-shift"` or `"hazard-multiplier"`. Corresponds to `--frailty-kind`.	`None`
`response_geometry`	`str \| None`	Optional manifold-valued response geometry. Use `"spherical"` for unit-sphere responses, or `"simplex"` / `"clr"` / `"alr"` for strictly positive compositional responses. The base point is the intrinsic Fréchet mean of the training responses, not an extrinsic arithmetic mean.	`None`
`response_columns`	`list[str] \| tuple[str, ...] \| None`	Sequence of response component columns used when `response_geometry` is set. One scalar Gaussian GAM is fitted for each tangent coordinate.	`None`
`response_coordinates`	`str \| None`	Coordinate chart for simplex responses: `"clr"` (default) or `"alr"`. Spherical responses always use ambient tangent coordinates.	`None`
`response_reference`	`int \| None`	Reference component for `"alr"` coordinates (default: last column).	`None`
`frailty_sd`	`float \| None`	Fixed frailty standard deviation. Omit to let latent hazard-multiplier models learn it. Corresponds to `--frailty-sd`.	`None`
`hazard_loading`	`str \| None`	Hazard loading for `frailty_kind="hazard-multiplier"`. One of `"full"` or `"loaded-vs-unloaded"`. Corresponds to `--hazard-loading`.	`None`
`scale_dimensions`	`bool \| None`	When `True`, enables learned per-axis anisotropic length scales on spatial smooths (e.g. multi-dim Duchon / Matern / TPS). Per-axis scales are learned, not specified. Corresponds to `--scale-dimensions`.	`None`
`adaptive_regularization`	`bool \| None`	Enable exact local adaptive regularization for compatible spatial smooths. Omit to use the quality-first automatic policy, which leaves it off unless explicitly requested.	`None`
`firth`	`bool \| None`	Enable Firth bias-reduced estimation. Corresponds to `--firth`.	`None`
`config`	`dict[str, Any] \| None`	Escape-hatch dict of extra pipeline keys. Any key already set via a dedicated kwarg wins over the same key in `config`.	`None`

Returns:

Type	Description
`Model`	A fitted model object with `predict`, `summary`, and save/load helpers.

fit_array ¶

fit_array(
    X: Any,
    Y: Any,
    formula: str,
    *,
    family: str = "auto",
    offset: str | None = None,
    weights: str | None = None,
    transformation_normal: bool | None = None,
    survival_likelihood: str | None = None,
    baseline_target: str | None = None,
    baseline_scale: float | None = None,
    baseline_shape: float | None = None,
    baseline_rate: float | None = None,
    baseline_makeham: float | None = None,
    z_column: str | None = None,
    link: str | None = None,
    logslope_formula: str | None = None,
    frailty_kind: str | None = None,
    frailty_sd: float | None = None,
    hazard_loading: str | None = None,
    scale_dimensions: bool | None = None,
    adaptive_regularization: bool | None = None,
    firth: bool | None = None,
    precision_hyperpriors: Any | None = None,
    config: dict[str, Any] | None = None,
) -> Model

Fit directly from numeric NumPy-compatible arrays.

X is named x0, x1, ... at the formula boundary. A one-column Y is named from the formula response; multi-column Y is named y0, y1, ...

load ¶

load(path: str | Path) -> Model

Load a fitted :class:Model previously written with :meth:Model.save.

Reads the raw bytes from path and dispatches to :func:loads.

Parameters:

Name	Type	Description	Default
`path`	`str or Path`	Filesystem path to the serialized model file.	required

Returns:

Type	Description
`Model`	Fitted model ready for prediction.

Raises:

Type	Description
`GamError`	If the file cannot be decoded by the Rust engine.

Examples:

>>> model = gamfit.load("model.gam")
>>> model.predict(test_df)

loads ¶

loads(model_bytes: bytes) -> Model

Load a fitted :class:Model from an in-memory bytes payload.

Parameters:

Name	Type	Description	Default
`model_bytes`	`bytes`	Raw serialized model produced by :meth:`Model.save` or :meth:`Model.saves`.	required

Returns:

Type	Description
`Model`	Fitted model ready for prediction.

Raises:

Type	Description
`GamError`	If the payload is malformed or incompatible with the current engine.

Examples:

>>> with open("model.gam", "rb") as fh:
...     model = gamfit.loads(fh.read())

load_posterior ¶

load_posterior(path: str | Path) -> PosteriorSamples

Load a :class:PosteriorSamples archive from disk.

Thin wrapper around :meth:PosteriorSamples.load provided for symmetry with :func:gamfit.load / :func:gamfit.fit at module level.

Parameters:

Name	Type	Description	Default
`path`	`str or Path`	Filesystem path to an `.npz` archive previously written by :meth:`PosteriorSamples.save`.	required

Returns:

Type	Description
`PosteriorSamples`	Reconstructed posterior draws and metadata.

Examples:

>>> draws = gamfit.load_posterior("posterior.npz")
>>> draws.beta.shape
(1000, 42)

competing_risks_cif ¶

competing_risks_cif(
    predictions: Mapping[str, "SurvivalPrediction"]
    | Sequence["SurvivalPrediction"],
    *,
    times: Any,
    endpoint_names: Sequence[str] | None = None,
) -> CompetingRisksCIF

Assemble competing-risks CIFs from cause-specific survival predictions.

cross_fit_shared_precision_groups ¶

cross_fit_shared_precision_groups(
    models: Sequence[Model] | Mapping[str, Model],
    groups: Sequence[SharedPrecisionGroup | Mapping[str, Any]]
    | Mapping[str, Any],
) -> dict[str, dict[str, Any]]

Compute EB precision updates shared across separately fitted models.

For each declared group p, the update is

lambda_p = (N_fits(p) * d_p + 2 * (a_p - 1)) / (sum_q_p + 2 * b_p),

where sum_q_p pools ||beta_p||² + tr(Sigma_pp) over models where the selected term/column/label appears. If a model does not contain the selected block, it is skipped for that group.

validate_formula ¶

validate_formula(
    data: Any,
    formula: str,
    *,
    family: str = "auto",
    offset: str | None = None,
    weights: str | None = None,
    transformation_normal: bool | None = None,
    survival_likelihood: str | None = None,
    baseline_target: str | None = None,
    baseline_scale: float | None = None,
    baseline_shape: float | None = None,
    baseline_rate: float | None = None,
    baseline_makeham: float | None = None,
    z_column: str | None = None,
    link: str | None = None,
    logslope_formula: str | None = None,
    frailty_kind: str | None = None,
    frailty_sd: float | None = None,
    hazard_loading: str | None = None,
    scale_dimensions: bool | None = None,
    adaptive_regularization: bool | None = None,
    firth: bool | None = None,
    config: dict[str, Any] | None = None,
) -> FormulaValidation

Validate a formula against a dataset without fitting.

Accepts every pipeline kwarg that :func:fit accepts, with identical semantics. See :func:fit for parameter documentation.

build_info ¶

build_info() -> dict[str, Any]

Return build/runtime metadata for the Rust extension.

Reports whether gamfit._rust was importable and, when available, the build-time information exposed by the extension (version, commit, feature flags). Useful for bug reports and for confirming a development build is being used.

Returns:

Type	Description
`dict`	Always contains `available` (bool) and `module` (str). When the extension loaded, additional engine-specific keys are merged in; otherwise `reason` describes why import failed.

Examples:

>>> info = gamfit.build_info()
>>> info["available"]
True

cuda_diagnostics ¶

cuda_diagnostics() -> dict[str, object]

Return CUDA loader diagnostics without forcing Rust GPU dispatch.

format_cuda_diagnostics ¶

format_cuda_diagnostics() -> str

Return CUDA loader diagnostics as stable, grep-friendly text.

explain_error ¶

explain_error(exc: BaseException) -> str

Return a short, actionable hint describing how to recover from exc.

Inspects the exception type and returns a one-line suggestion tailored to the gamfit error hierarchy (:class:FormulaError, :class:SchemaMismatchError, :class:PredictionError, :class:GamError, :class:RustExtensionUnavailableError). Unrecognized exceptions fall back to a generic message.

Parameters:

Name	Type	Description	Default
`exc`	`BaseException`	The exception caught from a gamfit call.	required

Returns:

Type	Description
`str`	Human-readable remediation hint.

Examples:

>>> try:
...     gamfit.fit(df, "y ~ s(nope)")
... except gamfit.GamError as exc:
...     print(gamfit.explain_error(exc))
Check the formula syntax and confirm every referenced column exists.

Fitted model¶

Model ¶

Model(*, _model_bytes: bytes, _training_table_kind: str | None = None)

formula `property` ¶

formula: str

The fitted Wilkinson-style formula string.

family_name `property` ¶

family_name: str

Human-readable family + link name (e.g. "Gaussian Identity").

model_class `property` ¶

model_class: str

Fitted model class string (e.g. "standard", "survival marginal-slope").

is_survival `property` ¶

is_survival: bool

True if this is a survival-family model.

is_marginal_slope `property` ¶

is_marginal_slope: bool

True if this model was fit with a marginal-slope likelihood.

is_transformation_normal `property` ¶

is_transformation_normal: bool

True if this is a conditional transformation-normal model.

response_name `property` ¶

response_name: str | None

Name of the response column, inferred from the formula.

Returns None for survival formulas (Surv(...)) and other cases where the left-hand side isn't a single identifier.

training_table_kind `property` ¶

training_table_kind: str | None

The kind of table the model was fit on.

One of "pandas", "polars", "pyarrow", "numpy", "mapping" (dict of columns), "records" (list of dicts), "rows" (2-D sequence), or None if the input kind wasn't retained. Used as a default return_type for :meth:predict and :meth:diagnose.

group_metadata `property` ¶

group_metadata: dict[str, Any] | None

Per-group metadata persisted with the fitted model, if present.

deployment_extensions `property` ¶

deployment_extensions: tuple[dict[str, Any], ...]

No-refit group extensions applied after fitting.

predict ¶

predict(
    data: Any,
    *,
    interval: float | None = None,
    return_type: str | None = None,
    id_column: str | None = None,
    with_uncertainty: bool = False,
) -> Any

Predict from data.

Default return (when id_column and return_type are both omitted) depends on the fitted model class:

Gaussian / Binomial / Standard models: a table (dict, pandas DataFrame, pyarrow Table, ...) matching the training table kind with an eta and mean column (plus interval columns when interval is given).
Transformation-normal models: a per-row transformed z-score as a 1-D numpy array of shape (n_samples,).
Bernoulli marginal-slope: a calibrated probability vector in (0, 1) as a 1-D numpy array of shape (n_samples,).
Survival models: a :class:SurvivalPrediction whose .hazard_at, .survival_at, .failure_at, and .cumulative_hazard_at helpers evaluate the fitted hazard surface on a user-supplied time grid.

Passing id_column or return_type switches the array-returning model classes (transformation-normal and Bernoulli marginal-slope) to the table form: a 2-column table (id_column, "z" or "mean") rather than a bare 1-D array. Naively flattening that table with np.asarray(...) / .to_numpy() yields shape (n_samples, 2), which is a common cause of silent broadcasting bugs in downstream metric code that expects a 1-D probability vector. When you need the probabilities as an array after asking for an id column, extract the column explicitly, e.g. out["mean"] / np.asarray(out["mean"], dtype=float).

with_uncertainty (survival only): when True, the returned :class:SurvivalPrediction also carries delta-method standard errors on the survival surface (survival_se) and the linear predictor (eta_se). Only honored for the location-scale survival likelihood mode; requesting with_uncertainty=True with any other survival likelihood ("transformation", "weibull", "marginal-slope", "latent", "latent-binary") or with competing-risks survival models raises an error.

predict_array ¶

predict_array(X: Any, *, interval: float | None = None) -> Any

Predict directly from a numeric NumPy-compatible feature matrix.

Columns are named x0, x1, ... at the Rust formula boundary. The return value is a dense NumPy array with columns ordered as eta, mean, then any uncertainty columns.

summary ¶

summary() -> Summary

Return the model summary (coefficients, family, deviance, REML score).

Returns:

Type	Description
`Summary`	A dict-like :class:`Summary` containing the fitted formula, family / link name, model class, deviance, REML or LAML score, iteration count, and the per-coefficient table (estimates, standard errors, credible-interval bounds). The summary is cached on first call.

Examples:

>>> model = gamfit.fit(train, "y ~ s(x)")
>>> s = model.summary()
>>> print(s["family_name"], s["deviance"])
>>> s.coefficients_frame()      # pandas DataFrame, requires pandas

smoothing_parameters ¶

smoothing_parameters() -> dict[int, float]

Return fitted smoothing/precision parameters by penalty index.

check ¶

check(data: Any) -> SchemaCheck

Validate data against the model's training schema.

Inexpensive: runs the schema validator only, no prediction. Use this before :meth:predict to surface column-name or type issues as structured :class:SchemaIssue records rather than as a raised :class:SchemaMismatchError.

Parameters:

Name	Type	Description	Default
`data`	`Any`	Any table-like input (pandas DataFrame, dict of columns, list of records, numpy array, etc.).	required

Returns:

Type	Description
`SchemaCheck`	`check.ok` is `True` when the data matches the training schema; otherwise `check.issues` enumerates the problems. `check.raise_for_error()` raises `ValueError` on failure.

Examples:

>>> check = model.check(test_df)
>>> if not check:
...     for issue in check.issues:
...         print(issue.kind, issue.column, issue.message)

report ¶

report(path: str | Path | None = None) -> str

Generate a standalone HTML report of the fitted model.

The report contains the summary table, smooth-term visualisations, and convergence diagnostics. It is self-contained (no external assets), so the file can be emailed or attached to a PR.

Parameters:

Name	Type	Description	Default
`path`	`str \| Path \| None`	If given, write the HTML to this path and return the path. If `None` (default), return the HTML as a string.	`None`

Returns:

Type	Description
`str`	HTML string (when `path is None`) or the written path.

Examples:

>>> model.report("report.html")
>>> html = model.report()             # for inline Jupyter display

sample ¶

sample(
    data: Any,
    *,
    samples: int | None = None,
    warmup: int | None = None,
    chains: int | None = None,
    target_accept: float | None = None,
    seed: int | None = None,
) -> PosteriorSamples

Draw from the model's posterior with NUTS.

Returns a :class:PosteriorSamples object carrying the raw (n_draws, n_coeffs) numpy matrix, per-coefficient mean / std / credible intervals, and convergence diagnostics (rhat, ess, converged).

Defaults are dimension-aware — leaving every keyword unset gives you a chain count, warmup length, and total sample budget tuned to the fitted coefficient size (see :func:gam::hmc::NutsConfig::for_dimension on the Rust side). That heuristic already covers most usage; the keywords are there for power users who want a longer run, a different acceptance target, or a fixed seed for reproducibility.

Parameters:

Name	Type	Description	Default
`data`	`Any`	Table-like input matching the model's training schema; the same input formats accepted by :meth:`predict` are supported here. For survival models, the entry/exit/event columns are consumed in addition to covariates.	required
`samples`	`int \| None`	Posterior draws per chain after warmup. When omitted, chosen automatically from the coefficient count.	`None`
`warmup`	`int \| None`	Warmup iterations per chain (defaults to `samples` when both are left unset, otherwise to the adaptive default).	`None`
`chains`	`int \| None`	Number of independent chains. Defaults adaptively to 2 or 4.	`None`
`target_accept`	`float \| None`	Target HMC acceptance rate in `(0, 1)`. Higher values give smaller leapfrog steps and slower-but-more-robust mixing.	`None`
`seed`	`int \| None`	RNG seed for deterministic chain initialisation.	`None`

Notes

Sampling currently supports standard GLM family models (Gaussian, Binomial logit/probit/cloglog, Poisson, Gamma — with or without a link-wiggle component) and survival likelihood modes other than the latent and location-scale variants. Unsupported model classes raise :class:gamfit.GamError with a message mirroring the CLI's gam sample behaviour.

sample_paired ¶

sample_paired(
    competing: "Model",
    data: Any,
    competing_data: Any | None = None,
    *,
    samples: int | None = None,
    warmup: int | None = None,
    chains: int | None = None,
    target_accept: float | None = None,
    seed: int | None = None,
) -> PairedPosteriorSamples

Draw this fit and a linked competing fit with paired draw indices.

design_matrix ¶

design_matrix(data: Any) -> Any

Materialised design matrix for data against the saved model.

Returns an (n_rows, n_coeffs) numpy array — exactly the matrix the engine uses internally for linear-predictor evaluation. Useful for custom posterior reasoning (e.g. feeding draws into your own predictive routine) or for debugging term layouts.

Currently restricted to standard non-link-wiggle GAM models; other classes raise a clear error pointing at :meth:Model.predict for the class-specific prediction path.

design_matrix_array ¶

design_matrix_array(X: Any) -> Any

Materialised design matrix for a numeric feature matrix.

predict_with_coverage ¶

predict_with_coverage(
    rows: Any, *, coverage: float = 0.95
) -> tuple[Any, Any, Any, dict[str, Any]]

Predict with covariance-based confidence intervals and group attribution.

Returns (point, lower, upper, per_group_variance_contributions). The first three entries are numpy arrays on the response-mean scale. The fourth entry is a covariance-block variance decomposition: per-group arrays contain x_g' Cov(beta_g, beta_g) x_g and cross-term arrays contain 2 x_g' Cov(beta_g, beta_h) x_h.

difference_smooth ¶

difference_smooth(
    *,
    view: str,
    group: str | None = None,
    pairs: Sequence[tuple[Any, Any]] | None = None,
    n: int = 100,
    level: float = 0.95,
    simultaneous: bool = False,
    n_sim: int = 10000,
    seed: int | None = 12345,
    marginalise_random: bool = True,
    group_means: bool = True,
    data: Any | None = None,
    return_type: str | None = None,
) -> Any

Covariance-aware pairwise difference smooths.

Builds two model matrices on a grid, subtracts them, and uses the fitted joint coefficient covariance for pointwise bands. With simultaneous=True the band critical value is estimated from posterior coefficient simulation using the max standardized deviation across the whole grid.

save ¶

save(path: str | Path) -> None

Serialise the fitted model to path.

Writes a self-contained binary .gam file that :func:gamfit.load round-trips.

Examples:

>>> model.save("model.gam")
>>> loaded = gamfit.load("model.gam")

extend_with_group ¶

extend_with_group(
    new_group_spec: dict[str, Any],
    metadata: Any | None = None,
    prior: Any | None = None,
) -> "Model"

Return a no-refit model extended with deployment-time group levels.

new_group_spec currently targets an existing random-effect term: {"kind": "random-effect-level", "term": "group_term", "level": "new"} or {"term": "group_term", "levels": ["a", "b"]}. The returned model reuses the fitted coefficients and inserts zero-initialized coefficients, or prior["mean"] / prior["mu"] when supplied.

dumps ¶

dumps() -> bytes

Return the serialised model as raw bytes.

Useful for in-memory transport. :func:gamfit.loads is the inverse.

Examples:

>>> blob = model.dumps()
>>> loaded = gamfit.loads(blob)

diagnose ¶

diagnose(
    data: Any, *, y: str | None = None, interval: float | None = 0.95
) -> Diagnostics

Score the fitted model on held-out data.

Calls :meth:predict on the feature columns of data and compares the result against the observed response, packaging the prediction, residuals, observed values, and (when requested) Wald bands into a :class:Diagnostics object. Useful for ad-hoc held-out checks and for feeding the :meth:plot method.

Parameters:

Name	Type	Description	Default
`data`	`table - like`	Any table-like input accepted by :meth:`predict` that also carries the response column.	required
`y`	`str`	Name of the response column. Defaults to :attr:`response_name`; required when that cannot be inferred (e.g. survival formulas).	`None`
`interval`	`float or None`	Pointwise Wald-interval probability passed through to :meth:`predict`. Set to `None` to skip interval columns. Defaults to `0.95`.	`0.95`

Returns:

Type	Description
`Diagnostics`	A :class:`Diagnostics` record containing the formula, response name, observed values, the predicted table, and residuals.

Raises:

Type	Description
`ValueError`	If the response column cannot be inferred or is missing from `data`.

Examples:

>>> diag = model.diagnose(test_df)
>>> diag.rmse, diag.r_squared
(0.42, 0.81)
>>> diag.predicted["mean"][:3]
[1.04, 1.21, 0.99]

See Also

Model.predict Model.plot

plot ¶

plot(
    data: Any,
    *,
    x: str | None = None,
    y: str | None = None,
    interval: float | None = 0.95,
    kind: str = "prediction",
    ax: Any | None = None,
) -> Any

Plot the model's behaviour on data with matplotlib.

Runs :meth:diagnose against data and then renders one of three standard diagnostic plots onto a matplotlib Axes.

Parameters:

Name	Type	Description	Default
`data`	`table - like`	Held-out data with the response column present (same requirements as :meth:`diagnose`).	required
`x`	`str`	Feature column to plot on the x-axis when `kind="prediction"`. Inferred automatically when there is exactly one non-response feature column.	`None`
`y`	`str`	Response column name. Defaults to :attr:`response_name`.	`None`
`interval`	`float or None`	Pointwise Wald-interval probability for the shaded band on prediction plots. Ignored for `residuals` and `observed_vs_predicted` plots. Defaults to `0.95`.	`0.95`
`kind`	`('prediction', 'residuals', 'observed_vs_predicted')`	`"prediction"` (default) — mean curve over `x` with a pointwise Wald band and observed scatter overlay. `"residuals"` — residuals vs predicted mean. `"observed_vs_predicted"` — observed vs predicted with a reference `y = x` line.	`"prediction"`
`ax`	`Axes`	Existing axes to draw onto. When omitted, a fresh `Axes` is created via `plt.subplots()`.	`None`

Returns:

Type	Description
`Axes`	The axes that were drawn on.

Raises:

Type	Description
`ValueError`	If `kind` is not one of the supported choices, or if `x` cannot be inferred for a multi-feature prediction plot, or if the named `x` column is missing from `data`.

Examples:

>>> model.plot(test_df)                       # prediction with band
>>> model.plot(test_df, kind="residuals")
>>> ax = model.plot(test_df, kind="observed_vs_predicted")
>>> ax.set_title("Calibration on held-out fold")

See Also

Model.diagnose Model.predict

SurvivalPrediction `dataclass` ¶

SurvivalPrediction(
    model_class: str,
    parameters: Any,
    parameter_names: Sequence[str] = tuple(),
    times: Any | None = None,
    hazard: Any | None = None,
    survival: Any | None = None,
    cumulative_hazard: Any | None = None,
    linear_predictor: Any | None = None,
    id_column: str | None = None,
    row_ids: Sequence[str] | None = None,
    survival_se: Any | None = None,
    eta_se: Any | None = None,
)

Per-row survival functions evaluated on demand.

Returned by :meth:Model.predict for survival-family models. The *_at helpers (:meth:hazard_at, :meth:cumulative_hazard_at, :meth:survival_at, :meth:failure_at) evaluate the fitted hazard surface at any user-supplied time grid.

When the FFI produced a dense (n_samples, n_times) grid of hazard / survival / cumulative-hazard values, the *_at helpers linearly interpolate against that grid. Otherwise they fall back to the legacy plug-in piecewise-constant hazard reconstructed from parameters so bare-dataclass construction keeps working.

For very large queries (n_rows * n_times exceeds roughly one million cells), the *_at helpers internally evaluate the surface in blocks via the matching *_at_chunks generator and then assemble the dense result; callers that want to avoid the dense allocation can iterate the chunk generators directly or stream a CSV with :meth:write_survival_at_csv.

Attributes:

Name	Type	Description
`model_class`	`str`	The fitted model class string (e.g. `"survival marginal-slope"`).
`parameters`	`ndarray`	Flat per-row parameters returned by the FFI. Shape `(n_samples, n_params_per_row)`. The exact column semantics depend on `model_class`; callers should treat this as opaque and prefer the `*_at` helpers.
`parameter_names`	`tuple of str`	Column names corresponding to `parameters`, in order.
`times`	`ndarray or None`	Shared 1-D time grid at which the hazard surfaces were evaluated.
`hazard`	`ndarray or None`	`(n_samples, len(times))` dense hazard surface from the FFI.
`survival`	`ndarray or None`	`(n_samples, len(times))` dense survival surface from the FFI.
`cumulative_hazard`	`ndarray or None`	`(n_samples, len(times))` dense cumulative-hazard surface from the FFI.
`linear_predictor`	`ndarray or None`	`(n_samples,)` per-row linear predictor at each row's own exit time.
`id_column`	`str or None`	Optional name of the id column carried through from :meth:`Model.predict` for use by :meth:`write_survival_at_csv`.
`row_ids`	`sequence of str or None`	Per-row identifiers aligned with `parameters` rows, populated when `id_column` was supplied to :meth:`Model.predict`.
`survival_se`	`ndarray or None`	`(n_samples, len(times))` delta-method standard errors on the survival surface (response scale). `None` unless the prediction was issued with `with_uncertainty=True`; then populated for location-scale survival models.
`eta_se`	`ndarray or None`	`(n_samples,)` delta-method SE on the linear predictor at each row's own exit time, under the same conditions as `survival_se`.

Examples:

>>> import numpy as np
>>> pred = model.predict(test_df)        # survival model
>>> times = np.linspace(0.0, 10.0, 50)
>>> S = pred.survival_at(times)          # (n_rows, 50) ndarray
>>> h = pred.hazard_at(times)
>>> H = pred.cumulative_hazard_at(times)

See Also

Model.predict : Returns a :class:SurvivalPrediction for survival models.

hazard_at ¶

hazard_at(times: Any) -> Any

Evaluate the hazard rate h(t) at each requested time.

When the FFI produced a dense hazard surface this linearly interpolates against the returned grid; otherwise the hazard is reconstructed from the cumulative-hazard differences. Large requests are evaluated in chunks internally before assembling the dense result.

Parameters:

Name	Type	Description	Default
`times`	`array_like`	1-D sequence of finite, non-negative times at which to evaluate the per-row hazard.	required

Returns:

Type	Description
`ndarray`	`(n_samples, len(times))` array of non-negative hazard values, one row per prediction sample.

Examples:

>>> import numpy as np
>>> pred = model.predict(test_df)
>>> h = pred.hazard_at(np.linspace(0.0, 5.0, 11))
>>> h.shape
(len(test_df), 11)

See Also

SurvivalPrediction.hazard_at_chunks : streaming chunked variant. SurvivalPrediction.cumulative_hazard_at

cumulative_hazard_at ¶

cumulative_hazard_at(times: Any) -> Any

Evaluate the cumulative hazard H(t) = -log S(t).

When the FFI provided a dense cumulative-hazard surface this interpolates against it directly; otherwise H(t) is derived from :meth:survival_at via -log S(t) (clipped away from zero for numerical safety).

Parameters:

Name	Type	Description	Default
`times`	`array_like`	1-D sequence of finite, non-negative times.	required

Returns:

Type	Description
`ndarray`	`(n_samples, len(times))` array of non-negative cumulative hazard values.

Examples:

>>> import numpy as np
>>> H = pred.cumulative_hazard_at(np.array([1.0, 2.0, 5.0]))
>>> np.all(np.diff(H, axis=1) >= 0)   # monotone non-decreasing
True

See Also

SurvivalPrediction.survival_at SurvivalPrediction.hazard_at

survival_at ¶

survival_at(times: Any) -> Any

Evaluate the survival probability S(t) at each requested time.

When the FFI produced a dense hazard/survival surface this linearly interpolates against the returned grid. Otherwise it falls back to the plug-in identity S(t) = exp(-H(t)) using a per-row piecewise-constant hazard derived from parameters (supports bare-dataclass construction). Large requests are evaluated in chunks internally before assembling the dense result.

Parameters:

Name	Type	Description	Default
`times`	`array_like`	1-D sequence of finite, non-negative times.	required

Returns:

Type	Description
`ndarray`	`(n_samples, len(times))` array of survival probabilities in `[0, 1]`.

Examples:

>>> import numpy as np
>>> times = np.linspace(0.0, 5.0, 6)
>>> S = pred.survival_at(times)
>>> S[:, 0]                  # S(0) is 1 for every row
array([1., 1., ..., 1.])

See Also

SurvivalPrediction.failure_at : returns 1 - S(t). SurvivalPrediction.survival_se_at : delta-method standard error. SurvivalPrediction.survival_at_chunks : streaming chunked variant.

failure_at ¶

failure_at(times: Any) -> Any

Evaluate the failure (event) probability F(t) = 1 - S(t).

Convenience wrapper around :meth:survival_at; the output is clipped to [0, 1] to guard against tiny interpolation excursions.

Parameters:

Name	Type	Description	Default
`times`	`array_like`	1-D sequence of finite, non-negative times.	required

Returns:

Type	Description
`ndarray`	`(n_samples, len(times))` array of failure probabilities in `[0, 1]`.

Examples:

>>> F = pred.failure_at([1.0, 5.0, 10.0])
>>> F.shape[1]
3

See Also

SurvivalPrediction.survival_at

survival_se_at ¶

survival_se_at(times: Any) -> Any

Delta-method standard error on S(t) at each requested time.

Returns None when the prediction was not issued with with_uncertainty=True (or the model class does not yet support response-scale uncertainty). When available, the returned array has shape (n_samples, len(times)) and is clipped to be non-negative.

Parameters:

Name	Type	Description	Default
`times`	`array_like`	1-D sequence of finite, non-negative times.	required

Returns:

Type	Description
`ndarray or None`	`(n_samples, len(times))` array of standard errors on the survival surface, or `None` if no uncertainty was requested.

Notes

Pair with :meth:survival_at for response-scale Wald-style bands: S +/- z * SE with the standard caveats around the Gaussian approximation near the [0, 1] boundaries.

Examples:

>>> pred = model.predict(test_df, with_uncertainty=True)
>>> S = pred.survival_at([1.0, 2.0])
>>> SE = pred.survival_se_at([1.0, 2.0])
>>> lower = (S - 1.96 * SE).clip(0.0, 1.0)

See Also

SurvivalPrediction.survival_at Model.predict : pass with_uncertainty=True to populate this.

survival_at_chunks ¶

survival_at_chunks(
    times: Any,
    *,
    people_chunk: int = DEFAULT_SURVIVAL_PEOPLE_CHUNK,
    time_grid_chunk: int = DEFAULT_SURVIVAL_TIME_GRID_CHUNK,
) -> Any

Yield S(t) evaluations in row/time blocks.

Streaming counterpart to :meth:survival_at for queries large enough that the dense (n_samples, len(times)) allocation is unwelcome. Each yielded block can be consumed (written to disk, reduced, fed into a metric) and discarded before the next one is produced.

Parameters:

Name	Type	Description	Default
`times`	`array_like`	1-D sequence of finite, non-negative times.	required
`people_chunk`	`int`	Maximum number of rows per yielded block. Defaults to `DEFAULT_SURVIVAL_PEOPLE_CHUNK` (50 000).	`DEFAULT_SURVIVAL_PEOPLE_CHUNK`
`time_grid_chunk`	`int`	Maximum number of time points per yielded block. Defaults to `DEFAULT_SURVIVAL_TIME_GRID_CHUNK` (64).	`DEFAULT_SURVIVAL_TIME_GRID_CHUNK`

Yields:

Type	Description
`tuple of (slice, slice, ndarray)`	`(row_slice, time_slice, block)` where `block` has shape `(row_slice.stop - row_slice.start, time_slice.stop - time_slice.start)` and the slices index into the full `(n_samples, len(times))` result.

Examples:

>>> import numpy as np
>>> times = np.linspace(0.0, 10.0, 200)
>>> total = 0.0
>>> for _r, _t, block in pred.survival_at_chunks(times):
...     total += float(block.sum())

See Also

SurvivalPrediction.survival_at SurvivalPrediction.write_survival_at_csv

cumulative_hazard_at_chunks ¶

cumulative_hazard_at_chunks(
    times: Any,
    *,
    people_chunk: int = DEFAULT_SURVIVAL_PEOPLE_CHUNK,
    time_grid_chunk: int = DEFAULT_SURVIVAL_TIME_GRID_CHUNK,
) -> Any

Yield H(t) evaluations in row/time blocks.

Streaming counterpart to :meth:cumulative_hazard_at. When the FFI provided a dense cumulative-hazard surface this iterates that surface directly; otherwise it derives H(t) from each survival block returned by :meth:survival_at_chunks.

Parameters:

Name	Type	Description	Default
`times`	`array_like`	1-D sequence of finite, non-negative times.	required
`people_chunk`	`int`	Maximum number of rows per yielded block. Defaults to `DEFAULT_SURVIVAL_PEOPLE_CHUNK`.	`DEFAULT_SURVIVAL_PEOPLE_CHUNK`
`time_grid_chunk`	`int`	Maximum number of time points per yielded block. Defaults to `DEFAULT_SURVIVAL_TIME_GRID_CHUNK`.	`DEFAULT_SURVIVAL_TIME_GRID_CHUNK`

Yields:

Type	Description
`tuple of (slice, slice, ndarray)`	`(row_slice, time_slice, block)` of cumulative-hazard values with shape matching the slice extents.

Examples:

>>> for r, t, H_block in pred.cumulative_hazard_at_chunks(times):
...     handle.write(H_block.tobytes())

See Also

SurvivalPrediction.cumulative_hazard_at SurvivalPrediction.survival_at_chunks

hazard_at_chunks ¶

hazard_at_chunks(
    times: Any,
    *,
    people_chunk: int = DEFAULT_SURVIVAL_PEOPLE_CHUNK,
    time_grid_chunk: int = DEFAULT_SURVIVAL_TIME_GRID_CHUNK,
) -> Any

Yield h(t) evaluations in row/time blocks.

Streaming counterpart to :meth:hazard_at. When the FFI provided a dense hazard surface this iterates that surface directly; otherwise the hazard is derived from successive cumulative-hazard blocks, carrying the previous block's tail forward so the finite-difference at each block boundary stays consistent with the non-chunked :meth:hazard_at result.

Parameters:

Name	Type	Description	Default
`times`	`array_like`	1-D sequence of finite, non-negative times.	required
`people_chunk`	`int`	Maximum number of rows per yielded block. Defaults to `DEFAULT_SURVIVAL_PEOPLE_CHUNK`.	`DEFAULT_SURVIVAL_PEOPLE_CHUNK`
`time_grid_chunk`	`int`	Maximum number of time points per yielded block. Defaults to `DEFAULT_SURVIVAL_TIME_GRID_CHUNK`.	`DEFAULT_SURVIVAL_TIME_GRID_CHUNK`

Yields:

Type	Description
`tuple of (slice, slice, ndarray)`	`(row_slice, time_slice, block)` of non-negative hazard values with shape matching the slice extents.

Examples:

>>> peak = 0.0
>>> for _r, _t, h_block in pred.hazard_at_chunks(times):
...     peak = max(peak, float(h_block.max()))

See Also

SurvivalPrediction.hazard_at SurvivalPrediction.cumulative_hazard_at_chunks

write_survival_at_csv ¶

write_survival_at_csv(
    path: str | Path,
    times: Any,
    *,
    people_chunk: int = DEFAULT_SURVIVAL_PEOPLE_CHUNK,
    time_grid_chunk: int = DEFAULT_SURVIVAL_TIME_GRID_CHUNK,
) -> str

Stream survival predictions to a CSV file.

Iterates :meth:survival_at_chunks and writes one row per (prediction_row, time) pair, avoiding materialising the full (n_samples, len(times)) matrix in memory. When the prediction was issued with an id_column (via :meth:Model.predict), that column is included.

Parameters:

Name	Type	Description	Default
`path`	`str or Path`	Destination CSV file. Overwritten if it already exists.	required
`times`	`array_like`	1-D sequence of finite, non-negative times at which to evaluate `S(t)`.	required
`people_chunk`	`int`	Maximum number of rows per internal block. Defaults to `DEFAULT_SURVIVAL_PEOPLE_CHUNK`.	`DEFAULT_SURVIVAL_PEOPLE_CHUNK`
`time_grid_chunk`	`int`	Maximum number of time points per internal block. Defaults to `DEFAULT_SURVIVAL_TIME_GRID_CHUNK`.	`DEFAULT_SURVIVAL_TIME_GRID_CHUNK`

Returns:

Type	Description
`str`	The string form of `path`.

Notes

Columns written are row, time, survival (or row, <id_column>, time, survival when an id column is present). The file is opened in text mode with UTF-8 encoding.

Examples:

>>> import numpy as np
>>> pred = model.predict(test_df, id_column="patient_id")
>>> pred.write_survival_at_csv(
...     "survival.csv", np.linspace(0.0, 10.0, 64)
... )
'survival.csv'

See Also

SurvivalPrediction.survival_at_chunks

CompetingRisksPrediction `dataclass` ¶

CompetingRisksPrediction(
    model_class: str,
    likelihood_mode: str,
    endpoint_names: tuple[str, ...],
    times: Any,
    hazard: Any,
    survival: Any,
    cumulative_hazard: Any,
    cif: Any,
    overall_survival: Any,
    linear_predictor: Any,
    columns: dict[str, list[float]],
)

Rust-computed joint cause-specific competing-risks prediction.

CompetingRisksCIF `dataclass` ¶

CompetingRisksCIF(
    times: Any,
    cif: Any,
    overall_survival: Any,
    cumulative_hazard: Any,
    endpoint_names: tuple[str, ...],
)

Cause-specific cumulative incidence assembled by the Rust core.

Posterior sampling¶

SamplingConfig `dataclass` ¶

SamplingConfig(
    n_samples: int,
    n_warmup: int,
    n_chains: int,
    target_accept: float,
    seed: int,
)

Echo of the NUTS configuration the engine ran with.

All fields are populated from the FFI payload so callers can reconstruct exactly which sampler invocation produced the draws — useful for reproducibility logs and for telling whether an explicit samples=... request was honored or auto-derived from the model dimension.

Attributes:

Name	Type	Description
`n_samples`	`int`	Post-warmup draws kept per chain.
`n_warmup`	`int`	Warmup draws discarded per chain before collecting `n_samples`.
`n_chains`	`int`	Number of independent NUTS chains run by the engine.
`target_accept`	`float`	Step-size adaptation target acceptance probability in `(0, 1)`.
`seed`	`int`	RNG seed actually consumed by the sampler.

Examples:

>>> post = model.sample(samples=500)
>>> post.config.n_samples
500
>>> post.config.target_accept
0.95

to_dict ¶

to_dict() -> dict[str, Any]

Return the config as a plain JSON-serialisable dict.

Returns:

Type	Description
`dict[str, Any]`	Mapping with keys `n_samples`, `n_warmup`, `n_chains`, `target_accept`, `seed`.

Examples:

>>> cfg = SamplingConfig(500, 1000, 4, 0.95, 42)
>>> cfg.to_dict()["n_chains"]
4

PosteriorSamples `dataclass` ¶

PosteriorSamples(
    samples: Any,
    coefficient_names: tuple[str, ...],
    mean: Any,
    std: Any,
    rhat: float,
    ess: float,
    converged: bool,
    method: str,
    model_class: str,
    family_kind: str,
    config: SamplingConfig,
    _model_bytes: bytes = _NO_MODEL,
    _name_index: Mapping[str, int] = dict(),
)

Posterior draws over the model's coefficient vector.

Returned by :meth:gamfit.Model.sample. This is the user-facing surface for posterior reasoning: a numpy-first container with named-column subscripting, credible-interval helpers, posterior predictive utilities, .save / :meth:load round-trip, trace plotting, a concise :meth:__repr__, and a notebook-friendly rich-HTML representation (_repr_html_) that delegates to :meth:summary.

Attributes:

Name	Type	Description
`samples`	`ndarray`	`(n_draws, n_coeffs)` numpy array of draws. `n_draws` is `n_chains * n_samples` (warmup is already discarded by the engine).
`coefficient_names`	`tuple[str, ...]`	Column labels for `samples`. Currently the FFI emits `("beta_0", "beta_1", ...)`; future releases may carry the same names the fitted model exposes via :class:`Summary`.
`mean`	`ndarray`	Per-coefficient posterior mean reported by the sampler.
`std`	`ndarray`	Per-coefficient posterior standard deviation reported by the sampler.
`rhat`	`float`	Maximum split-Rhat across coefficients (exact NUTS only; `1.0` exactly for Laplace iid draws).
`ess`	`float`	Minimum effective sample size across coefficients.
`converged`	`bool`	Boolean convenience for `rhat < 1.1`.
`method`	`str`	`"nuts"` for exact NUTS, `"laplace"` for the Gaussian Laplace approximation around the fitted joint mode.
`model_class`	`str`	Saved-model predictive class string the draws came from.
`family_kind`	`str`	Inverse-link tag (`"identity"`, `"logit"`, `"probit"`, `"cloglog"`, `"log"`, ...). Used by :meth:`predict` to push draws through the correct inverse link.
`config`	`SamplingConfig`	:class:`SamplingConfig` recording the chain count, warmup, `target_accept`, and seed actually used.

Examples:

>>> post = model.sample(samples=1000, warmup=1000, chains=4)
>>> post.n_draws, post.n_coeffs
(4000, 12)
>>> post["x1"].mean()
0.342
>>> bands = post.predict(new_data, level=0.9)
>>> post.save("posterior.npz")

n_draws `property` ¶

n_draws: int

Total number of post-warmup draws across all chains.

Returns:

Type	Description
`int`	`n_chains * n_samples`; the leading axis length of :attr:`samples`.

Examples:

>>> post.n_draws
4000

n_coeffs `property` ¶

n_coeffs: int

Number of model coefficients (columns of :attr:samples).

Returns:

Type	Description
`int`	Trailing axis length of :attr:`samples`.

Examples:

>>> post.n_coeffs
12

shape `property` ¶

shape: tuple[int, int]

Shape of the underlying draws matrix.

Returns:

Type	Description
`tuple[int, int]`	`(n_draws, n_coeffs)`.

Examples:

>>> post.shape
(4000, 12)

is_exact `property` ¶

is_exact: bool

Whether the draws are exact NUTS rather than Laplace iid.

Returns:

Type	Description
`bool`	`True` if :attr:`method` is `"nuts"`, `False` for `"laplace"` (the Gaussian Laplace approximation).

Examples:

>>> post = model.sample(samples=1000)
>>> post.is_exact
True

from_ffi_payload `classmethod` ¶

from_ffi_payload(
    payload: Mapping[str, Any], *, model_bytes: bytes = _NO_MODEL
) -> "PosteriorSamples"

Internal factory: build a :class:PosteriorSamples from the FFI payload.

Used by :meth:gamfit.Model.sample to wrap the dict produced by the Rust sampler. End users should not call this directly.

Parameters:

Name	Type	Description	Default
`payload`	`Mapping[str, Any]`	Decoded FFI JSON payload. Must contain `n_draws`, `n_coeffs`, `samples_flat` (row-major), `rhat`, `ess`, `converged` and may contain `coefficient_names`, `posterior_mean`, `posterior_std`, `method`, `model_class`, `family_kind`, and `config`.	required
`model_bytes`	`bytes`	Saved-model byte blob to bundle so downstream methods like :meth:`predict` work without the user re-passing the model.	`_NO_MODEL`

Returns:

Type	Description
`PosteriorSamples`	Reified posterior with samples reshaped to `(n_draws, n_coeffs)`.

Notes

samples_flat is sent flat (row-major) so we round-trip through numpy.reshape once. Building a nested list of lists from JSON would otherwise dominate decode time for biobank-scale draws.

from_ffi_json `classmethod` ¶

from_ffi_json(
    raw: str, *, model_bytes: bytes = _NO_MODEL
) -> "PosteriorSamples"

Internal factory: build a :class:PosteriorSamples from a raw FFI JSON string.

Thin convenience around :meth:from_ffi_payload that decodes the JSON itself. Used by :meth:gamfit.Model.sample; not intended as a public API.

Parameters:

Name	Type	Description	Default
`raw`	`str`	JSON-encoded FFI payload from the Rust sampler.	required
`model_bytes`	`bytes`	Saved-model byte blob bundled into the returned object.	`_NO_MODEL`

Returns:

Type	Description
`PosteriorSamples`	Same as :meth:`from_ffi_payload`.

to_numpy ¶

to_numpy() -> Any

Return the raw draws as a numpy array.

Returns:

Type	Description
`ndarray`	`(n_draws, n_coeffs)` view of :attr:`samples` (not a copy).

Examples:

>>> arr = post.to_numpy()
>>> arr.shape
(4000, 12)

to_pandas ¶

to_pandas() -> Any

Return draws as a pandas DataFrame with named coefficient columns.

Returns:

Type	Description
`DataFrame`	`(n_draws, n_coeffs)` DataFrame whose columns are :attr:`coefficient_names`.

Examples:

>>> df = post.to_pandas()
>>> df.columns.tolist()[:2]
['beta_0', 'beta_1']
>>> df["beta_1"].mean()
0.342

interval ¶

interval(level: float = 0.95) -> Any

Equal-tailed credible interval for each coefficient.

Parameters:

Name	Type	Description	Default
`level`	`float`	Coverage probability in `(0, 1)`. Default `0.95`.	`0.95`

Returns:

Type	Description
`ndarray`	`(n_coeffs, 2)` array of `(lower, upper)` bounds at the requested coverage.

Raises:

Type	Description
`ValueError`	If `level` is not strictly between 0 and 1.

Examples:

>>> ci = post.interval(level=0.9)
>>> ci.shape
(12, 2)

summary ¶

summary(level: float = 0.95) -> Summary

Per-coefficient posterior summary as a :class:Summary.

Parameters:

Name	Type	Description	Default
`level`	`float`	Coverage probability for the credible interval columns, in `(0, 1)`. Default `0.95`.	`0.95`

Returns:

Type	Description
`Summary`	Coefficient rows (`index`, `name`, `estimate`, `std_error`, `ci_lower`, `ci_upper`) plus top-level convergence diagnostics (`rhat`, `ess`, `converged`), sampler `method`, and the :class:`SamplingConfig` echo. Renders nicely in a notebook via :class:`Summary` HTML.

Notes

The payload mirrors what :meth:gamfit.Model.summary returns for fitted models, so downstream rendering helpers work uniformly on both fitted and sampled views.

Examples:

>>> post.summary(level=0.95)
Summary(method='nuts', n_coeffs=12, rhat=1.0021, converged=True)

predict ¶

predict(
    new_data: Any, *, chunk_size: int | None = 4096, level: float = 0.95
) -> dict[str, Any]

Posterior credible bands for eta and E[y | x] on new data.

Parameters:

Name	Type	Description	Default
`new_data`	`Any`	Tabular new data (DataFrame, dict of columns, or any object accepted by the engine's table normaliser) at which to evaluate the posterior fitted means.	required
`chunk_size`	`int or None`	Number of prediction rows processed at once. Default `4096`. Pass `None` to disable chunking and form the full `(n_draws, n_rows)` matrix (consider :meth:`predict_draws` instead in that case).	`4096`
`level`	`float`	Coverage probability for the credible bands in `(0, 1)`. Default `0.95`.	`0.95`

Returns:

Type	Description
`dict[str, ndarray]`	Six length-`n_rows` arrays: `eta_mean`, `eta_lower`, `eta_upper` (link scale) and `mean`, `mean_lower`, `mean_upper` (response scale, inverse link applied).

Raises:

Type	Description
`RuntimeError`	If this :class:`PosteriorSamples` was loaded from disk without a model context.
`NotImplementedError`	For model classes lacking a closed-form design matrix (e.g. link-wiggle, survival) — use :meth:`gamfit.Model.predict` instead.

Notes

Walks chunks of rows through draws @ X.T and reduces each chunk to quantiles immediately, so memory stays bounded at roughly n_draws * chunk_size floats regardless of the prediction-set size. For Laplace-method posteriors the returned bands match what model.predict(new_data, interval=level) produces analytically, up to Monte Carlo error.

Examples:

>>> bands = post.predict(new_data, level=0.9)
>>> bands["mean_lower"].shape
(50,)
>>> bands["mean_upper"][0]
0.812

predict_draws ¶

predict_draws(new_data: Any) -> PosteriorPredictive

Full posterior fitted-mean draws on new data.

Parameters:

Name	Type	Description	Default
`new_data`	`Any`	Tabular new data (DataFrame, dict of columns, or any object accepted by the engine's table normaliser).	required

Returns:

Type	Description
`PosteriorPredictive`	Container whose :attr:`PosteriorPredictive.eta` and :attr:`PosteriorPredictive.mean` are `(n_draws, n_rows)` matrices on the link and response scales respectively.

Raises:

Type	Description
`RuntimeError`	If this :class:`PosteriorSamples` was loaded from disk without a model context.

Notes

Materialises the full (n_draws, n_rows) matrix in memory. For very large prediction sets prefer :meth:predict, which streams per-row credible bands chunk-by-chunk.

Examples:

>>> pp = post.predict_draws(new_data)
>>> pp.shape
(4000, 50)
>>> pp.mean.std(axis=0).mean()
0.087

save ¶

save(path: str | Path) -> str

Save the posterior to an .npz archive.

Parameters:

Name	Type	Description	Default
`path`	`str or Path`	Destination `.npz` file path.	required

Returns:

Type	Description
`str`	String form of the resolved output path.

Notes

The archive carries the full (n_draws, n_coeffs) samples matrix, the per-coefficient mean and std, convergence diagnostics, method / class / family tags, the :class:SamplingConfig, and the saved model bytes (so :meth:predict continues to work after a round-trip via :meth:load).

Examples:

>>> post.save("posterior.npz")
'posterior.npz'
>>> reloaded = PosteriorSamples.load("posterior.npz")

load `classmethod` ¶

load(path: str | Path) -> 'PosteriorSamples'

Load a :class:PosteriorSamples from an .npz archive.

Parameters:

Name	Type	Description	Default
`path`	`str or Path`	Path to an archive previously written by :meth:`save`.	required

Returns:

Type	Description
`PosteriorSamples`	Reconstructed posterior, including bundled model bytes so :meth:`predict` keeps working.

Notes

The archive uses allow_pickle=True to round-trip the JSON metadata stored as a 0-d object array; only load archives you produced via :meth:save.

Examples:

>>> post.save("posterior.npz")
'posterior.npz'
>>> reloaded = PosteriorSamples.load("posterior.npz")
>>> reloaded.n_draws == post.n_draws
True

plot_trace ¶

plot_trace(
    *, coefficients: Any = None, max_panels: int = 8, ax: Any = None
) -> Any

Matplotlib trace + marginal-density plot.

Parameters:

Name	Type	Description	Default
`coefficients`	`None, str, int, or iterable of str/int`	Coefficients to plot. If `None`, auto-selects the first `max_panels` coefficients. A single name or integer index plots one panel row; an iterable plots one row per element.	`None`
`max_panels`	`int`	Cap on the number of panel rows when `coefficients` is `None`. Default `8`.	`8`
`ax`	`numpy.ndarray of matplotlib Axes`	Pre-existing 2-D axes array of shape `(n_panels, 2)`. If `None`, a fresh `(n_panels, 2)` figure is created.	`None`

Returns:

Type	Description
`Figure`	The figure containing the trace and density panels.

Raises:

Type	Description
`ValueError`	If the resolved coefficient selection is empty.

Notes

Each row has two panels: trace (draws vs iteration index) on the left and a marginal density histogram on the right.

Examples:

>>> fig = post.plot_trace()
>>> fig = post.plot_trace(coefficients=["beta_0", "beta_1"])
>>> fig.savefig("trace.png")

PairedPosteriorSamples `dataclass` ¶

PairedPosteriorSamples(
    target: "PosteriorSamples", competing: "PosteriorSamples"
)

Posterior samples from two linked fits with draw rows paired by index.

cumulative_incidence ¶

cumulative_incidence(
    new_data: Any, times: Any, *, level: float = 0.95
) -> CumulativeIncidenceDraws

Compute target-cause CIF draws using paired target/competing rows.

PosteriorPredictive `dataclass` ¶

PosteriorPredictive(eta: Any, mean: Any, family_kind: str, model_class: str)

Per-row posterior fitted-mean draws on the link and response scales.

Returned by :meth:PosteriorSamples.predict_draws, this container holds the full (n_draws, n_rows) matrices of fitted-mean draws on both the linear-predictor (eta) and response (mean) scales, along with link/class metadata used to re-apply the inverse link on demand.

Attributes:

Name	Type	Description
`eta`	`ndarray`	`(n_draws, n_rows)` float matrix of draws on the link scale.
`mean`	`ndarray`	`(n_draws, n_rows)` float matrix of draws pushed through the model's inverse link (mean response scale).
`family_kind`	`str`	Inverse-link tag emitted by the engine (`"identity"`, `"logit"`, `"probit"`, `"cloglog"`, `"log"`, ...).
`model_class`	`str`	Saved-model predictive class string the underlying :class:`PosteriorSamples` came from.

Notes

Use :meth:summary to collapse the matrices to per-row credible bands without writing the quantile reductions yourself. For very large prediction sets, prefer :meth:PosteriorSamples.predict which streams chunk-by-chunk instead of materialising the full (n_draws, n_rows) matrix here.

Examples:

>>> pp = post.predict_draws(new_data)
>>> pp.shape
(1000, 50)
>>> bands = pp.summary(level=0.9)

shape `property` ¶

shape: tuple[int, int]

Shape of the link-scale draw matrix.

Returns:

Type	Description
`tuple[int, int]`	`(n_draws, n_rows)`.

Examples:

>>> pp = post.predict_draws(new_data)
>>> pp.shape
(1000, 50)

n_draws `property` ¶

n_draws: int

Number of posterior fitted-mean draws.

Returns:

Type	Description
`int`	Length of the leading axis of :attr:`eta`.

Examples:

>>> pp = post.predict_draws(new_data)
>>> pp.n_draws
1000

n_rows `property` ¶

n_rows: int

Number of prediction rows.

Returns:

Type	Description
`int`	Length of the trailing axis of :attr:`eta`.

Examples:

>>> pp = post.predict_draws(new_data)
>>> pp.n_rows
50

summary ¶

summary(level: float = 0.95) -> dict[str, Any]

Collapse fitted-mean draws to per-row credible bands.

Parameters:

Name	Type	Description	Default
`level`	`float`	Coverage probability of the equal-tailed credible interval in `(0, 1)`. Default `0.95`.	`0.95`

Returns:

Type	Description
`dict[str, ndarray]`	Dict with six length-`n_rows` arrays: `eta_mean`, `eta_lower`, `eta_upper` (link scale) and `mean`, `mean_lower`, `mean_upper` (response scale).

Notes

Because the supported inverse links are monotone, response-scale quantiles are computed as the inverse link applied to the link quantiles rather than as quantiles of :attr:mean directly — the two agree up to numerical noise and the link-quantile form avoids re-walking the response-scale matrix.

Examples:

>>> pp = post.predict_draws(new_data)
>>> bands = pp.summary(level=0.9)
>>> bands["mean_lower"].shape
(50,)

CumulativeIncidenceDraws `dataclass` ¶

CumulativeIncidenceDraws(
    times: Any, draws: Any, mean: Any, lower: Any, upper: Any, level: float
)

Paired posterior draws for a target-cause cumulative incidence curve.

Diagnostics and metadata¶

Summary `dataclass` ¶

Summary(payload: dict[str, Any])

Frozen view of a fitted-model summary payload.

A Summary is the structured equivalent of print(model) for a fitted GAM. It wraps a plain dict returned by the Rust engine and exposes convenient accessors plus a notebook-friendly HTML representation. The typical entry point is :meth:Model.summary.

The payload typically contains keys such as formula, family_name, model_class, deviance, reml_score, and coefficients (a list of per-term dictionaries). Use :meth:coefficients_frame to view the coefficient table as a pandas DataFrame.

Examples:

>>> summary = model.summary()
>>> summary["family_name"]
'gaussian'
>>> summary.coefficients_frame().head()

coefficients `property` ¶

coefficients: list[dict[str, Any]]

List of per-term coefficient records.

Returns:

Type	Description
`list of dict`	One record per fitted term, each with keys such as `term`, `estimate`, `std_error`, and `edf` depending on the model.

Examples:

>>> summary.coefficients[0]["term"]
'(Intercept)'

from_dict `classmethod` ¶

from_dict(payload: dict[str, Any]) -> 'Summary'

Build a :class:Summary from a raw payload dictionary.

Parameters:

Name	Type	Description	Default
`payload`	`dict`	Mapping of summary keys to values, as produced by the Rust engine.	required

Returns:

Type	Description
`Summary`	A new immutable summary view over a shallow copy of `payload`.

Examples:

>>> Summary.from_dict({"formula": "y ~ s(x)", "family_name": "gaussian"})
Summary(formula='y ~ s(x)', family_name='gaussian')

get ¶

get(key: str, default: Any = None) -> Any

Return payload[key] if present, otherwise default.

Parameters:

Name	Type	Description	Default
`key`	`str`	Payload key to look up.	required
`default`	`Any`	Value returned when `key` is not in the payload.	`None`

Returns:

Type	Description
`Any`	The looked-up value, or `default` when `key` is absent.

Examples:

>>> summary.get("deviance", float("nan"))
12.34

to_dict ¶

to_dict() -> dict[str, Any]

Return a shallow copy of the underlying payload dictionary.

Returns:

Type	Description
`dict`	Plain `dict` mirror of the summary payload, safe to mutate.

Examples:

>>> raw = summary.to_dict()
>>> sorted(raw)[:3]
['coefficients', 'deviance', 'family_name']

coefficients_frame ¶

coefficients_frame() -> Any

Return :attr:coefficients as a :class:pandas.DataFrame.

Returns:

Type	Description
`DataFrame`	One row per term, columns mirror the keys in :attr:`coefficients` records.

Examples:

>>> frame = summary.coefficients_frame()
>>> frame.columns.tolist()[:2]
['term', 'estimate']

Diagnostics `dataclass` ¶

Diagnostics(
    formula: str,
    response_name: str,
    observed: list[float],
    residuals: list[float],
    predicted: dict[str, list[float]],
    metrics: dict[str, float],
    interval_lower: list[float] | None = None,
    interval_upper: list[float] | None = None,
)

Held-out / in-sample diagnostics for a fitted GAM.

Bundles observed responses, model-implied predictions, residuals, and aggregate fit metrics (MAE, RMSE, bias, optional :math:R^2) into a single immutable record. Returned by :meth:Model.diagnose and rendered inline in notebooks via :meth:_repr_html_.

Key fields:

formula: the model formula used to produce the predictions.
response_name: name of the response column in the input table.
observed: actual response values aligned with predicted["mean"].
residuals: observed - predicted["mean"] per row.
predicted: dictionary of prediction series (mean plus optional mean_lower / mean_upper interval bounds).
metrics: scalar fit metrics (n_obs, mae, rmse, bias, and r_squared when the response varies).
interval_lower / interval_upper: optional pointwise prediction bands when the underlying call requested an interval.

Examples:

>>> diag = model.diagnose(test)
>>> diag.metrics["rmse"]
0.42

from_predictions `classmethod` ¶

from_predictions(
    *,
    formula: str,
    response_name: str,
    observed: list[float],
    predicted: dict[str, list[float]],
) -> "Diagnostics"

Construct a :class:Diagnostics from raw observed and predicted series.

Computes residuals and aggregate fit metrics (n, MAE, RMSE, bias, and :math:R^2 when the response variance is positive) from the inputs.

Parameters:

Name	Type	Description	Default
`formula`	`str`	Model formula associated with the predictions.	required
`response_name`	`str`	Name of the response column.	required
`observed`	`list of float`	Observed response values.	required
`predicted`	`dict of str to list of float`	Prediction series. Must contain key `"mean"`; may contain `"mean_lower"` and `"mean_upper"` for interval bands.	required

Returns:

Type	Description
`Diagnostics`	Populated diagnostics record with computed residuals and metrics.

Examples:

>>> Diagnostics.from_predictions(
...     formula="y ~ s(x)",
...     response_name="y",
...     observed=[1.0, 2.0, 3.0],
...     predicted={"mean": [1.1, 1.9, 3.2]},
... ).metrics["mae"]
0.13333333333333336

to_dict ¶

to_dict() -> dict[str, Any]

Return a plain dict snapshot of the diagnostics record.

Returns:

Type	Description
`dict`	Mapping with copies of every field, suitable for JSON-style serialization or further inspection.

Examples:

>>> diag.to_dict()["metrics"]["rmse"]
0.42

SchemaCheck `dataclass` ¶

SchemaCheck(ok: bool, issues: tuple[SchemaIssue, ...])

Result of comparing serving data against a fitted model's training schema.

Returned by :meth:Model.check. Truthy when the check passes (ok=True with no issues); rendered as an HTML table in notebooks.

Key fields:

ok: True when the data matches the training schema.
issues: tuple of :class:SchemaIssue records describing each detected problem (empty when ok is True).

Examples:

>>> check = model.check(serving_df)
>>> if not check:
...     check.raise_for_error()

from_dict `classmethod` ¶

from_dict(payload: dict[str, Any]) -> 'SchemaCheck'

Build a :class:SchemaCheck from a raw payload dictionary.

Parameters:

Name	Type	Description	Default
`payload`	`dict`	Mapping with keys `ok` (bool) and `issues` (list of dicts with `kind`, `message`, and optional `column`).	required

Returns:

Type	Description
`SchemaCheck`	Parsed schema-check result.

Examples:

>>> SchemaCheck.from_dict({"ok": True, "issues": []}).ok
True

raise_for_error ¶

raise_for_error() -> None

Raise :class:ValueError if the schema check failed.

Concatenates every issue message into a single ValueError. A no-op when :attr:ok is True.

Raises:

Type	Description
`ValueError`	If at least one :class:`SchemaIssue` is recorded.

Examples:

>>> check = model.check(serving_df)
>>> check.raise_for_error()  # raises ValueError on mismatch

SchemaIssue `dataclass` ¶

SchemaIssue(kind: str, message: str, column: str | None = None)

A single schema-validation problem detected against the training schema.

Key fields:

kind: short tag describing the issue category (e.g. "missing", "type_mismatch").
message: human-readable explanation.
column: name of the offending column, when applicable.

Examples:

>>> SchemaIssue(kind="missing", message="column 'age' is missing", column="age")
SchemaIssue(kind='missing', message="column 'age' is missing", column='age')

FormulaValidation `dataclass` ¶

FormulaValidation(payload: dict[str, Any])

Outcome of :func:gamfit.validate_formula (no fit performed).

Wraps the JSON payload returned by the Rust validator. Typical keys include formula, model_class, family_name, and supported_by_python. Use this to confirm a formula parses, infer the family that would be picked, and check whether the Python binding can fit the resulting model before committing to a full :func:gamfit.fit call.

Examples:

>>> info = gamfit.validate_formula(df, "y ~ s(x)")
>>> info["family_name"]
'gaussian'
>>> info.supported_by_python
True

supported_by_python `property` ¶

supported_by_python: bool

Whether the Python binding can fit the validated model.

Returns:

Type	Description
`bool`	`True` when :func:`gamfit.fit` can produce a fitted model for this formula/family combination, `False` if only the CLI / Rust engine can handle it.

Examples:

>>> info.supported_by_python
True

from_dict `classmethod` ¶

from_dict(payload: dict[str, Any]) -> 'FormulaValidation'

Build a :class:FormulaValidation from a raw payload dictionary.

Parameters:

Name	Type	Description	Default
`payload`	`dict`	Mapping of validation keys to values, as produced by the Rust validator.	required

Returns:

Type	Description
`FormulaValidation`	Immutable view over a shallow copy of `payload`.

Examples:

>>> FormulaValidation.from_dict({"formula": "y ~ x", "supported_by_python": True})
FormulaValidation(formula='y ~ x', model_class=None, family_name=None, supported_by_python=True)

to_dict ¶

to_dict() -> dict[str, Any]

Return a shallow copy of the underlying payload dictionary.

Returns:

Type	Description
`dict`	Plain `dict` mirror of the validation payload.

Examples:

>>> raw = info.to_dict()
>>> raw["formula"]
'y ~ s(x)'

SharedPrecisionGroup `dataclass` ¶

SharedPrecisionGroup(
    name: str,
    shape: float = 1.0,
    rate: float = 0.0,
    labels: str | Mapping[str | int, str] | None = None,
)

Cross-fit coefficient precision group.

name is the shared precision coordinate. By default it selects the same named coefficient term/column/label in every model. labels can override that with either one label for all models or a mapping keyed by the model name/index supplied to :func:cross_fit_shared_precision_groups.

Basis and ridge primitives¶

bspline_basis ¶

bspline_basis(
    t: Any, knots: Any = None, *, degree: int = 3, periodic: bool = False
) -> Any

Evaluate the Rust B-spline basis as a NumPy array.

knots may be:

None — auto-derive a clamped knot vector with quantile-spaced interior knots inferred from t.
an int K — auto-derive with K interior knots.
an array-like — used verbatim (must be a valid clamped knot vector).

bspline_basis_derivative ¶

bspline_basis_derivative(
    t: Any,
    knots: Any = None,
    *,
    degree: int = 3,
    order: int = 1,
    periodic: bool = False,
) -> Any

Evaluate derivatives of the Rust B-spline basis as a NumPy array.

knots accepts None / int / array — see :func:bspline_basis.

duchon_basis_1d ¶

duchon_basis_1d(
    t: Any, centers: Any = None, *, m: int = 2, periodic: bool = False
) -> Any

Evaluate the Rust one-dimensional Duchon basis as a NumPy array.

centers may be:

None — auto-derive K = 10 centers at empirical quantiles of t.
an int K — auto-derive K quantile centers.
an array-like — used verbatim.

duchon_basis_1d_derivative ¶

duchon_basis_1d_derivative(
    t: Any,
    centers: Any = None,
    *,
    m: int = 2,
    order: int = 1,
    periodic: bool = False,
) -> Any

Evaluate derivatives of the Rust one-dimensional Duchon basis.

centers accepts None / int / array — see :func:duchon_basis_1d.

smoothness_penalty ¶

smoothness_penalty(
    knots: Any, *, degree: int = 3, order: int = 2
) -> tuple[Any, Any]

Return (S, null_basis) for the Rust B-spline difference penalty.

knots must be a knot vector here — auto-derivation requires sample positions, which this penalty constructor does not take. Build one with :func:bspline_basis's defaults (or pass any 1D array).

gaussian_weighted_ridge ¶

gaussian_weighted_ridge(
    X: Any, Y: Any, penalty: Any, weights: Any, *, ridge_lambda: float
) -> tuple[Any, Any]

Closed-form Gaussian row-weighted ridge on NumPy-compatible arrays.

weights are likelihood row weights. They are not a multiplicative gate on the mean/design row.

gaussian_weighted_ridge_batch ¶

gaussian_weighted_ridge_batch(
    X: Any,
    Y: Any,
    penalty: Any,
    weights: Any,
    *,
    ridge_lambda: float,
    row_counts: Any | None = None,
) -> tuple[Any, Any]

Batched closed-form Gaussian row-weighted ridge.

X has shape (K, Nmax, M), Y has shape (K, Nmax, D), and weights has shape (K, Nmax). row_counts optionally marks the active row prefix for each problem in a padded ragged batch.

Gaussian REML primitives¶

gaussian_reml_fit ¶

gaussian_reml_fit(
    x: Any,
    y: Any,
    penalty: Any,
    *,
    weights: Any | None = None,
    init_lambda: float | None = None,
    by: Any | None = None,
    by_start_col: int = 0,
) -> dict[str, Any]

Fit a closed-form Gaussian REML problem from NumPy-compatible arrays.

gaussian_reml_fit_backward ¶

gaussian_reml_fit_backward(
    x: Any,
    y: Any,
    penalty: Any,
    *,
    grad_lambda: float = 0.0,
    grad_coefficients: Any | None = None,
    grad_fitted: Any | None = None,
    grad_reml_score: float = 0.0,
    grad_edf: float = 0.0,
    forward_state: dict[str, Any] | None = None,
    weights: Any | None = None,
    init_lambda: float | None = None,
    by: Any | None = None,
    by_start_col: int = 0,
) -> dict[str, Any]

Run the analytic VJP for gaussian_reml_fit outputs.

gaussian_reml_fit_batched ¶

gaussian_reml_fit_batched(
    x: Any,
    y: Any,
    row_offsets: Any,
    penalty: Any,
    *,
    weights: Any | None = None,
    init_lambda: float | None = None,
    by: Any | None = None,
    by_start_col: int = 0,
) -> dict[str, Any]

Fit K closed-form Gaussian REML problems packed by row offsets.

gaussian_reml_fit_batched_backward ¶

gaussian_reml_fit_batched_backward(
    x: Any,
    y: Any,
    row_offsets: Any,
    penalty: Any,
    *,
    grad_lambda: Any | None = None,
    grad_coefficients: Any | None = None,
    grad_fitted: Any | None = None,
    grad_reml_score: Any | None = None,
    grad_edf: Any | None = None,
    forward_state: dict[str, Any] | None = None,
    weights: Any | None = None,
    init_lambda: float | None = None,
    by: Any | None = None,
    by_start_col: int = 0,
) -> dict[str, Any]

Run packed ragged analytic VJPs for gaussian_reml_fit_batched.

gaussian_reml_fit_positions ¶

gaussian_reml_fit_positions(
    t: Any,
    y: Any,
    basis_kind: str | None = None,
    knots_or_centers: Any = None,
    penalty: Any | None = None,
    *,
    basis: str | None = None,
    basis_order: int | None = None,
    periodic: bool = False,
    period: float | None = None,
    weights: Any | None = None,
    init_lambda: float | None = None,
    by: Any | None = None,
    by_start_col: int = 0,
) -> dict[str, Any]

Fit closed-form Gaussian REML from 1D positions and an internal basis.

knots_or_centers may be None, an int (basis count), or an array; the basis-location vector is auto-derived from t when not supplied. penalty may be None for a neutral identity ridge of matching size.

gaussian_reml_fit_positions_backward ¶

gaussian_reml_fit_positions_backward(
    t: Any,
    y: Any,
    basis_kind: str | None = None,
    knots_or_centers: Any = None,
    penalty: Any | None = None,
    *,
    basis: str | None = None,
    grad_lambda: float = 0.0,
    grad_coefficients: Any | None = None,
    grad_fitted: Any | None = None,
    grad_reml_score: float = 0.0,
    grad_edf: float = 0.0,
    forward_state: dict[str, Any] | None = None,
    basis_order: int | None = None,
    periodic: bool = False,
    period: float | None = None,
    weights: Any | None = None,
    init_lambda: float | None = None,
    by: Any | None = None,
    by_start_col: int = 0,
) -> dict[str, Any]

Run the analytic VJP for gaussian_reml_fit_positions outputs.

knots_or_centers and penalty accept the same auto-derived defaults as :func:gaussian_reml_fit_positions.

gaussian_reml_fit_positions_batched ¶

gaussian_reml_fit_positions_batched(
    t: Any,
    y: Any,
    row_offsets: Any,
    basis_kind: str | None = None,
    knots_or_centers: Any = None,
    penalty: Any | None = None,
    *,
    basis: str | None = None,
    basis_order: int | None = None,
    periodic: bool = False,
    period: float | None = None,
    weights: Any | None = None,
    init_lambda: float | None = None,
    by: Any | None = None,
    by_start_col: int = 0,
) -> dict[str, Any]

Fit packed ragged closed-form Gaussian REML problems from positions.

knots_or_centers and penalty accept the same auto-derived defaults as :func:gaussian_reml_fit_positions. The basis locations are inferred from the concatenated positions across all groups.

gaussian_reml_fit_positions_batched_backward ¶

gaussian_reml_fit_positions_batched_backward(
    t: Any,
    y: Any,
    row_offsets: Any,
    basis_kind: str | None = None,
    knots_or_centers: Any = None,
    penalty: Any | None = None,
    *,
    basis: str | None = None,
    grad_lambda: Any | None = None,
    grad_coefficients: Any | None = None,
    grad_fitted: Any | None = None,
    grad_reml_score: Any | None = None,
    grad_edf: Any | None = None,
    forward_state: dict[str, Any] | None = None,
    basis_order: int | None = None,
    periodic: bool = False,
    period: float | None = None,
    weights: Any | None = None,
    init_lambda: float | None = None,
    by: Any | None = None,
    by_start_col: int = 0,
) -> dict[str, Any]

Run the analytic VJP for packed position-based Gaussian REML fits.

knots_or_centers and penalty accept the same auto-derived defaults as :func:gaussian_reml_fit_positions_batched.

gaussian_reml_fit_formula ¶

gaussian_reml_fit_formula(
    data: Any, formula: str, y: Any, *, config: dict[str, Any] | None = None
) -> dict[str, Any]

Fit closed-form Gaussian REML after materialising a formula design.

scikit-learn integration¶

GAMRegressor `dataclass` ¶

GAMRegressor(
    formula: str,
    family: str = "auto",
    offset: str | None = None,
    weights: str | None = None,
    config: dict[str, Any] | None = None,
)

Bases: _BaseGAMEstimator, RegressorMixin

scikit-learn-compatible regressor wrapping :func:gamfit.fit.

Construct with a formula string and (optionally) pipeline kwargs such as family, offset, weights, or a free-form config dict, then call :meth:fit with either a fully-formed table (X) or a feature table plus a target column / vector (y). After fitting, the estimator exposes the standard predict / score interface plus pass-through helpers :meth:summary, :meth:report, and :meth:check from the underlying :class:Model.

Parameters:

Name	Type	Description	Default
`formula`	`str`	Wilkinson-style formula. May or may not include the response on the left-hand side; the response is resolved from `y` if missing.	required
`family`	`str`	Likelihood family forwarded to :func:`gamfit.fit`.	``"auto"``
`offset`	`str or None`	Offset column name, forwarded to :func:`gamfit.fit`.	`None`
`weights`	`str or None`	Observation-weight column name.	`None`
`config`	`dict or None`	Escape-hatch dict of extra pipeline keys.	`None`

Examples:

>>> from gamfit.sklearn import GAMRegressor
>>> reg = GAMRegressor(formula="y ~ s(x1) + s(x2)").fit(X_train, y_train)
>>> preds = reg.predict(X_test)
>>> reg.score(X_test, y_test)
0.87

fit ¶

fit(X: Any, y: Any = None) -> 'GAMRegressor'

Fit the underlying GAM and return self.

Parameters:

Name	Type	Description	Default
`X`	`Any`	Training table (pandas DataFrame, pyarrow Table, dict of columns, list of records, or anything :func:`gamfit.fit` accepts). May include the response column or not.	required
`y`	`str, array-like, or None`	Target. `str` names a column already in `X`; an array-like is bound to `X` under the response name implied by `formula`; `None` means `X` already contains the response named by `formula`.	`None`

Returns:

Type	Description
`GAMRegressor`	Fitted estimator (`self`) with `model_`, `formula_`, `feature_names_in_`, and `n_features_in_` attributes set.

Examples:

>>> GAMRegressor(formula="y ~ s(x)").fit(df, y="y")

predict ¶

predict(X: Any) -> np.ndarray

Predict the conditional mean for each row in X.

Parameters:

Name	Type	Description	Default
`X`	`Any`	Serving table with the feature columns seen at fit time.	required

Returns:

Type	Description
`ndarray`	One-dimensional float array of predicted means, one per row.

Examples:

>>> reg.predict(X_test)[:3]
array([1.02, 0.98, 1.41])

score ¶

score(X: Any, y: Any, sample_weight: Any = None) -> float

Return the coefficient of determination :math:R^2.

Parameters:

Name	Type	Description	Default
`X`	`Any`	Test feature table.	required
`y`	`array - like`	True response values.	required
`sample_weight`	`array - like or None`	Per-row weights forwarded to :func:`sklearn.metrics.r2_score`.	`None`

Returns:

Type	Description
`float`	:math:`R^2` of the predictions.

Examples:

>>> reg.score(X_test, y_test)
0.87

GAMClassifier `dataclass` ¶

GAMClassifier(
    formula: str,
    family: str = "auto",
    offset: str | None = None,
    weights: str | None = None,
    config: dict[str, Any] | None = None,
)

Bases: _BaseGAMEstimator, ClassifierMixin

scikit-learn-compatible binary classifier wrapping :func:gamfit.fit.

Same construction and fit semantics as :class:GAMRegressor (see that class for parameter documentation). Predictions interpret the model's mean as the probability of the positive class; classes are fixed to [0, 1] and a threshold of 0.5 is used by :meth:predict.

Examples:

>>> from gamfit.sklearn import GAMClassifier
>>> clf = GAMClassifier(formula="y ~ s(x1) + s(x2)", family="binomial")
>>> clf.fit(X_train, y_train)
>>> clf.predict_proba(X_test)[:1]
array([[0.34, 0.66]])

fit ¶

fit(X: Any, y: Any = None) -> 'GAMClassifier'

Fit the binary GAM classifier and return self.

Parameters:

Name	Type	Description	Default
`X`	`Any`	Training table. See :meth:`GAMRegressor.fit` for accepted forms.	required
`y`	`str, array-like, or None`	Binary target. See :meth:`GAMRegressor.fit` for accepted forms.	`None`

Returns:

Type	Description
`GAMClassifier`	Fitted estimator (`self`) with `classes_` set to `[0, 1]`.

Examples:

>>> GAMClassifier(formula="y ~ s(x)", family="binomial").fit(df, y="y")

predict_proba ¶

predict_proba(X: Any) -> np.ndarray

Predict class probabilities for each row in X.

Parameters:

Name	Type	Description	Default
`X`	`Any`	Serving table with the feature columns seen at fit time.	required

Returns:

Type	Description
`ndarray`	Two-column float array `[[P(y=0), P(y=1)], ...]`, clipped to `[0, 1]`.

Examples:

>>> clf.predict_proba(X_test).shape
(100, 2)

predict ¶

predict(X: Any) -> np.ndarray

Predict the binary class label using a 0.5 threshold on the positive class.

Parameters:

Name	Type	Description	Default
`X`	`Any`	Serving table with the feature columns seen at fit time.	required

Returns:

Type	Description
`ndarray`	One-dimensional integer array of class labels (`0` or `1`).

Examples:

>>> clf.predict(X_test)[:5]
array([1, 0, 1, 1, 0])

score ¶

score(X: Any, y: Any, sample_weight: Any = None) -> float

Return classification accuracy.

Parameters:

Name	Type	Description	Default
`X`	`Any`	Test feature table.	required
`y`	`array - like`	True binary labels.	required
`sample_weight`	`array - like or None`	Per-row weights forwarded to :func:`sklearn.metrics.accuracy_score`.	`None`

Returns:

Type	Description
`float`	Accuracy in `[0, 1]`.

Examples:

>>> clf.score(X_test, y_test)
0.91

Exceptions¶

GamError ¶

Bases: Exception

Base class for Python-facing GAM errors.

All gamfit-specific exceptions raised by the Python binding inherit from GamError, so catching this class is the broadest way to handle a failure originating from the Rust engine or the binding layer.

Examples:

>>> try:
...     gamfit.fit(df, "y ~ s(x)")
... except gamfit.GamError as exc:
...     print(gamfit.explain_error(exc))

FormulaError ¶

Bases: GamError

The formula is invalid or unsupported.

Raised when the Wilkinson-style formula string cannot be parsed, references columns missing from the input table, or describes a model the engine does not support.

Examples:

>>> try:
...     gamfit.fit(df, "y ~ s(nope)")
... except gamfit.FormulaError as exc:
...     print(exc)

SchemaMismatchError ¶

Bases: GamError

Prediction input does not match the training schema.

Raised when the table passed to :meth:Model.predict or related methods lacks columns that were present at fit time, has incompatible dtypes, or introduces unknown categorical levels.

Examples:

>>> try:
...     model.predict(serving_df)
... except gamfit.SchemaMismatchError as exc:
...     print(model.check(serving_df))

PredictionError ¶

Bases: GamError

Prediction failed.

Raised for runtime failures during prediction that are not pure schema problems (numerical issues, unsupported prediction modes for the fitted model, etc.).

Examples:

>>> try:
...     model.predict(test_df)
... except gamfit.PredictionError as exc:
...     print(gamfit.explain_error(exc))

RustExtensionUnavailableError ¶

Bases: ImportError

Raised when the compiled gamfit._rust extension cannot be imported.

The Rust engine ships as a maturin-built extension module. When it is missing (typical in a fresh source checkout that has not been built yet), every Rust-backed API in :mod:gamfit raises this error eagerly so users see a single, actionable message instead of an opaque ImportError.

The fix is to build or install the package, e.g. maturin develop from the gamfit source tree, or pip install gamfit from PyPI.

Examples:

>>> try:
...     gamfit.fit(df, "y ~ s(x)")
... except gamfit.RustExtensionUnavailableError as exc:
...     print("build the extension first:", exc)

Response geometry¶

ResponseGeometryModel `dataclass` ¶

ResponseGeometryModel(
    models: Sequence[Any],
    response_geometry: str,
    response_columns: tuple[str, ...],
    base_point: Any,
    coordinates: str,
    reference: int = -1,
    training_table_kind: str | None = None,
    shared_tangent_fit: SharedGaussianRemlTangentFit | None = None,
)

A fitted response-geometry GAM with shared smoothing across tangent coordinates.

clr ¶

clr(values: Any) -> Any

Centered log-ratio coordinates for positive compositions.

alr ¶

alr(values: Any, *, reference: int = -1) -> Any

Additive log-ratio coordinates for positive compositions.

closure ¶

closure(values: Any) -> Any

Normalize rows onto the probability simplex.

simplex_frechet_mean ¶

simplex_frechet_mean(values: Any, weights: Any | None = None) -> Any

Intrinsic Fréchet mean under Aitchison simplex geometry.

sphere_frechet_mean ¶

sphere_frechet_mean(
    values: Any,
    weights: Any | None = None,
    *,
    tol: float = 1e-12,
    max_iter: int = 256,
) -> Any

Intrinsic Fréchet/Karcher mean on the unit sphere.

If the minimizer is not unique, as for an exactly antipodal pair, this returns one deterministic minimizer rather than an endpoint surrogate.

API reference¶

Top-level functions¶

fit ¶

fit_array ¶

load ¶

loads ¶

load_posterior ¶

competing_risks_cif ¶

cross_fit_shared_precision_groups ¶

validate_formula ¶

build_info ¶

cuda_diagnostics ¶

format_cuda_diagnostics ¶

explain_error ¶

Fitted model¶

Model ¶

formula property ¶

family_name property ¶

model_class property ¶

is_survival property ¶

is_marginal_slope property ¶

is_transformation_normal property ¶

response_name property ¶

training_table_kind property ¶

group_metadata property ¶

deployment_extensions property ¶

predict ¶

predict_array ¶

summary ¶

smoothing_parameters ¶

check ¶

report ¶

sample ¶

sample_paired ¶

design_matrix ¶

design_matrix_array ¶

predict_with_coverage ¶

difference_smooth ¶

save ¶

extend_with_group ¶

dumps ¶

diagnose ¶

plot ¶

SurvivalPrediction dataclass ¶

hazard_at ¶

cumulative_hazard_at ¶

survival_at ¶

failure_at ¶

survival_se_at ¶

survival_at_chunks ¶

cumulative_hazard_at_chunks ¶

hazard_at_chunks ¶

write_survival_at_csv ¶

CompetingRisksPrediction dataclass ¶

CompetingRisksCIF dataclass ¶

Posterior sampling¶

SamplingConfig dataclass ¶

to_dict ¶

PosteriorSamples dataclass ¶

n_draws property ¶

n_coeffs property ¶

shape property ¶

is_exact property ¶

from_ffi_payload classmethod ¶

from_ffi_json classmethod ¶

to_numpy ¶

to_pandas ¶

interval ¶

summary ¶

predict ¶

predict_draws ¶

save ¶

load classmethod ¶

plot_trace ¶

PairedPosteriorSamples dataclass ¶

cumulative_incidence ¶

PosteriorPredictive dataclass ¶

shape property ¶

n_draws property ¶

n_rows property ¶

formula `property` ¶

family_name `property` ¶

model_class `property` ¶

is_survival `property` ¶

is_marginal_slope `property` ¶

is_transformation_normal `property` ¶

response_name `property` ¶

training_table_kind `property` ¶

group_metadata `property` ¶

deployment_extensions `property` ¶

SurvivalPrediction `dataclass` ¶

CompetingRisksPrediction `dataclass` ¶

CompetingRisksCIF `dataclass` ¶

SamplingConfig `dataclass` ¶

PosteriorSamples `dataclass` ¶

n_draws `property` ¶

n_coeffs `property` ¶

shape `property` ¶

is_exact `property` ¶

from_ffi_payload `classmethod` ¶

from_ffi_json `classmethod` ¶

load `classmethod` ¶

PairedPosteriorSamples `dataclass` ¶

PosteriorPredictive `dataclass` ¶

shape `property` ¶

n_draws `property` ¶

n_rows `property` ¶

CumulativeIncidenceDraws `dataclass` ¶

Summary `dataclass` ¶

coefficients `property` ¶

from_dict `classmethod` ¶

Diagnostics `dataclass` ¶

from_predictions `classmethod` ¶

SchemaCheck `dataclass` ¶

from_dict `classmethod` ¶

SchemaIssue `dataclass` ¶

FormulaValidation `dataclass` ¶

supported_by_python `property` ¶

from_dict `classmethod` ¶

SharedPrecisionGroup `dataclass` ¶

GAMRegressor `dataclass` ¶

GAMClassifier `dataclass` ¶

ResponseGeometryModel `dataclass` ¶