API reference¶
Generated from docstrings and type hints in the gamfit source. See the
topical guides for narrative explanations.
Top-level functions¶
fit
¶
fit(
data: Any,
formula: str,
*,
family: str = ...,
offset: str | None = ...,
weights: str | None = ...,
transformation_normal: bool | None = ...,
survival_likelihood: str | None = ...,
baseline_target: str | None = ...,
baseline_scale: float | None = ...,
baseline_shape: float | None = ...,
baseline_rate: float | None = ...,
baseline_makeham: float | None = ...,
z_column: str | None = ...,
link: str | None = ...,
logslope_formula: str | None = ...,
frailty_kind: str | None = ...,
frailty_sd: float | None = ...,
hazard_loading: str | None = ...,
scale_dimensions: bool | None = ...,
adaptive_regularization: bool | None = ...,
firth: bool | None = ...,
precision_hyperpriors: Any | None = ...,
response_geometry: None = ...,
response_columns: list[str] | tuple[str, ...] | None = ...,
response_coordinates: str | None = ...,
response_reference: int | None = ...,
config: dict[str, Any] | None = ...,
) -> Model
fit(
data: Any,
formula: str,
*,
family: str = ...,
offset: str | None = ...,
weights: str | None = ...,
transformation_normal: bool | None = ...,
survival_likelihood: str | None = ...,
baseline_target: str | None = ...,
baseline_scale: float | None = ...,
baseline_shape: float | None = ...,
baseline_rate: float | None = ...,
baseline_makeham: float | None = ...,
z_column: str | None = ...,
link: str | None = ...,
logslope_formula: str | None = ...,
frailty_kind: str | None = ...,
frailty_sd: float | None = ...,
hazard_loading: str | None = ...,
scale_dimensions: bool | None = ...,
adaptive_regularization: bool | None = ...,
firth: bool | None = ...,
precision_hyperpriors: Any | None = ...,
response_geometry: str,
response_columns: list[str] | tuple[str, ...] | None = ...,
response_coordinates: str | None = ...,
response_reference: int | None = ...,
config: dict[str, Any] | None = ...,
) -> ResponseGeometryModel
fit(
data: Any,
formula: str,
*,
family: str = "auto",
offset: str | None = None,
weights: str | None = None,
transformation_normal: bool | None = None,
survival_likelihood: str | None = None,
baseline_target: str | None = None,
baseline_scale: float | None = None,
baseline_shape: float | None = None,
baseline_rate: float | None = None,
baseline_makeham: float | None = None,
z_column: str | None = None,
link: str | None = None,
logslope_formula: str | None = None,
frailty_kind: str | None = None,
frailty_sd: float | None = None,
hazard_loading: str | None = None,
scale_dimensions: bool | None = None,
adaptive_regularization: bool | None = None,
firth: bool | None = None,
precision_hyperpriors: Any | None = None,
response_geometry: str | None = None,
response_columns: list[str] | tuple[str, ...] | None = None,
response_coordinates: str | None = None,
response_reference: int | None = None,
config: dict[str, Any] | None = None,
) -> Model | ResponseGeometryModel
Fit a GAM model from a formula and a tabular dataset.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data
|
Any
|
Input table. Accepts a pandas DataFrame, pyarrow Table, dict of columns, list of records, or any object normalize_table understands. |
required |
formula
|
str
|
Wilkinson-style formula string (e.g. |
required |
family
|
str
|
Likelihood family, or |
'auto'
|
offset
|
str | None
|
Name of the offset column. Corresponds to |
None
|
weights
|
str | None
|
Name of the observation-weight column. Corresponds to |
None
|
transformation_normal
|
bool | None
|
Fit a conditional transformation-normal model ( |
None
|
survival_likelihood
|
str | None
|
Survival likelihood formulation. One of |
None
|
baseline_target
|
str | None
|
Parametric baseline target for survival models. One of |
None
|
baseline_scale
|
float | None
|
Weibull baseline scale (>0) when |
None
|
baseline_shape
|
float | None
|
Weibull baseline shape (>0). Corresponds to |
None
|
baseline_rate
|
float | None
|
Gompertz hazard rate (>0) when |
None
|
baseline_makeham
|
float | None
|
Makeham additive hazard (>0) when |
None
|
z_column
|
str | None
|
Name of the latent/observed z-score column used by score-warp families
and latent transformation models. Corresponds to |
None
|
link
|
str | None
|
Override the default link function. Corresponds to |
None
|
logslope_formula
|
str | None
|
Secondary formula for the logslope / score-warp submodel. Corresponds to
|
None
|
frailty_kind
|
str | None
|
Frailty family for frailty-aware survival models. One of
|
None
|
response_geometry
|
str | None
|
Optional manifold-valued response geometry. Use |
None
|
response_columns
|
list[str] | tuple[str, ...] | None
|
Sequence of response component columns used when |
None
|
response_coordinates
|
str | None
|
Coordinate chart for simplex responses: |
None
|
response_reference
|
int | None
|
Reference component for |
None
|
frailty_sd
|
float | None
|
Fixed frailty standard deviation. Omit to let latent hazard-multiplier
models learn it. Corresponds to |
None
|
hazard_loading
|
str | None
|
Hazard loading for |
None
|
scale_dimensions
|
bool | None
|
When |
None
|
adaptive_regularization
|
bool | None
|
Enable exact local adaptive regularization for compatible spatial smooths. Omit to use the quality-first automatic policy, which leaves it off unless explicitly requested. |
None
|
firth
|
bool | None
|
Enable Firth bias-reduced estimation. Corresponds to |
None
|
config
|
dict[str, Any] | None
|
Escape-hatch dict of extra pipeline keys. Any key already set via a
dedicated kwarg wins over the same key in |
None
|
Returns:
| Type | Description |
|---|---|
Model
|
A fitted model object with |
fit_array
¶
fit_array(
X: Any,
Y: Any,
formula: str,
*,
family: str = "auto",
offset: str | None = None,
weights: str | None = None,
transformation_normal: bool | None = None,
survival_likelihood: str | None = None,
baseline_target: str | None = None,
baseline_scale: float | None = None,
baseline_shape: float | None = None,
baseline_rate: float | None = None,
baseline_makeham: float | None = None,
z_column: str | None = None,
link: str | None = None,
logslope_formula: str | None = None,
frailty_kind: str | None = None,
frailty_sd: float | None = None,
hazard_loading: str | None = None,
scale_dimensions: bool | None = None,
adaptive_regularization: bool | None = None,
firth: bool | None = None,
precision_hyperpriors: Any | None = None,
config: dict[str, Any] | None = None,
) -> Model
Fit directly from numeric NumPy-compatible arrays.
X is named x0, x1, ... at the formula boundary. A one-column
Y is named from the formula response; multi-column Y is named
y0, y1, ...
load
¶
Load a fitted :class:Model previously written with :meth:Model.save.
Reads the raw bytes from path and dispatches to :func:loads.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
path
|
str or Path
|
Filesystem path to the serialized model file. |
required |
Returns:
| Type | Description |
|---|---|
Model
|
Fitted model ready for prediction. |
Raises:
| Type | Description |
|---|---|
GamError
|
If the file cannot be decoded by the Rust engine. |
Examples:
loads
¶
Load a fitted :class:Model from an in-memory bytes payload.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
model_bytes
|
bytes
|
Raw serialized model produced by :meth: |
required |
Returns:
| Type | Description |
|---|---|
Model
|
Fitted model ready for prediction. |
Raises:
| Type | Description |
|---|---|
GamError
|
If the payload is malformed or incompatible with the current engine. |
Examples:
load_posterior
¶
Load a :class:PosteriorSamples archive from disk.
Thin wrapper around :meth:PosteriorSamples.load provided for symmetry
with :func:gamfit.load / :func:gamfit.fit at module level.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
path
|
str or Path
|
Filesystem path to an |
required |
Returns:
| Type | Description |
|---|---|
PosteriorSamples
|
Reconstructed posterior draws and metadata. |
Examples:
competing_risks_cif
¶
competing_risks_cif(
predictions: Mapping[str, "SurvivalPrediction"]
| Sequence["SurvivalPrediction"],
*,
times: Any,
endpoint_names: Sequence[str] | None = None,
) -> CompetingRisksCIF
Assemble competing-risks CIFs from cause-specific survival predictions.
cross_fit_shared_precision_groups
¶
cross_fit_shared_precision_groups(
models: Sequence[Model] | Mapping[str, Model],
groups: Sequence[SharedPrecisionGroup | Mapping[str, Any]]
| Mapping[str, Any],
) -> dict[str, dict[str, Any]]
Compute EB precision updates shared across separately fitted models.
For each declared group p, the update is
lambda_p = (N_fits(p) * d_p + 2 * (a_p - 1)) / (sum_q_p + 2 * b_p),
where sum_q_p pools ||beta_p||² + tr(Sigma_pp) over models where
the selected term/column/label appears. If a model does not contain the
selected block, it is skipped for that group.
validate_formula
¶
validate_formula(
data: Any,
formula: str,
*,
family: str = "auto",
offset: str | None = None,
weights: str | None = None,
transformation_normal: bool | None = None,
survival_likelihood: str | None = None,
baseline_target: str | None = None,
baseline_scale: float | None = None,
baseline_shape: float | None = None,
baseline_rate: float | None = None,
baseline_makeham: float | None = None,
z_column: str | None = None,
link: str | None = None,
logslope_formula: str | None = None,
frailty_kind: str | None = None,
frailty_sd: float | None = None,
hazard_loading: str | None = None,
scale_dimensions: bool | None = None,
adaptive_regularization: bool | None = None,
firth: bool | None = None,
config: dict[str, Any] | None = None,
) -> FormulaValidation
Validate a formula against a dataset without fitting.
Accepts every pipeline kwarg that :func:fit accepts, with identical
semantics. See :func:fit for parameter documentation.
build_info
¶
Return build/runtime metadata for the Rust extension.
Reports whether gamfit._rust was importable and, when available, the
build-time information exposed by the extension (version, commit, feature
flags). Useful for bug reports and for confirming a development build is
being used.
Returns:
| Type | Description |
|---|---|
dict
|
Always contains |
Examples:
cuda_diagnostics
¶
Return CUDA loader diagnostics without forcing Rust GPU dispatch.
format_cuda_diagnostics
¶
Return CUDA loader diagnostics as stable, grep-friendly text.
explain_error
¶
Return a short, actionable hint describing how to recover from exc.
Inspects the exception type and returns a one-line suggestion tailored to
the gamfit error hierarchy (:class:FormulaError,
:class:SchemaMismatchError, :class:PredictionError, :class:GamError,
:class:RustExtensionUnavailableError). Unrecognized exceptions fall back
to a generic message.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
exc
|
BaseException
|
The exception caught from a gamfit call. |
required |
Returns:
| Type | Description |
|---|---|
str
|
Human-readable remediation hint. |
Examples:
Fitted model¶
Model
¶
family_name
property
¶
Human-readable family + link name (e.g. "Gaussian Identity").
model_class
property
¶
Fitted model class string (e.g. "standard", "survival marginal-slope").
is_marginal_slope
property
¶
True if this model was fit with a marginal-slope likelihood.
is_transformation_normal
property
¶
True if this is a conditional transformation-normal model.
response_name
property
¶
Name of the response column, inferred from the formula.
Returns None for survival formulas (Surv(...)) and other
cases where the left-hand side isn't a single identifier.
training_table_kind
property
¶
The kind of table the model was fit on.
One of "pandas", "polars", "pyarrow", "numpy",
"mapping" (dict of columns), "records" (list of dicts),
"rows" (2-D sequence), or None if the input kind wasn't
retained. Used as a default return_type for :meth:predict
and :meth:diagnose.
group_metadata
property
¶
Per-group metadata persisted with the fitted model, if present.
deployment_extensions
property
¶
No-refit group extensions applied after fitting.
predict
¶
predict(
data: Any,
*,
interval: float | None = None,
return_type: str | None = None,
id_column: str | None = None,
with_uncertainty: bool = False,
) -> Any
Predict from data.
Default return (when id_column and return_type are both
omitted) depends on the fitted model class:
- Gaussian / Binomial / Standard models: a table (dict, pandas
DataFrame, pyarrow Table, ...) matching the training table kind
with an
etaandmeancolumn (plus interval columns whenintervalis given). - Transformation-normal models: a per-row transformed z-score as a
1-D numpy array of shape
(n_samples,). - Bernoulli marginal-slope: a calibrated probability vector in
(0, 1)as a 1-D numpy array of shape(n_samples,). - Survival models: a :class:
SurvivalPredictionwhose.hazard_at,.survival_at,.failure_at, and.cumulative_hazard_athelpers evaluate the fitted hazard surface on a user-supplied time grid.
Passing id_column or return_type switches the
array-returning model classes (transformation-normal and
Bernoulli marginal-slope) to the table form: a 2-column table
(id_column, "z" or "mean") rather than a bare 1-D array.
Naively flattening that table with np.asarray(...) /
.to_numpy() yields shape (n_samples, 2), which is a
common cause of silent broadcasting bugs in downstream metric
code that expects a 1-D probability vector. When you need the
probabilities as an array after asking for an id column, extract
the column explicitly, e.g. out["mean"] /
np.asarray(out["mean"], dtype=float).
with_uncertainty (survival only): when True, the returned
:class:SurvivalPrediction also carries delta-method standard
errors on the survival surface (survival_se) and the linear
predictor (eta_se). Only honored for the location-scale
survival likelihood mode; requesting with_uncertainty=True
with any other survival likelihood ("transformation",
"weibull", "marginal-slope", "latent",
"latent-binary") or with competing-risks survival models
raises an error.
predict_array
¶
Predict directly from a numeric NumPy-compatible feature matrix.
Columns are named x0, x1, ... at the Rust formula boundary.
The return value is a dense NumPy array with columns ordered as
eta, mean, then any uncertainty columns.
summary
¶
Return the model summary (coefficients, family, deviance, REML score).
Returns:
| Type | Description |
|---|---|
Summary
|
A dict-like :class: |
Examples:
smoothing_parameters
¶
Return fitted smoothing/precision parameters by penalty index.
check
¶
Validate data against the model's training schema.
Inexpensive: runs the schema validator only, no prediction. Use
this before :meth:predict to surface column-name or type issues
as structured :class:SchemaIssue records rather than as a raised
:class:SchemaMismatchError.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data
|
Any
|
Any table-like input (pandas DataFrame, dict of columns, list of records, numpy array, etc.). |
required |
Returns:
| Type | Description |
|---|---|
SchemaCheck
|
|
Examples:
report
¶
Generate a standalone HTML report of the fitted model.
The report contains the summary table, smooth-term visualisations, and convergence diagnostics. It is self-contained (no external assets), so the file can be emailed or attached to a PR.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
path
|
str | Path | None
|
If given, write the HTML to this path and return the path.
If |
None
|
Returns:
| Type | Description |
|---|---|
str
|
HTML string (when |
Examples:
sample
¶
sample(
data: Any,
*,
samples: int | None = None,
warmup: int | None = None,
chains: int | None = None,
target_accept: float | None = None,
seed: int | None = None,
) -> PosteriorSamples
Draw from the model's posterior with NUTS.
Returns a :class:PosteriorSamples object carrying the raw
(n_draws, n_coeffs) numpy matrix, per-coefficient mean / std /
credible intervals, and convergence diagnostics (rhat,
ess, converged).
Defaults are dimension-aware — leaving every keyword unset gives
you a chain count, warmup length, and total sample budget tuned
to the fitted coefficient size (see
:func:gam::hmc::NutsConfig::for_dimension on the Rust side).
That heuristic already covers most usage; the keywords are there
for power users who want a longer run, a different acceptance
target, or a fixed seed for reproducibility.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data
|
Any
|
Table-like input matching the model's training schema; the
same input formats accepted by :meth: |
required |
samples
|
int | None
|
Posterior draws per chain after warmup. When omitted, chosen automatically from the coefficient count. |
None
|
warmup
|
int | None
|
Warmup iterations per chain (defaults to |
None
|
chains
|
int | None
|
Number of independent chains. Defaults adaptively to 2 or 4. |
None
|
target_accept
|
float | None
|
Target HMC acceptance rate in |
None
|
seed
|
int | None
|
RNG seed for deterministic chain initialisation. |
None
|
Notes
Sampling currently supports standard GLM family models (Gaussian,
Binomial logit/probit/cloglog, Poisson, Gamma — with or without a
link-wiggle component) and survival likelihood modes other than
the latent and location-scale variants. Unsupported model
classes raise :class:gamfit.GamError with a message mirroring
the CLI's gam sample behaviour.
sample_paired
¶
sample_paired(
competing: "Model",
data: Any,
competing_data: Any | None = None,
*,
samples: int | None = None,
warmup: int | None = None,
chains: int | None = None,
target_accept: float | None = None,
seed: int | None = None,
) -> PairedPosteriorSamples
Draw this fit and a linked competing fit with paired draw indices.
design_matrix
¶
Materialised design matrix for data against the saved model.
Returns an (n_rows, n_coeffs) numpy array — exactly the
matrix the engine uses internally for linear-predictor
evaluation. Useful for custom posterior reasoning (e.g.
feeding draws into your own predictive routine) or for
debugging term layouts.
Currently restricted to standard non-link-wiggle GAM models;
other classes raise a clear error pointing at
:meth:Model.predict for the class-specific prediction path.
design_matrix_array
¶
Materialised design matrix for a numeric feature matrix.
predict_with_coverage
¶
predict_with_coverage(
rows: Any, *, coverage: float = 0.95
) -> tuple[Any, Any, Any, dict[str, Any]]
Predict with covariance-based confidence intervals and group attribution.
Returns (point, lower, upper, per_group_variance_contributions).
The first three entries are numpy arrays on the response-mean scale.
The fourth entry is a covariance-block variance decomposition:
per-group arrays contain x_g' Cov(beta_g, beta_g) x_g and
cross-term arrays contain 2 x_g' Cov(beta_g, beta_h) x_h.
difference_smooth
¶
difference_smooth(
*,
view: str,
group: str | None = None,
pairs: Sequence[tuple[Any, Any]] | None = None,
n: int = 100,
level: float = 0.95,
simultaneous: bool = False,
n_sim: int = 10000,
seed: int | None = 12345,
marginalise_random: bool = True,
group_means: bool = True,
data: Any | None = None,
return_type: str | None = None,
) -> Any
Covariance-aware pairwise difference smooths.
Builds two model matrices on a grid, subtracts them, and uses the
fitted joint coefficient covariance for pointwise bands. With
simultaneous=True the band critical value is estimated from
posterior coefficient simulation using the max standardized deviation
across the whole grid.
save
¶
extend_with_group
¶
extend_with_group(
new_group_spec: dict[str, Any],
metadata: Any | None = None,
prior: Any | None = None,
) -> "Model"
Return a no-refit model extended with deployment-time group levels.
new_group_spec currently targets an existing random-effect term:
{"kind": "random-effect-level", "term": "group_term", "level": "new"}
or {"term": "group_term", "levels": ["a", "b"]}. The returned
model reuses the fitted coefficients and inserts zero-initialized
coefficients, or prior["mean"] / prior["mu"] when supplied.
dumps
¶
diagnose
¶
Score the fitted model on held-out data.
Calls :meth:predict on the feature columns of data and
compares the result against the observed response, packaging
the prediction, residuals, observed values, and (when
requested) Wald bands into a :class:Diagnostics object.
Useful for ad-hoc held-out checks and for feeding the
:meth:plot method.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data
|
table - like
|
Any table-like input accepted by :meth: |
required |
y
|
str
|
Name of the response column. Defaults to
:attr: |
None
|
interval
|
float or None
|
Pointwise Wald-interval probability passed through to
:meth: |
0.95
|
Returns:
| Type | Description |
|---|---|
Diagnostics
|
A :class: |
Raises:
| Type | Description |
|---|---|
ValueError
|
If the response column cannot be inferred or is missing from
|
Examples:
>>> diag = model.diagnose(test_df)
>>> diag.rmse, diag.r_squared
(0.42, 0.81)
>>> diag.predicted["mean"][:3]
[1.04, 1.21, 0.99]
See Also
Model.predict Model.plot
plot
¶
plot(
data: Any,
*,
x: str | None = None,
y: str | None = None,
interval: float | None = 0.95,
kind: str = "prediction",
ax: Any | None = None,
) -> Any
Plot the model's behaviour on data with matplotlib.
Runs :meth:diagnose against data and then renders one of
three standard diagnostic plots onto a matplotlib Axes.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data
|
table - like
|
Held-out data with the response column present (same
requirements as :meth: |
required |
x
|
str
|
Feature column to plot on the x-axis when
|
None
|
y
|
str
|
Response column name. Defaults to :attr: |
None
|
interval
|
float or None
|
Pointwise Wald-interval probability for the shaded band on
prediction plots. Ignored for |
0.95
|
kind
|
('prediction', 'residuals', 'observed_vs_predicted')
|
|
"prediction"
|
ax
|
Axes
|
Existing axes to draw onto. When omitted, a fresh
|
None
|
Returns:
| Type | Description |
|---|---|
Axes
|
The axes that were drawn on. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If |
Examples:
>>> model.plot(test_df) # prediction with band
>>> model.plot(test_df, kind="residuals")
>>> ax = model.plot(test_df, kind="observed_vs_predicted")
>>> ax.set_title("Calibration on held-out fold")
See Also
Model.diagnose Model.predict
SurvivalPrediction
dataclass
¶
SurvivalPrediction(
model_class: str,
parameters: Any,
parameter_names: Sequence[str] = tuple(),
times: Any | None = None,
hazard: Any | None = None,
survival: Any | None = None,
cumulative_hazard: Any | None = None,
linear_predictor: Any | None = None,
id_column: str | None = None,
row_ids: Sequence[str] | None = None,
survival_se: Any | None = None,
eta_se: Any | None = None,
)
Per-row survival functions evaluated on demand.
Returned by :meth:Model.predict for survival-family models. The
*_at helpers (:meth:hazard_at, :meth:cumulative_hazard_at,
:meth:survival_at, :meth:failure_at) evaluate the fitted hazard
surface at any user-supplied time grid.
When the FFI produced a dense (n_samples, n_times) grid of
hazard / survival / cumulative-hazard values, the *_at helpers
linearly interpolate against that grid. Otherwise they fall back to
the legacy plug-in piecewise-constant hazard reconstructed from
parameters so bare-dataclass construction keeps working.
For very large queries (n_rows * n_times exceeds roughly one
million cells), the *_at helpers internally evaluate the surface
in blocks via the matching *_at_chunks generator and then
assemble the dense result; callers that want to avoid the dense
allocation can iterate the chunk generators directly or stream a CSV
with :meth:write_survival_at_csv.
Attributes:
| Name | Type | Description |
|---|---|---|
model_class |
str
|
The fitted model class string (e.g. |
parameters |
ndarray
|
Flat per-row parameters returned by the FFI. Shape
|
parameter_names |
tuple of str
|
Column names corresponding to |
times |
ndarray or None
|
Shared 1-D time grid at which the hazard surfaces were evaluated. |
hazard |
ndarray or None
|
|
survival |
ndarray or None
|
|
cumulative_hazard |
ndarray or None
|
|
linear_predictor |
ndarray or None
|
|
id_column |
str or None
|
Optional name of the id column carried through from
:meth: |
row_ids |
sequence of str or None
|
Per-row identifiers aligned with |
survival_se |
ndarray or None
|
|
eta_se |
ndarray or None
|
|
Examples:
>>> import numpy as np
>>> pred = model.predict(test_df) # survival model
>>> times = np.linspace(0.0, 10.0, 50)
>>> S = pred.survival_at(times) # (n_rows, 50) ndarray
>>> h = pred.hazard_at(times)
>>> H = pred.cumulative_hazard_at(times)
See Also
Model.predict : Returns a :class:SurvivalPrediction for survival models.
hazard_at
¶
Evaluate the hazard rate h(t) at each requested time.
When the FFI produced a dense hazard surface this linearly interpolates against the returned grid; otherwise the hazard is reconstructed from the cumulative-hazard differences. Large requests are evaluated in chunks internally before assembling the dense result.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
times
|
array_like
|
1-D sequence of finite, non-negative times at which to evaluate the per-row hazard. |
required |
Returns:
| Type | Description |
|---|---|
ndarray
|
|
Examples:
>>> import numpy as np
>>> pred = model.predict(test_df)
>>> h = pred.hazard_at(np.linspace(0.0, 5.0, 11))
>>> h.shape
(len(test_df), 11)
See Also
SurvivalPrediction.hazard_at_chunks : streaming chunked variant. SurvivalPrediction.cumulative_hazard_at
cumulative_hazard_at
¶
Evaluate the cumulative hazard H(t) = -log S(t).
When the FFI provided a dense cumulative-hazard surface this
interpolates against it directly; otherwise H(t) is derived
from :meth:survival_at via -log S(t) (clipped away from
zero for numerical safety).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
times
|
array_like
|
1-D sequence of finite, non-negative times. |
required |
Returns:
| Type | Description |
|---|---|
ndarray
|
|
Examples:
>>> import numpy as np
>>> H = pred.cumulative_hazard_at(np.array([1.0, 2.0, 5.0]))
>>> np.all(np.diff(H, axis=1) >= 0) # monotone non-decreasing
True
See Also
SurvivalPrediction.survival_at SurvivalPrediction.hazard_at
survival_at
¶
Evaluate the survival probability S(t) at each requested time.
When the FFI produced a dense hazard/survival surface this
linearly interpolates against the returned grid. Otherwise it
falls back to the plug-in identity S(t) = exp(-H(t)) using
a per-row piecewise-constant hazard derived from
parameters (supports bare-dataclass construction). Large
requests are evaluated in chunks internally before assembling
the dense result.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
times
|
array_like
|
1-D sequence of finite, non-negative times. |
required |
Returns:
| Type | Description |
|---|---|
ndarray
|
|
Examples:
>>> import numpy as np
>>> times = np.linspace(0.0, 5.0, 6)
>>> S = pred.survival_at(times)
>>> S[:, 0] # S(0) is 1 for every row
array([1., 1., ..., 1.])
See Also
SurvivalPrediction.failure_at : returns 1 - S(t).
SurvivalPrediction.survival_se_at : delta-method standard error.
SurvivalPrediction.survival_at_chunks : streaming chunked variant.
failure_at
¶
Evaluate the failure (event) probability F(t) = 1 - S(t).
Convenience wrapper around :meth:survival_at; the output is
clipped to [0, 1] to guard against tiny interpolation
excursions.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
times
|
array_like
|
1-D sequence of finite, non-negative times. |
required |
Returns:
| Type | Description |
|---|---|
ndarray
|
|
Examples:
See Also
SurvivalPrediction.survival_at
survival_se_at
¶
Delta-method standard error on S(t) at each requested time.
Returns None when the prediction was not issued with
with_uncertainty=True (or the model class does not yet
support response-scale uncertainty). When available, the
returned array has shape (n_samples, len(times)) and is
clipped to be non-negative.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
times
|
array_like
|
1-D sequence of finite, non-negative times. |
required |
Returns:
| Type | Description |
|---|---|
ndarray or None
|
|
Notes
Pair with :meth:survival_at for response-scale Wald-style
bands: S +/- z * SE with the standard caveats around the
Gaussian approximation near the [0, 1] boundaries.
Examples:
>>> pred = model.predict(test_df, with_uncertainty=True)
>>> S = pred.survival_at([1.0, 2.0])
>>> SE = pred.survival_se_at([1.0, 2.0])
>>> lower = (S - 1.96 * SE).clip(0.0, 1.0)
See Also
SurvivalPrediction.survival_at
Model.predict : pass with_uncertainty=True to populate this.
survival_at_chunks
¶
survival_at_chunks(
times: Any,
*,
people_chunk: int = DEFAULT_SURVIVAL_PEOPLE_CHUNK,
time_grid_chunk: int = DEFAULT_SURVIVAL_TIME_GRID_CHUNK,
) -> Any
Yield S(t) evaluations in row/time blocks.
Streaming counterpart to :meth:survival_at for queries large
enough that the dense (n_samples, len(times)) allocation is
unwelcome. Each yielded block can be consumed (written to disk,
reduced, fed into a metric) and discarded before the next one
is produced.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
times
|
array_like
|
1-D sequence of finite, non-negative times. |
required |
people_chunk
|
int
|
Maximum number of rows per yielded block. Defaults to
|
DEFAULT_SURVIVAL_PEOPLE_CHUNK
|
time_grid_chunk
|
int
|
Maximum number of time points per yielded block. Defaults
to |
DEFAULT_SURVIVAL_TIME_GRID_CHUNK
|
Yields:
| Type | Description |
|---|---|
tuple of (slice, slice, ndarray)
|
|
Examples:
>>> import numpy as np
>>> times = np.linspace(0.0, 10.0, 200)
>>> total = 0.0
>>> for _r, _t, block in pred.survival_at_chunks(times):
... total += float(block.sum())
See Also
SurvivalPrediction.survival_at SurvivalPrediction.write_survival_at_csv
cumulative_hazard_at_chunks
¶
cumulative_hazard_at_chunks(
times: Any,
*,
people_chunk: int = DEFAULT_SURVIVAL_PEOPLE_CHUNK,
time_grid_chunk: int = DEFAULT_SURVIVAL_TIME_GRID_CHUNK,
) -> Any
Yield H(t) evaluations in row/time blocks.
Streaming counterpart to :meth:cumulative_hazard_at. When the
FFI provided a dense cumulative-hazard surface this iterates
that surface directly; otherwise it derives H(t) from each
survival block returned by :meth:survival_at_chunks.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
times
|
array_like
|
1-D sequence of finite, non-negative times. |
required |
people_chunk
|
int
|
Maximum number of rows per yielded block. Defaults to
|
DEFAULT_SURVIVAL_PEOPLE_CHUNK
|
time_grid_chunk
|
int
|
Maximum number of time points per yielded block. Defaults
to |
DEFAULT_SURVIVAL_TIME_GRID_CHUNK
|
Yields:
| Type | Description |
|---|---|
tuple of (slice, slice, ndarray)
|
|
Examples:
>>> for r, t, H_block in pred.cumulative_hazard_at_chunks(times):
... handle.write(H_block.tobytes())
See Also
SurvivalPrediction.cumulative_hazard_at SurvivalPrediction.survival_at_chunks
hazard_at_chunks
¶
hazard_at_chunks(
times: Any,
*,
people_chunk: int = DEFAULT_SURVIVAL_PEOPLE_CHUNK,
time_grid_chunk: int = DEFAULT_SURVIVAL_TIME_GRID_CHUNK,
) -> Any
Yield h(t) evaluations in row/time blocks.
Streaming counterpart to :meth:hazard_at. When the FFI
provided a dense hazard surface this iterates that surface
directly; otherwise the hazard is derived from successive
cumulative-hazard blocks, carrying the previous block's tail
forward so the finite-difference at each block boundary stays
consistent with the non-chunked :meth:hazard_at result.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
times
|
array_like
|
1-D sequence of finite, non-negative times. |
required |
people_chunk
|
int
|
Maximum number of rows per yielded block. Defaults to
|
DEFAULT_SURVIVAL_PEOPLE_CHUNK
|
time_grid_chunk
|
int
|
Maximum number of time points per yielded block. Defaults
to |
DEFAULT_SURVIVAL_TIME_GRID_CHUNK
|
Yields:
| Type | Description |
|---|---|
tuple of (slice, slice, ndarray)
|
|
Examples:
>>> peak = 0.0
>>> for _r, _t, h_block in pred.hazard_at_chunks(times):
... peak = max(peak, float(h_block.max()))
See Also
SurvivalPrediction.hazard_at SurvivalPrediction.cumulative_hazard_at_chunks
write_survival_at_csv
¶
write_survival_at_csv(
path: str | Path,
times: Any,
*,
people_chunk: int = DEFAULT_SURVIVAL_PEOPLE_CHUNK,
time_grid_chunk: int = DEFAULT_SURVIVAL_TIME_GRID_CHUNK,
) -> str
Stream survival predictions to a CSV file.
Iterates :meth:survival_at_chunks and writes one row per
(prediction_row, time) pair, avoiding materialising the full
(n_samples, len(times)) matrix in memory. When the
prediction was issued with an id_column (via
:meth:Model.predict), that column is included.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
path
|
str or Path
|
Destination CSV file. Overwritten if it already exists. |
required |
times
|
array_like
|
1-D sequence of finite, non-negative times at which to
evaluate |
required |
people_chunk
|
int
|
Maximum number of rows per internal block. Defaults to
|
DEFAULT_SURVIVAL_PEOPLE_CHUNK
|
time_grid_chunk
|
int
|
Maximum number of time points per internal block. Defaults
to |
DEFAULT_SURVIVAL_TIME_GRID_CHUNK
|
Returns:
| Type | Description |
|---|---|
str
|
The string form of |
Notes
Columns written are row, time, survival (or
row, <id_column>, time, survival when an id column is
present). The file is opened in text mode with UTF-8 encoding.
Examples:
>>> import numpy as np
>>> pred = model.predict(test_df, id_column="patient_id")
>>> pred.write_survival_at_csv(
... "survival.csv", np.linspace(0.0, 10.0, 64)
... )
'survival.csv'
See Also
SurvivalPrediction.survival_at_chunks
CompetingRisksPrediction
dataclass
¶
CompetingRisksPrediction(
model_class: str,
likelihood_mode: str,
endpoint_names: tuple[str, ...],
times: Any,
hazard: Any,
survival: Any,
cumulative_hazard: Any,
cif: Any,
overall_survival: Any,
linear_predictor: Any,
columns: dict[str, list[float]],
)
Rust-computed joint cause-specific competing-risks prediction.
CompetingRisksCIF
dataclass
¶
CompetingRisksCIF(
times: Any,
cif: Any,
overall_survival: Any,
cumulative_hazard: Any,
endpoint_names: tuple[str, ...],
)
Cause-specific cumulative incidence assembled by the Rust core.
Posterior sampling¶
SamplingConfig
dataclass
¶
Echo of the NUTS configuration the engine ran with.
All fields are populated from the FFI payload so callers can reconstruct
exactly which sampler invocation produced the draws — useful for
reproducibility logs and for telling whether an explicit samples=...
request was honored or auto-derived from the model dimension.
Attributes:
| Name | Type | Description |
|---|---|---|
n_samples |
int
|
Post-warmup draws kept per chain. |
n_warmup |
int
|
Warmup draws discarded per chain before collecting |
n_chains |
int
|
Number of independent NUTS chains run by the engine. |
target_accept |
float
|
Step-size adaptation target acceptance probability in |
seed |
int
|
RNG seed actually consumed by the sampler. |
Examples:
>>> post = model.sample(samples=500)
>>> post.config.n_samples
500
>>> post.config.target_accept
0.95
to_dict
¶
PosteriorSamples
dataclass
¶
PosteriorSamples(
samples: Any,
coefficient_names: tuple[str, ...],
mean: Any,
std: Any,
rhat: float,
ess: float,
converged: bool,
method: str,
model_class: str,
family_kind: str,
config: SamplingConfig,
_model_bytes: bytes = _NO_MODEL,
_name_index: Mapping[str, int] = dict(),
)
Posterior draws over the model's coefficient vector.
Returned by :meth:gamfit.Model.sample. This is the user-facing
surface for posterior reasoning: a numpy-first container with
named-column subscripting, credible-interval helpers, posterior
predictive utilities, .save / :meth:load round-trip, trace
plotting, a concise :meth:__repr__, and a notebook-friendly
rich-HTML representation (_repr_html_) that delegates to
:meth:summary.
Attributes:
| Name | Type | Description |
|---|---|---|
samples |
ndarray
|
|
coefficient_names |
tuple[str, ...]
|
Column labels for |
mean |
ndarray
|
Per-coefficient posterior mean reported by the sampler. |
std |
ndarray
|
Per-coefficient posterior standard deviation reported by the sampler. |
rhat |
float
|
Maximum split-Rhat across coefficients (exact NUTS only;
|
ess |
float
|
Minimum effective sample size across coefficients. |
converged |
bool
|
Boolean convenience for |
method |
str
|
|
model_class |
str
|
Saved-model predictive class string the draws came from. |
family_kind |
str
|
Inverse-link tag ( |
config |
SamplingConfig
|
:class: |
Examples:
>>> post = model.sample(samples=1000, warmup=1000, chains=4)
>>> post.n_draws, post.n_coeffs
(4000, 12)
>>> post["x1"].mean()
0.342
>>> bands = post.predict(new_data, level=0.9)
>>> post.save("posterior.npz")
n_draws
property
¶
n_coeffs
property
¶
shape
property
¶
is_exact
property
¶
from_ffi_payload
classmethod
¶
from_ffi_payload(
payload: Mapping[str, Any], *, model_bytes: bytes = _NO_MODEL
) -> "PosteriorSamples"
Internal factory: build a :class:PosteriorSamples from the FFI payload.
Used by :meth:gamfit.Model.sample to wrap the dict produced by
the Rust sampler. End users should not call this directly.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
payload
|
Mapping[str, Any]
|
Decoded FFI JSON payload. Must contain |
required |
model_bytes
|
bytes
|
Saved-model byte blob to bundle so downstream methods like
:meth: |
_NO_MODEL
|
Returns:
| Type | Description |
|---|---|
PosteriorSamples
|
Reified posterior with samples reshaped to
|
Notes
samples_flat is sent flat (row-major) so we round-trip
through numpy.reshape once. Building a nested list of
lists from JSON would otherwise dominate decode time for
biobank-scale draws.
from_ffi_json
classmethod
¶
Internal factory: build a :class:PosteriorSamples from a raw FFI JSON string.
Thin convenience around :meth:from_ffi_payload that decodes
the JSON itself. Used by :meth:gamfit.Model.sample; not
intended as a public API.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
raw
|
str
|
JSON-encoded FFI payload from the Rust sampler. |
required |
model_bytes
|
bytes
|
Saved-model byte blob bundled into the returned object. |
_NO_MODEL
|
Returns:
| Type | Description |
|---|---|
PosteriorSamples
|
Same as :meth: |
to_numpy
¶
to_pandas
¶
interval
¶
Equal-tailed credible interval for each coefficient.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
level
|
float
|
Coverage probability in |
0.95
|
Returns:
| Type | Description |
|---|---|
ndarray
|
|
Raises:
| Type | Description |
|---|---|
ValueError
|
If |
Examples:
summary
¶
Per-coefficient posterior summary as a :class:Summary.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
level
|
float
|
Coverage probability for the credible interval columns,
in |
0.95
|
Returns:
| Type | Description |
|---|---|
Summary
|
Coefficient rows ( |
Notes
The payload mirrors what :meth:gamfit.Model.summary returns
for fitted models, so downstream rendering helpers work
uniformly on both fitted and sampled views.
Examples:
predict
¶
Posterior credible bands for eta and E[y | x] on new data.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
new_data
|
Any
|
Tabular new data (DataFrame, dict of columns, or any object accepted by the engine's table normaliser) at which to evaluate the posterior fitted means. |
required |
chunk_size
|
int or None
|
Number of prediction rows processed at once. Default
|
4096
|
level
|
float
|
Coverage probability for the credible bands in |
0.95
|
Returns:
| Type | Description |
|---|---|
dict[str, ndarray]
|
Six length- |
Raises:
| Type | Description |
|---|---|
RuntimeError
|
If this :class: |
NotImplementedError
|
For model classes lacking a closed-form design matrix
(e.g. link-wiggle, survival) — use
:meth: |
Notes
Walks chunks of rows through draws @ X.T and reduces each
chunk to quantiles immediately, so memory stays bounded at
roughly n_draws * chunk_size floats regardless of the
prediction-set size. For Laplace-method posteriors the
returned bands match what
model.predict(new_data, interval=level) produces
analytically, up to Monte Carlo error.
Examples:
predict_draws
¶
Full posterior fitted-mean draws on new data.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
new_data
|
Any
|
Tabular new data (DataFrame, dict of columns, or any object accepted by the engine's table normaliser). |
required |
Returns:
| Type | Description |
|---|---|
PosteriorPredictive
|
Container whose :attr: |
Raises:
| Type | Description |
|---|---|
RuntimeError
|
If this :class: |
Notes
Materialises the full (n_draws, n_rows) matrix in memory.
For very large prediction sets prefer :meth:predict, which
streams per-row credible bands chunk-by-chunk.
Examples:
save
¶
Save the posterior to an .npz archive.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
path
|
str or Path
|
Destination |
required |
Returns:
| Type | Description |
|---|---|
str
|
String form of the resolved output path. |
Notes
The archive carries the full (n_draws, n_coeffs) samples
matrix, the per-coefficient mean and std, convergence
diagnostics, method / class / family tags, the
:class:SamplingConfig, and the saved model bytes (so
:meth:predict continues to work after a round-trip via
:meth:load).
Examples:
load
classmethod
¶
Load a :class:PosteriorSamples from an .npz archive.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
path
|
str or Path
|
Path to an archive previously written by :meth: |
required |
Returns:
| Type | Description |
|---|---|
PosteriorSamples
|
Reconstructed posterior, including bundled model bytes so
:meth: |
Notes
The archive uses allow_pickle=True to round-trip the JSON
metadata stored as a 0-d object array; only load archives you
produced via :meth:save.
Examples:
plot_trace
¶
Matplotlib trace + marginal-density plot.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
coefficients
|
None, str, int, or iterable of str/int
|
Coefficients to plot. If |
None
|
max_panels
|
int
|
Cap on the number of panel rows when |
8
|
ax
|
numpy.ndarray of matplotlib Axes
|
Pre-existing 2-D axes array of shape |
None
|
Returns:
| Type | Description |
|---|---|
Figure
|
The figure containing the trace and density panels. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If the resolved coefficient selection is empty. |
Notes
Each row has two panels: trace (draws vs iteration index) on the left and a marginal density histogram on the right.
Examples:
PairedPosteriorSamples
dataclass
¶
Posterior samples from two linked fits with draw rows paired by index.
cumulative_incidence
¶
cumulative_incidence(
new_data: Any, times: Any, *, level: float = 0.95
) -> CumulativeIncidenceDraws
Compute target-cause CIF draws using paired target/competing rows.
PosteriorPredictive
dataclass
¶
Per-row posterior fitted-mean draws on the link and response scales.
Returned by :meth:PosteriorSamples.predict_draws, this container
holds the full (n_draws, n_rows) matrices of fitted-mean draws
on both the linear-predictor (eta) and response (mean)
scales, along with link/class metadata used to re-apply the inverse
link on demand.
Attributes:
| Name | Type | Description |
|---|---|---|
eta |
ndarray
|
|
mean |
ndarray
|
|
family_kind |
str
|
Inverse-link tag emitted by the engine ( |
model_class |
str
|
Saved-model predictive class string the underlying
:class: |
Notes
Use :meth:summary to collapse the matrices to per-row credible
bands without writing the quantile reductions yourself. For very
large prediction sets, prefer :meth:PosteriorSamples.predict
which streams chunk-by-chunk instead of materialising the full
(n_draws, n_rows) matrix here.
Examples:
shape
property
¶
n_draws
property
¶
n_rows
property
¶
summary
¶
Collapse fitted-mean draws to per-row credible bands.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
level
|
float
|
Coverage probability of the equal-tailed credible interval
in |
0.95
|
Returns:
| Type | Description |
|---|---|
dict[str, ndarray]
|
Dict with six length- |
Notes
Because the supported inverse links are monotone, response-scale
quantiles are computed as the inverse link applied to the link
quantiles rather than as quantiles of :attr:mean directly —
the two agree up to numerical noise and the link-quantile form
avoids re-walking the response-scale matrix.
Examples:
CumulativeIncidenceDraws
dataclass
¶
Paired posterior draws for a target-cause cumulative incidence curve.
Diagnostics and metadata¶
Summary
dataclass
¶
Frozen view of a fitted-model summary payload.
A Summary is the structured equivalent of print(model) for a fitted
GAM. It wraps a plain dict returned by the Rust engine and exposes
convenient accessors plus a notebook-friendly HTML representation. The
typical entry point is :meth:Model.summary.
The payload typically contains keys such as formula, family_name,
model_class, deviance, reml_score, and coefficients (a list
of per-term dictionaries). Use :meth:coefficients_frame to view the
coefficient table as a pandas DataFrame.
Examples:
>>> summary = model.summary()
>>> summary["family_name"]
'gaussian'
>>> summary.coefficients_frame().head()
coefficients
property
¶
from_dict
classmethod
¶
Build a :class:Summary from a raw payload dictionary.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
payload
|
dict
|
Mapping of summary keys to values, as produced by the Rust engine. |
required |
Returns:
| Type | Description |
|---|---|
Summary
|
A new immutable summary view over a shallow copy of |
Examples:
get
¶
Return payload[key] if present, otherwise default.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
key
|
str
|
Payload key to look up. |
required |
default
|
Any
|
Value returned when |
None
|
Returns:
| Type | Description |
|---|---|
Any
|
The looked-up value, or |
Examples:
to_dict
¶
coefficients_frame
¶
Diagnostics
dataclass
¶
Diagnostics(
formula: str,
response_name: str,
observed: list[float],
residuals: list[float],
predicted: dict[str, list[float]],
metrics: dict[str, float],
interval_lower: list[float] | None = None,
interval_upper: list[float] | None = None,
)
Held-out / in-sample diagnostics for a fitted GAM.
Bundles observed responses, model-implied predictions, residuals, and
aggregate fit metrics (MAE, RMSE, bias, optional :math:R^2) into a
single immutable record. Returned by :meth:Model.diagnose and rendered
inline in notebooks via :meth:_repr_html_.
Key fields:
formula: the model formula used to produce the predictions.response_name: name of the response column in the input table.observed: actual response values aligned withpredicted["mean"].residuals:observed - predicted["mean"]per row.predicted: dictionary of prediction series (meanplus optionalmean_lower/mean_upperinterval bounds).metrics: scalar fit metrics (n_obs,mae,rmse,bias, andr_squaredwhen the response varies).interval_lower/interval_upper: optional pointwise prediction bands when the underlying call requested an interval.
Examples:
from_predictions
classmethod
¶
from_predictions(
*,
formula: str,
response_name: str,
observed: list[float],
predicted: dict[str, list[float]],
) -> "Diagnostics"
Construct a :class:Diagnostics from raw observed and predicted series.
Computes residuals and aggregate fit metrics (n, MAE, RMSE, bias, and
:math:R^2 when the response variance is positive) from the inputs.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
formula
|
str
|
Model formula associated with the predictions. |
required |
response_name
|
str
|
Name of the response column. |
required |
observed
|
list of float
|
Observed response values. |
required |
predicted
|
dict of str to list of float
|
Prediction series. Must contain key |
required |
Returns:
| Type | Description |
|---|---|
Diagnostics
|
Populated diagnostics record with computed residuals and metrics. |
Examples:
to_dict
¶
SchemaCheck
dataclass
¶
Result of comparing serving data against a fitted model's training schema.
Returned by :meth:Model.check. Truthy when the check passes
(ok=True with no issues); rendered as an HTML table in notebooks.
Key fields:
ok:Truewhen the data matches the training schema.issues: tuple of :class:SchemaIssuerecords describing each detected problem (empty whenokisTrue).
Examples:
from_dict
classmethod
¶
Build a :class:SchemaCheck from a raw payload dictionary.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
payload
|
dict
|
Mapping with keys |
required |
Returns:
| Type | Description |
|---|---|
SchemaCheck
|
Parsed schema-check result. |
Examples:
raise_for_error
¶
Raise :class:ValueError if the schema check failed.
Concatenates every issue message into a single ValueError. A no-op
when :attr:ok is True.
Raises:
| Type | Description |
|---|---|
ValueError
|
If at least one :class: |
Examples:
SchemaIssue
dataclass
¶
A single schema-validation problem detected against the training schema.
Key fields:
kind: short tag describing the issue category (e.g."missing","type_mismatch").message: human-readable explanation.column: name of the offending column, when applicable.
Examples:
>>> SchemaIssue(kind="missing", message="column 'age' is missing", column="age")
SchemaIssue(kind='missing', message="column 'age' is missing", column='age')
FormulaValidation
dataclass
¶
Outcome of :func:gamfit.validate_formula (no fit performed).
Wraps the JSON payload returned by the Rust validator. Typical keys include
formula, model_class, family_name, and supported_by_python.
Use this to confirm a formula parses, infer the family that would be
picked, and check whether the Python binding can fit the resulting model
before committing to a full :func:gamfit.fit call.
Examples:
>>> info = gamfit.validate_formula(df, "y ~ s(x)")
>>> info["family_name"]
'gaussian'
>>> info.supported_by_python
True
supported_by_python
property
¶
from_dict
classmethod
¶
Build a :class:FormulaValidation from a raw payload dictionary.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
payload
|
dict
|
Mapping of validation keys to values, as produced by the Rust validator. |
required |
Returns:
| Type | Description |
|---|---|
FormulaValidation
|
Immutable view over a shallow copy of |
Examples:
to_dict
¶
SharedPrecisionGroup
dataclass
¶
SharedPrecisionGroup(
name: str,
shape: float = 1.0,
rate: float = 0.0,
labels: str | Mapping[str | int, str] | None = None,
)
Cross-fit coefficient precision group.
name is the shared precision coordinate. By default it selects the
same named coefficient term/column/label in every model. labels can
override that with either one label for all models or a mapping keyed by
the model name/index supplied to :func:cross_fit_shared_precision_groups.
Basis and ridge primitives¶
bspline_basis
¶
Evaluate the Rust B-spline basis as a NumPy array.
knots may be:
None— auto-derive a clamped knot vector with quantile-spaced interior knots inferred fromt.- an
intK— auto-derive withKinterior knots. - an array-like — used verbatim (must be a valid clamped knot vector).
bspline_basis_derivative
¶
bspline_basis_derivative(
t: Any,
knots: Any = None,
*,
degree: int = 3,
order: int = 1,
periodic: bool = False,
) -> Any
Evaluate derivatives of the Rust B-spline basis as a NumPy array.
knots accepts None / int / array — see :func:bspline_basis.
duchon_basis_1d
¶
Evaluate the Rust one-dimensional Duchon basis as a NumPy array.
centers may be:
None— auto-deriveK = 10centers at empirical quantiles oft.- an
intK— auto-deriveKquantile centers. - an array-like — used verbatim.
duchon_basis_1d_derivative
¶
duchon_basis_1d_derivative(
t: Any,
centers: Any = None,
*,
m: int = 2,
order: int = 1,
periodic: bool = False,
) -> Any
Evaluate derivatives of the Rust one-dimensional Duchon basis.
centers accepts None / int / array — see :func:duchon_basis_1d.
smoothness_penalty
¶
Return (S, null_basis) for the Rust B-spline difference penalty.
knots must be a knot vector here — auto-derivation requires
sample positions, which this penalty constructor does not take. Build
one with :func:bspline_basis's defaults (or pass any 1D array).
gaussian_weighted_ridge
¶
gaussian_weighted_ridge(
X: Any, Y: Any, penalty: Any, weights: Any, *, ridge_lambda: float
) -> tuple[Any, Any]
Closed-form Gaussian row-weighted ridge on NumPy-compatible arrays.
weights are likelihood row weights. They are not a multiplicative
gate on the mean/design row.
gaussian_weighted_ridge_batch
¶
gaussian_weighted_ridge_batch(
X: Any,
Y: Any,
penalty: Any,
weights: Any,
*,
ridge_lambda: float,
row_counts: Any | None = None,
) -> tuple[Any, Any]
Batched closed-form Gaussian row-weighted ridge.
X has shape (K, Nmax, M), Y has shape (K, Nmax, D), and
weights has shape (K, Nmax). row_counts optionally marks the
active row prefix for each problem in a padded ragged batch.
Gaussian REML primitives¶
gaussian_reml_fit
¶
gaussian_reml_fit(
x: Any,
y: Any,
penalty: Any,
*,
weights: Any | None = None,
init_lambda: float | None = None,
by: Any | None = None,
by_start_col: int = 0,
) -> dict[str, Any]
Fit a closed-form Gaussian REML problem from NumPy-compatible arrays.
gaussian_reml_fit_backward
¶
gaussian_reml_fit_backward(
x: Any,
y: Any,
penalty: Any,
*,
grad_lambda: float = 0.0,
grad_coefficients: Any | None = None,
grad_fitted: Any | None = None,
grad_reml_score: float = 0.0,
grad_edf: float = 0.0,
forward_state: dict[str, Any] | None = None,
weights: Any | None = None,
init_lambda: float | None = None,
by: Any | None = None,
by_start_col: int = 0,
) -> dict[str, Any]
Run the analytic VJP for gaussian_reml_fit outputs.
gaussian_reml_fit_batched
¶
gaussian_reml_fit_batched(
x: Any,
y: Any,
row_offsets: Any,
penalty: Any,
*,
weights: Any | None = None,
init_lambda: float | None = None,
by: Any | None = None,
by_start_col: int = 0,
) -> dict[str, Any]
Fit K closed-form Gaussian REML problems packed by row offsets.
gaussian_reml_fit_batched_backward
¶
gaussian_reml_fit_batched_backward(
x: Any,
y: Any,
row_offsets: Any,
penalty: Any,
*,
grad_lambda: Any | None = None,
grad_coefficients: Any | None = None,
grad_fitted: Any | None = None,
grad_reml_score: Any | None = None,
grad_edf: Any | None = None,
forward_state: dict[str, Any] | None = None,
weights: Any | None = None,
init_lambda: float | None = None,
by: Any | None = None,
by_start_col: int = 0,
) -> dict[str, Any]
Run packed ragged analytic VJPs for gaussian_reml_fit_batched.
gaussian_reml_fit_positions
¶
gaussian_reml_fit_positions(
t: Any,
y: Any,
basis_kind: str | None = None,
knots_or_centers: Any = None,
penalty: Any | None = None,
*,
basis: str | None = None,
basis_order: int | None = None,
periodic: bool = False,
period: float | None = None,
weights: Any | None = None,
init_lambda: float | None = None,
by: Any | None = None,
by_start_col: int = 0,
) -> dict[str, Any]
Fit closed-form Gaussian REML from 1D positions and an internal basis.
knots_or_centers may be None, an int (basis count), or an
array; the basis-location vector is auto-derived from t when not
supplied. penalty may be None for a neutral identity ridge of
matching size.
gaussian_reml_fit_positions_backward
¶
gaussian_reml_fit_positions_backward(
t: Any,
y: Any,
basis_kind: str | None = None,
knots_or_centers: Any = None,
penalty: Any | None = None,
*,
basis: str | None = None,
grad_lambda: float = 0.0,
grad_coefficients: Any | None = None,
grad_fitted: Any | None = None,
grad_reml_score: float = 0.0,
grad_edf: float = 0.0,
forward_state: dict[str, Any] | None = None,
basis_order: int | None = None,
periodic: bool = False,
period: float | None = None,
weights: Any | None = None,
init_lambda: float | None = None,
by: Any | None = None,
by_start_col: int = 0,
) -> dict[str, Any]
Run the analytic VJP for gaussian_reml_fit_positions outputs.
knots_or_centers and penalty accept the same auto-derived
defaults as :func:gaussian_reml_fit_positions.
gaussian_reml_fit_positions_batched
¶
gaussian_reml_fit_positions_batched(
t: Any,
y: Any,
row_offsets: Any,
basis_kind: str | None = None,
knots_or_centers: Any = None,
penalty: Any | None = None,
*,
basis: str | None = None,
basis_order: int | None = None,
periodic: bool = False,
period: float | None = None,
weights: Any | None = None,
init_lambda: float | None = None,
by: Any | None = None,
by_start_col: int = 0,
) -> dict[str, Any]
Fit packed ragged closed-form Gaussian REML problems from positions.
knots_or_centers and penalty accept the same auto-derived
defaults as :func:gaussian_reml_fit_positions. The basis locations
are inferred from the concatenated positions across all groups.
gaussian_reml_fit_positions_batched_backward
¶
gaussian_reml_fit_positions_batched_backward(
t: Any,
y: Any,
row_offsets: Any,
basis_kind: str | None = None,
knots_or_centers: Any = None,
penalty: Any | None = None,
*,
basis: str | None = None,
grad_lambda: Any | None = None,
grad_coefficients: Any | None = None,
grad_fitted: Any | None = None,
grad_reml_score: Any | None = None,
grad_edf: Any | None = None,
forward_state: dict[str, Any] | None = None,
basis_order: int | None = None,
periodic: bool = False,
period: float | None = None,
weights: Any | None = None,
init_lambda: float | None = None,
by: Any | None = None,
by_start_col: int = 0,
) -> dict[str, Any]
Run the analytic VJP for packed position-based Gaussian REML fits.
knots_or_centers and penalty accept the same auto-derived
defaults as :func:gaussian_reml_fit_positions_batched.
gaussian_reml_fit_formula
¶
gaussian_reml_fit_formula(
data: Any, formula: str, y: Any, *, config: dict[str, Any] | None = None
) -> dict[str, Any]
Fit closed-form Gaussian REML after materialising a formula design.
scikit-learn integration¶
GAMRegressor
dataclass
¶
GAMRegressor(
formula: str,
family: str = "auto",
offset: str | None = None,
weights: str | None = None,
config: dict[str, Any] | None = None,
)
Bases: _BaseGAMEstimator, RegressorMixin
scikit-learn-compatible regressor wrapping :func:gamfit.fit.
Construct with a formula string and (optionally) pipeline kwargs such as
family, offset, weights, or a free-form config dict, then
call :meth:fit with either a fully-formed table (X) or a feature
table plus a target column / vector (y). After fitting, the estimator
exposes the standard predict / score interface plus pass-through
helpers :meth:summary, :meth:report, and :meth:check from the
underlying :class:Model.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
formula
|
str
|
Wilkinson-style formula. May or may not include the response on the
left-hand side; the response is resolved from |
required |
family
|
str
|
Likelihood family forwarded to :func: |
``"auto"``
|
offset
|
str or None
|
Offset column name, forwarded to :func: |
None
|
weights
|
str or None
|
Observation-weight column name. |
None
|
config
|
dict or None
|
Escape-hatch dict of extra pipeline keys. |
None
|
Examples:
>>> from gamfit.sklearn import GAMRegressor
>>> reg = GAMRegressor(formula="y ~ s(x1) + s(x2)").fit(X_train, y_train)
>>> preds = reg.predict(X_test)
>>> reg.score(X_test, y_test)
0.87
fit
¶
Fit the underlying GAM and return self.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
X
|
Any
|
Training table (pandas DataFrame, pyarrow Table, dict of columns,
list of records, or anything :func: |
required |
y
|
str, array-like, or None
|
Target. |
None
|
Returns:
| Type | Description |
|---|---|
GAMRegressor
|
Fitted estimator ( |
Examples:
predict
¶
Predict the conditional mean for each row in X.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
X
|
Any
|
Serving table with the feature columns seen at fit time. |
required |
Returns:
| Type | Description |
|---|---|
ndarray
|
One-dimensional float array of predicted means, one per row. |
Examples:
score
¶
Return the coefficient of determination :math:R^2.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
X
|
Any
|
Test feature table. |
required |
y
|
array - like
|
True response values. |
required |
sample_weight
|
array - like or None
|
Per-row weights forwarded to :func: |
None
|
Returns:
| Type | Description |
|---|---|
float
|
:math: |
Examples:
GAMClassifier
dataclass
¶
GAMClassifier(
formula: str,
family: str = "auto",
offset: str | None = None,
weights: str | None = None,
config: dict[str, Any] | None = None,
)
Bases: _BaseGAMEstimator, ClassifierMixin
scikit-learn-compatible binary classifier wrapping :func:gamfit.fit.
Same construction and fit semantics as :class:GAMRegressor (see that
class for parameter documentation). Predictions interpret the model's
mean as the probability of the positive class; classes are fixed to
[0, 1] and a threshold of 0.5 is used by :meth:predict.
Examples:
>>> from gamfit.sklearn import GAMClassifier
>>> clf = GAMClassifier(formula="y ~ s(x1) + s(x2)", family="binomial")
>>> clf.fit(X_train, y_train)
>>> clf.predict_proba(X_test)[:1]
array([[0.34, 0.66]])
fit
¶
Fit the binary GAM classifier and return self.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
X
|
Any
|
Training table. See :meth: |
required |
y
|
str, array-like, or None
|
Binary target. See :meth: |
None
|
Returns:
| Type | Description |
|---|---|
GAMClassifier
|
Fitted estimator ( |
Examples:
predict_proba
¶
Predict class probabilities for each row in X.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
X
|
Any
|
Serving table with the feature columns seen at fit time. |
required |
Returns:
| Type | Description |
|---|---|
ndarray
|
Two-column float array |
Examples:
predict
¶
Predict the binary class label using a 0.5 threshold on the positive class.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
X
|
Any
|
Serving table with the feature columns seen at fit time. |
required |
Returns:
| Type | Description |
|---|---|
ndarray
|
One-dimensional integer array of class labels ( |
Examples:
score
¶
Return classification accuracy.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
X
|
Any
|
Test feature table. |
required |
y
|
array - like
|
True binary labels. |
required |
sample_weight
|
array - like or None
|
Per-row weights forwarded to
:func: |
None
|
Returns:
| Type | Description |
|---|---|
float
|
Accuracy in |
Examples:
Exceptions¶
GamError
¶
Bases: Exception
Base class for Python-facing GAM errors.
All gamfit-specific exceptions raised by the Python binding inherit from
GamError, so catching this class is the broadest way to handle a
failure originating from the Rust engine or the binding layer.
Examples:
FormulaError
¶
Bases: GamError
The formula is invalid or unsupported.
Raised when the Wilkinson-style formula string cannot be parsed, references columns missing from the input table, or describes a model the engine does not support.
Examples:
SchemaMismatchError
¶
Bases: GamError
Prediction input does not match the training schema.
Raised when the table passed to :meth:Model.predict or related methods
lacks columns that were present at fit time, has incompatible dtypes, or
introduces unknown categorical levels.
Examples:
PredictionError
¶
Bases: GamError
Prediction failed.
Raised for runtime failures during prediction that are not pure schema problems (numerical issues, unsupported prediction modes for the fitted model, etc.).
Examples:
RustExtensionUnavailableError
¶
Bases: ImportError
Raised when the compiled gamfit._rust extension cannot be imported.
The Rust engine ships as a maturin-built extension module. When it is
missing (typical in a fresh source checkout that has not been built yet),
every Rust-backed API in :mod:gamfit raises this error eagerly so users
see a single, actionable message instead of an opaque ImportError.
The fix is to build or install the package, e.g. maturin develop from
the gamfit source tree, or pip install gamfit from PyPI.
Examples:
Response geometry¶
ResponseGeometryModel
dataclass
¶
ResponseGeometryModel(
models: Sequence[Any],
response_geometry: str,
response_columns: tuple[str, ...],
base_point: Any,
coordinates: str,
reference: int = -1,
training_table_kind: str | None = None,
shared_tangent_fit: SharedGaussianRemlTangentFit | None = None,
)
A fitted response-geometry GAM with shared smoothing across tangent coordinates.
alr
¶
Additive log-ratio coordinates for positive compositions.
simplex_frechet_mean
¶
Intrinsic Fréchet mean under Aitchison simplex geometry.
sphere_frechet_mean
¶
sphere_frechet_mean(
values: Any,
weights: Any | None = None,
*,
tol: float = 1e-12,
max_iter: int = 256,
) -> Any
Intrinsic Fréchet/Karcher mean on the unit sphere.
If the minimizer is not unique, as for an exactly antipodal pair, this returns one deterministic minimizer rather than an endpoint surrogate.