Diagnostics

This provides a small set of utilities in NumPyro that are used to diagnose posterior samples.

Autocorrelation

autocorrelation(x: NDArray, axis: int = 0, bias: bool = True) → NDArray[source]

Computes the autocorrelation of samples at dimension axis.

Parameters:

x (numpy.ndarray) – the input array.
axis (int) – the dimension to calculate autocorrelation.
bias – whether to use a biased estimator.

Returns:

autocorrelation of x.

Return type:

numpy.ndarray

Autocovariance

autocovariance(x: NDArray, axis: int = 0, bias: bool = True) → NDArray[source]

Computes the autocovariance of samples at dimension axis.

Parameters:

x (numpy.ndarray) – the input array.
axis (int) – the dimension to calculate autocovariance.
bias – whether to use a biased estimator.

Returns:

autocovariance of x.

Return type:

numpy.ndarray

Effective Sample Size

effective_sample_size(x: NDArray, bias: bool = True) → NDArray[source]

Computes effective sample size of input x, where the first dimension of x is chain dimension and the second dimension of x is draw dimension.

References:

Introduction to Markov Chain Monte Carlo, Charles J. Geyer
Stan Reference Manual version 2.18, Stan Development Team

Parameters:

x (numpy.ndarray) – the input array.
bias – whether to use a biased estimator of the autocovariance.

Returns:

effective sample size of x.

Return type:

numpy.ndarray

Gelman Rubin

gelman_rubin(x: NDArray) → NDArray[source]

Computes R-hat over chains of samples x, where the first dimension of x is chain dimension and the second dimension of x is draw dimension. It is required that x.shape[0] >= 2 and x.shape[1] >= 2.

Parameters:: x (numpy.ndarray) – the input array.
Returns:: R-hat of x.
Return type:: numpy.ndarray

Split Gelman Rubin

split_gelman_rubin(x: NDArray) → NDArray[source]

Computes split R-hat over chains of samples x, where the first dimension of x is chain dimension and the second dimension of x is draw dimension. It is required that x.shape[1] >= 4.

Parameters:: x (numpy.ndarray) – the input array.
Returns:: split R-hat of x.
Return type:: numpy.ndarray

HPDI

hpdi(x: NDArray, prob: float = 0.9, axis: int = 0) → NDArray[source]

Computes “highest posterior density interval” (HPDI) which is the narrowest interval with probability mass prob.

Parameters:

x (numpy.ndarray) – the input array.
prob (float) – the probability mass of samples within the interval.
axis (int) – the dimension to calculate hpdi.

Returns:

Array containing the lower and upper bounds of the HPDI along the specified axis. The output has the same shape as x except that the size along axis is 2 (lower bound first, upper bound second).

Return type:

numpy.ndarray

Summary

summary(samples: dict | ndarray, prob: float = 0.9, group_by_chain: bool = True) → dict[source]

Returns a summary table displaying diagnostics of samples from the posterior. The diagnostics displayed are mean, standard deviation, median, the 90% Credibility Interval hpdi(), effective_sample_size(), and split_gelman_rubin().

Parameters:

samples (dict or numpy.ndarray) – a collection of input samples with left most dimension is chain dimension and second to left most dimension is draw dimension.
prob (float) – the probability mass of samples within the HPDI interval.
group_by_chain (bool) – If True, each variable in samples will be treated as having shape num_chains x num_samples x sample_shape. Otherwise, the corresponding shape will be num_samples x sample_shape (i.e. without chain dimension).

print_summary(samples: dict | NDArray, prob: float = 0.9, group_by_chain: bool = True) → None[source]

Prints a summary table displaying diagnostics of samples from the posterior. The diagnostics displayed are mean, standard deviation, median, the 90% Credibility Interval hpdi(), effective_sample_size(), and split_gelman_rubin().

Parameters:

samples (dict or numpy.ndarray) – a collection of input samples with left most dimension is chain dimension and second to left most dimension is draw dimension.
prob (float) – the probability mass of samples within the HPDI interval.
group_by_chain (bool) – If True, each variable in samples will be treated as having shape num_chains x num_samples x sample_shape. Otherwise, the corresponding shape will be num_samples x sample_shape (i.e. without chain dimension).