lib.samples.samples.Samples

class lib.samples.samples.Samples(data=None, parameters=None, attrs=None)

Bases: cosmopipe.lib.catalog.base.BaseCatalog

Class that holds samples drawn from likelihood.

Initialize Samples.

Parameters
  • data (dict, Samples) – Dictionary name: array. If Samples instance, update self attributes.

  • parameters (list, ParameterCollection, default=None) – Parameters. Defaults to data.keys().

  • attrs (dict) – Other attributes.

Methods

argmax

Return parameter value for maximum of cost.

argmin

Return parameter value for minimum of cost.

autocorrelation

Return weighted autocorrelation.

average

Return global average of column(s) column, with weights weights (defaults to 1).

columns

Return parameter names, after optional selections.

concatenate

Concatenate catalogs together.

copy

Return copy, including column names columns (defaults to all columns).

corrcoef

Estimate weighted parameter correlation matrix.

corrpair

Estimate weighted correlation of a pair of parameters column1 and column2.

cov

Estimate weighted parameter covariance.

covpair

Estimate weighted covariance of a pair of parameters column1 and column2.

deepcopy

eval

Evaluate input literal and return results.

extend

Extend catalog with other.

falses

Return array of size size filled with False.

from_array

Build BaseCatalog from input array.

from_nbodykit

Build new catalog from nbodykit.

from_state

Instantiate and initalize class with state dictionary.

full

Return array of size size filled with fill_value.

gelman_rubin

Return Gelman-Rubin statistics, which compares covariance of chain means to (mean of) intra-chain covariances.

get

Return samples for parameter name.

gget

Return on process rank root catalog global column column if exists, else return provided default.

gindices

Row numbers in the global catalog.

gslice

Perform global slicing of catalog, e.g.

integrated_autocorrelation_time

Return integrated autocorrelation time.

interval

Return n-sigmas confidence interval(s).

invcov

Estimate weighted parameter inverse covariance.

is_mpi_broadcast

is_mpi_gathered

is_mpi_root

is_mpi_scattered

load

Load catalog in numpy binary format from disk.

load_auto

Load samples from disk.

load_cosmomc

Load samples in CosmoMC format, i.e.:

load_fits

Load catalog in fits binary format from disk.

log_critical

log_debug

log_error

log_info

log_warning

maximum

Return global maximum of column(s) column.

mean

Return weighted mean.

median

Return global median of column(s) column.

minimum

Return global minimum of column(s) column.

mpi_broadcast

mpi_collect

Return new instance corresponding to self on larger mpicomm.

mpi_distribute

Return new instance corresponding to self on smaller mpicomm.

mpi_gather

Gather catalog on a single process.

mpi_recv

Receive catalog from rank source with tag tag.

mpi_scatter

Scatter catalog on all processes.

mpi_send

Send catalog to rank dest with tag tag.

mpi_to_state

Return instance, changing current MPI state to mpistate.

nans

Return array of size size filled with numpy.nan.

ones

Return array of size size filled with one.

percentile

Return global percentiles of column(s) column.

quantile

Return weighted quantiles.

remove_burnin

Return new samples with burn-in removed.

save

Save class to disk.

save_auto

Save samples to disk.

save_cosmomc

Save samples to disk in CosmoMC format.

save_fits

Save catalog to filename as fits file.

set

Set parameter name samples to item.

set_default_parameter

Add default parameter of name name.

std

Estimate weigthed standard deviation.

sum

Return global sum of column(s) column.

to_array

Return samples as numpy array.

to_getdist

Return GetDist hook to samples.

to_mesh

Interpolate samples to mesh.

to_nbodykit

Return catalog in nbodykit format.

to_stats

Export samples summary quantities.

trues

Return array of size size filled with True.

var

Estimate weighted parameter variance.

zeros

Return array of size size filled with zero.

Attributes

gsize

Return catalog global size, i.e. sum of size in each process.

logger

mpiattrs

MPI attributes

mpistate

size

Equivalent for __length__().

argmax(column, cost='metrics.logposterior')

Return parameter value for maximum of cost.

argmin(column, cost='metrics.chi2')

Return parameter value for minimum of cost.

autocorrelation(column)

Return weighted autocorrelation. Adapted from https://github.com/dfm/emcee/blob/main/src/emcee/autocorr.py

Parameters

columns (list, ParameterCollection) – Parameters to compute autocorrelation for. Defaults to all parameters.

Returns

autocorr

Return type

array

average(column, weights=None)

Return global average of column(s) column, with weights weights (defaults to 1).

columns(include=None, exclude=None, **kwargs)

Return parameter names, after optional selections.

Parameters
  • include (list, string, default=None) – Single or list of regex patterns to select parameter names to include. Defaults to all parameters.

  • exclude (list, string, default=None) – Single or list of regex patterns to select parameter names to exclude. Defaults to no parameters.

  • kwargs (dict) – Selections on parameter attributes, e.g. varied=True for varied parameters.

Returns

columns – Return parameters, after optional selections.

Return type

list

classmethod concatenate(*others)

Concatenate catalogs together.

Parameters

others (list) – List of BaseCatalog instances.

Returns

new

Return type

BaseCatalog

Warning

attrs of returned catalog contains, for each key, the last value found in others attrs dictionaries.

copy(columns=None)

Return copy, including column names columns (defaults to all columns).

corrcoef(columns=None, **kwargs)

Estimate weighted parameter correlation matrix. See cov().

corrpair(column1, column2, **kwargs)

Estimate weighted correlation of a pair of parameters column1 and column2. See cov().

cov(columns=None, ddof=1)

Estimate weighted parameter covariance.

Parameters
  • columns (list, ParameterCollection, default=None) – Parameters to compute covariance for. Defaults to all varied parameters.

  • ddof (int, default=1) – Number of degrees of freedom.

Returns

cov – If single parameter provided as columns, returns variance for that parameter (scalar). Else returns covariance (2D array).

Return type

scalar, array

covpair(column1, column2, **kwargs)

Estimate weighted covariance of a pair of parameters column1 and column2. See cov().

eval(literal='None')

Evaluate input literal and return results. Python’s eval() is provided access to numpy (np), catalog global size gsize and columns.

extend(other)

Extend catalog with other.

falses()

Return array of size size filled with False.

classmethod from_array(array, columns=None, mpiroot=0, mpistate=0, mpicomm=None, **kwargs)

Build BaseCatalog from input array.

Parameters
  • columns (list) – List of columns to read from array.

  • mpiroot (int, default=0) – Rank of process where input array lives.

  • mpistate (string, mpi.CurrentMPIState) – MPI state of the input array: ‘scattered’, ‘gathered’, ‘broadcast’?

  • mpicomm (MPI communicator, default=None) – MPI communicator.

  • kwargs (dict) – Other arguments for __init__().

Returns

catalog

Return type

BaseCatalog

classmethod from_nbodykit(catalog, columns=None)

Build new catalog from nbodykit.

Parameters
  • catalog (nbodykit.base.catalog.CatalogSource) – nbodykit catalog.

  • columns (list, default=None) – Columns to import. Defaults to all columns.

Returns

catalog

Return type

BaseCatalog

classmethod from_state(state, mpistate=1, mpiroot=0, mpicomm=None)

Instantiate and initalize class with state dictionary.

full(fill_value, dtype=<class 'numpy.float64'>)

Return array of size size filled with fill_value.

classmethod gelman_rubin(chains, columns=None, statistic='mean', method='eigen', return_matrices=False, check=True)

Return Gelman-Rubin statistics, which compares covariance of chain means to (mean of) intra-chain covariances.

Parameters
  • chains (list) – List of Samples instances.

  • columns (list, ParameterCollection) – Parameters to compute Gelman-Rubin statistics for. Defaults to all parameters.

  • statistic (string, callable, default='mean') – If ‘mean’, compares covariance of chain means to (mean of) intra-chain covariances. Else, must be a callable taking Samples instance and parameter list as input and returning array of values (one for each parameter).

  • method (string, default='eigen') – If eigen, return eigenvalues of covariance ratios, else diagonal.

  • return_matrices (bool, default=True) – If True, also return pair of covariance matrices.

  • check (bool, default=True) – Whether to check for reliable inverse of intra-chain covariances.

  • Reference

  • ---------

  • http (//www.stat.columbia.edu/~gelman/research/published/brooksgelman2.pdf) –

get(name, *args, **kwargs)

Return samples for parameter name. If not found, return default if provided.

Parameters

name (ParamName, string, tuple, Parameter) – Parameter name.

Returns

samples

Return type

array

gget(column, root=None)

Return on process rank root catalog global column column if exists, else return provided default. If root is None or Ellipsis return result on all processes.

gindices()

Row numbers in the global catalog.

property gsize

Return catalog global size, i.e. sum of size in each process.

gslice(*args)

Perform global slicing of catalog, e.g. catalog.gslice(0,100,1) will return a new catalog of global size 100. Same reference to attrs.

integrated_autocorrelation_time(chains, column, min_corr=None, c=5, reliable=50, check=False)

Return integrated autocorrelation time. Adapted from https://github.com/dfm/emcee/blob/main/src/emcee/autocorr.py

Parameters
  • chains (list) – List of Samples instances.

  • columns (list, ParameterCollection) – Parameters to compute integrated autocorrelation time for.

  • min_corr (float, default=None) – Integrate starting from this lower autocorrelation threshold. If None, use c.

  • c (float, int) – Step size for the window search.

  • reliable (float, int, default=50) – Minimum ratio between the chain length and estimated autocorrelation time for it to be considered reliable.

  • check (bool, default=False) – Whether to check for reliable estimate of autocorrelation time (based on reliable).

Returns

iat

Return type

scalar, array

interval(column, nsigmas=1.0, bins=100, method='gaussian_kde', bw_method='scott')

Return n-sigmas confidence interval(s).

Parameters
  • columns (list, ParameterCollection, default=None) – Parameters to compute confidence interval for.

  • nsigmas (int) – Return interval for this number of sigmas.

  • bins (int, default=100) – Number of bins i.e. mesh nodes. See Mesh.from_samples().

  • method (string) – Method to interpolate (weighted) samples on mesh. See Mesh.from_samples().

  • bw_method (string, default='scott') – If method is 'gaussian_kde', method to determine KDE bandwidth, see scipy.stats.gaussian_kde.

Returns

interval

Return type

array

invcov(columns=None, ddof=1)

Estimate weighted parameter inverse covariance.

Parameters
  • columns (list, ParameterCollection, default=None) – Parameters to compute inverse covariance for. Defaults to all varied parameters.

  • ddof (int, default=1) – Number of degrees of freedom.

Returns

cov – If single parameter provided as columns, returns inverse variance for that parameter (scalar). Else returns inverse covariance (2D array).

Return type

scalar, array

classmethod load(*args, **kwargs)

Load catalog in numpy binary format from disk.

classmethod load_auto(filename, *args, **kwargs)

Load samples from disk.

Parameters
  • filename (string) – File name of samples. If ends with ‘.txt’, calls load_cosmomc(). Else (numpy binary format), calls load().

  • args (list) – Arguments for load function.

  • kwargs (dict) – Other arguments for load function.

classmethod load_cosmomc(base_filename, ichains=None, mpiroot=0, mpistate=1, mpicomm=None)

Load samples in CosmoMC format, i.e.:

  • ‘_{ichain}.txt’ files for sample values

  • ‘.paramnames’ files for parameter names / latex

  • ‘.ranges’ for parameter ranges

Parameters
  • base_filename (string) – Base CosmoMC file name. Will be prepended by ‘_{ichain}.txt’ for sample values, ‘.paramnames’ for parameter names and ‘.ranges’ for parameter ranges.

  • ichains (int, tuple, list, default=None) – Chain numbers to load. Defaults to all chains matching pattern ‘{base_filename}*.txt’

  • mpiroot (int, default=0) – Rank of root process.

  • mpistate (string, mpi.CurrentMPIState) – MPI state: ‘scattered’, ‘gathered’, ‘broadcast’?

  • mpicomm (MPI communicator, default=None) – MPI communicator.

Returns

samples

Return type

Samples

classmethod load_fits(filename, columns=None, ext=None, mpiroot=0, mpistate=0, mpicomm=None)

Load catalog in fits binary format from disk.

Parameters
  • columns (list, default=None) – List of column names to read. Defaults to all columns.

  • ext (int, default=None) – fits extension. Defaults to first extension with data.

  • mpiroot (int, default=0) – Rank of process where input array lives.

  • mpistate (string, mpi.CurrentMPIState) – MPI state of the input array: ‘scattered’, ‘gathered’, ‘broadcast’?

  • mpicomm (MPI communicator, default=None) – MPI communicator.

Returns

catalog

Return type

BaseCatalog

maximum(column)

Return global maximum of column(s) column.

mean(column)

Return weighted mean.

median(column)

Return global median of column(s) column.

minimum(column)

Return global minimum of column(s) column.

classmethod mpi_collect(self=None, sources=None, mpicomm=None)

Return new instance corresponding to self on larger mpicomm.

Parameters
  • self (object, None) – Instance to spread on mpicomm.

  • sources (list, None) – Ranks of processes of mpicomm where self lives. If None, takes the ranks of processes where self is not None.

  • mpicomm (MPI communicator) – New mpi communicator.

Returns

new

Return type

object

mpi_distribute(dests, mpicomm=None)

Return new instance corresponding to self on smaller mpicomm.

Parameters
  • self (object, None) – Instance to concentrate on mpicomm.

  • dests (list, None) – Ranks of processes of mpicomm where to send self lives. If None, takes the ranks of processes where self is not None.

  • mpicomm (MPI communicator) – New mpi communicator.

Returns

new

Return type

object, None

mpi_gather()

Gather catalog on a single process.

Warning

May blow up memory of the node this process runs on.

mpi_recv(source, tag=42)

Receive catalog from rank source with tag tag.

mpi_scatter()

Scatter catalog on all processes.

mpi_send(dest, tag=42)

Send catalog to rank dest with tag tag.

mpi_to_state(mpistate)

Return instance, changing current MPI state to mpistate.

property mpiattrs

MPI attributes

nans()

Return array of size size filled with numpy.nan.

ones(dtype=<class 'numpy.float64'>)

Return array of size size filled with one.

percentile(column, q=(15.87, 84.13))

Return global percentiles of column(s) column.

quantile(column, q=(0.1587, 0.8413))

Return weighted quantiles.

remove_burnin(burnin=0)

Return new samples with burn-in removed.

Parameters

burnin (float, int) – If burnin between 0 and 1, remove that fraction of samples. Else, remove burnin first points (in global samples).

Returns

samples

Return type

Samples

save(filename)

Save class to disk.

save_auto(filename, *args, **kwargs)

Save samples to disk.

Parameters
  • filename (string) – File name of samples. If ends with ‘.txt’, calls load_cosmomc(). Else (numpy binary format), calls save().

  • args (list) – Arguments for save function.

  • kwargs (dict) – Other arguments for save function.

save_cosmomc(base_filename, columns=None, ichain=None, fmt='%.18e', delimiter=' ', **kwargs)

Save samples to disk in CosmoMC format.

Parameters
  • base_filename (string) – Base CosmoMC file name. Will be prepended by ‘_{ichain}.txt’ for sample values, ‘.paramnames’ for parameter names and ‘.ranges’ for parameter ranges.

  • columns (list, ParameterCollection, default=None) – Parameters to save samples of. Defaults to all parameters (weight and logposterior treated separatey).

  • ichain (int, default=None) – Chain number to append to file name, i.e. sample values will be saved as ‘{base_filename}_{ichain}.txt’. If None, does not append any number, sample values will be saved as ‘{base_filename}.txt’.

  • kwargs (dict) – Arguments for numpy.savetxt().

save_fits(filename)

Save catalog to filename as fits file. Possible to change fitsio to write by chunks?.

set(name, item)

Set parameter name samples to item.

Parameters
  • name (ParamName, string, tuple, Parameter) – Parameter name. If does not exist in current samples, creates new parameter.

  • item (array) – Samples for this parameter.

set_default_parameter(name=None)

Add default parameter of name name.

Parameters

name (ParamName, string, tuple, Parameter) – Parameter name.

property size

Equivalent for __length__().

std(column, **kwargs)

Estimate weigthed standard deviation. Same arguments as var().

sum(column)

Return global sum of column(s) column.

to_array(columns=None, struct=True)

Return samples as numpy array.

Parameters
  • columns (list, default=None) – Columns to use. Defaults to all columns.

  • struct (bool, default=True) – Whether to return structured array, with columns accessible through e.g. array['Position']. If False, numpy will attempt to cast types of different columns.

Returns

array

Return type

array

to_getdist(columns=None)

Return GetDist hook to samples.

Parameters

columns (list, ParameterCollection, default=None) – Parameters to share to GetDist. Defaults to all parameters (weight and logposterior treated separatey).

Returns

samples

Return type

getdist.MCSamples

to_mesh(columns, **kwargs)

Interpolate samples to mesh.

Parameters
  • columns (list, ParameterCollection) – List of parameters to build a mesh for.

  • kwargs (dict) – Arguments for Mesh.from_samples().

Returns

mesh – Mesh with interpolated samples.

Return type

Mesh

to_nbodykit(columns=None)

Return catalog in nbodykit format.

Parameters

columns (list, default=None) – Columns to export. Defaults to all columns.

Returns

catalog

Return type

nbodykit.base.catalog.CatalogSource

to_stats(columns=None, quantities=None, sigfigs=2, tablefmt='latex_raw', filename=None)

Export samples summary quantities.

Parameters
  • columns (list, default=None) – Parameters to export quantities for. Defaults to all parameters.

  • quantities (list, default=None) – Quantities to export. Defaults to ['argmax','mean','median','std','quantile:1sigma','interval:1sigma'].

  • sigfigs (int, default=2) – Number of significant digits. See utils.round_measurement().

  • tablefmt (string, default='latex_raw') – Format for summary table. See tabulate.tabulate().

  • filename (string default=None) – If not None, file name where to save summary table.

Returns

tab – Summary table.

Return type

string

trues()

Return array of size size filled with True.

var(column, ddof=1)

Estimate weighted parameter variance.

Parameters
  • columns (list, ParameterCollection, default=None) – Parameters to compute variance for.

  • ddof (int, default=1) – Number of degrees of freedom.

Returns

var – If single parameter provided as columns, returns variance for that parameter (scalar). Else returns variance array.

Return type

scalar, array

zeros(dtype=<class 'numpy.float64'>)

Return array of size size filled with zero.