lib.samples.samples.Samples¶
- class lib.samples.samples.Samples(data=None, parameters=None, attrs=None)¶
Bases:
cosmopipe.lib.catalog.base.BaseCatalogClass that holds samples drawn from likelihood.
Initialize
Samples.- Parameters
data (dict, Samples) – Dictionary name: array. If
Samplesinstance, updateselfattributes.parameters (list, ParameterCollection, default=None) – Parameters. Defaults to
data.keys().attrs (dict) – Other attributes.
Methods
Return parameter value for maximum of
cost.Return parameter value for minimum of
cost.Return weighted autocorrelation.
Return global average of column(s)
column, with weightsweights(defaults to1).Return parameter names, after optional selections.
Concatenate catalogs together.
Return copy, including column names
columns(defaults to all columns).Estimate weighted parameter correlation matrix.
Estimate weighted correlation of a pair of parameters
column1andcolumn2.Estimate weighted parameter covariance.
Estimate weighted covariance of a pair of parameters
column1andcolumn2.deepcopyEvaluate input
literaland return results.Extend catalog with
other.Return array of size
sizefilled withFalse.Build
BaseCatalogfrom inputarray.Build new catalog from nbodykit.
Instantiate and initalize class with state dictionary.
Return array of size
sizefilled withfill_value.Return Gelman-Rubin statistics, which compares covariance of chain means to (mean of) intra-chain covariances.
Return samples for parameter
name.Return on process rank
rootcatalog global columncolumnif exists, else return provided default.Row numbers in the global catalog.
Perform global slicing of catalog, e.g.
Return integrated autocorrelation time.
Return n-sigmas confidence interval(s).
Estimate weighted parameter inverse covariance.
is_mpi_broadcastis_mpi_gatheredis_mpi_rootis_mpi_scatteredLoad catalog in numpy binary format from disk.
Load samples from disk.
Load samples in CosmoMC format, i.e.:
Load catalog in fits binary format from disk.
log_criticallog_debuglog_errorlog_infolog_warningReturn global maximum of column(s)
column.Return weighted mean.
Return global median of column(s)
column.Return global minimum of column(s)
column.mpi_broadcastReturn new instance corresponding to
selfon largermpicomm.Return new instance corresponding to
selfon smallermpicomm.Gather catalog on a single process.
Receive catalog from rank
sourcewith tagtag.Scatter catalog on all processes.
Send catalog to rank
destwith tagtag.Return instance, changing current MPI state to
mpistate.Return array of size
sizefilled withnumpy.nan.Return array of size
sizefilled with one.Return global percentiles of column(s)
column.Return weighted quantiles.
Return new samples with burn-in removed.
Save class to disk.
Save samples to disk.
Save samples to disk in CosmoMC format.
Save catalog to
filenameas fits file.Set parameter
namesamples toitem.Add default parameter of name
name.Estimate weigthed standard deviation.
Return global sum of column(s)
column.Return samples as numpy array.
Return GetDist hook to samples.
Interpolate samples to mesh.
Return catalog in nbodykit format.
Export samples summary quantities.
Return array of size
sizefilled withTrue.Estimate weighted parameter variance.
Return array of size
sizefilled with zero.Attributes
Return catalog global size, i.e. sum of size in each process.
loggerMPI attributes
mpistateEquivalent for
__length__().- argmax(column, cost='metrics.logposterior')¶
Return parameter value for maximum of
cost.
- argmin(column, cost='metrics.chi2')¶
Return parameter value for minimum of
cost.
- autocorrelation(column)¶
Return weighted autocorrelation. Adapted from https://github.com/dfm/emcee/blob/main/src/emcee/autocorr.py
- Parameters
columns (list, ParameterCollection) – Parameters to compute autocorrelation for. Defaults to all parameters.
- Returns
autocorr
- Return type
array
- average(column, weights=None)¶
Return global average of column(s)
column, with weightsweights(defaults to1).
- columns(include=None, exclude=None, **kwargs)¶
Return parameter names, after optional selections.
- Parameters
include (list, string, default=None) – Single or list of regex patterns to select parameter names to include. Defaults to all parameters.
exclude (list, string, default=None) – Single or list of regex patterns to select parameter names to exclude. Defaults to no parameters.
kwargs (dict) – Selections on parameter attributes, e.g.
varied=Truefor varied parameters.
- Returns
columns – Return parameters, after optional selections.
- Return type
list
- classmethod concatenate(*others)¶
Concatenate catalogs together.
- Parameters
others (list) – List of
BaseCataloginstances.- Returns
new
- Return type
Warning
attrsof returned catalog contains, for each key, the last value found inothersattrsdictionaries.
- copy(columns=None)¶
Return copy, including column names
columns(defaults to all columns).
- corrpair(column1, column2, **kwargs)¶
Estimate weighted correlation of a pair of parameters
column1andcolumn2. Seecov().
- cov(columns=None, ddof=1)¶
Estimate weighted parameter covariance.
- Parameters
columns (list, ParameterCollection, default=None) – Parameters to compute covariance for. Defaults to all varied parameters.
ddof (int, default=1) – Number of degrees of freedom.
- Returns
cov – If single parameter provided as
columns, returns variance for that parameter (scalar). Else returns covariance (2D array).- Return type
scalar, array
- covpair(column1, column2, **kwargs)¶
Estimate weighted covariance of a pair of parameters
column1andcolumn2. Seecov().
- eval(literal='None')¶
Evaluate input
literaland return results. Python’seval()is provided access to numpy (np), catalog global sizegsizeand columns.
- extend(other)¶
Extend catalog with
other.
- classmethod from_array(array, columns=None, mpiroot=0, mpistate=0, mpicomm=None, **kwargs)¶
Build
BaseCatalogfrom inputarray.- Parameters
columns (list) – List of columns to read from array.
mpiroot (int, default=0) – Rank of process where input array lives.
mpistate (string, mpi.CurrentMPIState) – MPI state of the input array: ‘scattered’, ‘gathered’, ‘broadcast’?
mpicomm (MPI communicator, default=None) – MPI communicator.
kwargs (dict) – Other arguments for
__init__().
- Returns
catalog
- Return type
- classmethod from_nbodykit(catalog, columns=None)¶
Build new catalog from nbodykit.
- Parameters
catalog (nbodykit.base.catalog.CatalogSource) – nbodykit catalog.
columns (list, default=None) – Columns to import. Defaults to all columns.
- Returns
catalog
- Return type
- classmethod from_state(state, mpistate=1, mpiroot=0, mpicomm=None)¶
Instantiate and initalize class with state dictionary.
- classmethod gelman_rubin(chains, columns=None, statistic='mean', method='eigen', return_matrices=False, check=True)¶
Return Gelman-Rubin statistics, which compares covariance of chain means to (mean of) intra-chain covariances.
- Parameters
chains (list) – List of
Samplesinstances.columns (list, ParameterCollection) – Parameters to compute Gelman-Rubin statistics for. Defaults to all parameters.
statistic (string, callable, default='mean') – If ‘mean’, compares covariance of chain means to (mean of) intra-chain covariances. Else, must be a callable taking
Samplesinstance and parameter list as input and returning array of values (one for each parameter).method (string, default='eigen') – If eigen, return eigenvalues of covariance ratios, else diagonal.
return_matrices (bool, default=True) – If
True, also return pair of covariance matrices.check (bool, default=True) – Whether to check for reliable inverse of intra-chain covariances.
Reference –
--------- –
http (//www.stat.columbia.edu/~gelman/research/published/brooksgelman2.pdf) –
- get(name, *args, **kwargs)¶
Return samples for parameter
name. If not found, return default if provided.
- gget(column, root=None)¶
Return on process rank
rootcatalog global columncolumnif exists, else return provided default. IfrootisNoneorEllipsisreturn result on all processes.
- gindices()¶
Row numbers in the global catalog.
- property gsize¶
Return catalog global size, i.e. sum of size in each process.
- gslice(*args)¶
Perform global slicing of catalog, e.g.
catalog.gslice(0,100,1)will return a new catalog of global size100. Same reference toattrs.
- integrated_autocorrelation_time(chains, column, min_corr=None, c=5, reliable=50, check=False)¶
Return integrated autocorrelation time. Adapted from https://github.com/dfm/emcee/blob/main/src/emcee/autocorr.py
- Parameters
chains (list) – List of
Samplesinstances.columns (list, ParameterCollection) – Parameters to compute integrated autocorrelation time for.
min_corr (float, default=None) – Integrate starting from this lower autocorrelation threshold. If
None, usec.c (float, int) – Step size for the window search.
reliable (float, int, default=50) – Minimum ratio between the chain length and estimated autocorrelation time for it to be considered reliable.
check (bool, default=False) – Whether to check for reliable estimate of autocorrelation time (based on
reliable).
- Returns
iat
- Return type
scalar, array
- interval(column, nsigmas=1.0, bins=100, method='gaussian_kde', bw_method='scott')¶
Return n-sigmas confidence interval(s).
- Parameters
columns (list, ParameterCollection, default=None) – Parameters to compute confidence interval for.
nsigmas (int) – Return interval for this number of sigmas.
bins (int, default=100) – Number of bins i.e. mesh nodes. See
Mesh.from_samples().method (string) – Method to interpolate (weighted) samples on mesh. See
Mesh.from_samples().bw_method (string, default='scott') – If
methodis'gaussian_kde', method to determine KDE bandwidth, seescipy.stats.gaussian_kde.
- Returns
interval
- Return type
array
- invcov(columns=None, ddof=1)¶
Estimate weighted parameter inverse covariance.
- Parameters
columns (list, ParameterCollection, default=None) – Parameters to compute inverse covariance for. Defaults to all varied parameters.
ddof (int, default=1) – Number of degrees of freedom.
- Returns
cov – If single parameter provided as
columns, returns inverse variance for that parameter (scalar). Else returns inverse covariance (2D array).- Return type
scalar, array
- classmethod load(*args, **kwargs)¶
Load catalog in numpy binary format from disk.
- classmethod load_auto(filename, *args, **kwargs)¶
Load samples from disk.
- Parameters
filename (string) – File name of samples. If ends with ‘.txt’, calls
load_cosmomc(). Else (numpy binary format), callsload().args (list) – Arguments for load function.
kwargs (dict) – Other arguments for load function.
- classmethod load_cosmomc(base_filename, ichains=None, mpiroot=0, mpistate=1, mpicomm=None)¶
Load samples in CosmoMC format, i.e.:
‘_{ichain}.txt’ files for sample values
‘.paramnames’ files for parameter names / latex
‘.ranges’ for parameter ranges
- Parameters
base_filename (string) – Base CosmoMC file name. Will be prepended by ‘_{ichain}.txt’ for sample values, ‘.paramnames’ for parameter names and ‘.ranges’ for parameter ranges.
ichains (int, tuple, list, default=None) – Chain numbers to load. Defaults to all chains matching pattern ‘{base_filename}*.txt’
mpiroot (int, default=0) – Rank of root process.
mpistate (string, mpi.CurrentMPIState) – MPI state: ‘scattered’, ‘gathered’, ‘broadcast’?
mpicomm (MPI communicator, default=None) – MPI communicator.
- Returns
samples
- Return type
- classmethod load_fits(filename, columns=None, ext=None, mpiroot=0, mpistate=0, mpicomm=None)¶
Load catalog in fits binary format from disk.
- Parameters
columns (list, default=None) – List of column names to read. Defaults to all columns.
ext (int, default=None) – fits extension. Defaults to first extension with data.
mpiroot (int, default=0) – Rank of process where input array lives.
mpistate (string, mpi.CurrentMPIState) – MPI state of the input array: ‘scattered’, ‘gathered’, ‘broadcast’?
mpicomm (MPI communicator, default=None) – MPI communicator.
- Returns
catalog
- Return type
- maximum(column)¶
Return global maximum of column(s)
column.
- mean(column)¶
Return weighted mean.
- median(column)¶
Return global median of column(s)
column.
- minimum(column)¶
Return global minimum of column(s)
column.
- classmethod mpi_collect(self=None, sources=None, mpicomm=None)¶
Return new instance corresponding to
selfon largermpicomm.- Parameters
self (object, None) – Instance to spread on
mpicomm.sources (list, None) – Ranks of processes of
mpicommwhereselflives. IfNone, takes the ranks of processes whereselfis notNone.mpicomm (MPI communicator) – New mpi communicator.
- Returns
new
- Return type
object
- mpi_distribute(dests, mpicomm=None)¶
Return new instance corresponding to
selfon smallermpicomm.- Parameters
self (object, None) – Instance to concentrate on
mpicomm.dests (list, None) – Ranks of processes of
mpicommwhere to sendselflives. IfNone, takes the ranks of processes whereselfis notNone.mpicomm (MPI communicator) – New mpi communicator.
- Returns
new
- Return type
object, None
- mpi_gather()¶
Gather catalog on a single process.
Warning
May blow up memory of the node this process runs on.
- mpi_recv(source, tag=42)¶
Receive catalog from rank
sourcewith tagtag.
- mpi_scatter()¶
Scatter catalog on all processes.
- mpi_send(dest, tag=42)¶
Send catalog to rank
destwith tagtag.
- mpi_to_state(mpistate)¶
Return instance, changing current MPI state to
mpistate.
- property mpiattrs¶
MPI attributes
- percentile(column, q=(15.87, 84.13))¶
Return global percentiles of column(s)
column.
- quantile(column, q=(0.1587, 0.8413))¶
Return weighted quantiles.
- remove_burnin(burnin=0)¶
Return new samples with burn-in removed.
- Parameters
burnin (float, int) – If burnin between 0 and 1, remove that fraction of samples. Else, remove burnin first points (in global samples).
- Returns
samples
- Return type
- save(filename)¶
Save class to disk.
- save_auto(filename, *args, **kwargs)¶
Save samples to disk.
- Parameters
filename (string) – File name of samples. If ends with ‘.txt’, calls
load_cosmomc(). Else (numpy binary format), callssave().args (list) – Arguments for save function.
kwargs (dict) – Other arguments for save function.
- save_cosmomc(base_filename, columns=None, ichain=None, fmt='%.18e', delimiter=' ', **kwargs)¶
Save samples to disk in CosmoMC format.
- Parameters
base_filename (string) – Base CosmoMC file name. Will be prepended by ‘_{ichain}.txt’ for sample values, ‘.paramnames’ for parameter names and ‘.ranges’ for parameter ranges.
columns (list, ParameterCollection, default=None) – Parameters to save samples of. Defaults to all parameters (weight and logposterior treated separatey).
ichain (int, default=None) – Chain number to append to file name, i.e. sample values will be saved as ‘{base_filename}_{ichain}.txt’. If
None, does not append any number, sample values will be saved as ‘{base_filename}.txt’.kwargs (dict) – Arguments for
numpy.savetxt().
- save_fits(filename)¶
Save catalog to
filenameas fits file. Possible to change fitsio to write by chunks?.
- set(name, item)¶
Set parameter
namesamples toitem.
- set_default_parameter(name=None)¶
Add default parameter of name
name.
- property size¶
Equivalent for
__length__().
- sum(column)¶
Return global sum of column(s)
column.
- to_array(columns=None, struct=True)¶
Return samples as numpy array.
- Parameters
columns (list, default=None) – Columns to use. Defaults to all columns.
struct (bool, default=True) – Whether to return structured array, with columns accessible through e.g.
array['Position']. IfFalse, numpy will attempt to cast types of different columns.
- Returns
array
- Return type
array
- to_getdist(columns=None)¶
Return GetDist hook to samples.
- Parameters
columns (list, ParameterCollection, default=None) – Parameters to share to GetDist. Defaults to all parameters (weight and logposterior treated separatey).
- Returns
samples
- Return type
getdist.MCSamples
- to_mesh(columns, **kwargs)¶
Interpolate samples to mesh.
- Parameters
columns (list, ParameterCollection) – List of parameters to build a mesh for.
kwargs (dict) – Arguments for
Mesh.from_samples().
- Returns
mesh – Mesh with interpolated samples.
- Return type
- to_nbodykit(columns=None)¶
Return catalog in nbodykit format.
- Parameters
columns (list, default=None) – Columns to export. Defaults to all columns.
- Returns
catalog
- Return type
nbodykit.base.catalog.CatalogSource
- to_stats(columns=None, quantities=None, sigfigs=2, tablefmt='latex_raw', filename=None)¶
Export samples summary quantities.
- Parameters
columns (list, default=None) – Parameters to export quantities for. Defaults to all parameters.
quantities (list, default=None) – Quantities to export. Defaults to
['argmax','mean','median','std','quantile:1sigma','interval:1sigma'].sigfigs (int, default=2) – Number of significant digits. See
utils.round_measurement().tablefmt (string, default='latex_raw') – Format for summary table. See
tabulate.tabulate().filename (string default=None) – If not
None, file name where to save summary table.
- Returns
tab – Summary table.
- Return type
string
- var(column, ddof=1)¶
Estimate weighted parameter variance.
- Parameters
columns (list, ParameterCollection, default=None) – Parameters to compute variance for.
ddof (int, default=1) – Number of degrees of freedom.
- Returns
var – If single parameter provided as
columns, returns variance for that parameter (scalar). Else returns variance array.- Return type
scalar, array