lib.catalog.catalog.Catalog¶
- class lib.catalog.catalog.Catalog(data=None, columns=None, attrs=None)¶
Bases:
lib.catalog.base.BaseCatalogClass that represents a standard catalog.
Initialize
BaseCatalog.- Parameters
data (dict, BaseCatalog) – Dictionary name: array. If
BaseCataloginstance, updateselfattributes.columns (list, default=None) – List of column names. Defaults to
data.keys().attrs (dict) – Other attributes.
Methods
Return global average of column(s)
column, with weightsweights(defaults to1).Return catalog column names, after optional selections.
Concatenate catalogs together.
Return copy, including column names
columns(defaults to all columns).Estimate weighted covariance.
deepcopyEvaluate input
literaland return results.Extend catalog with
other.Return array of size
sizefilled withFalse.Build
BaseCatalogfrom inputarray.Build new catalog from nbodykit.
Instantiate and initalize class with state dictionary.
Return array of size
sizefilled withfill_value.Return catalog (local) column
columnif exists, else return provided default.Return on process rank
rootcatalog global columncolumnif exists, else return provided default.Row numbers in the global catalog.
Perform global slicing of catalog, e.g.
is_mpi_broadcastis_mpi_gatheredis_mpi_rootis_mpi_scatteredLoad catalog in numpy binary format from disk.
Load catalog from disk.
Load catalog in fits binary format from disk.
log_criticallog_debuglog_errorlog_infolog_warningReturn global maximum of column(s)
column.Return global mean of column(s)
column.Return global median of column(s)
column.Return global minimum of column(s)
column.mpi_broadcastReturn new instance corresponding to
selfon largermpicomm.Return new instance corresponding to
selfon smallermpicomm.Gather catalog on a single process.
Receive catalog from rank
sourcewith tagtag.Scatter catalog on all processes.
Send catalog to rank
destwith tagtag.Return instance, changing current MPI state to
mpistate.Return array of size
sizefilled withnumpy.nan.Return array of size
sizefilled with one.Return global percentiles of column(s)
column.Return global quantiles of column(s)
column.Save class to disk.
Write catalog to disk.
Save catalog to
filenameas fits file.Set column of name
column.Estimate weigthed standard deviation.
Return global sum of column(s)
column.Return catalog as numpy array.
Return catalog in nbodykit format.
Export catalog summary quantities.
Return array of size
sizefilled withTrue.Estimate weighted parameter variance.
Return array of size
sizefilled with zero.Attributes
Return catalog global size, i.e. sum of size in each process.
loggerMPI attributes
mpistateEquivalent for
__length__().- average(column, weights=None)¶
Return global average of column(s)
column, with weightsweights(defaults to1).
- columns(include=None, exclude=None)¶
Return catalog column names, after optional selections.
- Parameters
include (list, string, default=None) – Single or list of regex patterns to select column names to include. Defaults to all columns.
exclude (list, string, default=None) – Single or list of regex patterns to select column names to exclude. Defaults to no columns.
- Returns
columns – Return catalog column names, after optional selections.
- Return type
list
- classmethod concatenate(*others)¶
Concatenate catalogs together.
- Parameters
others (list) – List of
BaseCataloginstances.- Returns
new
- Return type
Warning
attrsof returned catalog contains, for each key, the last value found inothersattrsdictionaries.
- copy(columns=None)¶
Return copy, including column names
columns(defaults to all columns).
- cov(columns=None, fweights=None, aweights=None, ddof=1)¶
Estimate weighted covariance.
- Parameters
columns (list, default=None) – Columns to compute covariance for.
fweights (array, int, default=None) – 1D array of integer frequency weights; the number of times each observation vector should be repeated.
aweights (array, default=None) – 1D array of observation vector weights. These relative weights are typically large for observations considered “important” and smaller for observations considered less “important”. If
ddof=0the array of weights can be used to assign probabilities to observation vectors.ddof (int, default=1) – Number of degrees of freedom.
- Returns
cov – If single parameter provided as
columns, returns variance for that parameter (scalar). Else returns covariance (2D array).- Return type
scalar, array
- eval(literal='None')¶
Evaluate input
literaland return results. Python’seval()is provided access to numpy (np), catalog global sizegsizeand columns.
- extend(other)¶
Extend catalog with
other.
- classmethod from_array(array, columns=None, mpiroot=0, mpistate=0, mpicomm=None, **kwargs)¶
Build
BaseCatalogfrom inputarray.- Parameters
columns (list) – List of columns to read from array.
mpiroot (int, default=0) – Rank of process where input array lives.
mpistate (string, mpi.CurrentMPIState) – MPI state of the input array: ‘scattered’, ‘gathered’, ‘broadcast’?
mpicomm (MPI communicator, default=None) – MPI communicator.
kwargs (dict) – Other arguments for
__init__().
- Returns
catalog
- Return type
- classmethod from_nbodykit(catalog, columns=None)¶
Build new catalog from nbodykit.
- Parameters
catalog (nbodykit.base.catalog.CatalogSource) – nbodykit catalog.
columns (list, default=None) – Columns to import. Defaults to all columns.
- Returns
catalog
- Return type
- classmethod from_state(state, mpistate=1, mpiroot=0, mpicomm=None)¶
Instantiate and initalize class with state dictionary.
- get(column, *args, **kwargs)¶
Return catalog (local) column
columnif exists, else return provided default.
- gget(column, root=None)¶
Return on process rank
rootcatalog global columncolumnif exists, else return provided default. IfrootisNoneorEllipsisreturn result on all processes.
- gindices()¶
Row numbers in the global catalog.
- property gsize¶
Return catalog global size, i.e. sum of size in each process.
- gslice(*args)¶
Perform global slicing of catalog, e.g.
catalog.gslice(0,100,1)will return a new catalog of global size100. Same reference toattrs.
- classmethod load(*args, **kwargs)¶
Load catalog in numpy binary format from disk.
- classmethod load_auto(filename, *args, **kwargs)¶
Load catalog from disk.
- Parameters
filename (string) – File name of catalog. If ends with ‘.fits’, calls
load_fits(). Else (numpy binary format), callsload().args (list) – Arguments for load function.
kwargs (dict) – Other arguments for load function.
- classmethod load_fits(filename, columns=None, ext=None, mpiroot=0, mpistate=0, mpicomm=None)¶
Load catalog in fits binary format from disk.
- Parameters
columns (list, default=None) – List of column names to read. Defaults to all columns.
ext (int, default=None) – fits extension. Defaults to first extension with data.
mpiroot (int, default=0) – Rank of process where input array lives.
mpistate (string, mpi.CurrentMPIState) – MPI state of the input array: ‘scattered’, ‘gathered’, ‘broadcast’?
mpicomm (MPI communicator, default=None) – MPI communicator.
- Returns
catalog
- Return type
- maximum(column)¶
Return global maximum of column(s)
column.
- mean(column)¶
Return global mean of column(s)
column.
- median(column)¶
Return global median of column(s)
column.
- minimum(column)¶
Return global minimum of column(s)
column.
- classmethod mpi_collect(self=None, sources=None, mpicomm=None)¶
Return new instance corresponding to
selfon largermpicomm.- Parameters
self (object, None) – Instance to spread on
mpicomm.sources (list, None) – Ranks of processes of
mpicommwhereselflives. IfNone, takes the ranks of processes whereselfis notNone.mpicomm (MPI communicator) – New mpi communicator.
- Returns
new
- Return type
object
- mpi_distribute(dests, mpicomm=None)¶
Return new instance corresponding to
selfon smallermpicomm.- Parameters
self (object, None) – Instance to concentrate on
mpicomm.dests (list, None) – Ranks of processes of
mpicommwhere to sendselflives. IfNone, takes the ranks of processes whereselfis notNone.mpicomm (MPI communicator) – New mpi communicator.
- Returns
new
- Return type
object, None
- mpi_gather()¶
Gather catalog on a single process.
Warning
May blow up memory of the node this process runs on.
- mpi_recv(source, tag=42)¶
Receive catalog from rank
sourcewith tagtag.
- mpi_scatter()¶
Scatter catalog on all processes.
- mpi_send(dest, tag=42)¶
Send catalog to rank
destwith tagtag.
- mpi_to_state(mpistate)¶
Return instance, changing current MPI state to
mpistate.
- property mpiattrs¶
MPI attributes
- percentile(column, q=(15.87, 84.13))¶
Return global percentiles of column(s)
column.
- quantile(column, q=(0.1587, 0.8413), weights=None)¶
Return global quantiles of column(s)
column.
- save(filename)¶
Save class to disk.
- save_auto(filename, *args, **kwargs)¶
Write catalog to disk.
- Parameters
filename (string) – File name of catalog. If ends with ‘.fits’, calls
save_fits(). Else (numpy binary format), callssave().args (list) – Arguments for save function.
kwargs (dict) – Other arguments for save function.
- save_fits(filename)¶
Save catalog to
filenameas fits file. Possible to change fitsio to write by chunks?.
- set(column, item)¶
Set column of name
column.
- property size¶
Equivalent for
__length__().
- sum(column)¶
Return global sum of column(s)
column.
- to_array(columns=None, struct=True)¶
Return catalog as numpy array.
- Parameters
columns (list, default=None) – Columns to use. Defaults to all catalog columns.
struct (bool, default=True) – Whether to return structured array, with columns accessible through e.g.
array['Position']. IfFalse, numpy will attempt to cast types of different columns.
- Returns
array
- Return type
array
- to_nbodykit(columns=None)¶
Return catalog in nbodykit format.
- Parameters
columns (list, default=None) – Columns to export. Defaults to all columns.
- Returns
catalog
- Return type
nbodykit.base.catalog.CatalogSource
- to_stats(columns=None, quantities=None, sigfigs=2, tablefmt='latex_raw', filename=None)¶
Export catalog summary quantities.
- Parameters
columns (list, default=None) – Columns to export quantities for. Defaults to all columns.
quantities (list, default=None) – Quantities to export. Defaults to
['mean','median','std'].sigfigs (int, default=2) – Number of significant digits. See
utils.round_measurement().tablefmt (string, default='latex_raw') – Format for summary table. See
tabulate.tabulate().filename (string default=None) – If not
None, file name where to save summary table.
- Returns
tab – Summary table.
- Return type
string
- var(column, fweights=None, aweights=None, ddof=1)¶
Estimate weighted parameter variance.
- Parameters
columns (list, default=None) – Columns to compute variance for.
fweights (array, int, default=None) – 1D array of integer frequency weights; the number of times each observation vector should be repeated.
aweights (array, default=None) – 1D array of observation vector weights. These relative weights are typically large for observations considered “important” and smaller for observations considered less “important”. If
ddof=0the array of weights can be used to assign probabilities to observation vectors.
- Returns
var – If single parameter provided as
columns, returns variance for that parameter (scalar). Else returns variance array.- Return type
scalar, array