lib.data_vector.binned_statistic.BinnedStatistic

class lib.data_vector.binned_statistic.BinnedStatistic(data=None, edges=None, dims=None, attrs=None)

Bases: cosmopipe.lib.utils.BaseClass

Class representing a binned statistic, similar to https://github.com/bccp/nbodykit/blob/master/nbodykit/binned_statistic.py.

data

Dictionary of data arrays, of same shape.

Type

dict

edges

Dictionary of edges.

Type

dict

dims

List of dimension names.

Type

list

attrs

Dictionary of other attributes.

Type

dict

Initialize BinnedStatistic.

Parameters
  • data (dict, default=None) – Dictionary of data arrays, of same shape. Defaults to empty dictionary.

  • edges (dict, list, default=None) – Dictionary of edges, or list of edges corresponding to dims. If None, no edges considered.

  • dims (list, default=None) – List of dimension names. If None, defaults to edges dictionary keys.

  • attrs (dict, default=None) – Dictionary of other attributes.

Methods

average

Average binned data along dimensions dims.

copy

Return shallow copy of self.

deepcopy

from_state

Instantiate and initalize class with state dictionary.

get_header_txt

Dump header:

get_title_label

Return title, as module.class.

has_edges

Has specified edges?

is_mpi_broadcast

is_mpi_gathered

is_mpi_root

is_mpi_scattered

load

Load class in numpy binary format from disk.

load_auto

If different formats are possible, this method should between them based on file name extension.

load_txt

Load BinnedStatistic from disk.

log_critical

log_debug

log_error

log_info

log_warning

read_header_txt

Read and decode header.

read_title_label

Decode title line, splitting module.class into (module, class).

rebin

Rebin data by factor factors.

save

Save class to disk.

save_auto

If different formats are possible, this method should between them based on file name extension.

save_txt

Dump BinnedStatistic.

set_new_edges

Interpolate binned data within new edges.

squeeze

Squeeze binned data along dimensions dims.

Attributes

columns

Entries in data.

logger

mpiattrs

MPI attributes

mpicomm

mpiroot

mpistate

ndim

Number of dimensions of binned data.

shape

Shape of binned data; if edges, return tuple of length of edges - 1, else shape of first array in data

size

Total size of binned data, i.e. product of length over all dimensions.

average(dims=None, weights=None, columns_to_sum=None)

Average binned data along dimensions dims. This is equivalent to rebin() with factors corresponding to lengths along dimensions dims, followed by squeeze() along those dimensions.

property columns

Entries in data.

copy()

Return shallow copy of self.

classmethod from_state(state, mpiroot=0, mpicomm=None)

Instantiate and initalize class with state dictionary.

get_header_txt(comments='#', ignore_json_errors=True)

Dump header:

Parameters
  • comments (string, default='#') – String to be prepended to the header lines.

  • ignore_json_errors (bool, default=True) – When trying to dump attrs using json, ignore errors.

Returns

header – List of strings (lines).

Return type

list

classmethod get_title_label()

Return title, as module.class.

has_edges()

Has specified edges?

classmethod load(filename, mpiroot=0, mpicomm=None)

Load class in numpy binary format from disk. If the loaded state contains __class__ and that exists in cls._registry, return instance of cls._registry[__class__] (instead of cls).

load_auto(*args, **kwargs)

If different formats are possible, this method should between them based on file name extension.

classmethod load_txt(filename, comments='#', usecols=None, skip_rows=0, max_rows=None, mapping_header=None, pattern_header=None, attrs=None, **kwargs)

Load BinnedStatistic from disk.

Note

If previously saved using save_txt(), loading the BinnedStatistic only requires filename. In this case, the returned instance will be of the class that was used to create it (e.g. BinnedProjection below) - not necessarily BinnedStatistic.

Parameters
  • filename (string) – File name to read in.

  • comments (string, default='#') – Characters used to indicate the start of a comment.

  • usecols (list, default=None) – Which columns to read, with 0 being the first.

  • skip_rows (int, default=0) – Skip the first skip_rows lines, including comments.

  • max_rows (int, default=None) – Read max_rows lines of content after skip_rows lines. The default is to read all the lines.

  • mapping_header (dict, default=None) – Dictionary holding key:regex mapping or (regex, type) to provide the type. The corresponding values, read in the header, will be saved in the attrs dictionary.

  • pattern_header (string, default=None) – A regex pattern for header with groups corresponding to key, value to add into the attrs dictionary.

  • attrs (dict, default=None) – Attributes to save in the attrs dictionary.

  • kwargs (dict) – Arguments for __init__() (other than data and attrs).

Returns

data

Return type

BinnedStatistic

property mpiattrs

MPI attributes

property ndim

Number of dimensions of binned data.

classmethod read_header_txt(file, comments='#', mapping_header=None, pattern_header=None, ignore_json_errors=True)

Read and decode header.

Parameters
  • file (list, iterator) – List of lines.

  • comments (string, default='#') – Characters used to indicate the start of a header line.

  • mapping_header (dict, default=None) – Dictionary holding key:regex mapping or (regex, type) to provide the type. Type can be unspecified (or None), in which case decoded will be tried with json, a string corresponding to __builtins__, or a callable.

  • pattern_header (string, default=None) – A regex pattern with groups corresponding to key:value.

  • ignore_json_errors (bool, default=True) – When trying to decode header values using json, ignore errors.

Returns

attrs

Return type

dict

classmethod read_title_label(line)

Decode title line, splitting module.class into (module, class). It loads module, then if class is in _registry, return corresponding class. Else return None.

rebin(factors, dims=None, weights=None, columns_to_sum=None)

Rebin data by factor factors.

Parameters
  • factors (dict, list) – dim: rebinning factor mapping. If list, should contain rebinning factor for each dim of dims.

  • dims (list, default=None) – List of dimension names. Defaults to dims.

  • weights (array, default=None) – Array of weights (of shape shape). If None, defaults to 1.

  • columns_to_sum (list) – List of columns to sum, i.e. to not renormalize after rebinning.

save(filename)

Save class to disk.

save_auto(*args, **kwargs)

If different formats are possible, this method should between them based on file name extension.

save_txt(filename=None, fmt='.18e', comments='#', ignore_json_errors=True)

Dump BinnedStatistic.

Parameters
  • filename (string, default=None) – ASCII file name where to save binned data. If None, do not write on disk.

  • fmt (string, default='.18e') – Floating point format.

  • comments (string, default='#') – String that will be prepended to the header lines.

  • ignore_json_errors (bool, default=True) – When trying to dump attrs using json, ignore errors.

Returns

lines – List of strings (lines).

Return type

list

set_new_edges(edges, dims=None, weights=None, columns_to_sum=None)

Interpolate binned data within new edges. Perform linear interpolation if edges are not a simple concatenation of current edges.

Parameters
  • edges (dict, list) – New dim: edges mapping. If list, should contain edges for each dim of dims.

  • dims (list, default=None) – List of dimension names. Defaults to dims.

  • weights (array, default=None) – Array of weights (of shape shape). If None, defaults to 1.

  • columns_to_sum (list) – List of columns to sum, i.e. to not renormalize after rebinning.

Warning

Requires edges to be set.

property shape

Shape of binned data; if edges, return tuple of length of edges - 1, else shape of first array in data

property size

Total size of binned data, i.e. product of length over all dimensions.

squeeze(dims=None)

Squeeze binned data along dimensions dims. If dims is None, defaults to dimensions with length <= 1. dims and edges are updated to match new shape.