lib.data_vector.binned_statistic.BinnedProjection

class lib.data_vector.binned_statistic.BinnedProjection(data=None, x=None, y=None, edges=None, dims=None, weights=None, proj=None, attrs=None)

Bases: lib.data_vector.binned_statistic.BinnedStatistic

Class representing a binned projection, i.e. a BinnedStatistic with a ProjectionName attribute. Can be, e.g., the power spectrum monopole. Dimensions are x-coordinates.

proj

Projection.

Type

ProjectionName

Initialize BinnedProjection.

Parameters
  • data (dict, default=None) – Dictionary of data arrays, of same shape. Defaults to empty dictionary.

  • x (tuple, string, array, default=None) – Name of x-coordinate(s) in data, or array (e.g. 'k', ('s','mu'))

  • y (string, array, default=None) – Name of y-coordinate in data, or array.

  • edges (dict, list, default=None) – Dictionary of edges, or list of edges corresponding to dims. If None, no edges considered.

  • dims (list, default=None) – List of dimension names. If None, defaults to edges dictionary keys.

  • weights (string, array, default=None) – Name of weights in data, or array. These will be used to rebin data; e.g. number of modes for the power spectrum, RR pair counts for the correlation function.

  • proj (ProjectionName, string, tuple, dict, default=None) – Projection.

  • attrs (dict, default=None) – Dictionary of other attributes.

Methods

average

Average binned data along dimensions dims.

copy

Return shallow copy of self.

deepcopy

from_state

Instantiate and initalize class with state dictionary.

get_edges

Return x-edges within xlim or mask.

get_header_txt

Return header, adding proj to that of BinnedStatistic.

get_index

Return index.

get_title_label

Return title, as module.class.

get_weights

Same as get_x(), for weights.

get_x

Return x-coordinates within xlim or mask.

get_x_average

Return average of x-coordinates.

get_y

Same as get_x(), for y-coordinate.

has_edges

Has specified edges?

has_weights

Are weights specified?

has_x

Are x-coordinates specified?

has_y

Are y-coordinates specified?

is_mpi_broadcast

is_mpi_gathered

is_mpi_root

is_mpi_scattered

load

Load class in numpy binary format from disk.

load_auto

If different formats are possible, this method should between them based on file name extension.

load_txt

Load BinnedStatistic from disk.

log_critical

log_debug

log_error

log_info

log_warning

read_header_txt

Read and decode header.

read_title_label

Decode title line, splitting module.class into (module, class).

rebin

Rebin data by factor factors.

save

Save class to disk.

save_auto

If different formats are possible, this method should between them based on file name extension.

save_txt

Dump BinnedStatistic.

set_new_edges

Interpolate binned data within new edges.

set_x

Set x-coordinates.

set_y

Same as set_x(), for y-coordinate.

squeeze

Squeeze binned data along dimensions dims.

Attributes

columns

Entries in data.

logger

mpiattrs

MPI attributes

mpicomm

mpiroot

mpistate

ndim

Number of dimensions of binned data.

shape

Shape of binned data; if edges, return tuple of length of edges - 1, else shape of first array in data

size

Total size of binned data, i.e. product of length over all dimensions.

average(dims=None, weights=None, columns_to_sum=None)

Average binned data along dimensions dims. This is equivalent to rebin() with factors corresponding to lengths along dimensions dims, followed by squeeze() along those dimensions.

property columns

Entries in data.

copy()

Return shallow copy of self.

classmethod from_state(state, mpiroot=0, mpicomm=None)

Instantiate and initalize class with state dictionary.

get_edges(xlim=None, mask=Ellipsis)

Return x-edges within xlim or mask.

Parameters
  • xlim (list, tuple, default=None) – x-limits for each x-coordinate.

  • mask (list, tuple, default=Ellipsis) – Mask for each x-coordinate, of same length as data along each x-coordinate.

Returns

edges – (masked) edges along each x-coordinate.

Return type

tuple

get_header_txt(comments='#', **kwargs)

Return header, adding proj to that of BinnedStatistic.

get_index(xlim=None, mask=Ellipsis, flatten=True)

Return index.

Parameters
  • xlim (list, tuple, default=None) – x-limits for each x-coordinate.

  • mask (list, tuple, default=Ellipsis) – Mask for each x-coordinate. If flatten is False, must be of same length as data along each x-coordinate, else same size (i.e. product of shape) as data.

  • flatten (bool, default=True) – If True, return index in flatten data array. Else, return tuple of 1D index along each dimension.

classmethod get_title_label()

Return title, as module.class.

get_weights(xlim=None, mask=Ellipsis, flatten=True)

Same as get_x(), for weights.

get_x(xlim=None, mask=Ellipsis, flatten=True)

Return x-coordinates within xlim or mask.

Parameters
  • xlim (list, tuple, default=None) – x-limits for each x-coordinate.

  • mask (list, tuple, default=Ellipsis) – Mask for each x-coordinate. If flatten is False, must be of same length as data along each x-coordinate, else same size (i.e. product of shape) as data.

  • flatten (bool, default=True) – If True, return flattened x. Else, ndim-D x.

Returns

x – (masked) x-coordinates (1D if flatten, else ndim-D), stacked along last axis. If only one x-coordinate, last dimension is removed.

Return type

array

get_x_average(xlim=None, mask=Ellipsis, weights=None, from_edges=None)

Return average of x-coordinates. e.g., if x is (s, mu), the average of s over mu and the average of mu over s.

Parameters
  • xlim (list, tuple, default=None) – x-limits for each x-coordinate.

  • mask (list, tuple, default=Ellipsis) – 1D mask for each x-coordinate.

  • weights (array, default=None) – Array of same shape as data.

  • from_edges (bool, default=None) – If True, return points at mid-edges. If None, if x-coordinates are not in data, return mid-points. Else, average of x-coordinates along each dimensions are returned.

Returns

x

Return type

tuple of 1D arrays

get_y(xlim=None, mask=Ellipsis, flatten=True)

Same as get_x(), for y-coordinate.

has_edges()

Has specified edges?

has_weights()

Are weights specified?

has_x()

Are x-coordinates specified?

has_y()

Are y-coordinates specified?

classmethod load(filename, mpiroot=0, mpicomm=None)

Load class in numpy binary format from disk. If the loaded state contains __class__ and that exists in cls._registry, return instance of cls._registry[__class__] (instead of cls).

load_auto(*args, **kwargs)

If different formats are possible, this method should between them based on file name extension.

classmethod load_txt(filename, comments='#', usecols=None, skip_rows=0, max_rows=None, mapping_header=None, pattern_header=None, attrs=None, **kwargs)

Load BinnedStatistic from disk.

Note

If previously saved using save_txt(), loading the BinnedStatistic only requires filename. In this case, the returned instance will be of the class that was used to create it (e.g. BinnedProjection below) - not necessarily BinnedStatistic.

Parameters
  • filename (string) – File name to read in.

  • comments (string, default='#') – Characters used to indicate the start of a comment.

  • usecols (list, default=None) – Which columns to read, with 0 being the first.

  • skip_rows (int, default=0) – Skip the first skip_rows lines, including comments.

  • max_rows (int, default=None) – Read max_rows lines of content after skip_rows lines. The default is to read all the lines.

  • mapping_header (dict, default=None) – Dictionary holding key:regex mapping or (regex, type) to provide the type. The corresponding values, read in the header, will be saved in the attrs dictionary.

  • pattern_header (string, default=None) – A regex pattern for header with groups corresponding to key, value to add into the attrs dictionary.

  • attrs (dict, default=None) – Attributes to save in the attrs dictionary.

  • kwargs (dict) – Arguments for __init__() (other than data and attrs).

Returns

data

Return type

BinnedStatistic

property mpiattrs

MPI attributes

property ndim

Number of dimensions of binned data.

classmethod read_header_txt(file, comments='#', mapping_header=None, pattern_header=None, ignore_json_errors=True)

Read and decode header.

Parameters
  • file (list, iterator) – List of lines.

  • comments (string, default='#') – Characters used to indicate the start of a header line.

  • mapping_header (dict, default=None) – Dictionary holding key:regex mapping or (regex, type) to provide the type. Type can be unspecified (or None), in which case decoded will be tried with json, a string corresponding to __builtins__, or a callable.

  • pattern_header (string, default=None) – A regex pattern with groups corresponding to key:value.

  • ignore_json_errors (bool, default=True) – When trying to decode header values using json, ignore errors.

Returns

attrs

Return type

dict

classmethod read_title_label(line)

Decode title line, splitting module.class into (module, class). It loads module, then if class is in _registry, return corresponding class. Else return None.

rebin(factors, dims=None, weights=None, columns_to_sum=None)

Rebin data by factor factors.

Parameters
  • factors (dict, list) – dim: rebinning factor mapping. If list, should contain rebinning factor for each dim of dims.

  • dims (list, default=None) – List of dimension names. Defaults to dims.

  • weights (array, default=None) – Array of weights (of shape shape). If None, defaults to 1.

  • columns_to_sum (list) – List of columns to sum, i.e. to not renormalize after rebinning.

save(filename)

Save class to disk.

save_auto(*args, **kwargs)

If different formats are possible, this method should between them based on file name extension.

save_txt(filename=None, fmt='.18e', comments='#', ignore_json_errors=True)

Dump BinnedStatistic.

Parameters
  • filename (string, default=None) – ASCII file name where to save binned data. If None, do not write on disk.

  • fmt (string, default='.18e') – Floating point format.

  • comments (string, default='#') – String that will be prepended to the header lines.

  • ignore_json_errors (bool, default=True) – When trying to dump attrs using json, ignore errors.

Returns

lines – List of strings (lines).

Return type

list

set_new_edges(edges, dims=None, weights=None, columns_to_sum=None)

Interpolate binned data within new edges. Perform linear interpolation if edges are not a simple concatenation of current edges.

Parameters
  • edges (dict, list) – New dim: edges mapping. If list, should contain edges for each dim of dims.

  • dims (list, default=None) – List of dimension names. Defaults to dims.

  • weights (array, default=None) – Array of weights (of shape shape). If None, defaults to 1.

  • columns_to_sum (list) – List of columns to sum, i.e. to not renormalize after rebinning.

Warning

Requires edges to be set.

set_x(x, mask=Ellipsis, flatten=True)

Set x-coordinates.

Parameters
  • x (list, array) – New x-coordinates. Can be a single array if ndim is 1. If flatten is False, arrays must be of same shape as (masked) data. Else, arrays must be 1D, of same size as (masked) data.

  • mask (list, tuple, default=Ellipsis) – Mask for each x-coordinate. If flatten is False, must be of same length as data along each x-coordinate, else same size (i.e. product of shape) as data.

  • flatten (bool, default=True) – Whether input is flattened.

set_y(y, mask=Ellipsis, flatten=True)

Same as set_x(), for y-coordinate.

property shape

Shape of binned data; if edges, return tuple of length of edges - 1, else shape of first array in data

property size

Total size of binned data, i.e. product of length over all dimensions.

squeeze(dims=None)

Squeeze binned data along dimensions dims. If dims is None, defaults to dimensions with length <= 1. dims and edges are updated to match new shape.