Data Type Classes

TXPipe stage input and output files are all represented by a class which defines their type and various other pieces of information.

Some of these classes are generic, like HDFFile or TextFile, and some are more specific, like ShearCatalog.

The available types are described below.

class txpipe.data_types.base.DataFile(path, mode, extra_provenance=None, validate=True, **kwargs)[source]

Bases: object

A class representing a DataFile to be made by pipeline stages and passed on to subsequent ones.

DataFile itself should not be instantiated - instead subclasses should be defined for different file types.

These subclasses are used in the definition of pipeline stages to indicate what kind of file is expected. The “suffix” attribute, which must be defined on subclasses, indicates the file suffix.

The open method, which can optionally be overridden, is used by the machinery of the PipelineStage class to open an input our output named by a tag.

add_provenance(key, value)[source]

Concrete subclasses (for which it is possible) should override this method to save the a new string key/value pair to file

close()[source]
static generate_provenance(extra_provenance=None)[source]

Generate provenance information - a dictionary of useful information about the origina

classmethod make_name(tag)[source]
classmethod open(path, mode)[source]

Open a data file. The base implementation of this function just opens and returns a standard python file object.

Subclasses can override to either open files using different openers (like fitsio.FITS), or, for more specific data types, return an instance of the class itself to use as an intermediary for the file.

read_provenance()[source]

Concrete subclasses for which it is possible should override this method and read the provenance information from the file.

Other classes will return this dictionary of UNKNOWNs

suffix = None
supports_parallel_write = False
validate()[source]

Concrete subclasses should override this method to check that all expected columns are present.

write_provenance()[source]

Concrete subclasses (for which it is possible) should override this method to save the dictionary self.provenance to the file.

class txpipe.data_types.base.Directory(path, mode, extra_provenance=None, validate=True, **kwargs)[source]

Bases: DataFile

classmethod open(path, mode)[source]

Open a data file. The base implementation of this function just opens and returns a standard python file object.

Subclasses can override to either open files using different openers (like fitsio.FITS), or, for more specific data types, return an instance of the class itself to use as an intermediary for the file.

read_provenance()[source]

Concrete subclasses for which it is possible should override this method and read the provenance information from the file.

Other classes will return this dictionary of UNKNOWNs

suffix = ''
write_provenance()[source]

Write provenance information to a new group, called ‘provenance’

class txpipe.data_types.base.FileCollection(path, mode, extra_provenance=None, validate=True, **kwargs)[source]

Bases: Directory

Represents a grouped bundle of files, for cases where you don’t know the exact list in advance.

path_for_file(filename)[source]

Get the path for a file inside the collection. Does not check if the file exists or anything like that.

read_listing()[source]

Read a listing file from the directory.

suffix = ''
write_listing(filenames)[source]

Write a listing file in the directory recording (presumably) the filenames put in it.

exception txpipe.data_types.base.FileValidationError[source]

Bases: Exception

class txpipe.data_types.base.FitsFile(path, mode, extra_provenance=None, validate=True, **kwargs)[source]

Bases: DataFile

A data file in the FITS format. Using these files requires the fitsio package.

close()[source]
missing_columns(columns, hdu=1)[source]

Check that all supplied columns exist and are in the chosen HDU

classmethod open(path, mode, **kwargs)[source]

Open a data file. The base implementation of this function just opens and returns a standard python file object.

Subclasses can override to either open files using different openers (like fitsio.FITS), or, for more specific data types, return an instance of the class itself to use as an intermediary for the file.

read_provenance()[source]

Concrete subclasses for which it is possible should override this method and read the provenance information from the file.

Other classes will return this dictionary of UNKNOWNs

required_columns = []
suffix = 'fits'
validate()[source]

Check that the catalog has all the required columns and complain otherwise

write_provenance()[source]

Write provenance information to a new group, called ‘provenance’

class txpipe.data_types.base.HDFFile(path, mode, extra_provenance=None, validate=True, **kwargs)[source]

Bases: DataFile

close()[source]
classmethod open(path, mode, **kwargs)[source]

Open a data file. The base implementation of this function just opens and returns a standard python file object.

Subclasses can override to either open files using different openers (like fitsio.FITS), or, for more specific data types, return an instance of the class itself to use as an intermediary for the file.

read_provenance()[source]

Concrete subclasses for which it is possible should override this method and read the provenance information from the file.

Other classes will return this dictionary of UNKNOWNs

required_datasets = []
suffix = 'hdf5'
supports_parallel_write = True

A data file in the HDF5 format. Using these files requires the h5py package, which in turn requires an HDF5 library installation.

validate()[source]

Concrete subclasses should override this method to check that all expected columns are present.

write_provenance()[source]

Write provenance information to a new group, called ‘provenance’

class txpipe.data_types.base.PNGFile(path, mode, extra_provenance=None, validate=True, **kwargs)[source]

Bases: DataFile

close()[source]
classmethod open(path, mode, **kwargs)[source]

Open a data file. The base implementation of this function just opens and returns a standard python file object.

Subclasses can override to either open files using different openers (like fitsio.FITS), or, for more specific data types, return an instance of the class itself to use as an intermediary for the file.

read_provenance()[source]

Concrete subclasses for which it is possible should override this method and read the provenance information from the file.

Other classes will return this dictionary of UNKNOWNs

suffix = 'png'
write_provenance()[source]

Concrete subclasses (for which it is possible) should override this method to save the dictionary self.provenance to the file.

class txpipe.data_types.base.ParquetFile(path, mode, extra_provenance=None, validate=True, **kwargs)[source]

Bases: DataFile

close()[source]
open(path, mode)[source]

Open a data file. The base implementation of this function just opens and returns a standard python file object.

Subclasses can override to either open files using different openers (like fitsio.FITS), or, for more specific data types, return an instance of the class itself to use as an intermediary for the file.

suffiz = 'pq'
class txpipe.data_types.base.PickleFile(path, mode, extra_provenance=None, validate=True, **kwargs)[source]

Bases: DataFile

classmethod open(path, mode, **kwargs)[source]

Open a data file. The base implementation of this function just opens and returns a standard python file object.

Subclasses can override to either open files using different openers (like fitsio.FITS), or, for more specific data types, return an instance of the class itself to use as an intermediary for the file.

read()[source]
read_provenance()[source]

Concrete subclasses for which it is possible should override this method and read the provenance information from the file.

Other classes will return this dictionary of UNKNOWNs

suffix = 'pkl'
write(obj)[source]
write_provenance()[source]

Concrete subclasses (for which it is possible) should override this method to save the dictionary self.provenance to the file.

class txpipe.data_types.base.TextFile(path, mode, extra_provenance=None, validate=True, **kwargs)[source]

Bases: DataFile

A data file in plain text format.

suffix = 'txt'
class txpipe.data_types.base.YamlFile(path, mode, extra_provenance=None, validate=True, load_mode='full')[source]

Bases: DataFile

A data file in yaml format. The top-level object in TXPipe YAML files should always be a dictionary.

read(key)[source]
read_provenance()[source]

Concrete subclasses for which it is possible should override this method and read the provenance information from the file.

Other classes will return this dictionary of UNKNOWNs

suffix = 'yml'
write(d)[source]
write_provenance()[source]

Concrete subclasses (for which it is possible) should override this method to save the dictionary self.provenance to the file.

This file contains TXPipe-specific file types, subclassing the more generic types in base.py

class txpipe.data_types.types.CSVFile(path, mode, extra_provenance=None, validate=True, **kwargs)[source]

Bases: DataFile

save_file(name, dataframe)[source]
suffix = 'csv'
class txpipe.data_types.types.ClusteringNoiseMaps(path, mode, extra_provenance=None, validate=True, **kwargs)[source]

Bases: MapsFile

number_of_realizations()[source]
read_density_split(realization_index, bin_index)[source]
class txpipe.data_types.types.FiducialCosmology(path, mode, extra_provenance=None, validate=True, load_mode='full')[source]

Bases: YamlFile

to_ccl(**kwargs)[source]
class txpipe.data_types.types.LensingNoiseMaps(path, mode, extra_provenance=None, validate=True, **kwargs)[source]

Bases: MapsFile

number_of_realizations()[source]
read_rotation(realization_index, bin_index)[source]
required_datasets = []
class txpipe.data_types.types.MapsFile(path, mode, extra_provenance=None, validate=True, **kwargs)[source]

Bases: HDFFile

list_maps()[source]
plot(map_name, **kwargs)[source]
plot_gnomonic(map_name, **kwargs)[source]
plot_healpix(map_name, view='cart', **kwargs)[source]
read_gnomonic(map_name)[source]
read_healpix(map_name, return_all=False)[source]
read_map(map_name)[source]
read_map_info(map_name)[source]
read_mask()[source]
required_datasets = []
write_map(map_name, pixel, value, metadata)[source]

Save an output map to an HDF5 subgroup.

The pixel numbering and the metadata are also saved.

Parameters:
  • group (H5Group) – The h5py Group object in which to store maps

  • name (str) – The name of this map, used as the name of a subgroup in the group where the data is stored.

  • pixel (array) – Array of indices of observed pixels

  • value (array) – Array of values of observed pixels

  • metadata (mapping) – Dict or other mapping of metadata to store along with the map

class txpipe.data_types.types.NOfZFile(path, mode, extra_provenance=None, validate=True, **kwargs)[source]

Bases: HDFFile

get_n_of_z(kind, bin_index)[source]
get_n_of_z_spline(bin_index, kind='cubic', **kwargs)[source]
get_nbin(kind)[source]
plot(kind)[source]
required_datasets = []
save_plot(filename, **fig_kw)[source]
class txpipe.data_types.types.PhotozPDFFile(path, mode, extra_provenance=None, validate=True, **kwargs)[source]

Bases: HDFFile

required_datasets = []
class txpipe.data_types.types.QPFile(path, mode, extra_provenance=None, validate=True, **kwargs)[source]

Bases: DataFile

suffix = 'hdf5'
class txpipe.data_types.types.RandomsCatalog(path, mode, extra_provenance=None, validate=True, **kwargs)[source]

Bases: HDFFile

required_datasets = ['randoms/ra', 'randoms/dec']
class txpipe.data_types.types.SACCFile(path, mode, extra_provenance=None, validate=True, **kwargs)[source]

Bases: DataFile

close()[source]
classmethod open(path, mode, **kwargs)[source]

Open a data file. The base implementation of this function just opens and returns a standard python file object.

Subclasses can override to either open files using different openers (like fitsio.FITS), or, for more specific data types, return an instance of the class itself to use as an intermediary for the file.

read_provenance()[source]

Concrete subclasses for which it is possible should override this method and read the provenance information from the file.

Other classes will return this dictionary of UNKNOWNs

suffix = 'sacc'
class txpipe.data_types.types.ShearCatalog(*args, **kwargs)[source]

Bases: HDFFile

A generic shear catalog

property catalog_type
get_primary_catalog_group()[source]
get_primary_catalog_names(true_shear=False)[source]
get_size()[source]
read_catalog_info()[source]
class txpipe.data_types.types.TomographyCatalog(path, mode, extra_provenance=None, validate=True, **kwargs)[source]

Bases: HDFFile

read_nbin(bin_type)[source]
read_zbins(bin_type)[source]

Read saved redshift bin edges from attributes

required_datasets = []
txpipe.data_types.types.metacalibration_names(names)[source]

Generate the metacalibrated variants of the inputs names, that is, variants with _1p, _1m, _2p, and _2m on the end of each name.