ModelContainer

class jwst.datamodels.ModelContainer(init=None, asn_exptypes=None, asn_n_members=None, iscopy=False, **kwargs)[source]

Bases: JwstDataModel, Sequence

A container for holding DataModels.

This functions like a list for holding DataModel objects. It can be iterated through like a list, DataModels within the container can be addressed by index, and the datamodels can be grouped into a list of lists for grouped looping, useful for NIRCam where grouping together all detectors of a given exposure is useful for some pipeline steps.

Parameters
  • init (file path, list of DataModels, or None) –

    • file path: initialize from an association table

    • list: a list of DataModels of any type

    • None: initializes an empty ModelContainer instance, to which DataModels can be added via the append() method.

  • asn_exptypes (str) – list of exposure types from the asn file to read into the ModelContainer, if None read all the given files.

  • asn_n_members (int) – Open only the first N qualifying members.

  • iscopy (bool) – Presume this model is a copy. Members will not be closed when the model is closed/garbage-collected.

Examples

>>> container = ModelContainer('example_asn.json')
>>> for model in container:
...     print(model.meta.filename)

Say the association was a NIRCam dithered dataset. The models_grouped attribute is a list of lists, the first index giving the list of exposure groups, with the second giving the individual datamodels representing each detector in the exposure (2 or 8 in the case of NIRCam).

>>> total_exposure_time = 0.0
>>> for group in container.models_grouped:
...     total_exposure_time += group[0].meta.exposure.exposure_time
>>> c = ModelContainer()
>>> m = datamodels.open('myfile.fits')
>>> c.append(m)

Notes

The optional paramters save_open and return_open can be provided to control how the DataModel are used by the ModelContainer. If save_open is set to False, each input DataModel instance in init will be written out to disk and closed, then only the filename for the DataModel will be used to initialize the ModelContainer object. Subsequent access of each member will then open the DataModel file to work with it. If return_open is also False, then the DataModel will be closed when access to the DataModel is completed. The use of these parameters can minimize the amount of memory used by this object during processing, with these parameters being used by OutlierDetectionStep.

When ASN table’s members contain attributes listed in RECOGNIZED_MEMBER_FIELDS, ModelContainer will read those attribute values and update the corresponding attributes in the meta of input models.

Example of ASN table with additional model attributes to supply custom catalogs.
"products": [
    {
        "name": "resampled_image",
        "members": [
            {
                "expname": "input_image1_cal.fits",
                "exptype": "science",
                "tweakreg_catalog": "custom_catalog1.ecsv"
            },
            {
                "expname": "input_image2_cal.fits",
                "exptype": "science",
                "tweakreg_catalog": "custom_catalog2.ecsv"
            }
        ]
    }
]

Warning

Input files will be updated in-place with new meta attribute values when ASN table’s members contain additional attributes.

Parameters
  • init (str, tuple, HDUList, ndarray, dict, None) –

    • None : Create a default data model with no shape.

    • tuple : Shape of the data array. Initialize with empty data array with shape specified by the.

    • file path: Initialize from the given file (FITS or ASDF)

    • readable file object: Initialize from the given file object

    • HDUList : Initialize from the given HDUList.

    • A numpy array: Used to initialize the data array

    • dict: The object model tree for the data model

  • schema (dict, str (optional)) – Tree of objects representing a JSON schema, or string naming a schema. The schema to use to understand the elements on the model. If not provided, the schema associated with this class will be used.

  • memmap (bool) – Turn memmap of FITS file on or off. (default: False). Ignored for ASDF files.

  • pass_invalid_values (bool or None) – If True, values that do not validate the schema will be added to the metadata. If False, they will be set to None. If None, value will be taken from the environmental PASS_INVALID_VALUES. Otherwise the default value is False.

  • strict_validation (bool or None) – If True, schema validation errors will generate an exception. If False, they will generate a warning. If None, value will be taken from the environmental STRICT_VALIDATION. Otherwise, the default value is False.

  • validate_on_assignment (bool or None) – Defaults to ‘None’. If None, value will be taken from the environmental VALIDATE_ON_ASSIGNMENT, defaulting to ‘True’ if no environment variable is set. If ‘True’, attribute assignments are validated at the time of assignment. Validation errors generate warnings and values will be set to None. If ‘False’, schema validation occurs only once at the time of write. Validation errors generate warnings.

  • cast_fits_arrays (bool) – If True, arrays will be cast to the dtype described by the schema when read from a FITS file. If False, arrays will be read without casting.

  • validate_arrays (bool) – If True, arrays will be validated against ndim, max_ndim, and datatype validators in the schemas.

  • ignore_missing_extensions (bool) – When False, raise warnings when a file is read that contains metadata about extensions that are not available. Defaults to True.

  • kwargs (dict) –

    Additional keyword arguments passed to lower level functions. These arguments are generally file format-specific. Arguments of note are:

    • FITS

      skip_fits_update - bool or None

      True to skip updating the ASDF tree from the FITS headers, if possible. If None, value will be taken from the environmental SKIP_FITS_UPDATE. Otherwise, the default value is True.

Attributes Summary

crds_observatory

Get the CRDS observatory for this container.

group_names

Return list of names for the DataModel groups by exposure.

models_grouped

Returns a list of a list of datamodels grouped by exposure.

schema_url

The schema URI to validate the model against.

Methods Summary

append(model)

close()

Close all datamodels.

copy([memo])

Returns a deep copy of the models in this model container.

extend(model)

from_asn(asn_data[, asn_file_path])

Load fits files from a JWST association file.

get_crds_parameters()

Get CRDS parameters for this container.

get_sections()

Iterator to return the sections from all members of the container.

ind_asn_type(asn_exptype)

Determine the indices of models corresponding to asn_exptype.

insert(index, model)

pop([index])

read_asn(filepath)

Load fits files from a JWST association file.

save([path, dir_path, save_model_func])

Write out models in container to FITS or ASDF.

set_buffer(buffer_size[, overlap])

Set buffer size for scrolling section-by-section access.

Attributes Documentation

crds_observatory

Get the CRDS observatory for this container. Used when selecting step/pipeline parameter files when the container is a pipeline input.

Return type

str

group_names

Return list of names for the DataModel groups by exposure.

models_grouped

Returns a list of a list of datamodels grouped by exposure. Assign an ID grouping by exposure.

Data from different detectors of the same exposure will have the same group id, which allows grouping by exposure. The following metadata is used for grouping:

meta.observation.program_number meta.observation.observation_number meta.observation.visit_number meta.observation.visit_group meta.observation.sequence_id meta.observation.activity_id meta.observation.exposure_number

schema_url = None

The schema URI to validate the model against. If None, only basic validation of required metadata properties (filename, model_type) will occur.

Methods Documentation

append(model)[source]
close()[source]

Close all datamodels.

copy(memo=None)[source]

Returns a deep copy of the models in this model container.

extend(model)[source]
from_asn(asn_data, asn_file_path=None)[source]

Load fits files from a JWST association file.

Parameters
  • asn_data (Association) – An association dictionary

  • asn_file_path (str) – Filepath of the association, if known.

get_crds_parameters()[source]

Get CRDS parameters for this container. Used when selecting step/pipeline parameter files when the container is a pipeline input.

Return type

dict

get_sections()[source]

Iterator to return the sections from all members of the container.

ind_asn_type(asn_exptype)[source]

Determine the indices of models corresponding to asn_exptype.

Parameters

asn_exptype (str) – Exposure type as defined in an association, e.g. “science”.

Returns

ind – Indices of models in ModelContainer._models matching asn_exptype.

Return type

list

insert(index, model)[source]
pop(index=-1)[source]
static read_asn(filepath)[source]

Load fits files from a JWST association file.

Parameters

filepath (str) – The path to an association file.

save(path=None, dir_path=None, save_model_func=None, **kwargs)[source]

Write out models in container to FITS or ASDF.

Parameters
  • path (str or func or None) –

    • If None, the meta.filename is used for each model.

    • If a string, the string is used as a root and an index is appended.

    • If a function, the function takes the two arguments: the value of model.meta.filename and the idx index, returning constructed file name.

  • dir_path (str) – Directory to write out files. Defaults to current working dir. If directory does not exist, it creates it. Filenames are pulled from meta.filename of each datamodel in the container.

  • save_model_func (func or None) – Alternate function to save each model instead of the models save method. Takes one argument, the model, and keyword argument idx for an index.

Returns

output_paths – List of output file paths of where the models were saved.

Return type

[str[, …]]

set_buffer(buffer_size, overlap=None)[source]

Set buffer size for scrolling section-by-section access.

Parameters
  • buffer_size (float, None) – Define size of buffer in MB for each section. If None, a default buffer size of 1MB will be used.

  • overlap (int, optional) – Define the number of rows of overlaps between sections. If None, no overlap will be used.