ascat package
Subpackages
- ascat.aggregate package
- ascat.download namespace
- ascat.eumetsat package
- ascat.grids package
- ascat.product_info package
- Submodules
- ascat.product_info.product_info module
AscatH121CellAscatH121SwathAscatH122CellAscatH122SwathAscatH129CellAscatH129SwathAscatH130CellAscatH130SwathAscatH139CellAscatH139SwathAscatH29CellAscatH29SwathAscatSIG0Cell12500mAscatSIG0Cell6250mAscatSIG0Swath12500mAscatSIG0Swath6250mAscatSwathProductBaseCellProductErsCellErsHCellErsNCellOrthoMultiArrayCellProductRaggedArrayCellProductSwathProductget_swath_product_id()
- Module contents
- ascat.read_native package
- Submodules
- ascat.read_native.base module
- ascat.read_native.bufr module
- ascat.read_native.cdr module
- ascat.read_native.eps_native module
- ascat.read_native.generate_test_data module
- ascat.read_native.hdf5 module
- ascat.read_native.nc module
- ascat.read_native.ragged_array_ts module
CRANcFileCellFileCollectionCellFileCollectionStackIRANcFileRAFileSwathFileCollectionSwathFileCollection.pathSwathFileCollection.ioclassSwathFileCollection.ioclass_kwsSwathFileCollection.gridSwathFileCollection.ts_dtypeSwathFileCollection.beams_varsSwathFileCollection.date_formatSwathFileCollection.cell_fn_formatSwathFileCollection.chron_filesSwathFileCollection.previous_cellSwathFileCollection.fidSwathFileCollection.max_buffer_memory_mbSwathFileCollection.close()SwathFileCollection.from_product_id()SwathFileCollection.get_filenames()SwathFileCollection.process()SwathFileCollection.read()SwathFileCollection.stack()SwathFileCollection.swath_data_generator()
braces_to_re_groups()vrange()
- ascat.read_native.xarray_io module
AscatH121v1CellAscatH121v1SwathAscatH121v1Swath.beams_varsAscatH121v1Swath.cell_fn_formatAscatH121v1Swath.date_formatAscatH121v1Swath.fn_patternAscatH121v1Swath.fn_read_fmt()AscatH121v1Swath.gridAscatH121v1Swath.grid_cell_sizeAscatH121v1Swath.grid_sampling_kmAscatH121v1Swath.sf_patternAscatH121v1Swath.sf_read_fmt()AscatH121v1Swath.ts_dtype
AscatH122CellAscatH122SwathAscatH129CellAscatH129SwathAscatH129v1CellAscatH129v1SwathAscatH129v1Swath.beams_varsAscatH129v1Swath.cell_fn_formatAscatH129v1Swath.date_formatAscatH129v1Swath.fn_patternAscatH129v1Swath.fn_read_fmt()AscatH129v1Swath.gridAscatH129v1Swath.grid_cell_sizeAscatH129v1Swath.grid_sampling_kmAscatH129v1Swath.sf_patternAscatH129v1Swath.sf_read_fmt()AscatH129v1Swath.ts_dtype
AscatNetCDFCellBaseAscatSIG0Cell12500mAscatSIG0Cell6250mAscatSIG0Swath12500mAscatSIG0Swath12500m.beams_varsAscatSIG0Swath12500m.cell_fn_formatAscatSIG0Swath12500m.date_formatAscatSIG0Swath12500m.fn_patternAscatSIG0Swath12500m.fn_read_fmt()AscatSIG0Swath12500m.gridAscatSIG0Swath12500m.grid_cell_sizeAscatSIG0Swath12500m.grid_sampling_kmAscatSIG0Swath12500m.sf_patternAscatSIG0Swath12500m.sf_read_fmt()AscatSIG0Swath12500m.ts_dtype
AscatSIG0Swath6250mAscatSIG0Swath6250m.beams_varsAscatSIG0Swath6250m.cell_fn_formatAscatSIG0Swath6250m.date_formatAscatSIG0Swath6250m.fn_patternAscatSIG0Swath6250m.fn_read_fmt()AscatSIG0Swath6250m.gridAscatSIG0Swath6250m.grid_cell_sizeAscatSIG0Swath6250m.grid_sampling_kmAscatSIG0Swath6250m.sf_patternAscatSIG0Swath6250m.sf_read_fmt()AscatSIG0Swath6250m.ts_dtype
CellGridCacheRaggedXArrayCellIOBaseSwathIOBaseappend_to_netcdf()create_variable_encodings()get_swath_product_id()set_attributes()trim_dates()var_order()
- Module contents
- ascat.regrid package
- ascat.resample package
- ascat.stack namespace
Submodules
ascat.accessors module
- class ascat.accessors.CFDiscreteGeometryAccessor(xarray_obj: Dataset)[source]
Bases:
object
- class ascat.accessors.PyGeoGriddedArrayAccessor(xarray_obj: Dataset)[source]
Bases:
object- property grid
- lonlat_vars_from_gpi_var(gpi_var, lon_var='lon', lat_var='lat') tuple[DataArray, DataArray][source]
ascat.cell module
- class ascat.cell.CellGridFiles(root_path, file_class, grid, fn_format='{cell:04d}.nc', sf_format=None, preprocessor=None)[source]
Bases:
object- convert_to_contiguous(out_dir, print_progress=True, **kwargs)[source]
Convert all files in the collection to contiguous format and write to disk.
- read(cell=None, location_id=None, coords=None, bbox=None, geom=None, max_coord_dist=inf, date_range=None, **kwargs)[source]
Read data matching a spatial and temporal criterion.
- Parameters:
coords (tuple of numeric or tuple of iterable of numeric) –
Tuple of (lon, lat) coordinates. lon and lat could each be numpy arrays in order to read multiple coordinates. For each coordinate the nearest grid point within max_coord_dist (in spherical cartesian coordinates) will be selected.
Note that if any passed coordinates share the same nearest grid point, that grid point will only be represented once in the output dataset.
bbox (tuple) – Tuple of (latmin, latmax, lonmin, lonmax) coordinates.
geom (shapely.geometry) – Geometry object.
max_coord_dist (float) – The maximum distance a coordinate’s nearest grid point can be from it to be selected (in spherical cartesian coordinates). Default is np.inf.
date_range (tuple of np.datetime64) – Tuple of (start, end) dates.
- Returns:
Filtered and merged data for the specified spatiotemporal region.
- Return type:
xarray.Dataset
- reprocess(out_dir, func, parallel=True, **kwargs)[source]
Use Filenames.reprocess to apply a function to all files in the collection and save the results to out_dir.
- spatial_search(cell=None, location_id=None, coords=None, bbox=None, geom=None)[source]
Search files for cells matching a spatial criterion. All args are declared as optional; but one and only one should be passed.
- class ascat.cell.OrthoMultiTimeseriesCell(filenames)[source]
Bases:
FilenamesClass to read and merge orthomulti cell files.
- read(date_range=None, location_id=None, lookup_vector=None, preprocessor=None, parallel=False, **kwargs)[source]
Read data from OrthoMulti Cell files.
- class ascat.cell.RaggedArrayTs(filenames)[source]
Bases:
FilenamesClass to read and merge ragged array cell files.
- read(date_range=None, location_id=None, lookup_vector=None, preprocessor=None, return_format=None, parallel=False, **kwargs)[source]
Read data from Ragged Array Cell files.
- Parameters:
date_range (tuple of np.datetime64) – Tuple of (start, end) dates.
lookup_vector (np.ndarray) – Lookup vector.
preprocessor (callable, optional) – Function to preprocess the dataset.
return_format (str, optional) – CF discrete geometry format to return data as. Can be “point”, “indexed”, or “contiguous”.
parallel (bool, optional) – Whether or not to read/preprocess in parallel. Default is False.
**kwargs (dict) –
ascat.cf_array module
- class ascat.cf_array.CFDiscreteGeom(xarray_obj: Dataset, coord_vars: Sequence[str] | None = None, instance_vars: Sequence[str] | None = None, contiguous_sort_vars: Sequence[str] | None = None)[source]
Bases:
object- property array_type
- class ascat.cf_array.OrthoMultiTimeseriesArray(xarray_obj: Dataset, coord_vars: Sequence[str] | None = None, instance_vars: Sequence[str] | None = None, contiguous_sort_vars: Sequence[str] | None = None)[source]
Bases:
CFDiscreteGeom- property array_type
- class ascat.cf_array.PointArray(xarray_obj: Dataset, coord_vars: Sequence[str] | None = None, instance_vars: Sequence[str] | None = None, contiguous_sort_vars: Sequence[str] | None = None)[source]
Bases:
CFDiscreteGeom
- class ascat.cf_array.RaggedArray(xarray_obj: Dataset, coord_vars: Sequence[str] | None = None, instance_vars: Sequence[str] | None = None, contiguous_sort_vars: Sequence[str] | None = None)[source]
Bases:
CFDiscreteGeom- property array_type
- sel_instances(instance_vals: Sequence[int | str] | ndarray | None = None, instance_lookup_vector: ndarray | None = None) Dataset[source]
- property timeseries_id
- class ascat.cf_array.TimeseriesPointArray(xarray_obj: Dataset, coord_vars: Sequence[str] | None = None, instance_vars: Sequence[str] | None = None, contiguous_sort_vars: Sequence[str] | None = None)[source]
Bases:
PointArrayAssumptions made beyond basic CF conventions:
- cf_role=”timeseries_id” is used to identify the timeseries ID variable for purposes
of selecting instances and converting to ragged arrays. If you only have a single timeseries there’s not much point in using this class.
- resample_to_orthomulti(instance_dim: str = 'locations', timeseries_id: str = 'location_id', count_var: str = 'row_size', instance_vars: ~typing.Sequence[str] | None = None, coord_vars: ~typing.Sequence[str] | None = None, sort_vars: ~typing.Sequence[str] | None = None, vars_to_resample: ~typing.Sequence[str] | None = None, resample_method: callable = <function mean>, resample_period: str = '1M')[source]
- sel_instances(instance_vals: Sequence[int | str] | ndarray | None = None, instance_lookup_vector: ndarray | None = None, timeseries_id: str = 'location_id')[source]
- property timeseries_id
- to_contiguous_ragged(instance_dim: str = 'locations', timeseries_id: str = 'location_id', count_var: str = 'row_size', instance_vars: Sequence[str] | None = None, coord_vars: Sequence[str] | None = None, sort_vars: Sequence[str] | None = None) Dataset[source]
- to_indexed_ragged(instance_dim: str = 'locations', timeseries_id: str = 'location_id', index_var: str = 'locationIndex', instance_vars: Sequence[str] | None = None, coord_vars: Sequence[str] | None = None) Dataset[source]
- ascat.cf_array.contiguous_to_indexed(ds: Dataset, sample_dim: str, instance_dim: str, count_var: str, index_var: str) Dataset[source]
Convert a contiguous ragged array dataset to an indexed ragged array dataset.
- ascat.cf_array.contiguous_to_point(ds: Dataset, sample_dim: str, instance_dim: str, count_var: str)[source]
Convert a contiguous ragged array dataset to a Point Array.
- ascat.cf_array.indexed_to_contiguous(ds: Dataset, sample_dim: str, instance_dim: str, count_var: str, index_var: str, sort_vars: Sequence[str] | None = None) Dataset[source]
Convert an indexed ragged array dataset to a contiguous ragged array dataset
- ascat.cf_array.indexed_to_point(ds: Dataset, sample_dim: str, instance_dim: str, index_var: str)[source]
ascat.cgls module
ascat.file_handling module
File search methods.
- class ascat.file_handling.ChronFiles(root_path, cls, fn_templ, sf_templ, cls_kwargs=None, err=True, fn_read_fmt=None, sf_read_fmt=None, fn_write_fmt=None, sf_write_fmt=None, cache_size=0)[source]
Bases:
MultiFileHandlerManaging chronological files with a date field in the filename.
- read_period(dt_start, dt_end, dt_delta=datetime.timedelta(days=1), dt_buffer=datetime.timedelta(days=1), search_date_fmt='%Y%m%d*', date_field='date', date_field_fmt='%Y%m%d', end_inclusive=True, fmt_kwargs={}, **kwargs)[source]
Read data for given interval.
- Parameters:
dt_start (datetime) – Start datetime.
dt_end (datetime) – End datetime.
dt_delta (timedelta, optional) – Time delta used to jump through search date.
dt_buffer (timedelta, optional) – Search buffer used to find files which could possibly contain data but would be left out because of dt_start.
search_date_fmt (str, optional) – Search date string format used during file search (default: %Y%m%d*).
date_field (str, optional) – Date field name (default: “date”).
date_field_fmt (str, optional) – Date field string format (default: %Y%m%d).
- Returns:
data – Data stored in file.
- Return type:
- search_date(timestamp, search_date_fmt='%Y%m%d*', date_field='date', date_field_fmt='%Y%m%d', return_date=False, **fmt_kwargs)[source]
Search files for given date.
- Parameters:
timestamp (datetime) – Search date.
search_date_fmt (str, optional) – Search date string format used during file search (default: %Y%m%d*).
date_field (str, optional) – Date field name (default: “date”)
date_field_format (str, optional) – Date field string format (default: %Y%m%d).
return_date (bool, optional) – Return date parsed from filename (default: False).
- Returns:
filenames (list of str) – Filenames.
dates (list of datetime) – Parsed date of filename (only returned if return_date=True).
- search_period(dt_start, dt_end, dt_delta=datetime.timedelta(days=1), search_date_fmt='%Y%m%d*', date_field='date', date_field_fmt='%Y%m%d', end_inclusive=True, **fmt_kwargs)[source]
Search files for time period.
- Parameters:
dt_start (datetime) – Start datetime.
dt_end (datetime) – End datetime.
dt_delta (timedelta, optional) – Time delta used to jump through search date.
search_fmt (str, optional) – Search date string format used during file search (default: %Y%m%d*).
date_field (str, optional) – Date field name (default: “date”).
date_field_fmt (str, optional) – Date field string format (default: %Y%m%d).
end_inclusive (bool, optional) – Include files from a dt_delta length period beyond dt_end if True (default: False).
- Returns:
filenames – Filenames.
- Return type:
- class ascat.file_handling.CsvFile(filename, mode='r')[source]
Bases:
FilenamesRead and write single CSV file.
- header2dtype(header)[source]
Convert header string to dtype info.
- Parameters:
header (str) – Header string with dtype info.
- Returns:
dtype – Data type.
- Return type:
- class ascat.file_handling.CsvFiles(root_path)[source]
Bases:
ChronFilesWrite CSV files.
- class ascat.file_handling.FileSearch(root_path, fn_pattern, sf_pattern=None)[source]
Bases:
objectFileSearch class.
- create_isearch_func(func, recursive=False)[source]
Create custom search function returning it.
- Parameters:
func (function) – Search function with its own args/kwargs returning a filename format dictionary and subfolder format dictionary depending on the passed arguments.
recursive (bool, optional) – If recursive is true, the pattern “**” will match any files and zero or more directories, subdirectories and symbolic links to directories (default: False).
- Returns:
custom_search – Custom search function returning an iterator of path/file names that match.
- Return type:
function
- create_search_func(func, recursive=False)[source]
Create custom search function returning it.
- Parameters:
func (function) – Search function with its own args/kwargs returning a filename format dictionary and subfolder format dictionary depending on the passed arguments.
recursive (bool, optional) – If recursive is true, the pattern “**” will match any files and zero or more directories, subdirectories and symbolic links to directories (default: False).
- Returns:
custom_search – Custom search function returning a possibly-empty list of path/file names that match.
- Return type:
function
- isearch(fn_fmt, sf_fmt=None, recursive=False)[source]
Search filesystem for given pattern returning iterator.
- Parameters:
fn_fmt (dict) – Filename format dictionary.
sf_fmt (dict of dicts, optional) – Format dictionary for subfolders (default: None).
recursive (bool, optional) – If recursive is true, the pattern “**” will match any files and zero or more directories, subdirectories and symbolic links to directories (default: False).
- Returns:
filenames – Iterator which yields the same values as search() without actually storing them all simultaneously.
- Return type:
iterator
- search(fn_fmt, sf_fmt=None, recursive=False)[source]
Search filesystem for given pattern returning list.
- Parameters:
fn_fmt (dict) – Filename format dictionary.
sf_fmt (dict of dicts, optional) – Format dictionary for subfolders (default: None).
recursive (bool, optional) – If recursive is true, the pattern “**” will match any files and zero or more directories, subdirectories and symbolic links to directories (default: False).
- Returns:
filenames – Return a possibly-empty list of path/file names that match.
- Return type:
- class ascat.file_handling.FilenameTemplate(root_path, fn_templ, sf_templ=None)[source]
Bases:
objectFilenameTemplate class.
- build_filename(fn_fmt, sf_fmt=None)[source]
Create filename from format dictionary.
- Parameters:
fn_fmt (dict) – Filename format applied on filename pattern (fn_pattern). e.g. fn_pattern = “{date}*.{suffix}” with fn_format_dict = {“date”: “20000101”, “suffix”: “nc”} returns “20000101*.nc”
sf_fmt (dict of dicts) –
Format dictionary for subfolders. Each subfolder contains a dictionary defining the format of the folder name. e.g. sf_templ = {“years”: {year}, “months”: {month}} with sf_format = {“years”: {“year”: “2000”},
”months”: {“month”: “02”}}
returns [“2000”, “02”]
- Returns:
filename – Filename with format_dict applied.
- Return type:
- build_subfolder(fmt)[source]
Create subfolder path from format dictionary.
- Parameters:
fmt (dict of dicts) –
Format dictionary for subfolders. Each subfolder contains a dictionary defining the format of the folder name. e.g. sf_pattern = {“years”: {year}, “months”: {month}} with format_dict = {“years”: {“year”: “2000”},
”months”: {“month”: “02”}}
returns [“2000”, “02”]
- Returns:
subfolder – Subfolder with format_dict applied.
- Return type:
- property template
Name property.
- class ascat.file_handling.Filenames(filenames)[source]
Bases:
objectA class to handle operations on multiple filenames.
This class provides methods for reading from, writing to, and merging data from multiple files.
- iter_read(print_progress=False, **kwargs)[source]
Iterate over all files and yield data.
- Yields:
object – Data read from each file.
- iter_read_nbytes(max_nbytes, print_progress=False, **kwargs)[source]
Iterate over all files and yield data until the specified number of bytes is reached. If _read returns dask objects, they are computed (in parallel) before merging the data.
- read(parallel=False, closer_attr=None, **kwargs)[source]
Read all data from files.
- Returns:
Merged data from all files.
- Return type:
- reprocess(out_dir, func, parallel=False, print_progress=False, read_kwargs=None, **write_kwargs)[source]
Reprocess data from all files through func, writing the results to out_dir. Assumes that if any files have the same name, they should be merged.
- Parameters:
out_dir (Path) – Directory to write the output files. This will be prepended to the filenames.
func (function) – The function to apply to the data before writing out.
parallel (bool, optional) – Whether to process the data in parallel (default: False).
**kwargs (dict) – Additional keyword arguments for writing.
- write(data, parallel=False, print_progress=False, **kwargs)[source]
Write data to file.
If there’s only one filename in self.filenames, write provided data to that file. If there is more than one filename, write each element of the provided data list to the corresponding filename.
- Parameters:
data (list of objects) – The data to write. Should be a list with the same length as self.filenames, where each element is the data to be written to the corresponding filename.
- class ascat.file_handling.MultiFileHandler(root_path, cls, fn_templ, sf_templ=None, cls_kwargs=None, err=False, cache_size=0)[source]
Bases:
objectMultiFileHandler class.
- read(*fmt_args, fmt_kwargs=None, cls_kwargs=None)[source]
Read data.
- Parameters:
- Returns:
data – Data stored in file.
- Return type:
- read_file(filename, cls_kwargs=None)[source]
Read data for given filename.
- Parameters:
filename (str) – Filename.
- search(fn_search_pattern, sf_search_pattern=None, custom_fn_templ=None, custom_sf_templ=None)[source]
Search files for given root path and filename/folder pattern.
ascat.h_saf module
ascat.ragged_array module
- class ascat.ragged_array.ContiguousRaggedArray(ds: Dataset, count_var: str, instance_dim: str, instance_id_var: str = None)[source]
Bases:
objectContiguous ragged array representation (CF convention).
In an contiguous ragged array representation, the dataset for all time series are stored in a single 1D array. Additional variables or dimensions provide the metadata needed to map these values back to their respective time series.
The contiguous ragged array representation can be used only if the size of each instance is known at the time that it is created. In this representation the data for each instance will be contiguous on disk.
If the instance dimension exists as a variable, it is assumed that the values represent the identifiers for each instance otherwise they are count upwards from 0.
- sample_dim
Name of the sample dimension. The variable bearing the sample_dimension attribute (i.e. count_var) must have the instance dimension as its single dimension, and must have an integer type.
- Type:
- count_var
Name of the count variable. The count variable must be an integer type and must have the instance dimension as its sole dimension. The count variable are identifiable by the presence of an attribute, sample_dimension, found on the count variable, which names the sample dimension being counted.
- Type:
- ds
Contiguous ragged array dataset.
- Type:
xarray.Dataset
- property ds
Dataset.
- Returns:
ds – Contiguous ragged array dataset.
- Return type:
xr.Dataset
- classmethod from_file(filename: str, count_var: str, instance_dim: str, instance_id_var: str = None, **kwargs)[source]
Load time series from file.
- Parameters:
- Returns:
data – ContiguousRaggedArray object loaded from a file.
- Return type:
- iter()[source]
Explicit iterator method.
- Returns:
ds – Time series for instance.
- Return type:
xr.Dataset
- sel_instances(i: ndarray) Dataset[source]
Read time series for given instance IDs using a LUT and preserve order.
- Parameters:
i (np.ndarray) – Array of instance IDs.
- Returns:
ds – Dataset containing the selected instances in the correct order.
- Return type:
xr.Dataset
- to_indexed()[source]
Convert to indexed ragged array.
- Returns:
data – Indexed ragged array time series.
- Return type:
- class ascat.ragged_array.IndexedRaggedArray(ds: Dataset, index_var: str, sample_dim: str)[source]
Bases:
objectIndexed ragged array representation (CF convention).
In an indexed ragged array representation, the dataset is structured to store variable-length data (e.g., time series with varying lengths) compactly. To achieve this, auxiliary indexing variables that map the flat array storage to meaningful groups (e.g. locations).
If the instance dimension exists as a variable, it is assumed that the values represent the identfiers for each instance otherwise they counting upwards from 0.
- index_var
The indexed ragged array representation must contain an index variable, which must be an integer type, and must have the sample dimension as its single dimension. The index variable can be identified by having an attribute ‘instance_dimension’ whose value is the instance dimension.
- Type:
- sample_dim
Name of the sample dimension. The sample dimension indicates the number of instances (e.g. stations, locations).
- Type:
- instance_dim
The name of the instance dimension. The value is defined by the ‘instance_dimension’ attribute, which must be present on the index variable. All variables having the instance dimension are instance variables, i.e. variables holding time series data.
- Type:
- ds
Indexed ragged array dataset.
- Type:
xarray.Dataset
- append(ds: Dataset)[source]
Append indexed ragged array time series.
- Parameters:
ds (xarray.Dataset) – Indexed ragged array time series.
- property ds: Dataset
Dataset.
- Returns:
ds – Indexed ragged array dataset.
- Return type:
xr.Dataset
- iter() Dataset[source]
Explicit iterator method.
- Returns:
ds – Time series for instance.
- Return type:
xr.Dataset
- sel_instance(i: int) Dataset[source]
Read time series.
- Parameters:
i (int) – Instance identifier.
- Returns:
ds – Time series for instance.
- Return type:
xr.Dataset
- sel_instances(i: array, ignore_missing: bool = True) Dataset[source]
Select multiple instances (time series).
- Parameters:
i (numpy.array) – Instance identifier.
- Returns:
ds – Time series for instance.
- Return type:
xr.Dataset
- to_contiguous(count_var: str = 'row_size') ContiguousRaggedArray[source]
Convert to contiguous ragged array.
- Parameters:
count_var (str, optional) – Count variable (default: “row_size”).
- Returns:
data – Contiguous ragged array time series.
- Return type:
- to_orthomulti() OrthoMultiArray[source]
Convert to orthogonal multidimensional array.
- Returns:
data – Orthogonal multidimensional array time series.
- Return type:
- class ascat.ragged_array.OrthoMultiArray(ds: Dataset, instance_dim: str = 'loc', element_dim: str = 'time')[source]
Bases:
objectOrthogonal multidimensional array.
- ds
Orthomulti array dataset.
- Type:
xarray.Dataset
- property ds
- class ascat.ragged_array.PointData(ds: Dataset, sample_dim: str)[source]
Bases:
objectPoint data represent scattered locations and times with no implied relationship among of coordinate positions, both data and coordinates must share the same (sample) instance dimension.
- property ds
- to_contiguous(count_var: str = 'row_size', instance_dim: str = 'loc')[source]
Convert point data to contiguous ragged array.
- Parameters:
- Returns:
contiguous – Contiguous ragged array object.
- Return type:
- ascat.ragged_array.pad_to_2d(var: DataArray, x: array, y: array, shape: tuple) array[source]
Pad each time series
- Parameters:
var (xarray.DataArray) – 1d array to be converted into 2d array.
x (np.array) – Row indices.
y (np.array) – Column indices.
shape (tuple) – Array shape.
- Returns:
padded – Padded 2d array.
- Return type:
numpy.array
- ascat.ragged_array.verify_contiguous_ragged(ds: Dataset, count_var: str, instance_dim: str) None[source]
Verify dataset follows contiguous ragged array CF definition.
- Parameters:
ds (xarray.Dataset) – Dataset to be verified.
count_var (str) – Name of the count variable. Count variable contains the length of each time series feature. It is identified by having an attribute with name ‘sample_dimension’ whose value is name of the sample dimension. The count variable implicitly partitions into individual instances all variables that have the sample dimension.
- Raises:
RuntimeError if verification fails. –
- ascat.ragged_array.verify_indexed_ragged(ds: Dataset, index_var: str, sample_dim: str) None[source]
Verify dataset follows indexed ragged array CF definition.
- ascat.ragged_array.verify_ortho_multi(ds: Dataset, instance_dim: str, element_dim: str) None[source]
Verify dataset follows orthogonal multidimensional array CF definition.
- ascat.ragged_array.verify_point_array(ds: Dataset, sample_dim: str) None[source]
Verify dataset follows the CF point data array convention.
- Parameters:
ds (xarray.Dataset) – Dataset to be verified.
sample_dim (str) – Name of the sample dimension.
- Raises:
RuntimeError if verification fails. –
- ascat.ragged_array.vrange(starts, stops)[source]
Create concatenated ranges of integers for multiple start/stop values.
- Parameters:
starts (numpy.ndarray) – Starts for each range.
stops (numpy.ndarray) – Stops for each range (same shape as starts).
- Returns:
ranges – Concatenated ranges.
- Return type:
Example
>>> starts = [1, 3, 4, 6] >>> stops = [1, 5, 7, 6] >>> vrange(starts, stops) array([3, 4, 4, 5, 6])
ascat.swath module
- class ascat.swath.Swath(filenames)[source]
Bases:
FilenamesClass to read and merge swath files given one or more file paths.
- static combine_attributes(attrs_list, context)[source]
Decides which attributes to keep when merging swath files.
- Parameters:
attrs_list (list of dict) – List of attributes dictionaries.
context (None) – This currently is None, but will eventually be passed information about the context in which this was called. (see https://github.com/pydata/xarray/issues/6679#issuecomment-1150946521)
- class ascat.swath.SwathGridFiles(root_path, fn_templ, sf_templ, grid_name, date_field_fmt, cell_fn_format=None, cls_kwargs=None, err=True, fn_read_fmt=None, sf_read_fmt=None, fn_write_fmt=None, sf_write_fmt=None, preprocessor=None, postprocessor=None, cache_size=0)[source]
Bases:
ChronFilesClass to manage chronological swath files with a date field in the filename.
- classmethod from_product_class(path, product_class)[source]
Create a SwathGridFiles from a given io_class.
Returns a SwathGridFiles object initialized with the given io_class.
- Parameters:
path (str or Path) – Path to the swath file collection.
io_class (class) – Class to use for reading and writing the swath files.
Examples
>>> my_swath_collection = SwathFileCollection.from_io_class( ... "/path/to/swath/files", ... AscatH129Swath, ... )
- classmethod from_product_id(path, product_id)[source]
Create a SwathGridFiles object based on a product_id.
Returns a SwathGridFiles object initialized with an io_class specified by product_id (case-insensitive).
- Parameters:
- Raises:
ValueError – If product_id is not recognized.
Examples
>>> my_swath_collection = SwathFileCollection.from_product_id( ... "/path/to/swath/files", ... "H129", ... )
- read(date_range, dt_delta=None, search_date_fmt='%Y%m%d*', date_field='date', end_inclusive=True, cell=None, location_id=None, coords=None, max_coord_dist=None, bbox=None, geom=None, read_kwargs=None, **fmt_kwargs)[source]
Extract data from swath files within a time range and spatial criterion.
- Parameters:
date_range (tuple of datetime.datetime) – Start and end date.
dt_delta (timedelta) – Time delta.
search_date_fmt (str) – Search date format.
date_field (str) – Date field.
end_inclusive (bool) – If True (default), include data from the end date in the result. Otherwise, exclude it.
coords (tuple of numeric or tuple of iterable of numeric) – Tuple of (lon, lat) coordinates to read.
max_coord_dist (float) – Maximum distance in meters to search for grid points near the given coordinates. If None, the default is np.inf.
bbox (tuple) – Tuple of (latmin, latmax, lonmin, lonmax) coordinates to bound the data.
geom (shapely.geometry) – Geometry to bound the data.
- Returns:
Dataset.
- Return type:
xarray.Dataset
- stack_to_cell_files(out_dir, max_nbytes, date_range=None, fmt_kwargs=None, cells=None, print_progress=True, parallel=True)[source]
Stack all swath files to cell files, writing them in parallel.
- Parameters:
out_dir (str) – Output directory.
max_nbytes (int) – Maximum number of bytes to open as xarray datasets before dumping to disk.
date_range (tuple of datetime.datetime, optional) – Start and end date for the search.
fmt_kwargs (dict, optional) – Additional keyword arguments passed to ascat.file_handling.ChronFiles.search_period.
cells (list of int, optional) – List of grid cell numbers to read. If None (default), all cells are read.
print_progress (bool, optional) – If True (default), print progress bars.
parallel (bool, optional) – If True, write data to files in parallel (use all available resources).
- swath_search(dt_start, dt_end, dt_delta=None, search_date_fmt='%Y%m%d*', date_field='date', end_inclusive=True, cell=None, location_id=None, coords=None, bbox=None, geom=None, **fmt_kwargs)[source]
Search for swath files within a time range and spatial criterion.
- Parameters:
dt_start (datetime) – Start date.
dt_end (datetime) – End date.
dt_delta (timedelta) – Time delta.
search_date_fmt (str) – Search date format.
date_field (str) – Date field.
end_inclusive (bool) – End date inclusive.
coords (tuple of numeric or tuple of iterable of numeric) – Tuple of (lon, lat) coordinates.
bbox (tuple) – Tuple of (latmin, latmax, lonmin, lonmax) coordinates.
geom (shapely.geometry) – Geometry.
fmt_kwargs (dict) – Additional keyword arguments passed to ascat.file_handling.ChronFiles.search_period.
- Returns:
Filenames.
- Return type:
ascat.utils module
- class ascat.utils.Spacecraft(name)[source]
Bases:
objectSpacecraft class.
- valid_spacecraft_names = ['METOPA', 'METOPB', 'METOPC', 'METOP-A', 'METOP-B', 'METOP-C', 'METOP-SG B1', 'METOP-SG B2', 'METOP-SG B3']
- ascat.utils.append_to_netcdf(filename, ds_to_append, unlimited_dim)[source]
Appends an xarray dataset to an existing netCDF file along a given unlimited dim.
- Parameters:
- Raises:
ValueError – If more than one unlimited dim is given.
- ascat.utils.boxcar(radius, distance)[source]
Boxcar filter
- Parameters:
n (int) – Length.
- Returns:
weights (numpy.ndarray) – Distance weights.
tw (float32) – Sum of weigths.
- ascat.utils.create_variable_encodings(ds, custom_variable_encodings=None, custom_dtypes=None)[source]
Create an encoding dictionary for a dataset, optionally overriding the default encoding or adding additional encoding parameters. New parameters cannot be added to default encoding for a variable, only overridden.
E.g. if you want to add a “units” encoding to “lon”, you should also pass “dtype”, “zlib”, “complevel”, and “_FillValue” if you don’t want to lose those.
- Parameters:
ds (xarray.Dataset) – Dataset.
custom_variable_encodings (dict, optional) – Custom encodings.
- Returns:
ds – Dataset with encodings.
- Return type:
xarray.Dataset
- ascat.utils.daterange(start_date, end_date)[source]
Generator for daily datetimes.
- Parameters:
start_date (datetime) – Start date.
end_date (datetime) – End date.
- ascat.utils.db2lin(val)[source]
Converting from linear to dB domain.
- Parameters:
val (numpy.ndarray) – Values in dB domain.
- Returns:
val – Values in linear domain.
- Return type:
- ascat.utils.get_bit(a, bit_pos)[source]
Returns 1 or 0 if bit is set or not.
- Parameters:
a (int or numpy.ndarray) – Input array.
bit_pos (int) – Bit position. First bit position is right.
- Returns:
b – 1 if bit is set and 0 if not.
- Return type:
- ascat.utils.get_grid_gpis(grid, cell=None, location_id=None, coords=None, bbox=None, geom=None, max_coord_dist=inf, return_lookup: bool = False)[source]
Get grid point indices.
- Parameters:
grid (pygeogrids.CellGrid) – Grid object.
location_id (int or iterable of int, optional) – Location ID.
coords (tuple, optional) – Tuple of (lon, lat) coordinates.
bbox (tuple, optional) – Tuple of (latmin, latmax, lonmin, lonmax) coordinates.
geom (shapely.geometry.BaseGeometry, optional) – Geometry object.
max_coord_dist (float, optional) – Maximum distance from coordinates to return a gpi.
- Returns:
gpi (int) – Grid point index.
lookup_vector (numpy.ndarray) – Lookup vector. (only if return_lookup is True)
- ascat.utils.get_roi_subset(ds, roi)[source]
Filter dataset for given region of interest.
- Parameters:
ds (xarray.Dataset) – Dataset to be filtered for region of interest.
roi (tuple of 4 float) – Region of interest: latmin, lonmin, latmax, lonmax
- Returns:
ds – Filtered dataset.
- Return type:
xarray.Dataset
- ascat.utils.get_toi_subset(ds, toi)[source]
Filter dataset for given time of interest.
- Parameters:
ds (xarray.Dataset) – Dataset to be filtered for time of interest.
toi (tuple of datetime) – Time of interest.
- Returns:
ds – Filtered dataset.
- Return type:
xarray.Dataset
- ascat.utils.get_window_radius(window, hp_radius)[source]
Calculates the required radius of a window function in order to achieve the provided half power radius.
- Parameters:
window (string) –
Window function name. Current supported windows:
Hamming
Boxcar
hp_radius (float32) – Half power radius. Radius of window function for weight equal to 0.5 (-3 dB). In the spatial domain this corresponds to half of the spatial resolution one would like to achieve with the given window.
- Returns:
r – Window radius needed to achieve the given half power radius
- Return type:
float32
- ascat.utils.get_window_weights(window, radius, distance, norm=False)[source]
Function returning weights for the provided window function
- Parameters:
window (str) – Window function name
radius (float) – Radius of the window.
distance (numpy.ndarray) – Distance array
norm (boolean) – If true, normalised weights will be returned.
- Returns:
weights – Weights according to distances and given window function
- Return type:
- ascat.utils.gpis_to_lookup(grid, gpis)[source]
Create lookup vector from grid point indices.
- Parameters:
grid (pygeogrids.BasicGrid) – Grid object.
gpis (numpy.ndarray) – Grid point indices.
- Returns:
lookup_vector – Lookup vector.
- Return type:
- ascat.utils.hamming_window(radius, distances)[source]
Hamming window filter.
- Parameters:
radius (float32) – Radius of the window.
distances (numpy.ndarray) – Array with distances.
- Returns:
weights (numpy.ndarray) – Distance weights.
tw (float32) – Sum of weigths.
- ascat.utils.lin2db(val)[source]
Converting from linear to dB domain.
- Parameters:
val (numpy.ndarray) – Values in linear domain.
- Returns:
val – Values in dB domain.
- Return type:
- ascat.utils.mask_dtype_nans(ds)[source]
Mask NaNs in a dataset based on the dtypes of its variables.
- ascat.utils.set_bit(a, bit_pos, value=1)[source]
Set bit at given position.
- Parameters:
a (int or numpy.ndarray) – Input array.
bit_pos (int) – Bit position. First bit starts right.
value (1 or 0, optional) – Set bit either to 1 or 0 (default: 1).
- Returns:
a – Modified input array with bit=value.
- Return type: