ascat.read_native package

Submodules

ascat.read_native.base module

class ascat.read_native.base.AscatFile(filename)[source]

Bases: Filenames

Class reading ASCAT files.

read(toi=None, roi=None, **kwargs)[source]

Read ASCAT Level 1b data.

Parameters:
  • toi (tuple of datetime, optional) – Filter data for given time of interest (default: None). e.g. (datetime(2020, 1, 1, 12), datetime(2020, 1, 2))

  • roi (tuple of 4 float, optional) – Filter data for region of interest (default: None). e.g. latmin, lonmin, latmax, lonmax

Returns:

  • data (xarray.Dataset or numpy.ndarray) – ASCAT data.

  • metadata (dict) – Metadata.

read_period(dt_start, dt_end, **kwargs)[source]

Read interval.

Parameters:
  • dt_start (datetime) – Start datetime.

  • dt_end (datetime) – End datetime.

Returns:

  • data (xarray.Dataset or numpy.ndarray) – ASCAT data.

  • metadata (dict) – Metadata.

ascat.read_native.bufr module

Readers for ASCAT Level 1b and Level 2 data in BUFR format.

class ascat.read_native.bufr.AscatL1bBufrFile(filename, **kwargs)[source]

Bases: AscatFile

Read ASCAT Level 1b file in BUFR format.

class ascat.read_native.bufr.AscatL1bBufrFileGeneric(filename, **kwargs)[source]

Bases: AscatL1bBufrFile

The same as AscatL1bBufrFile but with generic=True by default.

class ascat.read_native.bufr.AscatL2BufrFile(filename, **kwargs)[source]

Bases: AscatFile

Read ASCAT Level 2 file in BUFR format.

class ascat.read_native.bufr.AscatL2BufrFileGeneric(filename, **kwargs)[source]

Bases: AscatL2BufrFile

The same as AscatL1bBufrFile but with generic=True by default.

ascat.read_native.bufr.conv_bufrl1b_generic(data, metadata)[source]

Rename and convert data types of dataset.

Spacecraft_id vs sat_id encoding

BUFR encoding - Spacecraft_id - 1 ERS 1 - 2 ERS 2 - 3 Metop-1 (Metop-B) - 4 Metop-2 (Metop-A) - 5 Metop-3 (Metop-C)

Internal encoding - sat_id - 1 ERS 1 - 2 ERS 2 - 3 Metop-2 (Metop-A) - 4 Metop-1 (Metop-B) - 5 Metop-3 (Metop-C)

Parameters:
Returns:

data – Converted dataset.

Return type:

dict of numpy.ndarray

ascat.read_native.bufr.conv_bufrl2_generic(data, metadata)[source]

Rename and convert data types of dataset.

Spacecraft_id vs sat_id encoding

BUFR encoding - Spacecraft_id - 1 ERS 1 - 2 ERS 2 - 3 Metop-1 (Metop-B) - 4 Metop-2 (Metop-A) - 5 Metop-3 (Metop-C)

Internal encoding - sat_id - 1 ERS 1 - 2 ERS 2 - 3 Metop-2 (Metop-A) - 4 Metop-1 (Metop-B) - 5 Metop-3 (Metop-C)

Parameters:
Returns:

data – Converted dataset.

Return type:

dict of numpy.ndarray

ascat.read_native.cdr module

ascat.read_native.eps_native module

ascat.read_native.generate_test_data module

ascat.read_native.hdf5 module

ascat.read_native.nc module

Readers for ASCAT Level 1b and Level 2 data in NetCDF format.

class ascat.read_native.nc.AscatL1bNcFile(filename, **kwargs)[source]

Bases: AscatFile

Read ASCAT Level 1b file in NetCDF format.

class ascat.read_native.nc.AscatL1bNcFileGeneric(filename, **kwargs)[source]

Bases: AscatL1bNcFile

The same as AscatL1bNcFile but with generic=True by default.

class ascat.read_native.nc.AscatL2NcFile(filename, **kwargs)[source]

Bases: AscatFile

Read ASCAT Level 2 file in NetCDF format.

class ascat.read_native.nc.AscatL2NcFileGeneric(filename, **kwargs)[source]

Bases: AscatL2NcFile

The same as AscatL1bNcFile but with generic=True by default.

class ascat.read_native.nc.AscatSsmNcSwathFile(filename)[source]

Bases: AscatFile

Class reading ASCAT Surface Soil Moisture Netcdf swath file.

class ascat.read_native.nc.AscatSsmNcSwathFileList(path, filename_template=None, subfolder_template=None, sat='?', cls_kwargs=None)[source]

Bases: ChronFiles

Class reading ASCAT Surface Soil Moisture Netcdf swath file list.

iter_daterange(start_date, end_date)[source]

Generator returning filenames between start and end date.

Parameters:
  • start_date (datetime) – Start date.

  • end_date (datetime) – End date.

Yields:

filename (str) – Filename.

read_date(timestamp)[source]

Read data for given timestamp.

Parameters:

timestamp (datetime) – Date.

Returns:

data – Data.

Return type:

xarray.Dataset

read_period(start_dt, end_dt, delta_dt=datetime.timedelta(seconds=3600), buffer_dt=datetime.timedelta(seconds=3600), **kwargs)[source]

Read data for given interval.

Parameters:
  • start_dt (datetime) – Start datetime.

  • end_dt (datetime) – End datetime.

  • delta_dt (timedelta, optional) – Time delta used to jump through search date.

  • buffer_dt (timedelta, optional) – Search buffer used to find files which could possibly contain data but would be left out because of dt_start.

Returns:

data – Data stored in file.

Return type:

dict, numpy.ndarray

search_date(timestamp, **kwargs)[source]

Search date.

Parameters:

timestamp (datetime) – Date.

Returns:

filenames – Filenames.

Return type:

list

ascat.read_native.nc.read_nc(filename, generic, to_xarray, skip_fields, gen_fields_lut)[source]

Read NetCDF file.

Parameters:
  • filename (str) – Filename.

  • generic (bool) – ‘True’ reading and converting into generic format or ‘False’ reading original field names.

  • to_xarray (bool) – ‘True’ return data as xarray.Dataset ‘False’ return data as numpy.ndarray.

  • skip_fields (list) – Variables to skip.

  • gen_fields_lut (dict) – Conversion look-up table for generic names.

Returns:

  • data (xarray.Dataset or numpy.ndarray) – ASCAT data.

  • metadata (dict) – Metadata.

ascat.read_native.ragged_array_ts module

class ascat.read_native.ragged_array_ts.CRANcFile(filename, row_var='row_size', **kwargs)[source]

Bases: RAFile

Contiguous ragged array file reader.

property ids

Location IDs property.

Returns:

location_id – Location IDs.

Return type:

numpy.ndarray

property lats

Latitude coordinates property.

Returns:

lat – Latitude coordinates.

Return type:

numpy.ndarray

property lons

Longitude coordinates property.

Returns:

lon – Longitude coordinates.

Return type:

numpy.ndarray

read(location_id, variables=None)[source]

Read a timeseries for a given location_id.

Parameters:
  • location_id (int) – Location_id to read.

  • variables (list or None) – A list of parameter-names to read. If None, all parameters are read. If None, all parameters will be read. The default is None.

Returns:

df – A pandas.DataFrame containing the timeseries for the location_id.

Return type:

pandas.DataFrame

read_2d(variables=None)[source]

(Draft!) Read all time series into 2d array.

1d data: 1, 2, 3, 4, 5, 6, 7, 8 row_size: 3, 2, 1, 2 2d data: 1 2 3 0 0 4 5 0 0 0 6 0 0 0 0 7 8 0 0 0

class ascat.read_native.ragged_array_ts.CellFileCollection(path, ioclass, ioclass_kws=None, dir_name_format='{date1}_{date2}', dir_date_format='%Y%m%d%H%M%S')[source]

Bases: object

Collection of grid cell files.

Represents a collection of grid cell files that live in the same directory, and contains methods to read data from them.

property cells_in_collection

Return a list of the cells in the collection.

Returns:

List of cells in the collection.

Return type:

list of int

close()[source]

Close file.

create_cell_lookup(out_cell_size)[source]

Create a lookup table self.cell_lut mapping a new cell-size grid to the existing one.

Format of the table is a dictionary, where the keys are the cell numbers in the new cell-size grid, and the values are the cell numbers in the old cell-size grid which the new cell overlaps.

Parameters:

out_cell_size (int) – Cell size of the new grid.

property date_range

Return the start and end date of the collection based on its dir name

classmethod from_product_id(collections, product_id, ioclass_kws=None)[source]

Create a CellFileCollection based on a product_id.

Returns a CellFileCollection object initialized with an io_class specified by product_id (case-insensitive).

Parameters:
  • collections (list of str or Path) – A path to a cell file collection or a list of paths to cell file collections, or a list of CellFileCollection.

  • product_id (str) – ASCAT ID of the cell file collections.

  • ioclass_kws (dict, optional) – Keyword arguments to pass to the ioclass initialization.

Raises:

ValueError – If product_id is not recognized.

get_cell_path(cell=None, location_id=None)[source]

Get path to cell file given cell number or location id.

Returns a path to a cell file in the collection’s directory, whether the file exists or not, as long as the cell number or location id is within the grid.

Parameters:
  • cell (int, optional) – Cell number.

  • location_id (int, optional) – Location identifier.

Returns:

path – Path to cell file.

Return type:

pathlib.Path

Raises:
  • ValueError – If neither cell nor location_id is given.

  • ValueError – If the given cell number or location_id is not within the grid.

read(cell=None, location_id=None, coords=None, bbox=None, geom=None, mask_and_scale=True, date_range=None, **kwargs)[source]

Read data from the collection for a cell, location_id, or set of coordinates.

Parameters:
  • cell (int) – Grid cell number to read.

  • location_id (int) – Location id.

  • coords (tuple) – Tuple of (lat, lon) coordinates.

  • bbox (tuple) – Tuple of (latmin, latmax, lonmin, lonmax) coordinates.

  • mask_and_scale (bool, optional) – If True, mask and scale the data according to its scale_factor and _FillValue/missing_value before returning. Default: True.

  • **kwargs (dict) – Keyword arguments passed to the ioclass.

Returns:

Dataset containing the data for the given cell, location_id, or coordinates.

Return type:

xarray.Dataset

Raises:

ValueError – If neither cell, location_id, nor coords is given.

to_contiguous(out_dir, out_cell_size, processes=8)[source]
class ascat.read_native.ragged_array_ts.CellFileCollectionStack(collections, ioclass, dupe_window=None, dask_scheduler='threads', **kwargs)[source]

Bases: object

Collection of grid cell file collections.

add_collection(collections, product_id=None)[source]

Add a cell file collection to the stack, based on file path.

Parameters:
  • collections (str or list of str or CellFileCollection) – Path to the cell file collection to add, or a list of paths.

  • product_id (str, optional) – ASCAT ID of the collections to add. Needed if collections is a string or list of strings.

Raises:

ValueError – If collections is a string or list of strings and product_id is not given.

close()[source]

Close all the collections.

classmethod from_product_id(collections, product_id, dupe_window=None, dask_scheduler=None)[source]

Create a CellFileCollectionStack based on a product_id.

Returns a CellFileCollectionStack object initialized with an io_class specified by product_id (case-insensitive).

Parameters:
  • collections (list of str or CellFileCollection) – A path to a cell file collection or a list of paths to cell file collections, or a list of CellFileCollection.

  • product_id (str) – ASCAT ID of the cell file collections. Either this or ioclass must be specified.

  • dupe_window (numpy.timedelta64) – Time difference between two observations at the same location_id below which the second observation will be considered a duplicate. Will be set to np.timedelta64(“10”, “m”) if None. Default: None

  • dask_scheduler (str, optional) – Dask scheduler to use for parallel processing. Will be set to “threads” when class is initialized if None. Default: None

merge_and_write(out_dir, cells=None, date_range=None, out_cell_size=None, processes=8)[source]

Merge the data in all the collections by cell, and write each cell to disk.

Parameters:
  • out_dir (str or Path) – Path to output directory.

  • cells (list of int, optional) – Cells to write. If None, write all cells.

  • date_range (tuple of numpy.datetime64, optional) – Start and end dates to read data for before writing.

  • out_cell_size (tuple, optional) – Size of the output cells in degrees (assumes they are square). If None, and the component collections all have the same cell size, use that.

  • processes (int, optional) – Number of processes to use for parallel processing. Default: 8

Raises:

ValueError – If out_cell_size is None and the component collections do not all have the same cell size.

read(cell=None, location_id=None, bbox=None, geom=None, mask_and_scale=True, date_range=None, **kwargs)[source]

Read data for a cell or location_id.

Parameters:
  • cell (int) – Cell number to read data for.

  • location_id (int) – Location ID to read data for.

  • bbox (tuple) – Tuple of (latmin, latmax, lonmin, lonmax) coordinates to read data within.

  • mask_and_scale (bool, optional) – If True, mask and scale the data according to its scale_factor and _FillValue/missing_value before returning. Default: True.

  • date_range (tuple of numpy.datetime64, optional) – Start and end dates to read data for.

  • **kwargs (dict) – Keyword arguments to pass to the read function of the collection

Returns:

Dataset containing the combined data for the given cell or location_id from all the collections in the stack.

Return type:

xarray.Dataset

Raises:

ValueError – If neither cell nor location_id is given.

subcollection_cells(cells=None, out_cell_size=None, date_range=None)[source]

Get the cells that are covered by all the subcollections. If out_cell_size is passed, then it returns the cells in the new cell-scheme that are covered by the subcollections.

Parameters:
  • cells (list of int, optional) – Cells to check. If None, check all cells.

  • out_cell_size (int, optional) – The size of the cells in the new cell-scheme.

Returns:

Cells covered by all subcollections.

Return type:

set

class ascat.read_native.ragged_array_ts.IRANcFile(filename, **kwargs)[source]

Bases: RAFile

Indexed ragged array file reader.

property ids

Location IDs property.

Returns:

location_id – Location IDs.

Return type:

numpy.ndarray

property lats

Latitude coordinates property.

Returns:

lat – Latitude coordinates.

Return type:

numpy.ndarray

property lons

Longitude coordinates property.

Returns:

lon – Longitude coordinates.

Return type:

numpy.ndarray

read(location_id, variables=None)[source]

Read a timeseries for a given location_id.

Parameters:
  • location_id (int) – Location_id to read.

  • variables (list or None) – A list of parameter-names to read. If None, all parameters are read. If None, all parameters will be read. The default is None.

Returns:

df – A pandas.DataFrame containing the timeseries for the location_id.

Return type:

pandas.DataFrame

class ascat.read_native.ragged_array_ts.RAFile(loc_dim_name='locations', obs_dim_name='time', loc_ids_name='location_id', loc_descr_name='location_description', time_units='days since 1900-01-01 00:00:00', time_var='time', lat_var='lat', lon_var='lon', alt_var='alt', cache=False, mask_and_scale=False)[source]

Bases: object

Base class used for Ragged Array (RA) time series data.

class ascat.read_native.ragged_array_ts.SwathFileCollection(path, ioclass, ioclass_kws=None, dask_scheduler=None)[source]

Bases: object

Collection of time-series swath files.

Parameters:
  • path (str or Path) – Path to the swath file collection.

  • ioclass (ascat.read_native.xarray_io.SwathIOBase) – IO class to use for reading the data.

  • ioclass_kws (dict, optional) – Keyword arguments to pass to the ioclass initialization. Default: None

  • dask_scheduler (str, optional) – Dask scheduler to use for parallel processing in xarray. In testing this just made most things slower, but it may be useful in some cases. Default: None

path

Path to the swath file collection.

Type:

Path

ioclass

IO class to use for reading the data.

Type:

class

ioclass_kws

Keyword arguments to pass to the ioclass initialization. May include ioclass attributes that will override any that are set in the current ioclass.

Type:

dict

grid

Grid object defining the grid the data is on.

Type:

pygeogrids.CellGrid object

ts_dtype

Data types to encode the time series data as when writing.

Type:

numpy.dtype

beams_vars

List of names of the variables that have a beams dimension.

Type:

list of str

date_format

Format of the date in the filename.

Type:

str

cell_fn_format

Format for the names of the cell files that will be written out.

Type:

str

chron_files

Function to search for files in the collection based on their date.

Type:

function

previous_cell
Type:

int or list of int

fid

The currently open instance of self.ioclass.

Type:

ascat.read_native.xarray_io.SwathIOBase object

max_buffer_memory_mb

Maximum amount of memory to use for buffering data when stacking to disk.

Type:

int

close()[source]

Close collection and constituent xarray datasets.

classmethod from_product_id(path, product_id, ioclass_kws=None, dask_scheduler=None)[source]

Create a SwathFileCollection based on a product_id.

Returns a SwathFileCollection object initialized with an io_class specified by product_id (case-insensitive).

Parameters:
  • path (str or Path) – Path to the swath file collection.

  • product_id (str) – Identifier for the specific ASCAT product the swath files are part of.

  • ioclass_kws (dict, optional) – Keyword arguments to pass to the ioclass initialization. Default: None

  • dask_scheduler (str, optional) – Dask scheduler to use for parallel processing. Will be set to “threads” when class is initialized if None. Default: None

Raises:

ValueError – If product_id is not recognized.

Examples

>>> my_swath_collection = SwathFileCollection.from_product_id(
...     "/path/to/swath/files",
...     "H129",
... )
get_filenames(start_dt=None, end_dt=None, cell=None, location_id=None, coords=None, bbox=None, geom=None)[source]

Get filenames for the given time range.

Parameters:
Returns:

fnames – List of filenames.

Return type:

list of pathlib.Path

Raises:

NotImplementedError – If the ioclass does not have a file search method named chron_files.

process(data)[source]

Process a stacked dataset of swath data into a format that is ready to be split into cell timeseries datasets, and return the processed dataset.

Parameters:

data (xarray.Dataset) – Stacked dataset to process.

read(date_range, cell=None, location_id=None, coords=None, bbox=None, geom=None, **kwargs)[source]

Takes either 1 or 2 arguments and calls the correct function which is either reading the gpi directly or finding the nearest gpi from given lat,lon coordinates and then reading it.

If the time range is large, this can be slow. It may make more sense to convert to cell files first and access that data from disk using a CellFileCollection or CellFileCollectionStack.

Parameters:
  • date_range (tuple of datetime.datetime) – Start and end dates.

  • cell (int or list of int, optional) – Grid cell number to read.

  • location_id (int, optional) – Location id.

  • coords (tuple, optional) – Tuple of (lat, lon) coordinates.

  • bbox (tuple, optional) – Tuple of (latmin, latmax, lonmin, lonmax) coordinates.

  • geometry (shapely.geometry, optional) – Geometry object; use to read data that intersects the geometry.

stack(out_dir, fnames=None, date_range=None, mode='w', processes=1, buffer_memory_mb=None, dupe_window=None)[source]

Stack swath files and split them into cell timeseries files.

Reads swath files into memory, stacking their datasets in a buffer until the sum of their sizes exceeds self.max_buffer_memory_mb. Then, splits the buffer into cell timeseries datasets, writes them to disk in parallel, and clears the buffer. This process repeats until all files have been processed, with subsequent writes appending new data to existing cell files when appropriate.

Parameters:
  • out_dir (pathlib.Path) – Output directory to write the stacked files to.

  • fnames (list of pathlib.Path, optional) – List of swath filenames to stack.

  • date_range (tuple of datetime.datetime) – Start and end dates to read data for before writing.

  • mode (str, optional) – Write mode. Default is “w”, which will clear all files from out_dir before processing. Use “a” to append data to existing files (only if those have also been produced by this function).

  • processes (int, optional) – Number of processes to use for parallel writing. Default is 1.

  • buffer_memory_mb (numeric, optional) – Maximum amount of memory to use for the buffer, in megabytes. Will be set to self.max_buffer_memory_mb if None. Default is None.

  • dupe_window (numpy.timedelta64, optional) – Time window within which duplicate observations will be removed. Default is None.

Raises:

ValueError – If mode is not “w” or “a”.

swath_data_generator(start_dt=None, end_dt=None, cell=None, location_id=None, coords=None, bbox=None, geom=None)[source]

Return a generator producing the data for each requested swath file.

Parameters:
  • start_dt (datetime.datetime) – Start time.

  • end_dt (datetime.datetime) – End time.

  • cell (int) – Grid cell number to select.

  • location_id (int) – Location id.

  • coords (tuple) – Tuple of (lat, lon) coordinates.

  • bbox (tuple) – Tuple of (latmin, latmax, lonmin, lonmax) coordinates.

  • geom (shapely.geometry) – Geometry object; use to select data that intersects the geometry.

Yields:
  • start_timestamp (numpy.datetime64) – Sensing start time of the swath file.

  • end_timestamp (numpy.datetime64) – Sensing end time of the swath file.

  • sat (str) – Satellite name.

  • data (xarray.Dataset) – Dataset for each swath file intersecting the requested extent.

ascat.read_native.ragged_array_ts.braces_to_re_groups(string)[source]

Convert braces to character patterns defining regular expression groups. If any group name is repeated in the template string, a backreference is used for subsequent appearances.

Parameters:

string (str) – String with braces.

Returns:

string – String with regular expression groups.

Return type:

str

Examples

>>> braces_to_re_groups("{year}-{month}-{day}")
"(?P<year>.+)-(?P<month>.+)-(?P<day>.+)"
>>> braces_to_re_groups("{year}-{month}-{day}_{year}-{month}-{day2}")
"(?P<year>.+)-(?P<month>.+)-(?P<day>.+)_(?P=year)-(?P=month)-(?P<day2>.+)"
ascat.read_native.ragged_array_ts.vrange(starts, stops)[source]

Create concatenated ranges of integers for multiple start/stop values.

Parameters:
Returns:

ranges – Concatenated ranges.

Return type:

numpy.ndarray

Example

>>> starts = [1, 3, 4, 6]
>>> stops  = [1, 5, 7, 6]
>>> vrange(starts, stops)
array([3, 4, 4, 5, 6])

ascat.read_native.xarray_io module

class ascat.read_native.xarray_io.AscatH121v1Cell(filename, **kwargs)[source]

Bases: AscatNetCDFCellBase

fn_format = '{:04d}.nc'
grid = <fibgrid.realization.FibGrid object>
grid_cell_size = 5
grid_info = {'grid': <fibgrid.realization.FibGrid object>, 'max_cell': np.int16(2591), 'min_cell': np.int16(0), 'possible_cells': array([   0,    1,    2, ..., 2589, 2590, 2591],       shape=(2592,), dtype=int16)}
max_cell = np.int16(2591)
min_cell = np.int16(0)
possible_cells = array([   0,    1,    2, ..., 2589, 2590, 2591],       shape=(2592,), dtype=int16)
class ascat.read_native.xarray_io.AscatH121v1Swath(filename, **kwargs)[source]

Bases: SwathIOBase

beams_vars = []
cell_fn_format = '{:04d}.nc'
date_format = '%Y%m%d%H%M%S'
fn_pattern = 'W_IT-HSAF-ROME,SAT,SSM-ASCAT-METOP{sat}-12.5km-H121_C_LIIB_{placeholder}_{placeholder1}_{date}____.nc'
static fn_read_fmt(timestamp)[source]

TODO: figure out a sane way to describe what this does. Also decide if this /needs/ to be enforced. If the user doesn’t want to use all the filesearch functionality (or if they want to use their own filesearch logic), then they should still be able to use this class. They could of course override this and just return None, but that seems like a hack.

grid = <fibgrid.realization.FibGrid object>
grid_cell_size = 5
grid_sampling_km = 12.5
sf_pattern = {'satellite_folder': 'metop_[abc]', 'year_folder': '{year}'}
static sf_read_fmt(timestamp)[source]

TODO: same as above

ts_dtype = dtype([('sat_id', 'i1'), ('as_des_pass', 'i1'), ('swath_indicator', 'i1'), ('surface_soil_moisture', '<f4'), ('surface_soil_moisture_noise', '<f4'), ('backscatter40', '<f4'), ('slope40', '<f4'), ('curvature40', '<f4'), ('surface_soil_moisture_sensitivity', '<f4'), ('backscatter_flag', 'u1'), ('correction_flag', 'u1'), ('processing_flag', 'u1'), ('surface_flag', 'u1'), ('snow_cover_probability', 'i1'), ('frozen_soil_probability', 'i1'), ('wetland_fraction', 'i1'), ('topographic_complexity', 'i1'), ('subsurface_scattering_probability', 'i1')])
class ascat.read_native.xarray_io.AscatH122Cell(filename, **kwargs)[source]

Bases: AscatNetCDFCellBase

fn_format = '{:04d}.nc'
grid = <fibgrid.realization.FibGrid object>
grid_cell_size = 5
grid_info = {'grid': <fibgrid.realization.FibGrid object>, 'max_cell': np.int16(2591), 'min_cell': np.int16(0), 'possible_cells': array([   0,    1,    2, ..., 2589, 2590, 2591],       shape=(2592,), dtype=int16)}
max_cell = np.int16(2591)
min_cell = np.int16(0)
possible_cells = array([   0,    1,    2, ..., 2589, 2590, 2591],       shape=(2592,), dtype=int16)
class ascat.read_native.xarray_io.AscatH122Swath(filename, **kwargs)[source]

Bases: SwathIOBase

beams_vars = []
cell_fn_format = '{:04d}.nc'
date_format = '%Y%m%d%H%M%S'
fn_pattern = 'ascat_ssm_nrt_6.25km_{placeholder}Z_{date}Z_metop-{sat}_h122.nc'
static fn_read_fmt(timestamp)[source]

TODO: figure out a sane way to describe what this does. Also decide if this /needs/ to be enforced. If the user doesn’t want to use all the filesearch functionality (or if they want to use their own filesearch logic), then they should still be able to use this class. They could of course override this and just return None, but that seems like a hack.

grid = <fibgrid.realization.FibGrid object>
grid_cell_size = 5
grid_sampling_km = 6.25
sf_pattern = {'satellite_folder': 'metop_[abc]', 'year_folder': '{year}'}
static sf_read_fmt(timestamp)[source]

TODO: same as above

ts_dtype = dtype([('sat_id', '<i8'), ('as_des_pass', 'i1'), ('swath_indicator', 'i1'), ('surface_soil_moisture', '<f4'), ('surface_soil_moisture_noise', '<f4'), ('sigma40', '<f4'), ('sigma40_noise', '<f4'), ('slope40', '<f4'), ('slope40_noise', '<f4'), ('curvature40', '<f4'), ('curvature40_noise', '<f4'), ('dry40', '<f4'), ('dry40_noise', '<f4'), ('wet40', '<f4'), ('wet40_noise', '<f4'), ('surface_soil_moisture_sensitivity', '<f4'), ('surface_soil_moisture_climatology', '<f4'), ('correction_flag', 'u1'), ('processing_flag', 'u1'), ('snow_cover_probability', 'i1'), ('frozen_soil_probability', 'i1'), ('wetland_fraction', 'i1'), ('topographic_complexity', 'i1')])
class ascat.read_native.xarray_io.AscatH129Cell(filename, **kwargs)[source]

Bases: AscatNetCDFCellBase

fn_format = '{:04d}.nc'
grid = <fibgrid.realization.FibGrid object>
grid_cell_size = 5
grid_info = {'grid': <fibgrid.realization.FibGrid object>, 'max_cell': np.int16(2591), 'min_cell': np.int16(0), 'possible_cells': array([   0,    1,    2, ..., 2589, 2590, 2591],       shape=(2592,), dtype=int16)}
max_cell = np.int16(2591)
min_cell = np.int16(0)
possible_cells = array([   0,    1,    2, ..., 2589, 2590, 2591],       shape=(2592,), dtype=int16)
class ascat.read_native.xarray_io.AscatH129Swath(filename, **kwargs)[source]

Bases: SwathIOBase

beams_vars = ['backscatter', 'incidence_angle', 'azimuth_angle', 'kp']
cell_fn_format = '{:04d}.nc'
date_format = '%Y%m%d%H%M%S'
fn_pattern = 'W_IT-HSAF-ROME,SAT,SSM-ASCAT-METOP{sat}-6.25-H129_C_LIIB_{date}_{placeholder}_{placeholder1}____.nc'
static fn_read_fmt(timestamp)[source]

TODO: figure out a sane way to describe what this does. Also decide if this /needs/ to be enforced. If the user doesn’t want to use all the filesearch functionality (or if they want to use their own filesearch logic), then they should still be able to use this class. They could of course override this and just return None, but that seems like a hack.

grid = <fibgrid.realization.FibGrid object>
grid_cell_size = 5
grid_sampling_km = 6.25
sf_pattern = {'satellite_folder': 'metop_[abc]', 'year_folder': '{year}'}
static sf_read_fmt(timestamp)[source]

TODO: same as above

ts_dtype = dtype([('sat_id', 'i1'), ('as_des_pass', 'i1'), ('swath_indicator', 'i1'), ('backscatter_for', '<f4'), ('backscatter_mid', '<f4'), ('backscatter_aft', '<f4'), ('incidence_angle_for', '<f4'), ('incidence_angle_mid', '<f4'), ('incidence_angle_aft', '<f4'), ('azimuth_angle_for', '<f4'), ('azimuth_angle_mid', '<f4'), ('azimuth_angle_aft', '<f4'), ('kp_for', '<f4'), ('kp_mid', '<f4'), ('kp_aft', '<f4'), ('surface_soil_moisture', '<f4'), ('surface_soil_moisture_noise', '<f4'), ('backscatter40', '<f4'), ('slope40', '<f4'), ('curvature40', '<f4'), ('surface_soil_moisture_sensitivity', '<f4'), ('correction_flag', 'u1'), ('processing_flag', 'u1'), ('surface_flag', 'u1'), ('snow_cover_probability', 'i1'), ('frozen_soil_probability', 'i1'), ('wetland_fraction', 'i1'), ('topographic_complexity', 'i1')])
class ascat.read_native.xarray_io.AscatH129v1Cell(filename, **kwargs)[source]

Bases: AscatNetCDFCellBase

fn_format = '{:04d}.nc'
grid = <fibgrid.realization.FibGrid object>
grid_cell_size = 5
grid_info = {'grid': <fibgrid.realization.FibGrid object>, 'max_cell': np.int16(2591), 'min_cell': np.int16(0), 'possible_cells': array([   0,    1,    2, ..., 2589, 2590, 2591],       shape=(2592,), dtype=int16)}
max_cell = np.int16(2591)
min_cell = np.int16(0)
possible_cells = array([   0,    1,    2, ..., 2589, 2590, 2591],       shape=(2592,), dtype=int16)
class ascat.read_native.xarray_io.AscatH129v1Swath(filename, **kwargs)[source]

Bases: SwathIOBase

beams_vars = []
cell_fn_format = '{:04d}.nc'
date_format = '%Y%m%d%H%M%S'
fn_pattern = 'W_IT-HSAF-ROME,SAT,SSM-ASCAT-METOP{sat}-6.25km-H129_C_LIIB_{placeholder}_{placeholder1}_{date}____.nc'
static fn_read_fmt(timestamp)[source]

TODO: figure out a sane way to describe what this does. Also decide if this /needs/ to be enforced. If the user doesn’t want to use all the filesearch functionality (or if they want to use their own filesearch logic), then they should still be able to use this class. They could of course override this and just return None, but that seems like a hack.

grid = <fibgrid.realization.FibGrid object>
grid_cell_size = 5
grid_sampling_km = 6.25
sf_pattern = {'satellite_folder': 'metop_[abc]', 'year_folder': '{year}'}
static sf_read_fmt(timestamp)[source]

TODO: same as above

ts_dtype = dtype([('sat_id', 'i1'), ('as_des_pass', 'i1'), ('swath_indicator', 'i1'), ('surface_soil_moisture', '<f4'), ('surface_soil_moisture_noise', '<f4'), ('backscatter40', '<f4'), ('slope40', '<f4'), ('curvature40', '<f4'), ('surface_soil_moisture_sensitivity', '<f4'), ('backscatter_flag', 'u1'), ('correction_flag', 'u1'), ('processing_flag', 'u1'), ('surface_flag', 'u1'), ('snow_cover_probability', 'i1'), ('frozen_soil_probability', 'i1'), ('wetland_fraction', 'i1'), ('topographic_complexity', 'i1'), ('subsurface_scattering_probability', 'i1')])
class ascat.read_native.xarray_io.AscatNetCDFCellBase(filename, **kwargs)[source]

Bases: RaggedXArrayCellIOBase

read(date_range=None, location_id=None, mask_and_scale=True)[source]

Read data from netCDF4 file.

Read all or a subset of data from a netCDF4 file, with subset specified by the location_id argument.

Parameters:
  • date_range (tuple of datetime.datetime, optional) – Date range to read data for. If None, all data is read.

  • location_id (int or list of int.) – The location_id(s) to read data for. If None, all data is read. Default is None.

  • mask_and_scale (bool, optional) – If True, mask and scale the data according to its scale_factor and _FillValue/missing_value before returning. Default: True.

write(filename, ra_type='indexed', **kwargs)[source]

Write data to a netCDF file.

Parameters:
  • filename (str) – Output filename.

  • ra_type (str, optional) – Type of ragged array to write. Default is “contiguous”.

  • **kwargs (dict) – Additional keyword arguments passed to xarray.to_netcdf().

class ascat.read_native.xarray_io.AscatSIG0Cell12500m(filename, **kwargs)[source]

Bases: AscatNetCDFCellBase

fn_format = '{:04d}.nc'
grid = <fibgrid.realization.FibGrid object>
grid_cell_size = 5
grid_info = {'grid': <fibgrid.realization.FibGrid object>, 'max_cell': np.int16(2591), 'min_cell': np.int16(0), 'possible_cells': array([   0,    1,    2, ..., 2589, 2590, 2591],       shape=(2592,), dtype=int16)}
max_cell = np.int16(2591)
min_cell = np.int16(0)
possible_cells = array([   0,    1,    2, ..., 2589, 2590, 2591],       shape=(2592,), dtype=int16)
class ascat.read_native.xarray_io.AscatSIG0Cell6250m(filename, **kwargs)[source]

Bases: AscatNetCDFCellBase

fn_format = '{:04d}.nc'
grid = <fibgrid.realization.FibGrid object>
grid_cell_size = 5
grid_info = {'grid': <fibgrid.realization.FibGrid object>, 'max_cell': np.int16(2591), 'min_cell': np.int16(0), 'possible_cells': array([   0,    1,    2, ..., 2589, 2590, 2591],       shape=(2592,), dtype=int16)}
max_cell = np.int16(2591)
min_cell = np.int16(0)
possible_cells = array([   0,    1,    2, ..., 2589, 2590, 2591],       shape=(2592,), dtype=int16)
class ascat.read_native.xarray_io.AscatSIG0Swath12500m(filename, **kwargs)[source]

Bases: SwathIOBase

Class for reading and writing ASCAT sigma0 swath data.

beams_vars = ['backscatter', 'backscatter_std', 'incidence_angle', 'azimuth_angle', 'kp', 'n_echos', 'all_backscatter', 'all_backscatter_std', 'all_incidence_angle', 'all_azimuth_angle', 'all_kp', 'all_n_echos']
cell_fn_format = '{:04d}.nc'
date_format = '%Y%m%d%H%M%S'
fn_pattern = 'W_IT-HSAF-ROME,SAT,SIG0-ASCAT-METOP{sat}-12.5_C_LIIB_{placeholder}_{placeholder1}_{date}____.nc'
static fn_read_fmt(timestamp)[source]

Format a timestamp to search as YYYYMMDD*, for use in a regex that will match all files covering a single given date.

Parameters:

timestamp (datetime.datetime) – Timestamp to format

Returns:

Dictionary of formatted strings

Return type:

dict

grid = <fibgrid.realization.FibGrid object>
grid_cell_size = 5
grid_sampling_km = 12.5
sf_pattern = {'satellite_folder': 'metop_[abc]', 'year_folder': '{year}'}
static sf_read_fmt(timestamp)[source]

TODO: same as above

ts_dtype = dtype([('sat_id', 'i1'), ('as_des_pass', 'i1'), ('swath_indicator', 'i1'), ('backscatter_for', '<f4'), ('backscatter_mid', '<f4'), ('backscatter_aft', '<f4'), ('backscatter_std_for', '<f4'), ('backscatter_std_mid', '<f4'), ('backscatter_std_aft', '<f4'), ('incidence_angle_for', '<f4'), ('incidence_angle_mid', '<f4'), ('incidence_angle_aft', '<f4'), ('azimuth_angle_for', '<f4'), ('azimuth_angle_mid', '<f4'), ('azimuth_angle_aft', '<f4'), ('kp_for', '<f4'), ('kp_mid', '<f4'), ('kp_aft', '<f4'), ('n_echos_for', 'i1'), ('n_echos_mid', 'i1'), ('n_echos_aft', 'i1'), ('all_backscatter_for', '<f4'), ('all_backscatter_mid', '<f4'), ('all_backscatter_aft', '<f4'), ('all_backscatter_std_for', '<f4'), ('all_backscatter_std_mid', '<f4'), ('all_backscatter_std_aft', '<f4'), ('all_incidence_angle_for', '<f4'), ('all_incidence_angle_mid', '<f4'), ('all_incidence_angle_aft', '<f4'), ('all_azimuth_angle_for', '<f4'), ('all_azimuth_angle_mid', '<f4'), ('all_azimuth_angle_aft', '<f4'), ('all_kp_for', '<f4'), ('all_kp_mid', '<f4'), ('all_kp_aft', '<f4'), ('all_n_echos_for', 'i1'), ('all_n_echos_mid', 'i1'), ('all_n_echos_aft', 'i1')])
class ascat.read_native.xarray_io.AscatSIG0Swath6250m(filename, **kwargs)[source]

Bases: SwathIOBase

Class for reading ASCAT sigma0 swath data and writing it to cells.

beams_vars = ['backscatter', 'backscatter_std', 'incidence_angle', 'azimuth_angle', 'kp', 'n_echos', 'all_backscatter', 'all_backscatter_std', 'all_incidence_angle', 'all_azimuth_angle', 'all_kp', 'all_n_echos']
cell_fn_format = '{:04d}.nc'
date_format = '%Y%m%d%H%M%S'
fn_pattern = 'W_IT-HSAF-ROME,SAT,SIG0-ASCAT-METOP{sat}-6.25_C_LIIB_{placeholder}_{placeholder1}_{date}____.nc'
static fn_read_fmt(timestamp)[source]

Format a timestamp to search as YYYYMMDD*, for use in a regex that will match all files covering a single given date.

Parameters:

timestamp (datetime.datetime) – Timestamp to format

Returns:

Dictionary of formatted strings

Return type:

dict

grid = <fibgrid.realization.FibGrid object>
grid_cell_size = 5
grid_sampling_km = 6.25
sf_pattern = {'satellite_folder': 'metop_[abc]', 'year_folder': '{year}'}
static sf_read_fmt(timestamp)[source]

TODO: same as above

ts_dtype = dtype([('sat_id', 'i1'), ('as_des_pass', 'i1'), ('swath_indicator', 'i1'), ('backscatter_for', '<f4'), ('backscatter_mid', '<f4'), ('backscatter_aft', '<f4'), ('backscatter_std_for', '<f4'), ('backscatter_std_mid', '<f4'), ('backscatter_std_aft', '<f4'), ('incidence_angle_for', '<f4'), ('incidence_angle_mid', '<f4'), ('incidence_angle_aft', '<f4'), ('azimuth_angle_for', '<f4'), ('azimuth_angle_mid', '<f4'), ('azimuth_angle_aft', '<f4'), ('kp_for', '<f4'), ('kp_mid', '<f4'), ('kp_aft', '<f4'), ('n_echos_for', 'i1'), ('n_echos_mid', 'i1'), ('n_echos_aft', 'i1'), ('all_backscatter_for', '<f4'), ('all_backscatter_mid', '<f4'), ('all_backscatter_aft', '<f4'), ('all_backscatter_std_for', '<f4'), ('all_backscatter_std_mid', '<f4'), ('all_backscatter_std_aft', '<f4'), ('all_incidence_angle_for', '<f4'), ('all_incidence_angle_mid', '<f4'), ('all_incidence_angle_aft', '<f4'), ('all_azimuth_angle_for', '<f4'), ('all_azimuth_angle_mid', '<f4'), ('all_azimuth_angle_aft', '<f4'), ('all_kp_for', '<f4'), ('all_kp_mid', '<f4'), ('all_kp_aft', '<f4'), ('all_n_echos_for', 'i1'), ('all_n_echos_mid', 'i1'), ('all_n_echos_aft', 'i1')])
class ascat.read_native.xarray_io.CellGridCache[source]

Bases: object

Cache for CellGrid objects.

fetch_or_store(key, cell_grid_type=None, *args)[source]

Fetch a CellGrid object from the cache given a key, or store a new one.

class ascat.read_native.xarray_io.RaggedXArrayCellIOBase(source, engine, obs_dim='time', **kwargs)[source]

Bases: ABC

Base class for ascat xarray IO classes

source

Input filename(s).

Type:

str, Path, list

engine

Engine to use for reading/writing files.

Type:

str

close()[source]

Close file.

property date_range

Return date range of dataset.

Returns:

Date range of dataset.

Return type:

tuple

abstract read(location_id=None, **kwargs)[source]

Read data from file. Should be implemented by subclasses.

Parameters:
  • location_id (int, list, optional) – Location id(s) to read.

  • **kwargs – Additional keyword arguments passed to the read method.

Returns:

Dataset containing the data for any specified location_id(s), or all location_ids in the file if none are specified.

Return type:

xarray.Dataset

abstract write(filename, ra_type, **kwargs)[source]

Write data to file. Should be implemented by subclasses.

Parameters:
  • filename (str, Path) – Filename to write data to.

  • ra_type (str, optional) – Type of ragged array to write.

  • **kwargs – Additional keyword arguments passed to the write method.

class ascat.read_native.xarray_io.SwathIOBase(source, engine, **kwargs)[source]

Bases: ABC

Base class for reading swath data. Writes ragged array cell data in indexed or contiguous format.

beams_vars = []
classmethod chron_files(path)[source]

Return a ChronFiles object for this class type based on a path.

close()[source]

Close the dataset.

static combine_attributes(attrs_list, context)[source]

Decides which attributes to keep when merging swath files.

Parameters:
contains_location_ids(location_ids=None, lookup_vector=None)[source]

Check if the dataset contains any of the given location_ids.

Parameters:

location_ids (list of int) – Location ids to check.

Returns:

True if the dataset contains any of the given location_ids, False otherwise.

Return type:

bool

abstract static fn_read_fmt()[source]

TODO: figure out a sane way to describe what this does. Also decide if this /needs/ to be enforced. If the user doesn’t want to use all the filesearch functionality (or if they want to use their own filesearch logic), then they should still be able to use this class. They could of course override this and just return None, but that seems like a hack.

read(cell=None, location_id=None, mask_and_scale=True, lookup_vector=None)[source]

Returns data for a cell or location_id if specified, or for the entire swath file if not specified.

Parameters:
  • cell (int, optional) – Cell to read data for.

  • location_id (int, optional) – Location id to read data for.

  • mask_and_scale (bool, optional) – Whether to mask and scale the data. Default is True.

abstract static sf_read_fmt()[source]

TODO: same as above

write(filename, mode='w', **kwargs)[source]
ascat.read_native.xarray_io.append_to_netcdf(filename, ds_to_append, unlimited_dim)[source]

Appends an xarray dataset to an existing netCDF file along a given unlimited dim.

Parameters:
  • filename (str or Path) – Filename of netCDF file to append to.

  • ds_to_append (xarray.Dataset) – Dataset to append.

  • unlimited_dim (str or list of str) – Name of the unlimited dimension to append along.

Raises:

ValueError – If more than one unlimited dim is given.

ascat.read_native.xarray_io.create_variable_encodings(ds, custom_variable_encodings=None, custom_dtypes=None)[source]

Create an encoding dictionary for a dataset, optionally overriding the default encoding or adding additional encoding parameters. New parameters cannot be added to default encoding for a variable, only overridden.

E.g. if you want to add a “units” encoding to “lon”, you should also pass “dtype”, “zlib”, “complevel”, and “_FillValue” if you don’t want to lose those.

Parameters:
  • ds (xarray.Dataset) – Dataset.

  • custom_variable_encodings (dict, optional) – Custom encodings.

Returns:

ds – Dataset with encodings.

Return type:

xarray.Dataset

ascat.read_native.xarray_io.get_swath_product_id(filename)[source]

Get product identifier from filename.

Parameters:

filename (str) – Filename.

Returns:

product_id – Product identifier.

Return type:

str

ascat.read_native.xarray_io.set_attributes(ds, variable_attributes=None, global_attributes=None)[source]
Parameters:
  • ds (xarray.Dataset, Path) – Dataset.

  • variable_attributes (dict, optional) – User-defined variable attributes to set. Should be a dictionary with format {“varname”: {“attr1”: “value1”, “attr2”: “value2”}, “varname2”: {“attr1”: “value1”}}

  • global_attributes (dict, optional) – User-defined global attributes to set. Should be a dictionary with format {“attr1”: “value1”, “attr2”: “value2”}

Returns:

ds – Dataset with variable_attributes.

Return type:

xarray.Dataset

ascat.read_native.xarray_io.trim_dates(ds, date_range)[source]

Trim dates of dataset to a given date range. Assumes the time variable is named “time”, and observation dimension is named “obs”

Parameters:
Returns:

Dataset with trimmed dates.

Return type:

xarray.Dataset

ascat.read_native.xarray_io.var_order(ds)[source]

Returns a reasonable variable order for a ragged array dataset, based on that used in existing datasets.

Puts the count/index variable first depending on the ragged array type, then lon, lat, alt, location_id, location_description, and time, followed by the rest of the variables in the dataset.

Parameters:

ds (xarray.Dataset) – Dataset.

Returns:

order – List of dataset variable names in the determined order.

Return type:

list of str

Module contents