ascat.read_native package

Submodules

ascat.read_native.base module

class ascat.read_native.base.AscatFile(filename)[source]

Bases: Filenames

Class reading ASCAT files.

read(toi=None, roi=None, **kwargs)[source]

Read ASCAT Level 1b data.

Parameters:

toi (tuple of datetime, optional) – Filter data for given time of interest (default: None). e.g. (datetime(2020, 1, 1, 12), datetime(2020, 1, 2))
roi (tuple of 4 float, optional) – Filter data for region of interest (default: None). e.g. latmin, lonmin, latmax, lonmax

Returns:

data (xarray.Dataset or numpy.ndarray) – ASCAT data.
metadata (dict) – Metadata.

read_period(dt_start, dt_end, **kwargs)[source]

Read interval.

Parameters:

dt_start (datetime) – Start datetime.
dt_end (datetime) – End datetime.

Returns:

data (xarray.Dataset or numpy.ndarray) – ASCAT data.
metadata (dict) – Metadata.

ascat.read_native.bufr module

Readers for ASCAT Level 1b and Level 2 data in BUFR format.

class ascat.read_native.bufr.AscatL1bBufrFile(filename, **kwargs)[source]

Bases: AscatFile

Read ASCAT Level 1b file in BUFR format.

class ascat.read_native.bufr.AscatL1bBufrFileGeneric(filename, **kwargs)[source]

Bases: AscatL1bBufrFile

The same as AscatL1bBufrFile but with generic=True by default.

class ascat.read_native.bufr.AscatL2BufrFile(filename, **kwargs)[source]

Bases: AscatFile

Read ASCAT Level 2 file in BUFR format.

class ascat.read_native.bufr.AscatL2BufrFileGeneric(filename, **kwargs)[source]

Bases: AscatL2BufrFile

The same as AscatL1bBufrFile but with generic=True by default.

ascat.read_native.bufr.conv_bufrl1b_generic(data, metadata)[source]

Rename and convert data types of dataset.

Spacecraft_id vs sat_id encoding

BUFR encoding - Spacecraft_id - 1 ERS 1 - 2 ERS 2 - 3 Metop-1 (Metop-B) - 4 Metop-2 (Metop-A) - 5 Metop-3 (Metop-C)

Internal encoding - sat_id - 1 ERS 1 - 2 ERS 2 - 3 Metop-2 (Metop-A) - 4 Metop-1 (Metop-B) - 5 Metop-3 (Metop-C)

Parameters:

data (dict of numpy.ndarray) – Original dataset.
metadata (dict) – Metadata.

Returns:

data – Converted dataset.

Return type:

dict of numpy.ndarray

ascat.read_native.bufr.conv_bufrl2_generic(data, metadata)[source]

Rename and convert data types of dataset.

Spacecraft_id vs sat_id encoding

BUFR encoding - Spacecraft_id - 1 ERS 1 - 2 ERS 2 - 3 Metop-1 (Metop-B) - 4 Metop-2 (Metop-A) - 5 Metop-3 (Metop-C)

Internal encoding - sat_id - 1 ERS 1 - 2 ERS 2 - 3 Metop-2 (Metop-A) - 4 Metop-1 (Metop-B) - 5 Metop-3 (Metop-C)

Parameters:

data (dict of numpy.ndarray) – Original dataset.
metadata (dict) – Metadata.

Returns:

data – Converted dataset.

Return type:

dict of numpy.ndarray

ascat.read_native.bufr.read_bufr_data(filename, key_lookup)[source]

Read selected fields from a BUFR file using eccodes array access.

This reads the requested (rank-qualified) keys directly with codes_get_array instead of expanding every key of every subset, which is orders of magnitude faster than pdbufr.read_bufr(..., flat=True) for the large ASCAT BUFR messages.

Parameters:

filename (str) – BUFR filename.
key_lookup (dict) – Mapping of output field name to the eccodes key to read, e.g. {"f_Backscatter": "#1#backscatter"}. Keys yielding a single value per message (compressed scalars) are broadcast to all subsets.

Returns:

data – One row per observation with the requested fields plus lat, lon and time.

Return type:

pandas.DataFrame

ascat.read_native.cdr module

class ascat.read_native.cdr.AscatGriddedNcTs(path, fn_format, grid_filename, static_layer_path=None, cache_static_layer=False, thresholds=None, **kwargs)[source]

Bases: GriddedNcContiguousRaggedTs

Class reading Metop ASCAT soil moisture Climate Data Record (CDR).

Parameters:

path (str) – Path to Climate Data Record (CDR) data set.
fn_format (str) – Filename format string, typical ‘<prefix>_{:04d}’
grid_filename (str) – Grid filename.
static_layer_path (str, optional) – Path to static layer files (default: None).
thresholds (dict, optional) – Thresholds for topographic complexity (default 50) and wetland fraction (default 50).

grid

Cell grid.

Type:: pygeogrids.CellGrid

thresholds

Thresholds for topographic complexity (default 50) and wetland fraction (default 50).

Type:: dict

slayer

StaticLayer object

Type:: str

class ascat.read_native.cdr.StaticFile(filename, variables, cache=False)[source]

Bases: object

StaticFile class.

Parameters:

filename (str) – File name.
variables (list of str) – List of variables.
cache (bool, optional) – Flag to cache data stored in file (default: False).

filename

Static layer file name.

Type:: str

variables

List of variables.

Type:: list of str

cache

Flag to cache data stored in file.

Type:: bool

data

Dictionary containing static layer data.

Type:: dict

class ascat.read_native.cdr.StaticLayers(path, topo_wetland_file=None, frozen_snow_file=None, porosity_file=None, cache=False)[source]

Bases: object

Class to read static layer files.

Parameters:

path (str) – Path of static layer files.
topo_wetland_file (str, optional) – Topographic and complexity file (default: None).
frozen_snow_file (str, optional) – Frozen and snow cover probability file (default: None).
porosity_file (str, optional) – Porosity file (default: None).
cache (bool, optional) – If true all static layers are loaded into memory (default: False).

topo_wetland

Topographic complexity and inundation and wetland fraction.

Type:: dict

frozen_snow_prob

Frozen soil/canopy probability and snow cover probability.

Type:: dict

porosity

Soil porosity information.

Type:: dict

class ascat.read_native.cdr.TimeSeries(gpi, lon, lat, cell, data, topo_complex=None, wetland_frac=None, porosity_gldas=None, porosity_hwsd=None)[source]

Bases: object

Container class for a time series.

Parameters:

gpi (int) – Grid point index
lon (float) – Longitude of grid point
lat (float) – Latitude of grid point
cell (int) – Cell number of grid point
data (pandas.DataFrame) – DataFrame which contains the data
topo_complex (int, optional) – Topographic complexity at the grid point
wetland_frac (int, optional) – Wetland fraction at the grid point
porosity_gldas (float, optional) – Porosity taken from GLDAS model
porosity_hwsd (float, optional) – Porosity calculated from Harmonised World Soil Database

gpi

Grid point index

Type:: int

lon

Longitude of grid point

Type:: float

lat

Latitude of grid point

Type:: float

cell

Cell number of grid point

Type:: int

data

DataFrame which contains the data

Type:: pandas.DataFrame

topo_complex

Topographic complexity at the grid point

Type:: int

wetland_frac

Wetland fraction at the grid point

Type:: int

porosity_gldas

Porosity taken from GLDAS model

Type:: float

porosity_hwsd

Porosity calculated from Harmonised World Soil Database

Type:: float

ascat.read_native.cdr.load_grid(filename)[source]

Load grid file.

Parameters:: filename (str) – Grid filename.
Returns:: grid – Grid.
Return type:: pygeogrids.CellGrid

ascat.read_native.eps_native module

Readers for ASCAT Level 1b and Level 2 data in EPS Native format.

class ascat.read_native.eps_native.AscatL1bEpsFile(filename)[source]

Bases: AscatFile

ASCAT Level 1b EPS Native reader class.

class ascat.read_native.eps_native.AscatL1bEpsFileGeneric(filename)[source]

Bases: AscatL1bEpsFile

The same as AscatL1bEpsFile but with generic=True by default.

class ascat.read_native.eps_native.AscatL1bEpsSzfFile(filename)[source]

Bases: AscatFile

Class reading ASCAT Level 1b file in EPS Native format.

class ascat.read_native.eps_native.AscatL2EpsFile(filename)[source]

Bases: AscatFile

ASCAT Level 2 EPS Native reader class.

class ascat.read_native.eps_native.AscatL2EpsFileGeneric(filename)[source]

Bases: AscatL2EpsFile

The same as AscatL1bEpsFile but with generic=True by default.

class ascat.read_native.eps_native.EPSProduct(filename)[source]

Bases: object

Class for reading EPS products.

read(full=True, unsafe=False, scale_mdr=True)[source]

Read EPS file.

Parameters:

full (bool, optional) – Read full file content (True) or just Main Product Header Record (MPHR) and Main Data Record (MDR) (False). Default: True
unsafe (bool, optional) – If True it is (unsafely) assumed that MDR are continuously stacked until the end of file. Makes reading a lot faster. Default: False
scale_mdr (bool, optional) – Compute scaled MDR (True) or not (False). Default: True

Returns:

mphr (dict self.sphr, self.aux, self.mdr, scaled_mdr) – Main Product Header Record (MPHR).
sphr (dict) – Secondary Product Header Product (SPHR).
aux (dict) – Auxiliary Header Products.
mdr (numpy.ndarray) – Main Data Record (MDR)
scaled_mdr (numpy.ndarray) – Scaled Main Data Record (MPHR) or None if not computed.

read_mphr()[source]: Read only Main Product Header Record (MPHR).

read_record_class(grh, record_count)[source]

Read record class.

Parameters:

grh (numpy.ndarray) – Generic record header.
record_count (int) – Number of records.

ascat.read_native.eps_native.conv_epsl1bszf_generic(data, metadata, gen_fields_lut, skip_fields)[source]

Rename and convert data types of dataset.

Parameters:

data (dict of numpy.ndarray) – Original dataset.
metadata (dict) – Metadata.

Returns:

data – Converted dataset.

Return type:

dict of numpy.ndarray

ascat.read_native.eps_native.conv_epsl1bszx_generic(data, metadata)[source]

Rename and convert data types of dataset.

Parameters:

data (dict of numpy.ndarray) – Original dataset.
metadata (dict) – Metadata.

Returns:

data – Converted dataset.

Return type:

dict of numpy.ndarray

ascat.read_native.eps_native.conv_epsl2szx_generic(data, metadata)[source]

Rename and convert data types of dataset.

Parameters:

data (dict of numpy.ndarray) – Original dataset.
metadata (dict) – Metadata.

Returns:

data – Converted dataset.

Return type:

dict of numpy.ndarray

ascat.read_native.eps_native.gen_flagfield(data)[source]

The new flagfield collects the fields previously split across the RF1 / RF2 / PL / GEN1 / GEN2 flagfields. Its structure is described in the PFS, Tab. 14: Structure of FLAGFIELD.

The old RF1 flagfield (related to the quality of the raw echo correction functions) contains the following bit flags and maps to the v11 flagfield as follows :

RF1 Bit Flag v11 Bit Description 0 F_NOISE 0 Noise measurement missing, interpolated value used 1 F_PG 1 Degraded power gain product 2 V_PG 2 Very degraded power gain product 3 F_FILTER 3 Degraded filter shape 4 V_FILTER 4 Very degraded filter shape

RF2 Bit Flag v11 Bit Description 0 F_PGP 5 Estimated power gain product outside limits 1 F_NP 6 Measured noise outside limits 2 F_PGP_DROP 7 Small drop in power gain product detected

PL Bit Flag v11 Bit Description 0 F_ORBIT n/a Orbit height used for the NRCS normalisation is outside limits 1 F_ATTITUDE 8 No yaw steering 2 F_OMEGA 9 Unexpected instrument configuration 3 F_MAN 10 Satellite manoeuvre 4 F_OSV 11 Input orbit prediction file missing, OSV taken from L0 header

GEN1 Bit Flag v11 Bit Description 0 F_E_TEL_PRES 12 Instrument or platform HKTM missing 1 F_E_TEL_IR 13 Instrument or platform HKTM out of limits 2 F_CE n/a 3 V_CE n/a 4 F_OA n/a Quality of satellite orbit and attitute 5 F_TEL n/a 6 F_REF 14

GEN2 Bit Flag v11 Bit Description 0 F_S_A 15 Potential interference from solar array 1 F_LAND 16 Measurement over land in the generation of NCRS value 2 F_GEO 17 Geolocation algorithm failed 3 F_SIGN 18 The NRCS value is negative

ascat.read_native.eps_native.read_eps(filename, mphr_only=False, full=True, unsafe=False, scale_mdr=True)[source]

Read EPS file.

Parameters:: filename (str) – Filename
Returns:: prod – EPS data.
Return type:: EPSProduct

ascat.read_native.eps_native.read_eps_l1b(filename, generic=False, to_xarray=False, full=True, unsafe=False, scale_mdr=True, ignore_noise_ool=False, return_ptype=False)[source]

Level 1b reader and data preparation.

Parameters:

filename (str) – ASCAT Level 1b file name in EPS Native format.
generic (bool, optional) – “True” reading and converting into generic format or “False” reading original field names (default: False).
to_xarray (bool, optional) – “True” return data as xarray.Dataset “False” return data as numpy.ndarray (default: False).
full (bool, optional) – Read full file content (True) or just Main Product Header Record (MPHR) and Main Data Record (MDR) (False). Default: True
unsafe (bool, optional) – If True it is (unsafely) assumed that MDR are continuously stacked until the end of file. Makes reading a lot faster. Default: False
scale_mdr (bool, optional) – Compute scaled MDR (True) or not (False). Default: True
ignore_noise_ool (bool, optional) – Ignore noise out of limit flag (default: False).

Returns:

ds – ASCAT Level 1b data.

Return type:

xarray.Dataset, dict of xarray.Dataset

ascat.read_native.eps_native.read_eps_l2(filename, generic=False, to_xarray=False, return_ptype=False)[source]

Level 2 reader and data preparation.

Parameters:

filename (str) – ASCAT Level 1b file name in EPS Native format.
generic (bool, optional) – “True” reading and converting into generic format or “False” reading original field names (default: False).
to_xarray (bool, optional) – “True” return data as xarray.Dataset “False” return data as numpy.ndarray (default: False).

Returns:

data (xarray.Dataset or numpy.ndarray) – ASCAT data.
metadata (dict) – Metadata.

ascat.read_native.eps_native.read_smx_fmv_11(eps_file)[source]

Read SMO/SMR format version 11.

Parameters:: eps_file (EPSProduct object) – EPS Product object.
Returns:: data – SMO/SMR data.
Return type:: numpy.ndarray

ascat.read_native.eps_native.read_smx_fmv_12(eps_file)[source]

Read SMO/SMR format version 12.

Parameters:: eps_file (EPSProduct object) – EPS Product object.
Returns:: data – SMO/SMR data.
Return type:: numpy.ndarray

ascat.read_native.eps_native.read_szf_fmv_12(eps_file, ignore_noise_ool=False)[source]

Read SZF format version 12.

beam_num - 1 Left Fore Antenna - 2 Left Mid Antenna - 3 Left Aft Antenna - 4 Right Fore Antenna - 5 Right Mid Antenna - 6 Right Aft Antenna

as_des_pass - 0 Ascending - 1 Descending

swath_indicator - 0 Left - 1 Right

Parameters:

eps_file (EPSProduct object) – EPS Product object.
ignore_noise_ool (bool, optional) – Ignore noise out of limit flag (default: False).

Returns:

data – SZF data.

Return type:

numpy.ndarray

ascat.read_native.eps_native.read_szf_fmv_13(eps_file, ignore_noise_ool=False)[source]

Read SZF format version 13.

beam_num - 1 Left Fore Antenna - 2 Left Mid Antenna - 3 Left Aft Antenna - 4 Right Fore Antenna - 5 Right Mid Antenna - 6 Right Aft Antenna

as_des_pass - 0 Ascending - 1 Descending

swath_indicator - 0 Left - 1 Right

Parameters:

eps_file (EPSProduct object) – EPS Product object.
ignore_noise_ool (bool, optional) – Ignore noise out of limit flag (default: False).

Returns:

data – SZF data.

Return type:

numpy.ndarray

ascat.read_native.eps_native.read_szx_fmv_11(eps_file)[source]

Read SZO/SZR format version 11.

Parameters:: eps_file (EPSProduct object) – EPS Product object.
Returns:: data – SZO/SZR data.
Return type:: numpy.ndarray

ascat.read_native.eps_native.read_szx_fmv_12(eps_file)[source]

Read SZO/SZR format version

Parameters:: eps_file (EPSProduct object) – EPS Product object.
Returns:: data – SZO/SZR data.
Return type:: numpy.ndarray

ascat.read_native.eps_native.read_szx_fmv_13(eps_file)[source]

Read SZO/SZR format version

Parameters:: eps_file (EPSProduct object) – EPS Product object.
Returns:: data – SZO/SZR data.
Return type:: numpy.ndarray

ascat.read_native.eps_native.set_flags(data, ignore_noise_ool=False)[source]

Compute summary flag for each measurement with a value of 0, 1 or 2 indicating nominal, slightly degraded or severely degraded data.

The format of ASCAT products is defined by “EPS programme generic product format specification” (EPS.GGS.SPE.96167) and “ASCAT level 1 product format specification” (EPS.MIS.SPE.97233).

The flag bits are defined as follows:

bit name      category   description
------------------------------------

flagfield_rf1
fnoise     amber     noise missing, interpolated noise value used instead
fpgp       amber     degraded power gain product
vpgp       red       very degraded power gain product
fhrx       amber     degraded filter shape
vhrx       red       very degraded filter shape

flagfield_rf2
pgp_ool    red       power gain product is outside limits
noise_ool  red       measured noise value is outside limits

flagfield_pl
forb       red       orbit height is outside limits
fatt       red       no yaw steering
fcfg       red       unexpected instrument configuration
fman       red       satellite maneuver
fosv       warning   osv file missing (fman may be incorrect)

flagfield_gen1
ftel       warning   telemetry missing (ftool may be incorrect)
ftool      red       telemetry out of limits

flagfield_gen2
fsol   amber     possible interference from solar array
fland  warning   lat/long position is over land
fgeo   red       geolocation algorithm failed

Each flag has belongs to a particular category which indicates the impact on data quality. Flags in the “amber” category indicate that the data is slightly degraded but still usable. Flags in the “red” category indicate that the data is severely degraded and should be discarded or used with caution.

A simple algorithm for calculating a single summary flag with a value of 0, 1 or 2 indicating nominal, slightly degraded or severely degraded is

function calc_status( flags ): status = 0 if any amber flags are set then status = 1 if any red flags are set then status = 2

return status

Parameters:: data (numpy.ndarray) – SZF data.
Returns:: f_usable – Flag indicating nominal (0), slightly degraded (1) or severely degraded(2).
Return type:: numpy.ndarray

ascat.read_native.eps_native.set_flags_fmv13(flagfield, ignore_noise_ool=False)[source]

Compute summary flag for each measurement with a value of 0, 1 or 2 indicating nominal, slightly degraded or severely degraded data.

The format of ASCAT products is defined by “EPS programme generic product format specification” (EPS.GGS.SPE.96167) and “ASCAT level 1 product format specification” (EPS.MIS.SPE.97233).

The flag bits are defined as follows:

bit name         category  description
------------------------------------
f_noise       amber     1: noise missing/interpolated during processing
f_pg          amber     1: degraded power gain product (pgp)
v_pg          red       1: not valid power gain product (pgp)
f_filter      amber     1: degraded hrx
v_filter      red       1: no valid hrx
f_pgp_ool     red       1: estimated power gain product out of limits
f_np_ool      red       1: measured noise value is outside limits
f_pgp_drop    amber     0: continuous pgp 1: drop in pgp
f_attitude    red       1: non-normal attitude
f_omega       red       1: instrument parameter configuration mismatch
f_man         red       0: no-manoeuvre 1: manoeuvre
f_osv         info      1: osv file not available
f_e_tel_pres  amber     1: interpolated HKTM telemetry missing
f_e_tel_ir    red       1: some interpolated HKTM telemetry parameters
                               out of prescribed thresholds
f_ref         info      1: if f_pgp or f_np are 1
f_sa          amber     1: risk of solar array panel reflections
                               interference
f_land        info      0: no-land 1: land
f_geo         red       1: geolocation algorithm failed
f_sign        info         sigma0 in linear units is negative and value
                               in dB has been calculated from its
                               unsigned value
f_com_op      info      1: data taken during commissioning phase

20-31 spare

Each flag has belongs to a particular category which indicates the impact on data quality. Flags in the “amber” category indicate that the data is slightly degraded but still usable. Flags in the “red” category indicate that the data is severely degraded and should be discarded or used with caution.

Parameters:: flagfield (numpy.ndarray) – Flags in decimal format.
Returns:: f_usable – Flag indicating nominal (0), minor degraded (1) or major degraded (2).
Return type:: numpy.ndarray

ascat.read_native.eps_native.shortcdstime2jd(days, milliseconds)[source]

Convert cds time to julian date.

Parameters:

days (int) – Days since 2000-01-01
milliseconds (int) – Milliseconds.

Returns:

jd – Julian date.

Return type:

float

ascat.read_native.generate_test_data module

ascat.read_native.hdf5 module

Readers for ASCAT Level 1b in HDF5 format.

class ascat.read_native.hdf5.AscatL1bHdf5File(filename)[source]

Bases: AscatFile

Class reading ASCAT Level 1b file in HDF5 format.

class ascat.read_native.hdf5.AscatL1bHdf5FileGeneric(filename)[source]

Bases: AscatL1bHdf5File

The same as AscatL1bHdf5File but with generic=True by default.

ascat.read_native.hdf5.conv_hdf5l1b_generic(data, metadata)[source]

Rename and convert data types of dataset.

Parameters:

data (dict of numpy.ndarray) – Original dataset.
metadata (dict) – Metadata.

Returns:

data – Converted dataset.

Return type:

dict of numpy.ndarray

ascat.read_native.nc module

Readers for ASCAT Level 1b and Level 2 data in NetCDF format.

class ascat.read_native.nc.AscatL1bNcFile(filename, **kwargs)[source]

Bases: AscatFile

Read ASCAT Level 1b file in NetCDF format.

class ascat.read_native.nc.AscatL1bNcFileGeneric(filename, **kwargs)[source]

Bases: AscatL1bNcFile

The same as AscatL1bNcFile but with generic=True by default.

class ascat.read_native.nc.AscatL2NcFile(filename, **kwargs)[source]

Bases: AscatFile

Read ASCAT Level 2 file in NetCDF format.

class ascat.read_native.nc.AscatL2NcFileGeneric(filename, **kwargs)[source]

Bases: AscatL2NcFile

The same as AscatL1bNcFile but with generic=True by default.

class ascat.read_native.nc.AscatSsmNcSwathFile(filename)[source]

Bases: AscatFile

Class reading ASCAT Surface Soil Moisture Netcdf swath file.

class ascat.read_native.nc.AscatSsmNcSwathFileList(path, filename_template=None, subfolder_template=None, sat='?', cls_kwargs=None)[source]

Bases: ChronFiles

Class reading ASCAT Surface Soil Moisture Netcdf swath file list.

iter_daterange(start_date, end_date)[source]

Generator returning filenames between start and end date.

Parameters:

start_date (datetime) – Start date.
end_date (datetime) – End date.

Yields:

filename (str) – Filename.

read_date(timestamp)[source]

Read data for given timestamp.

Parameters:: timestamp (datetime) – Date.
Returns:: data – Data.
Return type:: xarray.Dataset

read_period(start_dt, end_dt, delta_dt=datetime.timedelta(seconds=3600), buffer_dt=datetime.timedelta(seconds=3600), **kwargs)[source]

Read data for given interval.

Parameters:

start_dt (datetime) – Start datetime.
end_dt (datetime) – End datetime.
delta_dt (timedelta, optional) – Time delta used to jump through search date.
buffer_dt (timedelta, optional) – Search buffer used to find files which could possibly contain data but would be left out because of dt_start.

Returns:

data – Data stored in file.

Return type:

dict, numpy.ndarray

search_date(timestamp, **kwargs)[source]

Search date.

Parameters:: timestamp (datetime) – Date.
Returns:: filenames – Filenames.
Return type:: list

ascat.read_native.nc.read_nc(filename, generic, to_xarray, skip_fields, gen_fields_lut)[source]

Read NetCDF file.

Parameters:

filename (str) – Filename.
generic (bool) – ‘True’ reading and converting into generic format or ‘False’ reading original field names.
to_xarray (bool) – ‘True’ return data as xarray.Dataset ‘False’ return data as numpy.ndarray.
skip_fields (list) – Variables to skip.
gen_fields_lut (dict) – Conversion look-up table for generic names.

Returns:

data (xarray.Dataset or numpy.ndarray) – ASCAT data.
metadata (dict) – Metadata.

ascat.read_native.ragged_array_ts module

class ascat.read_native.ragged_array_ts.CRANcFile(filename, row_var='row_size', **kwargs)[source]

Bases: RAFile

Contiguous ragged array file reader.

property ids

Location IDs property.

Returns:: location_id – Location IDs.
Return type:: numpy.ndarray

property lats

Latitude coordinates property.

Returns:: lat – Latitude coordinates.
Return type:: numpy.ndarray

property lons

Longitude coordinates property.

Returns:: lon – Longitude coordinates.
Return type:: numpy.ndarray

read(location_id, variables=None)[source]

Read a timeseries for a given location_id.

Parameters:

location_id (int) – Location_id to read.
variables (list or None) – A list of parameter-names to read. If None, all parameters are read. If None, all parameters will be read. The default is None.

Returns:

df – A pandas.DataFrame containing the timeseries for the location_id.

Return type:

pandas.DataFrame

read_2d(variables=None)[source]

(Draft!) Read all time series into 2d array.

1d data: 1, 2, 3, 4, 5, 6, 7, 8 row_size: 3, 2, 1, 2 2d data: 1 2 3 0 0 4 5 0 0 0 6 0 0 0 0 7 8 0 0 0

class ascat.read_native.ragged_array_ts.CellFileCollection(path, ioclass, ioclass_kws=None, dir_name_format='{date1}_{date2}', dir_date_format='%Y%m%d%H%M%S')[source]

Bases: object

Collection of grid cell files.

Represents a collection of grid cell files that live in the same directory, and contains methods to read data from them.

property cells_in_collection

Return a list of the cells in the collection.

Returns:: List of cells in the collection.
Return type:: list of int

close()[source]: Close file.

create_cell_lookup(out_cell_size)[source]

Create a lookup table self.cell_lut mapping a new cell-size grid to the existing one.

Format of the table is a dictionary, where the keys are the cell numbers in the new cell-size grid, and the values are the cell numbers in the old cell-size grid which the new cell overlaps.

Parameters:: out_cell_size (int) – Cell size of the new grid.

property date_range: Return the start and end date of the collection based on its dir name

classmethod from_product_id(collections, product_id, ioclass_kws=None)[source]

Create a CellFileCollection based on a product_id.

Returns a CellFileCollection object initialized with an io_class specified by product_id (case-insensitive).

Parameters:

collections (list of str or Path) – A path to a cell file collection or a list of paths to cell file collections, or a list of CellFileCollection.
product_id (str) – ASCAT ID of the cell file collections.
ioclass_kws (dict, optional) – Keyword arguments to pass to the ioclass initialization.

Raises:

ValueError – If product_id is not recognized.

get_cell_path(cell=None, location_id=None)[source]

Get path to cell file given cell number or location id.

Returns a path to a cell file in the collection’s directory, whether the file exists or not, as long as the cell number or location id is within the grid.

Parameters:

cell (int, optional) – Cell number.
location_id (int, optional) – Location identifier.

Returns:

path – Path to cell file.

Return type:

pathlib.Path

Raises:

ValueError – If neither cell nor location_id is given.
ValueError – If the given cell number or location_id is not within the grid.

read(cell=None, location_id=None, coords=None, bbox=None, geom=None, mask_and_scale=True, date_range=None, **kwargs)[source]

Read data from the collection for a cell, location_id, or set of coordinates.

Parameters:

cell (int) – Grid cell number to read.
location_id (int) – Location id.
coords (tuple) – Tuple of (lat, lon) coordinates.
bbox (tuple) – Tuple of (latmin, latmax, lonmin, lonmax) coordinates.
mask_and_scale (bool, optional) – If True, mask and scale the data according to its scale_factor and _FillValue/missing_value before returning. Default: True.
**kwargs (dict) – Keyword arguments passed to the ioclass.

Returns:

Dataset containing the data for the given cell, location_id, or coordinates.

Return type:

xarray.Dataset

Raises:

ValueError – If neither cell, location_id, nor coords is given.

to_contiguous(out_dir, out_cell_size, processes=8)[source]

class ascat.read_native.ragged_array_ts.CellFileCollectionStack(collections, ioclass, dupe_window=None, dask_scheduler='threads', **kwargs)[source]

Bases: object

Collection of grid cell file collections.

add_collection(collections, product_id=None)[source]

Add a cell file collection to the stack, based on file path.

Parameters:

collections (str or list of str or CellFileCollection) – Path to the cell file collection to add, or a list of paths.
product_id (str, optional) – ASCAT ID of the collections to add. Needed if collections is a string or list of strings.

Raises:

ValueError – If collections is a string or list of strings and product_id is not given.

close()[source]: Close all the collections.

classmethod from_product_id(collections, product_id, dupe_window=None, dask_scheduler=None)[source]

Create a CellFileCollectionStack based on a product_id.

Returns a CellFileCollectionStack object initialized with an io_class specified by product_id (case-insensitive).

Parameters:

collections (list of str or CellFileCollection) – A path to a cell file collection or a list of paths to cell file collections, or a list of CellFileCollection.
product_id (str) – ASCAT ID of the cell file collections. Either this or ioclass must be specified.
dupe_window (numpy.timedelta64) – Time difference between two observations at the same location_id below which the second observation will be considered a duplicate. Will be set to np.timedelta64(“10”, “m”) if None. Default: None
dask_scheduler (str, optional) – Dask scheduler to use for parallel processing. Will be set to “threads” when class is initialized if None. Default: None

merge_and_write(out_dir, cells=None, date_range=None, out_cell_size=None, processes=8)[source]

Merge the data in all the collections by cell, and write each cell to disk.

Parameters:

out_dir (str or Path) – Path to output directory.
cells (list of int, optional) – Cells to write. If None, write all cells.
date_range (tuple of numpy.datetime64, optional) – Start and end dates to read data for before writing.
out_cell_size (tuple, optional) – Size of the output cells in degrees (assumes they are square). If None, and the component collections all have the same cell size, use that.
processes (int, optional) – Number of processes to use for parallel processing. Default: 8

Raises:

ValueError – If out_cell_size is None and the component collections do not all have the same cell size.

read(cell=None, location_id=None, bbox=None, geom=None, mask_and_scale=True, date_range=None, **kwargs)[source]

Read data for a cell or location_id.

Parameters:

cell (int) – Cell number to read data for.
location_id (int) – Location ID to read data for.
bbox (tuple) – Tuple of (latmin, latmax, lonmin, lonmax) coordinates to read data within.
mask_and_scale (bool, optional) – If True, mask and scale the data according to its scale_factor and _FillValue/missing_value before returning. Default: True.
date_range (tuple of numpy.datetime64, optional) – Start and end dates to read data for.
**kwargs (dict) – Keyword arguments to pass to the read function of the collection

Returns:

Dataset containing the combined data for the given cell or location_id from all the collections in the stack.

Return type:

xarray.Dataset

Raises:

ValueError – If neither cell nor location_id is given.

subcollection_cells(cells=None, out_cell_size=None, date_range=None)[source]

Get the cells that are covered by all the subcollections. If out_cell_size is passed, then it returns the cells in the new cell-scheme that are covered by the subcollections.

Parameters:

cells (list of int, optional) – Cells to check. If None, check all cells.
out_cell_size (int, optional) – The size of the cells in the new cell-scheme.

Returns:

Cells covered by all subcollections.

Return type:

set

class ascat.read_native.ragged_array_ts.IRANcFile(filename, **kwargs)[source]

Bases: RAFile

Indexed ragged array file reader.

property ids

Location IDs property.

Returns:: location_id – Location IDs.
Return type:: numpy.ndarray

property lats

Latitude coordinates property.

Returns:: lat – Latitude coordinates.
Return type:: numpy.ndarray

property lons

Longitude coordinates property.

Returns:: lon – Longitude coordinates.
Return type:: numpy.ndarray

read(location_id, variables=None)[source]

Read a timeseries for a given location_id.

Parameters:

location_id (int) – Location_id to read.
variables (list or None) – A list of parameter-names to read. If None, all parameters are read. If None, all parameters will be read. The default is None.

Returns:

df – A pandas.DataFrame containing the timeseries for the location_id.

Return type:

pandas.DataFrame

class ascat.read_native.ragged_array_ts.RAFile(loc_dim_name='locations', obs_dim_name='time', loc_ids_name='location_id', loc_descr_name='location_description', time_units='days since 1900-01-01 00:00:00', time_var='time', lat_var='lat', lon_var='lon', alt_var='alt', cache=False, mask_and_scale=False)[source]

Bases: object

Base class used for Ragged Array (RA) time series data.

class ascat.read_native.ragged_array_ts.SwathFileCollection(path, ioclass, ioclass_kws=None, dask_scheduler=None)[source]

Bases: object

Collection of time-series swath files.

Parameters:

path (str or Path) – Path to the swath file collection.
ioclass (ascat.read_native.xarray_io.SwathIOBase) – IO class to use for reading the data.
ioclass_kws (dict, optional) – Keyword arguments to pass to the ioclass initialization. Default: None
dask_scheduler (str, optional) – Dask scheduler to use for parallel processing in xarray. In testing this just made most things slower, but it may be useful in some cases. Default: None

path

Path to the swath file collection.

Type:: Path

ioclass

IO class to use for reading the data.

Type:: class

ioclass_kws

Keyword arguments to pass to the ioclass initialization. May include ioclass attributes that will override any that are set in the current ioclass.

Type:: dict

grid

Grid object defining the grid the data is on.

Type:: pygeogrids.CellGrid object

ts_dtype

Data types to encode the time series data as when writing.

Type:: numpy.dtype

beams_vars

List of names of the variables that have a beams dimension.

Type:: list of str

date_format

Format of the date in the filename.

Type:: str

cell_fn_format

Format for the names of the cell files that will be written out.

Type:: str

chron_files

Function to search for files in the collection based on their date.

Type:: function

previous_cell

Type:: int or list of int

fid

The currently open instance of self.ioclass.

Type:: ascat.read_native.xarray_io.SwathIOBase object

max_buffer_memory_mb

Maximum amount of memory to use for buffering data when stacking to disk.

Type:: int

close()[source]: Close collection and constituent xarray datasets.

classmethod from_product_id(path, product_id, ioclass_kws=None, dask_scheduler=None)[source]

Create a SwathFileCollection based on a product_id.

Returns a SwathFileCollection object initialized with an io_class specified by product_id (case-insensitive).

Parameters:

path (str or Path) – Path to the swath file collection.
product_id (str) – Identifier for the specific ASCAT product the swath files are part of.
ioclass_kws (dict, optional) – Keyword arguments to pass to the ioclass initialization. Default: None
dask_scheduler (str, optional) – Dask scheduler to use for parallel processing. Will be set to “threads” when class is initialized if None. Default: None

Raises:

ValueError – If product_id is not recognized.

Examples

>>> my_swath_collection = SwathFileCollection.from_product_id(
...     "/path/to/swath/files",
...     "H129",
... )

get_filenames(start_dt=None, end_dt=None, cell=None, location_id=None, coords=None, bbox=None, geom=None)[source]

Get filenames for the given time range.

Parameters:

start_dt (datetime.datetime) – Start time.
end_dt (datetime.datetime) – End time.

Returns:

fnames – List of filenames.

Return type:

list of pathlib.Path

Raises:

NotImplementedError – If the ioclass does not have a file search method named chron_files.

process(data)[source]

Process a stacked dataset of swath data into a format that is ready to be split into cell timeseries datasets, and return the processed dataset.

Parameters:: data (xarray.Dataset) – Stacked dataset to process.

read(date_range, cell=None, location_id=None, coords=None, bbox=None, geom=None, **kwargs)[source]

Takes either 1 or 2 arguments and calls the correct function which is either reading the gpi directly or finding the nearest gpi from given lat,lon coordinates and then reading it.

If the time range is large, this can be slow. It may make more sense to convert to cell files first and access that data from disk using a CellFileCollection or CellFileCollectionStack.

Parameters:

date_range (tuple of datetime.datetime) – Start and end dates.
cell (int or list of int, optional) – Grid cell number to read.
location_id (int, optional) – Location id.
coords (tuple, optional) – Tuple of (lat, lon) coordinates.
bbox (tuple, optional) – Tuple of (latmin, latmax, lonmin, lonmax) coordinates.
geometry (shapely.geometry, optional) – Geometry object; use to read data that intersects the geometry.

stack(out_dir, fnames=None, date_range=None, mode='w', processes=1, buffer_memory_mb=None, dupe_window=None)[source]

Stack swath files and split them into cell timeseries files.

Reads swath files into memory, stacking their datasets in a buffer until the sum of their sizes exceeds self.max_buffer_memory_mb. Then, splits the buffer into cell timeseries datasets, writes them to disk in parallel, and clears the buffer. This process repeats until all files have been processed, with subsequent writes appending new data to existing cell files when appropriate.

Parameters:

out_dir (pathlib.Path) – Output directory to write the stacked files to.
fnames (list of pathlib.Path, optional) – List of swath filenames to stack.
date_range (tuple of datetime.datetime) – Start and end dates to read data for before writing.
mode (str, optional) – Write mode. Default is “w”, which will clear all files from out_dir before processing. Use “a” to append data to existing files (only if those have also been produced by this function).
processes (int, optional) – Number of processes to use for parallel writing. Default is 1.
buffer_memory_mb (numeric, optional) – Maximum amount of memory to use for the buffer, in megabytes. Will be set to self.max_buffer_memory_mb if None. Default is None.
dupe_window (numpy.timedelta64, optional) – Time window within which duplicate observations will be removed. Default is None.

Raises:

ValueError – If mode is not “w” or “a”.

swath_data_generator(start_dt=None, end_dt=None, cell=None, location_id=None, coords=None, bbox=None, geom=None)[source]

Return a generator producing the data for each requested swath file.

Parameters:

start_dt (datetime.datetime) – Start time.
end_dt (datetime.datetime) – End time.
cell (int) – Grid cell number to select.
location_id (int) – Location id.
coords (tuple) – Tuple of (lat, lon) coordinates.
bbox (tuple) – Tuple of (latmin, latmax, lonmin, lonmax) coordinates.
geom (shapely.geometry) – Geometry object; use to select data that intersects the geometry.

Yields:

start_timestamp (numpy.datetime64) – Sensing start time of the swath file.
end_timestamp (numpy.datetime64) – Sensing end time of the swath file.
sat (str) – Satellite name.
data (xarray.Dataset) – Dataset for each swath file intersecting the requested extent.

ascat.read_native.ragged_array_ts.braces_to_re_groups(string)[source]

Convert braces to character patterns defining regular expression groups. If any group name is repeated in the template string, a backreference is used for subsequent appearances.

Parameters:: string (str) – String with braces.
Returns:: string – String with regular expression groups.
Return type:: str

Examples

>>> braces_to_re_groups("{year}-{month}-{day}")
"(?P<year>.+)-(?P<month>.+)-(?P<day>.+)"
>>> braces_to_re_groups("{year}-{month}-{day}_{year}-{month}-{day2}")
"(?P<year>.+)-(?P<month>.+)-(?P<day>.+)_(?P=year)-(?P=month)-(?P<day2>.+)"

ascat.read_native.xarray_io module

class ascat.read_native.xarray_io.AscatH121v1Cell(filename, **kwargs)[source]

Bases: AscatNetCDFCellBase

fn_format = '{:04d}.nc'

grid = <fibgrid.realization.FibGrid object>

grid_cell_size = 5

grid_info = {'grid': <fibgrid.realization.FibGrid object>, 'max_cell': np.int16(2591), 'min_cell': np.int16(0), 'possible_cells': array([ 0, 1, 2, ..., 2589, 2590, 2591], shape=(2592,), dtype=int16)}

max_cell = np.int16(2591)

min_cell = np.int16(0)

possible_cells = array([ 0, 1, 2, ..., 2589, 2590, 2591], shape=(2592,), dtype=int16)

class ascat.read_native.xarray_io.AscatH121v1Swath(filename, **kwargs)[source]

Bases: SwathIOBase

beams_vars = []

cell_fn_format = '{:04d}.nc'

date_format = '%Y%m%d%H%M%S'

fn_pattern = 'W_IT-HSAF-ROME,SAT,SSM-ASCAT-METOP{sat}-12.5km-H121_C_LIIB_{placeholder}_{placeholder1}_{date}____.nc'

static fn_read_fmt(timestamp)[source]: TODO: figure out a sane way to describe what this does. Also decide if this /needs/ to be enforced. If the user doesn’t want to use all the filesearch functionality (or if they want to use their own filesearch logic), then they should still be able to use this class. They could of course override this and just return None, but that seems like a hack.

grid = <fibgrid.realization.FibGrid object>

grid_cell_size = 5

grid_sampling_km = 12.5

sf_pattern = {'satellite_folder': 'metop_[abc]', 'year_folder': '{year}'}

static sf_read_fmt(timestamp)[source]: TODO: same as above

ts_dtype = dtype([('sat_id', 'i1'), ('as_des_pass', 'i1'), ('swath_indicator', 'i1'), ('surface_soil_moisture', '<f4'), ('surface_soil_moisture_noise', '<f4'), ('backscatter40', '<f4'), ('slope40', '<f4'), ('curvature40', '<f4'), ('surface_soil_moisture_sensitivity', '<f4'), ('backscatter_flag', 'u1'), ('correction_flag', 'u1'), ('processing_flag', 'u1'), ('surface_flag', 'u1'), ('snow_cover_probability', 'i1'), ('frozen_soil_probability', 'i1'), ('wetland_fraction', 'i1'), ('topographic_complexity', 'i1'), ('subsurface_scattering_probability', 'i1')])

class ascat.read_native.xarray_io.AscatH122Cell(filename, **kwargs)[source]

Bases: AscatNetCDFCellBase

fn_format = '{:04d}.nc'

grid = <fibgrid.realization.FibGrid object>

grid_cell_size = 5

grid_info = {'grid': <fibgrid.realization.FibGrid object>, 'max_cell': np.int16(2591), 'min_cell': np.int16(0), 'possible_cells': array([ 0, 1, 2, ..., 2589, 2590, 2591], shape=(2592,), dtype=int16)}

max_cell = np.int16(2591)

min_cell = np.int16(0)

possible_cells = array([ 0, 1, 2, ..., 2589, 2590, 2591], shape=(2592,), dtype=int16)

class ascat.read_native.xarray_io.AscatH122Swath(filename, **kwargs)[source]

Bases: SwathIOBase

beams_vars = []

cell_fn_format = '{:04d}.nc'

date_format = '%Y%m%d%H%M%S'

fn_pattern = 'ascat_ssm_nrt_6.25km_{placeholder}Z_{date}Z_metop-{sat}_h122.nc'

static fn_read_fmt(timestamp)[source]: TODO: figure out a sane way to describe what this does. Also decide if this /needs/ to be enforced. If the user doesn’t want to use all the filesearch functionality (or if they want to use their own filesearch logic), then they should still be able to use this class. They could of course override this and just return None, but that seems like a hack.

grid = <fibgrid.realization.FibGrid object>

grid_cell_size = 5

grid_sampling_km = 6.25

sf_pattern = {'satellite_folder': 'metop_[abc]', 'year_folder': '{year}'}

static sf_read_fmt(timestamp)[source]: TODO: same as above

ts_dtype = dtype([('sat_id', '<i8'), ('as_des_pass', 'i1'), ('swath_indicator', 'i1'), ('surface_soil_moisture', '<f4'), ('surface_soil_moisture_noise', '<f4'), ('sigma40', '<f4'), ('sigma40_noise', '<f4'), ('slope40', '<f4'), ('slope40_noise', '<f4'), ('curvature40', '<f4'), ('curvature40_noise', '<f4'), ('dry40', '<f4'), ('dry40_noise', '<f4'), ('wet40', '<f4'), ('wet40_noise', '<f4'), ('surface_soil_moisture_sensitivity', '<f4'), ('surface_soil_moisture_climatology', '<f4'), ('correction_flag', 'u1'), ('processing_flag', 'u1'), ('snow_cover_probability', 'i1'), ('frozen_soil_probability', 'i1'), ('wetland_fraction', 'i1'), ('topographic_complexity', 'i1')])

class ascat.read_native.xarray_io.AscatH129Cell(filename, **kwargs)[source]

Bases: AscatNetCDFCellBase

fn_format = '{:04d}.nc'

grid = <fibgrid.realization.FibGrid object>

grid_cell_size = 5

grid_info = {'grid': <fibgrid.realization.FibGrid object>, 'max_cell': np.int16(2591), 'min_cell': np.int16(0), 'possible_cells': array([ 0, 1, 2, ..., 2589, 2590, 2591], shape=(2592,), dtype=int16)}

max_cell = np.int16(2591)

min_cell = np.int16(0)

possible_cells = array([ 0, 1, 2, ..., 2589, 2590, 2591], shape=(2592,), dtype=int16)

class ascat.read_native.xarray_io.AscatH129Swath(filename, **kwargs)[source]

Bases: SwathIOBase

beams_vars = ['backscatter', 'incidence_angle', 'azimuth_angle', 'kp']

cell_fn_format = '{:04d}.nc'

date_format = '%Y%m%d%H%M%S'

fn_pattern = 'W_IT-HSAF-ROME,SAT,SSM-ASCAT-METOP{sat}-6.25-H129_C_LIIB_{date}_{placeholder}_{placeholder1}____.nc'

static fn_read_fmt(timestamp)[source]: TODO: figure out a sane way to describe what this does. Also decide if this /needs/ to be enforced. If the user doesn’t want to use all the filesearch functionality (or if they want to use their own filesearch logic), then they should still be able to use this class. They could of course override this and just return None, but that seems like a hack.

grid = <fibgrid.realization.FibGrid object>

grid_cell_size = 5

grid_sampling_km = 6.25

sf_pattern = {'satellite_folder': 'metop_[abc]', 'year_folder': '{year}'}

static sf_read_fmt(timestamp)[source]: TODO: same as above

ts_dtype = dtype([('sat_id', 'i1'), ('as_des_pass', 'i1'), ('swath_indicator', 'i1'), ('backscatter_for', '<f4'), ('backscatter_mid', '<f4'), ('backscatter_aft', '<f4'), ('incidence_angle_for', '<f4'), ('incidence_angle_mid', '<f4'), ('incidence_angle_aft', '<f4'), ('azimuth_angle_for', '<f4'), ('azimuth_angle_mid', '<f4'), ('azimuth_angle_aft', '<f4'), ('kp_for', '<f4'), ('kp_mid', '<f4'), ('kp_aft', '<f4'), ('surface_soil_moisture', '<f4'), ('surface_soil_moisture_noise', '<f4'), ('backscatter40', '<f4'), ('slope40', '<f4'), ('curvature40', '<f4'), ('surface_soil_moisture_sensitivity', '<f4'), ('correction_flag', 'u1'), ('processing_flag', 'u1'), ('surface_flag', 'u1'), ('snow_cover_probability', 'i1'), ('frozen_soil_probability', 'i1'), ('wetland_fraction', 'i1'), ('topographic_complexity', 'i1')])

class ascat.read_native.xarray_io.AscatH129v1Cell(filename, **kwargs)[source]

Bases: AscatNetCDFCellBase

fn_format = '{:04d}.nc'

grid = <fibgrid.realization.FibGrid object>

grid_cell_size = 5

grid_info = {'grid': <fibgrid.realization.FibGrid object>, 'max_cell': np.int16(2591), 'min_cell': np.int16(0), 'possible_cells': array([ 0, 1, 2, ..., 2589, 2590, 2591], shape=(2592,), dtype=int16)}

max_cell = np.int16(2591)

min_cell = np.int16(0)

possible_cells = array([ 0, 1, 2, ..., 2589, 2590, 2591], shape=(2592,), dtype=int16)

class ascat.read_native.xarray_io.AscatH129v1Swath(filename, **kwargs)[source]

Bases: SwathIOBase

beams_vars = []

cell_fn_format = '{:04d}.nc'

date_format = '%Y%m%d%H%M%S'

fn_pattern = 'W_IT-HSAF-ROME,SAT,SSM-ASCAT-METOP{sat}-6.25km-H129_C_LIIB_{placeholder}_{placeholder1}_{date}____.nc'

static fn_read_fmt(timestamp)[source]: TODO: figure out a sane way to describe what this does. Also decide if this /needs/ to be enforced. If the user doesn’t want to use all the filesearch functionality (or if they want to use their own filesearch logic), then they should still be able to use this class. They could of course override this and just return None, but that seems like a hack.

grid = <fibgrid.realization.FibGrid object>

grid_cell_size = 5

grid_sampling_km = 6.25

sf_pattern = {'satellite_folder': 'metop_[abc]', 'year_folder': '{year}'}

static sf_read_fmt(timestamp)[source]: TODO: same as above

ts_dtype = dtype([('sat_id', 'i1'), ('as_des_pass', 'i1'), ('swath_indicator', 'i1'), ('surface_soil_moisture', '<f4'), ('surface_soil_moisture_noise', '<f4'), ('backscatter40', '<f4'), ('slope40', '<f4'), ('curvature40', '<f4'), ('surface_soil_moisture_sensitivity', '<f4'), ('backscatter_flag', 'u1'), ('correction_flag', 'u1'), ('processing_flag', 'u1'), ('surface_flag', 'u1'), ('snow_cover_probability', 'i1'), ('frozen_soil_probability', 'i1'), ('wetland_fraction', 'i1'), ('topographic_complexity', 'i1'), ('subsurface_scattering_probability', 'i1')])

class ascat.read_native.xarray_io.AscatNetCDFCellBase(filename, **kwargs)[source]

Bases: RaggedXArrayCellIOBase

read(date_range=None, location_id=None, mask_and_scale=True)[source]

Read data from netCDF4 file.

Read all or a subset of data from a netCDF4 file, with subset specified by the location_id argument.

Parameters:

date_range (tuple of datetime.datetime, optional) – Date range to read data for. If None, all data is read.
location_id (int or list of int.) – The location_id(s) to read data for. If None, all data is read. Default is None.
mask_and_scale (bool, optional) – If True, mask and scale the data according to its scale_factor and _FillValue/missing_value before returning. Default: True.

write(filename, ra_type='indexed', **kwargs)[source]

Write data to a netCDF file.

Parameters:

filename (str) – Output filename.
ra_type (str, optional) – Type of ragged array to write. Default is “contiguous”.
**kwargs (dict) – Additional keyword arguments passed to xarray.to_netcdf().

class ascat.read_native.xarray_io.AscatSIG0Cell12500m(filename, **kwargs)[source]

Bases: AscatNetCDFCellBase

fn_format = '{:04d}.nc'

grid = <fibgrid.realization.FibGrid object>

grid_cell_size = 5

grid_info = {'grid': <fibgrid.realization.FibGrid object>, 'max_cell': np.int16(2591), 'min_cell': np.int16(0), 'possible_cells': array([ 0, 1, 2, ..., 2589, 2590, 2591], shape=(2592,), dtype=int16)}

max_cell = np.int16(2591)

min_cell = np.int16(0)

possible_cells = array([ 0, 1, 2, ..., 2589, 2590, 2591], shape=(2592,), dtype=int16)

class ascat.read_native.xarray_io.AscatSIG0Cell6250m(filename, **kwargs)[source]

Bases: AscatNetCDFCellBase

fn_format = '{:04d}.nc'

grid = <fibgrid.realization.FibGrid object>

grid_cell_size = 5

grid_info = {'grid': <fibgrid.realization.FibGrid object>, 'max_cell': np.int16(2591), 'min_cell': np.int16(0), 'possible_cells': array([ 0, 1, 2, ..., 2589, 2590, 2591], shape=(2592,), dtype=int16)}

max_cell = np.int16(2591)

min_cell = np.int16(0)

possible_cells = array([ 0, 1, 2, ..., 2589, 2590, 2591], shape=(2592,), dtype=int16)

class ascat.read_native.xarray_io.AscatSIG0Swath12500m(filename, **kwargs)[source]

Bases: SwathIOBase

Class for reading and writing ASCAT sigma0 swath data.

beams_vars = ['backscatter', 'backscatter_std', 'incidence_angle', 'azimuth_angle', 'kp', 'n_echos', 'all_backscatter', 'all_backscatter_std', 'all_incidence_angle', 'all_azimuth_angle', 'all_kp', 'all_n_echos']

cell_fn_format = '{:04d}.nc'

date_format = '%Y%m%d%H%M%S'

fn_pattern = 'W_IT-HSAF-ROME,SAT,SIG0-ASCAT-METOP{sat}-12.5_C_LIIB_{placeholder}_{placeholder1}_{date}____.nc'

static fn_read_fmt(timestamp)[source]

Format a timestamp to search as YYYYMMDD*, for use in a regex that will match all files covering a single given date.

Parameters:: timestamp (datetime.datetime) – Timestamp to format
Returns:: Dictionary of formatted strings
Return type:: dict

grid = <fibgrid.realization.FibGrid object>

grid_cell_size = 5

grid_sampling_km = 12.5

sf_pattern = {'satellite_folder': 'metop_[abc]', 'year_folder': '{year}'}

static sf_read_fmt(timestamp)[source]: TODO: same as above

ts_dtype = dtype([('sat_id', 'i1'), ('as_des_pass', 'i1'), ('swath_indicator', 'i1'), ('backscatter_for', '<f4'), ('backscatter_mid', '<f4'), ('backscatter_aft', '<f4'), ('backscatter_std_for', '<f4'), ('backscatter_std_mid', '<f4'), ('backscatter_std_aft', '<f4'), ('incidence_angle_for', '<f4'), ('incidence_angle_mid', '<f4'), ('incidence_angle_aft', '<f4'), ('azimuth_angle_for', '<f4'), ('azimuth_angle_mid', '<f4'), ('azimuth_angle_aft', '<f4'), ('kp_for', '<f4'), ('kp_mid', '<f4'), ('kp_aft', '<f4'), ('n_echos_for', 'i1'), ('n_echos_mid', 'i1'), ('n_echos_aft', 'i1'), ('all_backscatter_for', '<f4'), ('all_backscatter_mid', '<f4'), ('all_backscatter_aft', '<f4'), ('all_backscatter_std_for', '<f4'), ('all_backscatter_std_mid', '<f4'), ('all_backscatter_std_aft', '<f4'), ('all_incidence_angle_for', '<f4'), ('all_incidence_angle_mid', '<f4'), ('all_incidence_angle_aft', '<f4'), ('all_azimuth_angle_for', '<f4'), ('all_azimuth_angle_mid', '<f4'), ('all_azimuth_angle_aft', '<f4'), ('all_kp_for', '<f4'), ('all_kp_mid', '<f4'), ('all_kp_aft', '<f4'), ('all_n_echos_for', 'i1'), ('all_n_echos_mid', 'i1'), ('all_n_echos_aft', 'i1')])

class ascat.read_native.xarray_io.AscatSIG0Swath6250m(filename, **kwargs)[source]

Bases: SwathIOBase

Class for reading ASCAT sigma0 swath data and writing it to cells.

beams_vars = ['backscatter', 'backscatter_std', 'incidence_angle', 'azimuth_angle', 'kp', 'n_echos', 'all_backscatter', 'all_backscatter_std', 'all_incidence_angle', 'all_azimuth_angle', 'all_kp', 'all_n_echos']

cell_fn_format = '{:04d}.nc'

date_format = '%Y%m%d%H%M%S'

fn_pattern = 'W_IT-HSAF-ROME,SAT,SIG0-ASCAT-METOP{sat}-6.25_C_LIIB_{placeholder}_{placeholder1}_{date}____.nc'

static fn_read_fmt(timestamp)[source]

Format a timestamp to search as YYYYMMDD*, for use in a regex that will match all files covering a single given date.

Parameters:: timestamp (datetime.datetime) – Timestamp to format
Returns:: Dictionary of formatted strings
Return type:: dict

grid = <fibgrid.realization.FibGrid object>

grid_cell_size = 5

grid_sampling_km = 6.25

sf_pattern = {'satellite_folder': 'metop_[abc]', 'year_folder': '{year}'}

static sf_read_fmt(timestamp)[source]: TODO: same as above

ts_dtype = dtype([('sat_id', 'i1'), ('as_des_pass', 'i1'), ('swath_indicator', 'i1'), ('backscatter_for', '<f4'), ('backscatter_mid', '<f4'), ('backscatter_aft', '<f4'), ('backscatter_std_for', '<f4'), ('backscatter_std_mid', '<f4'), ('backscatter_std_aft', '<f4'), ('incidence_angle_for', '<f4'), ('incidence_angle_mid', '<f4'), ('incidence_angle_aft', '<f4'), ('azimuth_angle_for', '<f4'), ('azimuth_angle_mid', '<f4'), ('azimuth_angle_aft', '<f4'), ('kp_for', '<f4'), ('kp_mid', '<f4'), ('kp_aft', '<f4'), ('n_echos_for', 'i1'), ('n_echos_mid', 'i1'), ('n_echos_aft', 'i1'), ('all_backscatter_for', '<f4'), ('all_backscatter_mid', '<f4'), ('all_backscatter_aft', '<f4'), ('all_backscatter_std_for', '<f4'), ('all_backscatter_std_mid', '<f4'), ('all_backscatter_std_aft', '<f4'), ('all_incidence_angle_for', '<f4'), ('all_incidence_angle_mid', '<f4'), ('all_incidence_angle_aft', '<f4'), ('all_azimuth_angle_for', '<f4'), ('all_azimuth_angle_mid', '<f4'), ('all_azimuth_angle_aft', '<f4'), ('all_kp_for', '<f4'), ('all_kp_mid', '<f4'), ('all_kp_aft', '<f4'), ('all_n_echos_for', 'i1'), ('all_n_echos_mid', 'i1'), ('all_n_echos_aft', 'i1')])

class ascat.read_native.xarray_io.CellGridCache[source]

Bases: object

Cache for CellGrid objects.

fetch_or_store(key, cell_grid_type=None, *args)[source]: Fetch a CellGrid object from the cache given a key, or store a new one.

class ascat.read_native.xarray_io.RaggedXArrayCellIOBase(source, engine, obs_dim='time', **kwargs)[source]

Bases: ABC

Base class for ascat xarray IO classes

source

Input filename(s).

Type:: str, Path, list

engine

Engine to use for reading/writing files.

Type:: str

close()[source]: Close file.

property date_range

Return date range of dataset.

Returns:: Date range of dataset.
Return type:: tuple

abstractmethod read(location_id=None, **kwargs)[source]

Read data from file. Should be implemented by subclasses.

Parameters:

location_id (int, list, optional) – Location id(s) to read.
**kwargs – Additional keyword arguments passed to the read method.

Returns:

Dataset containing the data for any specified location_id(s), or all location_ids in the file if none are specified.

Return type:

xarray.Dataset

abstractmethod write(filename, ra_type, **kwargs)[source]

Write data to file. Should be implemented by subclasses.

Parameters:

filename (str, Path) – Filename to write data to.
ra_type (str, optional) – Type of ragged array to write.
**kwargs – Additional keyword arguments passed to the write method.

class ascat.read_native.xarray_io.SwathIOBase(source, engine, **kwargs)[source]

Bases: ABC

Base class for reading swath data. Writes ragged array cell data in indexed or contiguous format.

beams_vars = []

classmethod chron_files(path)[source]: Return a ChronFiles object for this class type based on a path.

close()[source]: Close the dataset.

static combine_attributes(attrs_list, context)[source]

Decides which attributes to keep when merging swath files.

Parameters:

attrs_list (list of dict) – List of attributes dictionaries.
context (None) – This currently is None, but will eventually be passed information about the context in which this was called. (see https://github.com/pydata/xarray/issues/6679#issuecomment-1150946521)

contains_location_ids(location_ids=None, lookup_vector=None)[source]

Check if the dataset contains any of the given location_ids.

Parameters:: location_ids (list of int) – Location ids to check.
Returns:: True if the dataset contains any of the given location_ids, False otherwise.
Return type:: bool

abstractmethod static fn_read_fmt()[source]: TODO: figure out a sane way to describe what this does. Also decide if this /needs/ to be enforced. If the user doesn’t want to use all the filesearch functionality (or if they want to use their own filesearch logic), then they should still be able to use this class. They could of course override this and just return None, but that seems like a hack.

read(cell=None, location_id=None, mask_and_scale=True, lookup_vector=None)[source]

Returns data for a cell or location_id if specified, or for the entire swath file if not specified.

Parameters:

cell (int, optional) – Cell to read data for.
location_id (int, optional) – Location id to read data for.
mask_and_scale (bool, optional) – Whether to mask and scale the data. Default is True.

abstractmethod static sf_read_fmt()[source]: TODO: same as above

write(filename, mode='w', **kwargs)[source]

ascat.read_native.xarray_io.append_to_netcdf(filename, ds_to_append, unlimited_dim)[source]

Appends an xarray dataset to an existing netCDF file along a given unlimited dim.

Parameters:

filename (str or Path) – Filename of netCDF file to append to.
ds_to_append (xarray.Dataset) – Dataset to append.
unlimited_dim (str or list of str) – Name of the unlimited dimension to append along.

Raises:

ValueError – If more than one unlimited dim is given.

ascat.read_native.xarray_io.create_variable_encodings(ds, custom_variable_encodings=None, custom_dtypes=None)[source]

Create an encoding dictionary for a dataset, optionally overriding the default encoding or adding additional encoding parameters. New parameters cannot be added to default encoding for a variable, only overridden.

E.g. if you want to add a “units” encoding to “lon”, you should also pass “dtype”, “zlib”, “complevel”, and “_FillValue” if you don’t want to lose those.

Parameters:

ds (xarray.Dataset) – Dataset.
custom_variable_encodings (dict, optional) – Custom encodings.

Returns:

ds – Dataset with encodings.

Return type:

xarray.Dataset

ascat.read_native.xarray_io.get_swath_product_id(filename)[source]

Get product identifier from filename.

Parameters:: filename (str) – Filename.
Returns:: product_id – Product identifier.
Return type:: str

ascat.read_native.xarray_io.set_attributes(ds, variable_attributes=None, global_attributes=None)[source]

Parameters:

ds (xarray.Dataset, Path) – Dataset.
variable_attributes (dict, optional) – User-defined variable attributes to set. Should be a dictionary with format {“varname”: {“attr1”: “value1”, “attr2”: “value2”}, “varname2”: {“attr1”: “value1”}}
global_attributes (dict, optional) – User-defined global attributes to set. Should be a dictionary with format {“attr1”: “value1”, “attr2”: “value2”}

Returns:

ds – Dataset with variable_attributes.

Return type:

xarray.Dataset

ascat.read_native.xarray_io.trim_dates(ds, date_range)[source]

Trim dates of dataset to a given date range. Assumes the time variable is named “time”, and observation dimension is named “obs”

Parameters:

ds (xarray.Dataset) – Dataset.
date_range (tuple of datetime.datetime) – Date range to trim to.

Returns:

Dataset with trimmed dates.

Return type:

xarray.Dataset

ascat.read_native.xarray_io.var_order(ds)[source]

Returns a reasonable variable order for a ragged array dataset, based on that used in existing datasets.

Puts the count/index variable first depending on the ragged array type, then lon, lat, alt, location_id, location_description, and time, followed by the rest of the variables in the dataset.

Parameters:: ds (xarray.Dataset) – Dataset.
Returns:: order – List of dataset variable names in the determined order.
Return type:: list of str

ascat.read_native package

Submodules

ascat.read_native.base module

ascat.read_native.bufr module

ascat.read_native.cdr module

ascat.read_native.eps_native module

ascat.read_native.generate_test_data module

ascat.read_native.hdf5 module

ascat.read_native.nc module

ascat.read_native.ragged_array_ts module

ascat.read_native.xarray_io module

Module contents