ecmwf_models package

Submodules

ecmwf_models.cli module

ecmwf_models.extract module

ecmwf_models.extract.create_dt_fpath(dt, root, fname, subdirs=[])[source]

Create filepaths from root + fname and a list of subdirectories. fname and subdirs will be put through dt.strftime.

Parameters:

dt (datetime.datetime) – date as basis for the URL
root (string) – root of the filenpath
fname (string) – filename to use
subdirs (list, optional) – list of strings. Each element represents a subdirectory. For example the list [‘%Y’, ‘%m’] would lead to a URL of root/YYYY/MM/fname or for a dt of datetime(2000,12,31) root/2000/12/fname

Returns:

fpath – Full filename including path

Return type:

string

ecmwf_models.extract.save_gribs_from_grib(input_grib, output_path, product_name, keep_original=True, keep_prelim=True)[source]

Split the downloaded grib file into daily files and add to folder structure necessary for reshuffling.

Parameters:

input_grib (str) – Filepath of the downloaded .grb file
output_path (str) – Where to save the resulting grib files
product_name (str) – Name of the ECMWF model (only for filename generation)
keep_original (bool) – keep the original downloaded data too, before it is sliced into individual images.
keep_prelim (bool, optional (default: True)) – True to keep preliminary data from ERA5T with a different file name, or False drop these files and only keep the final records.

ecmwf_models.extract.save_ncs_from_nc(input_nc, output_path, product_name, grid=None, keep_original=True, remap_method='bil', keep_prelim=True)[source]

Split the downloaded netcdf file into daily files and add to folder structure necessary for reshuffling.

Parameters:

input_nc (str) – Filepath of the downloaded .nc file
output_path (str) – Where to save the resulting netcdf files
product_name (str) – Name of the ECMWF model (only for filename generation)
keep_original (bool) – keep the original downloaded data too, before it is sliced into individual images.
keep_prelim (bool, optional (default: True)) – True to keep preliminary data from ERA5T with a different file name, or False drop these files and only keep the final records.

ecmwf_models.extract.unzip_nc(input_zip, output_nc)[source]

Unzip and merge all netcdf files downloaded from CDS. If the zip file contains only 1 netcdf file, it only be extracted.

Parameters:

input_zip (str) – Path to the downloaded zip file containing one or more (datastream) netcdf files.
output_nc (str) – Path to the netcdf file to write

ecmwf_models.globals module

exception ecmwf_models.globals.CdoNotFoundError(msg=None)[source]: Bases: ModuleNotFoundError

exception ecmwf_models.globals.PygribNotFoundError(msg=None)[source]: Bases: ModuleNotFoundError

ecmwf_models.grid module

Common grid definitions for ECMWF model reanalysis products (regular gridded)

ecmwf_models.grid.ERA5_RegularImgLandGrid(resolution: float = 0.25, bbox: Tuple[float, float, float, float] = None, cellsize: float = 5.0) → CellGrid[source]

Uses the 0.25 DEG ERA5 land mask to create a land grid of the same size, which also excluded Antarctica.

Parameters:

resolution (float, optional (default: 0.25)) – Grid resolution in degrees. Either 0.25 (ERA5) or 0.1 (ERA5-Land)
bbox (tuple, optional (default: None)) – WGS84 (min_lon, min_lat, max_lon, max_lat) Values must be between -180 to 180 and -90 to 90 bbox to cut the global grid to
cellsize (float, optional (default: 5.)) – Chunk size of the grid

Returns:

landgrid – ERA Land grid at the given resolution, cut to the given bounding box

Return type:

CellGrid

ecmwf_models.grid.ERA_RegularImgGrid(resolution: float = 0.25, bbox: Tuple[float, float, float, float] = None, cellsize: float = 5.0) → CellGrid[source]

Create regular cell grid for bounding box with the selected resolution. GPI 0 is at Lon: 0, Lat: 90

Parameters:

resolution (float, optional (default: 0.25)) – Grid resolution (in degrees) in both directions. Either 0.25 (ERA5) or 0.1 (ERA5-Land)
bbox (tuple, optional (default: None)) – (min_lon, min_lat, max_lon, max_lat) wgs84 (Lon -180 to 180) bbox to cut the global grid to.
cellsize (float, optional (default: 5.)) – Cell chunking of the grid

Returns:

CellGrid – Regular, CellGrid with 5DEG*5DEG cells for the passed bounding box.

Return type:

CellGrid

ecmwf_models.grid.get_grid_resolution(lats: ~numpy.ndarray, lons: ~numpy.ndarray) -> (<class 'float'>, <class 'float'>)[source]: try to derive the grid resolution from given coords.

ecmwf_models.grid.safe_arange(start, stop, step)[source]

Like numpy.arange, but floating point precision is kept. Compare: np.arange(0, 100, 0.01)[-1] vs safe_arange(0, 100, 0.01)[-1]

Parameters:

start (float) – Start of interval
stop (float) – End of interval (not included)
step (float) – Stepsize

Returns:

arange – Range of values in interval at the given step size / sampling

Return type:

np.array

ecmwf_models.grid.trafo_lon(lon)[source]

0…360 -> 0…180…-180

Parameters:: lon (np.array) – Longitude array
Returns:: lon_transformed – Transformed longitude array
Return type:: np.array

ecmwf_models.interface module

Base classes for reading downloaded ERA netcdf and grib images and stacks

class ecmwf_models.interface.ERAGrbDs(root_path, product, parameter=None, subgrid=None, mask_seapoints=False, h_steps=(0, 6, 12, 18), array_1D=True)[source]

Bases: MultiTemporalImageBase

tstamps_for_daterange(start_date, end_date)[source]

Get datetimes in the correct sub-daily resolution between 2 dates

Parameters:

start_date (datetime) – Start datetime
end_date (datetime) – End datetime

Returns:

timestamps – List of datetime values (between start and end date) for all required time stamps.

Return type:

list

class ecmwf_models.interface.ERAGrbImg(filename, product, parameter=None, subgrid=None, mask_seapoints=False, array_1D=True, mode='r')[source]

Bases: ImageBase

close()[source]: Close file.

flush()[source]: Flush data.

read(timestamp=None)[source]

Read data from the loaded image file.

Parameters:: timestamp (datetime, optional (default: None)) – Specific date (time) to read the data for.

write(data)[source]

Write data to an image file.

Parameters:: image (object) – pygeobase.object_base.Image object

class ecmwf_models.interface.ERANcDs(root_path, product, parameter=None, subgrid=None, mask_seapoints=False, h_steps=(0, 6, 12, 18), array_1D=False)[source]

Bases: MultiTemporalImageBase

Reader to extract individual images from a multi-image netcdf dataset. The main purpose of this class is to use it in the time series conversion routine. To read downloaded image files, we recommend using xarray (https://docs.xarray.dev/en/stable/).

Parameters:

root_path (str) – Root path where image data is stored.
product (str) – ERA5 or ERA5-LAND
parameter (list[str] or str, optional (default: None)) – Parameter or list of parameters to read. None reads all available Parameters.
subgrid (ERA_RegularImgGrid or ERA_RegularImgLandGrid, optional) – Read only data for points of this grid. If None is passed, we read all points from the file. The main purpose of this parameter is when reshuffling to time series, to include only e.g. points over land.
mask_seapoints (bool, optional (default: False)) – All points that are not over land are replaced with NaN values. This requires that the land sea mask (lsm) parameter is included in the image files!
h_steps (tuple, optional (default: (0, 6, 12, 18))) – Time stamps available for each day. Numbers refer to full hours.
array_1D (bool, optional (default: True)) – Read data as 1d arrays. This is required when the passed subgrid is 1-dimensional (e.g. when only landpoints are read). Otherwise when a 2d (subgrid) is used, this switch means that the extracted image data is also 2-dimensional (lon, lat).

tstamps_for_daterange(start_date, end_date)[source]

Get datetimes in the correct sub-daily resolution between 2 dates

Parameters:

start_date (datetime) – Start datetime
end_date (datetime) – End datetime

Returns:

timestamps – List of datetimes

Return type:

list

class ecmwf_models.interface.ERANcImg(filename, product, parameter=None, subgrid=None, mask_seapoints=False, array_1D=False, mode='r')[source]

Bases: ImageBase

Reader for a single ERA netcdf file. The main purpose of this class is to use it in the time series conversion routine. To read downloaded image files, we recommend using xarray (https://docs.xarray.dev/en/stable/).

Parameters:

filename (str) – Path to the image file to read.
product (str) – ‘era5’ or ‘era5-land’
parameter (list or str, optional (default: ['swvl1', 'swvl2'])) – Name of parameters to read from the image file.
subgrid (ERA_RegularImgGrid or ERA_RegularImgLandGrid or None, optional) – Read only data for points of this grid. If None is passed, we read all points from the file. The main purpose of this parameter is when reshuffling to time series, to include only e.g. points over land.
mask_seapoints (bool, optional (default: False)) – Read the land-sea mask to mask points over water and set them to nan. This option needs the ‘lsm’ parameter to be in the file!
array_1D (bool, optional (default: False)) – Read data as list, instead of 2D array, used for reshuffling.
mode (str, optional (default: 'r')) – Mode in which to open the file, changing this can cause data loss. This argument should not be changed!

close()[source]: Close file.

flush()[source]: Flush data.

read(timestamp=None)[source]

Read data from the loaded image file.

Parameters:: timestamp (datetime, optional (default: None)) – Specific date (time) to read the data for.

write(data)[source]

Write data to an image file.

Parameters:: image (object) – pygeobase.object_base.Image object

class ecmwf_models.interface.ERATs(ts_path, grid_path=None, **kwargs)[source]

Bases: GriddedNcOrthoMultiTs

Time series reader for all reshuffled ERA reanalysis products in time series format (pynetcf OrthoMultiTs format) Use the read_ts(lon, lat) resp. read_ts(gpi) function of this class to read data for a location!

ecmwf_models.utils module

Utility functions for all data products in this package.

ecmwf_models.utils.assert_product(product: str) → str[source]

ecmwf_models.utils.check_api_ready() → bool[source]

Verify that the API is ready to be used. Otherwise raise an Error.

Returns:

api_ready: bool: True if api is ready

ecmwf_models.utils.default_variables(product='era5', format='dl_name')[source]

These variables are being downloaded, when None are passed by the user

Parameters:

product (str, optional (default: 'era5')) – Name of the era5 product to read the default variables for. Either ‘era5’ or ‘era5-land’.
format (str, optional (default: 'dl_name')) – ‘dl_name’ for name as in the downloaded image data ‘short_name’ for short name ‘long_name’ for long name

ecmwf_models.utils.get_default_params(name='era5')[source]

Read only lines that are marked as default variable in the csv file

Parameters:: name (str) – Name of the product to get the default parameters for

ecmwf_models.utils.get_first_last_image_date(path, start_from_last=True)[source]

Parse files in the given directory (or any subdir) using the passed filename template. props will contain all fields specified in the template. the datetime field is required and used to determine the last image date.

Parameters:

path (str) – Path to the directory containing the image files
start_from_last (bool, optional (default: True')) – Get date from last available file instead of the first available one.

Returns:

date – Parse date from the last found image file that matches fntempl.

Return type:

str

ecmwf_models.utils.img_infer_file_props(img_root_path: str, fntempl: str = '{product}_{type}_{datetime}.{ext}', start_from_last=False) → dict[source]

Parse file names to retrieve properties from fntempl. Does not open any files.

Parameters:

img_root_path (str) – Root directory where annual directories are located
fntempl (str, optional) – Filename template to parse filenames with
start_from_last (bool, optional) – Use the last available file instead of the first one.

ecmwf_models.utils.load_var_table(name='era5', lut=False)[source]

Load the variables table for supported variables to download.

Parameters:: lut (bool, optional (default: False)) – If set to true only names are loaded, so that they can be used for a LUT otherwise the full table is loaded

ecmwf_models.utils.lookup(name, variables)[source]: Search the passed elements in the lookup table, if one does not exists, print a Warning

ecmwf_models.utils.make_era5_land_definition_file(data_file, out_file, data_file_y_res=0.25, ref_var='lsm', threshold=0.5, exclude_antarctica=True)[source]

Create a land grid definition file from a variable within a downloaded, regular (netcdf) era5 file.

Parameters:

data_file (str) – Path to the downloaded file that contains the image that is used as the reference for creating the land definition file.
out_file (str) – Full output path to the land definition file to create.
data_file_y_res (float, optional (default: 0.25)) – The resolution of the data file in latitude direction.
ref_var (str, optional (default: 'lsm')) – A variable in the data_file that is the reference for the land definition. By default, we use the land-sea-mask variable.
threshold (float, optional (default: 0.5)) – Threshold value below which a point is declared water, and above (or equal) which it is declared a land-point. If None is passed, then a point is declared a land point if it is not masked (numpy masked array) in the reference variable.
exclude_antarctica (bool, optional (default: True)) – Cut off the definition file at -60° Lat to exclude Land Points in Antarctica.

ecmwf_models.utils.parse_filetype(inpath: str) → str[source]

Tries to find out the file type by parsing filenames in the passed directory.

Parameters:: inpath (str) – Input path where ERA data was downloaded to. Contains annual folders.
Returns:: product – Product name
Return type:: str

ecmwf_models.utils.parse_product(inpath: str) → str[source]

Tries to find out what product is stored in the path. This is done based on the name of the first file in the path that is found.

Parameters:: inpath (str) – Input path where ERA data was downloaded to. Contains annual folders.
Returns:: product – Product name
Return type:: str

ecmwf_models.utils.split_array(array, chunk_size)[source]

Split an array into chunks of a given size.

Parameters:

array (array-like) – Array to split into chunks
chunk_size (int) – Size of each chunk

Returns:

chunks – List of chunks

Return type:

list

ecmwf_models.utils.update_image_summary_file(data_path: str, other_props: dict = None, out_file=None)[source]

Summarize image metadata as yml file

Parameters:

data_path (str) – Root path to the image archive
other_props (dict, optional (default: None)) – Other properties to write into the yml file. E.g. download options to enable time series update.
out_file (str, optional (default: None)) – Path to summary file. File will be created/updated. If not specified, then data_path is used. If a file already exists, it will be overwritten.

ecmwf_models package

Subpackages

Submodules

ecmwf_models.cli module

ecmwf_models.extract module

ecmwf_models.globals module

ecmwf_models.grid module

ecmwf_models.interface module

ecmwf_models.utils module

Returns:

Module contents