pytRIBS Main Classes

class pytRIBS.classes.Land(input_file=None, meta=None)

Bases: LandProcessor

pytRIBS Land Class.

This class handles land-related data and options for the tRIBS model. It manages attributes related to land mapping, land tables, and land use grids.

Parameters:

input_file (str, optional) – Path to the input file containing the necessary options for initializing the Land class attributes.
meta (dict, optional) – Metadata associated with the Land instance.

Attributes:

landmapname (str) – The name of the land map file.
landtablename (str) – The name of the land table file.
lugrid (str) – The path to the LUGRID file.
optlanduse (int) – Option for land use.
optluintercept (int) – Option for land use interpolation.

class pytRIBS.classes.Mesh(preprocess_args=None, generate_mesh_args=None, input_file=None, meta=None)

Bases: object

A pytRIBS Mesh Class.

This class manages the creation and processing of mesh data for tRIBS simulations. It handles preprocessing of watershed and stream network data, and integrates with mesh generation routines. For more details see base classes and example below.

Parameters:

preprocess_args (tuple, optional) – Arguments for initializing the Preprocess class. Required if generate_mesh_args is provided.
generate_mesh_args (tuple, optional) – Arguments for initializing the GenerateMesh class. Must be provided if preprocess_args is given.
input_file (str, optional) – Path to the input file for initializing attributes.
meta (dict, optional) – Metadata associated with the mesh.

Attributes:

pointfilename (str) – The name of the file containing the mesh points.
graphfile (str) – The name of the file containing the mesh graph.
optmeshinput (int) – Option flag for mesh input processing.
graphoption (int) – Option for graph generation.
demfile (str) – The name of the file containing the Digital Elevation Model (DEM) data.
preprocess (Preprocess, optional) – An instance of the Preprocess class used for initial data extraction and processing.
mesh_generator (GenerateMesh, optional) – An instance of the GenerateMesh class used for mesh generation.

Example

To create and use an instance of the Mesh class: TODO UPDATE! >>> mesh = Mesh(preprocess_args=(arg1, arg2, arg3), generate_mesh_args=(arg4, arg5, arg6, arg7)) >>> print(mesh.pointfilename) ‘path/to/pointfile’ >>> print(mesh.demfile) ‘path/to/demfile’

class pytRIBS.classes.Met(input_file=None, meta=None)

Bases: MetProcessor

A pytRIBS Met Class.

This class handles the meteorological data for tRIBS simulations. It initializes various parameters related to meteorological stations, rain files, and other related metadata. The class is used to configure and manage the meteorological input options required for the simulation.

Parameters

metadict, optional: Metadata associated with the Met instance.

Attributes:

hydrometstations (str) – The path or name of the file containing hydrometeorological station data.
gaugestations (str) – The path or name of the file containing gauge station data.
hydrometbasename (str) – The base name for hydrometeorological data files.
rainfile (str) – The path or name of the file containing rainfall data.
hydrometgrid (str) – The path or name of the file containing the hydrometeorological grid data.
metdataoption (int) – Option flag for meteorological data processing.
rainsource (str) – The source of the rainfall data.
gaugebasename (str) – The base name for gauge data files.
rainextension (str) – The file extension for the rainfall data files.

class pytRIBS.classes.Model(input_file=None, met=None, land=None, soil=None, mesh=None, meta=None)

Bases: Infile, Shared, Aux, ModelProcessor, Preprocess, InOut

pytRIBS Model class.

This class provides access to the underlying framework of a tRIBS (TIN-based Real-time Integrated Basin Simulator) simulation. The Model class can be initialized at the top-level to facilitate model setup, simulation, post-processing, and can be used for manipulating and generating multiple simulations efficiently.

Parameters:

input_file (str, optional) – Path to a template .in file. Default is None.
met (object, optional) – pytRIBS Met Class object Default is None.
land (object, optional) – pytRIBS Land object. Default is None.
soil (object, optional) – pytRIBS Soil object. Default is None.
mesh (object, optional) – pytRIBS Mesh object. Default is None.
meta (dict, optional) – pytRIBS Meta object Default is None.

Attributes:

input_options (dict) – A dictionary of the necessary keywords for a tRIBS .in file.
model_input_file (str) – Path to a template .in file with the specified paths for model results, inputs, etc.

class pytRIBS.classes.Project(base_dir, name, epsg)

Bases: object

pytRIBS Project Class for managing directories and metadata in a specified root directory.

This class initializes with a base directory, a project name, and an EPSG code. It sets up a predefined set of directories for data, results, and various sub-categories. It also provides functionality to create these directories if they do not already exist.

Parameters:

base_dir (str) – The base directory path for the project.
name (str) – The name of the project.
epsg (int) – The EPSG code representing the coordinate system.

Attributes:

base_dir (str) – The base directory where all project-related directories will be created.
meta (dict) – A dictionary to store metadata about the project, including ‘Name’ and ‘EPSG’.
directories (dict) – A dictionary defining the structure of directories to be created within the base directory.

class pytRIBS.classes.Results(input_file, meta=None)

Bases: Infile, Shared, WaterBalance, Read, Viz, Evaluate

pytRIBS Results Class.

This class provides a framework for analyzing and visualizing individual tRIBS simulations. It takes an instance of the Simulation class and provides time-series and water balance analysis of the model results.

Parameters:

input_file (str, required) – Path to the input file containing the necessary options for initializing the Results class attributes.
meta (dict, optional) – Metadata associated with the Results instance.

Attributes:

options (dict) – A dictionary of input options for the tRIBS model run.
element (dict) – A dictionary for storing elements related to the results.
mrf (dict) – A dictionary containing mrf and waterbalance, which are initialized to None.
meta (dict, optional) – Metadata dictionary for additional information. Default is None.

class pytRIBS.classes.Soil(input_file=None, meta=None)

Bases: SoilProcessor

pytRIBS Soil Class.

This class handles soil-related data and options for the tRIBS model. It manages attributes related to soil mapping, soil tables, and groundwater files.

Parameters:

input_file (str, optional) – Path to the input file containing the necessary options for initializing the Soil class attributes.
meta (dict, optional) – Metadata associated with the Soil instance.

Attributes:

soilmapname (str) – The name of the soil map file.
soiltablename (str) – The name of the soil table file.
scgrid (str) – The path to the SCGRID file.
optsoiltype (int) – Option for soil type.
optgroundwater (int) – Option for groundwater.
optgwfile (int) – Option for groundwater file.
optbedrock (int) – Option for bedrock.
bedrockfile (str) – The path to the bedrock file.
gwaterfile (str) – The path to the groundwater file.

pytRIBS Base Classes

class pytRIBS.soil.soil.SoilProcessor

Methods for pytRIBS Soil Class.

compute_ks_decay(grid_input, output=None)

Produces a raster for the conductivity decay parameter f, following Ivanov et al., 2004.

Parameters:

grid_input (dict or str) –
If a dictionary, it should contain keys “depth” and “path” for each soil property. Depth should be provided in units of mm. The format of the dictionary list should follow this structure (from shallowest to deepest):
```
[{'depth': 25, 'path': 'path/to/25_mm_ks'},
 {...},
 {'depth': 800, 'path': 'path/to/800_mm_ks'}]
```
If a string is provided, it is treated as the path to a configuration file. The configuration file must be written in JSON format.
output (str) – Location to save the raster with the conductivity decay parameter f.

Returns:

This function saves the generated raster to the specified output location.

Return type:

None

Examples

To generate a raster using a dictionary for grid_input:

>>> grid_input = [{'depth': 25, 'path': 'path/to/25_mm_ks'},
...               {'depth': 800, 'path': 'path/to/800_mm_ks'}]
>>> output = "path/to/output_raster.tif"
>>> compute_ks_decay(grid_input, output)

To generate a raster using a configuration file:

>>> grid_input = "path/to/config_file.json"
>>> output = "path/to/output_raster.tif"
>>> compute_ks_decay(grid_input, output)

create_soil_map(grid_input, output=None)

Writes out an ASCII file with soil classes assigned by soil texture classification.

Parameters:

grid_input (list of dict or str) –
If a dictionary list, each dictionary should contain keys “grid_type” and “path” for each soil property. The format of the dictionary list is as follows:
```
[{'type': 'sand', 'path': 'path/to/sand_grid'},
 {'type': 'clay', 'path': 'path/to/clay_grid'}]
```
If a string is provided, it is treated as the path to a JSON configuration file containing grid types and output file paths.
output (str, optional) – The file path where the ASCII soil map will be saved. If not provided, the default output file will be used (‘soil_class.soi’).

Returns:

This function does not return any value. It writes an ASCII file with soil classifications to the specified output path.

Return type:

None

Examples

To create a soil map using a dictionary list:

>>> grid_input = [{'type': 'sand', 'path': 'path/to/sand_grid'},
...               {'type': 'clay', 'path': 'path/to/clay_grid'}]
>>> create_soil_map(grid_input, output="path/to/soil_map.asc")

To create a soil map using a configuration file:

>>> grid_input = "path/to/config_file.json"
>>> create_soil_map(grid_input)

static discrete_colormap(N, base_cmap=None)

generate_uniform_groundwater(watershed_boundary, value, filename=None)

Generates a uniform groundwater raster file within the specified watershed boundary.

This method creates a raster file with uniform groundwater values over the extent of the given watershed boundary. The raster file can be written to a specified filename or to a default filename from an attribute if no filename is provided.

Parameters:

watershed_boundary (GeoDataFrame) – A GeoDataFrame representing the watershed boundary. It should include a ‘bounds’ property to determine the raster extent.
value (float) – The uniform groundwater value to be written to the raster file.
filename (str, optional) – The path to the output file. If not provided, the filename will be retrieved from the gwaterfile attribute of the object.

Return type:

None

Notes

If filename is not provided, the method attempts to use the gwaterfile attribute from the object.
The raster file is written with a single cell covering the entire extent of the watershed boundary.
The raster format includes the number of columns, rows, and cell size, as well as the specified groundwater value.

Example

>>> obj.generate_uniform_groundwater(watershed_gdf, 10.0, 'output_file.txt')

Raises:: ValueError – If the filename cannot be determined and gwaterfile is not set in the object.

get_polaris_grids(bbox, depths, variables, stats, replace=False)

Retrieves data from the POLARIS database (Duke University), saves it as GeoTIFF files, and returns a list of paths to the downloaded files.

Parameters:

bbox (list of float) – The bounding box coordinates in the format [x1, y1, x2, y2], where: - x1 : float, minimum x-coordinate (longitude or easting) - y1 : float, minimum y-coordinate (latitude or northing) - x2 : float, maximum x-coordinate (longitude or easting) - y2 : float, maximum y-coordinate (latitude or northing)
depths (list of str) – List of soil depths to retrieve data for. Each depth should be specified as a string in the format ‘depth_min-depth_max’, e.g., ‘0-5cm’, ‘5-15cm’.
soil_vars (list of str) – List of soil variables to retrieve from the HTTP site. Examples include ‘bd’ (bulk density), ‘clay’, ‘sand’, ‘silt’, etc. For a full list of variables, see the readme documentation at http://hydrology.cee.duke.edu/POLARIS/PROPERTIES/v1.0/Readme.
stats (list of str) – List of statistics to compute for each variable and depth. Typically includes ‘mean’, but other quantiles or statistics may be available. For more information on prediction quantiles, see the ISRIC SoilGrids FAQ: https://www.isric.org/explore/soilgrids/faq-soilgrids.

Returns:

A list of file paths to the downloaded GeoTIFF files.

Return type:

list of str

Examples

To retrieve soil data for specific depths and variables within a bounding box:

>>> bbox = [387198, 3882394, 412385, 3901885]  # x1, y1, x2, y2 (e.g., UTM coordinates)
>>> depths = ['0-5cm', '5-15cm', '15-30cm', '30-60cm', '60-100cm']
>>> soil_vars = ['bdod', 'clay', 'sand', 'silt']
>>> stats = ['mean']
>>> file_paths = retrieve_soil_data(bbox, depths, soil_vars, stats)
>>> print(file_paths)
['path/to/downloaded_file_1.tif', 'path/to/downloaded_file_2.tif', ...]

get_soil_grids(bbox, depths, soil_vars, stats, replace=False)

process_polaris_parameters(grid_input, output_files, ks_only=False)

Writes ASCII grids for Ks, theta_s, theta_r, psib, and m by converting POLARIS gridded soil data into tRIBS-compatible formats and units.

Parameters:

grid_input (list of dict or str) –
If a dictionary list, each dictionary should contain the keys “type” and “path” for each soil property. The format of the dictionary list follows this structure:
```
[{'type': 'ksat', 'path': 'path/to/ksat_grid'},
 {'type': 'theta_s', 'path': 'path/to/theta_s_grid'},
 {'type': 'theta_r', 'path': 'path/to/theta_r_grid'},
 {'type': 'lambda', 'path': 'path/to/lambda_grid'},
 {'type': 'hb', 'path': 'path/to/hb_grid'}]
```
If a string is provided, it is treated as the path to a JSON configuration file.
output_files (list) – List of output file names for different soil properties. If ks_only=False, the list must have exactly 5 file names in this order: [‘Ks’, ‘theta_r’, ‘theta_s’, ‘psib’, ‘m’]. If ks_only=True, the list should contain only 1 file name for Ks.
ks_only (bool, optional) – If True, only write rasters for Ks. This is useful when processing multiple depths specifically for the compute_ks_decay function. Default is False.

Notes

This function performs specific physical conversions required to translate POLARIS probabilistic soil data to tRIBS inputs:

Ksat: Converted from log10(cm/hr) to arithmetic mm/hr.
Bubbling Pressure (hb/psib): Converted from log10(kPa) to arithmetic mm H2O.

Examples

To write all soil property grids for tRIBS:

>>> grid_input = [{'type': 'ksat', 'path': 'polaris/ksat_0-5_mean.tif'},
...               {'type': 'theta_s', 'path': 'polaris/thetas_0-5_mean.tif'},
...               {'type': 'theta_r', 'path': 'polaris/thetar_0-5_mean.tif'},
...               {'type': 'lambda', 'path': 'polaris/lamda_0-5_mean.tif'},
...               {'type': 'hb', 'path': 'polaris/hb_0-5_mean.tif'}]
>>> output = ['Ks.asc', 'theta_r.asc', 'theta_s.asc', 'psib.asc', 'm.asc']
>>> obj.process_polaris_parameters(grid_input, output)

To write only Ks raster (e.g., for deep layers):

>>> grid_input = [{'type': 'ksat', 'path': 'polaris/ksat_60-100_mean.tif'}]
>>> output = ['Ks_60-100cm.asc']
>>> obj.process_polaris_parameters(grid_input, output, ks_only=True)

process_raw_soil(grid_input, output=None, ks_only=False)

Writes ASCII grids for Ks, theta_s, theta_r, psib, and m from gridded soil data for sand, silt, clay, bulk density, and volumetric water content at 33 and 1500 kPa.

Parameters:

grid_input (list of dict or str) –

If a dictionary list, each dictionary should contain the keys “grid_type” and “path” for each soil property. The format of the dictionary list follows this structure:

[{'type': 'sand_fraction', 'path': 'path/to/grid'},
 {'type': 'silt_fraction', 'path': 'path/to/grid'},
 {'type': 'clay_fraction', 'path': 'path/to/grid'},
 {'type': 'bulk_density', 'path': 'path/to/grid'},
 {'type': 'vwc_33', 'path': 'path/to/grid'},
 {'type': 'vwc_1500', 'path': 'path/to/grid'}]

If a string is provided, it is treated as the path to a JSON configuration file.

output (list, optional) – List of output file names for different soil properties. The list should have exactly 5 file names corresponding to different soil properties.
ks_only (bool, optional) – If True, only write rasters for Ks. This is useful when using the compute_decay_ks function.

Notes

The grid_input key should contain a list of dictionaries, each specifying a grid type and its corresponding file path.
The output list should contain exactly 5 output file names for different soil properties.
The file paths in the grid_input list should be valid, and the output list should have the correct number of file names.

Examples

To write all soil property grids:

>>> grid_input = [{'type': 'sand_fraction', 'path': 'path/to/sand_grid'},
...               {'type': 'silt_fraction', 'path': 'path/to/silt_grid'},
...               {'type': 'clay_fraction', 'path': 'path/to/clay_grid'},
...               {'type': 'bulk_density', 'path': 'path/to/bulk_density_grid'},
...               {'type': 'vwc_33', 'path': 'path/to/vwc_33_grid'},
...               {'type': 'vwc_1500', 'path': 'path/to/vwc_1500_grid'}]
>>> output = ['ks_output.asc', 'theta_s_output.asc', 'theta_r_output.asc', 'psib_output.asc', 'm_output.asc']
>>> your_function_name(grid_input, output)

To write only Ks raster:

>>> grid_input = "path/to/config.json"
>>> output = ['ks_output.asc']
>>> your_function_name(grid_input, output, ks_only=True)

read_soil_table(textures=False, file_path=None)

Reads a Soil Reclassification Table Structure (*.sdt) file.

The .sdt file contains parameters such as: - ID, Ks, thetaS, thetaR, m, PsiB, f, As, Au, n, ks, Cs, and optionally soil texture.

The method reads the specified soil table file and returns a list of dictionaries representing the soil types and their associated parameters.

Parameters:

textures (bool, optional) – If True, the method will read and include texture classes in the returned data. Default is False.
file_path (str, optional) – The file path to the soil table (.sdt file). If not provided, it defaults to self.soiltablename[“value”]. If self.soiltablename[“value”] is also None, the method will print an error message and return None.

Returns:

A list of dictionaries, where each dictionary represents a soil type and its associated parameters. Each dictionary contains the following keys: - “ID” : str, soil type ID - “Ks” : float, saturated hydraulic conductivity - “thetaS” : float, saturated water content - “thetaR” : float, residual water content - “m” : float, parameter related to soil pore size distribution - “PsiB” : float, bubbling pressure - “f” : float, hydraulic decay parameter - “As” : float, saturated anisotropy ratio - “Au” : float, unsaturated anisotropy ratio - “n” : float, porosity of the soil - “ks” : float, volumetric heat conductivity - “Cs” : float, soil heat capacity - “Texture” : str, texture class (only if textures=True is passed)

If the file does not conform to the standard .sdt format or the number of soil types doesn’t match the specified count, an error message will be printed, and the function returns None.

Return type:

list of dict or None

Examples

Reading a soil table without textures:

>>> soil_list = read_soil_table(textures=False, file_path="path/to/soil_table.sdt")
>>> print(soil_list[0]["Ks"])
0.0001

Reading a soil table with textures:

>>> soil_list = read_soil_table(textures=True, file_path="path/to/soil_table_with_textures.sdt")
>>> print(soil_list[0]["Texture"])
'Sandy Loam'

run_soil_workflow(watershed, output_dir, source='ISRIC')

Executes the soil processing workflow for the given watershed.

This method performs a series of operations to process soil data, including filling missing values, processing raw soil grids, computing soil parameters, and generating soil maps. It assumes specific file structures and parameters for soil processing and outputs the results to the specified directory.

Parameters:

watershed (GeoDataFrame) – A GeoDataFrame representing the watershed boundary. It must contain a ‘bounds’ property for determining the spatial extent of the data.
output_dir (str) – The directory where output files will be saved.
source (str) – Specifies the source of gridded soil data. Currently there are two options: ISRIC or POLARIS, defaults to ISRIC.

Return type:

None

Notes

The method changes the current working directory to output_dir for processing and then restores the original directory.
Soil grids are processed for various depths and soil variables.
The method creates a soil map, writes a soil table file, and generates a configuration file (scgrid.gdf) with paths to the processed soil data.
Workflow steps:
1. Retrieves soil grid files based on the bounding box from the watershed GeoDataFrame.
2. Fills missing data in the soil grids.
3. Processes raw soil data for specified depths and variables.
4. Computes soil hydraulic conductivity decay parameters.
5. Creates a soil classification map.
6. Writes a soil table file with texture information.
7. Generates a configuration file for soil grid data (scgrid.gdf).

Examples

To run the soil processing workflow:

>>> obj.run_soil_workflow(watershed_gdf, '/path/to/output_dir')

Raises:: FileNotFoundError – If any of the required input files cannot be found.

static write_soil_table(soil_list, file_path, textures=False)

Writes out Soil Reclassification Table(*.sdt) file with the following format: #Types #Params ID Ks thetaS thetaR m PsiB f As Au n ks Cs

Parameters:

soil_list – List of dictionaries containing soil information specified by .sdt structure above.
file_path – Path to save *.sdt file.
textures – Optional True/False for writing texture classes to the .sdt file.

class pytRIBS.met.met.MetProcessor

Framework for Met Class. See classes.py

static clip_nldas_grid_mask_to_watershed(mask, watershed, epsg)

Clip a target GeoDataFrame (watershed) by each polygon in the pixel GeoDataFrame (mask), and reproject to the appropriate UTM zone.

This method clips the input watershed GeoDataFrame by the pixel polygons in the mask GeoDataFrame. It calculates the appropriate UTM zone for the watershed and reprojects the clipped geometries to UTM or Web Mercator based on location. The method then calculates the centroids and geographic coordinates (longitude, latitude) of the clipped geometries.

Parameters:

mask (geopandas.GeoDataFrame) – GeoDataFrame containing the pixel polygons from the NLDAS-2 grid.
watershed (geopandas.GeoDataFrame) – GeoDataFrame representing the watershed to be clipped by the pixel polygons.
epsg (int) – The EPSG code for the coordinate reference system used for geographic coordinates (longitude, latitude).

Returns:

clipped_watershedgeopandas.GeoDataFrame
The watershed GeoDataFrame clipped by the pixel polygons, reprojected to UTM coordinates, and containing additional columns for centroids (x, y), geographic coordinates (longitude, latitude), and area.
utm_crsstr
The EPSG code of the UTM zone or Web Mercator projection used for the clipped geometries.

Return type:

tuple

Notes

The method first checks the UTM zone of the watershed and determines if it spans multiple UTM zones or hemispheres.
If the watershed spans multiple UTM zones or hemispheres, the geometries are projected to Web Mercator (EPSG:3857).
The resulting clipped GeoDataFrame includes the centroid coordinates (x, y) in UTM and the geographic coordinates (longitude, latitude) after transforming from UTM to the given EPSG code.
The method also calculates the area of each clipped geometry, which can be used for thresholding in subsequent analysis.

convert_and_write_nldas_timeseries(list_dfs, station_coords, gmt, prefix=None, met_path=None, precip_path=None, orig_begin=None, orig_end=None)

Convert NLDAS-2 timeseries data to UTM coordinates and prepare for tRIBS input.

This function processes NLDAS-2 timeseries data from multiple stations, converts the coordinates to UTM, and prepares the data for tRIBS model input. The processed data is saved to meteorological and precipitation files in the specified directories.

Parameters:

list_dfs (list of pandas.DataFrame) – A list of DataFrames, each containing NLDAS-2 timeseries data with columns such as ‘date’, ‘psurf’, ‘wind_u’, ‘wind_v’, ‘temp’, ‘humidity’, ‘rsds’, and ‘prcp’.
station_coords (list of tuples) – A list of tuples, each containing the (longitude, latitude, elevation) for each station.
prefix (str) – Prefix for the output filenames.
met_path (str) – Directory path where meteorological files will be saved.
precip_path (str) – Directory path where precipitation files will be saved.
gmt (int) – GMT offset for the data.
utm_epsg (str) – EPSG code for the UTM coordinate system.

Returns:

This function does not return anything. The transformed timeseries data and station details are written to the specified output files.

Return type:

None

Notes

The function assumes that the input NLDAS-2 data is structured in a specific way and that the stations’ geographic coordinates (longitude, latitude) need to be converted to UTM coordinates using the provided EPSG code.

static create_nldas_grid_mask(ds, epsg=None)

Create polygons representing each pixel in a grid based on GeoTransform parameters.

This method generates a grid of polygons representing each pixel in the input xarray dataset, using the dataset’s spatial reference information. The resulting polygons are returned as a GeoDataFrame.

Parameters:

ds (xarray.Dataset) – The input dataset containing spatial reference information, including the GeoTransform parameters.
epsg (int, optional) – The EPSG code for the coordinate reference system. If provided, the resulting GeoDataFrame is set with this CRS.

Returns:

A GeoDataFrame containing polygons representing each pixel in the grid, with an optional CRS set if the epsg parameter is provided.

Return type:

geopandas.GeoDataFrame

Notes

The method extracts the GeoTransform parameters from the dataset to compute the coordinates of each pixel.
The polygons are created using the shapely.geometry.box function to define the bounding box of each pixel.
If the epsg parameter is provided, the resulting GeoDataFrame will be reprojected to the specified CRS.
The order of the top and bottom pixel coordinates is corrected if necessary, based on the GeoTransform.

static extract_nldas_timeseries(gridded_watershed, nldas_met_xarray, nldas_elev_xarray, threshold_area=0)

Extract NLDAS-2 timeseries data and station coordinates from a gridded watershed.

This function converts NLDAS-2 xarray datasets (meteorological and elevation) to pandas DataFrames for locations within a gridded watershed. The extracted timeseries data is filtered based on the specified threshold area, and the station coordinates (longitude, latitude, UTM x, UTM y, elevation) are returned along with the timeseries data.

Parameters:

gridded_watershed (geopandas.GeoDataFrame) – A GeoDataFrame representing the watershed with polygons for each sub-watershed.
nldas_met_xarray (xarray.Dataset) – The xarray dataset containing NLDAS-2 meteorological timeseries data.
nldas_elev_xarray (xarray.Dataset) – The xarray dataset containing NLDAS-2 elevation data.
threshold_area (float, optional) – The minimum area for a sub-watershed to be considered. Default is 0, meaning all sub-watersheds are included.

Returns:

The first element is a list of pandas DataFrames, each containing NLDAS-2 timeseries data for a sub-watershed.
The second element is a list of station coordinates, where each station is represented as [longitude, UTM x, latitude, UTM y, elevation].

Return type:

tuple of (list of pandas.DataFrame, list of list of float)

Notes

This function uses the ‘nearest’ method to select the closest data point in the NLDAS-2 xarray dataset for each sub-watershed.
The elevation data is extracted from nldas_elev_xarray and combined with the timeseries data.
The station coordinates include both geographic (longitude, latitude) and UTM coordinates (x, y).

static get_nldas_elevation(watershed, epsg)

Download the NLDAS-2 elevation grid as a NetCDF file and return it as an xarray Dataset.

This method downloads the NLDAS-2 elevation data from a specified URL, clips it to the extent of the provided watershed, reprojects it to the specified EPSG code, and returns the processed data as an xarray Dataset.

Parameters:

watershed (geopandas.GeoDataFrame) – The watershed for which the elevation data should be clipped.
epsg (int) – The EPSG code for the desired projection of the output data.

Returns:

The processed elevation data clipped to the watershed extent and reprojected. Returns None if there is an error during the download or processing.

Return type:

xarray.Dataset or None

Raises:

requests.exceptions.RequestException – If there is an error downloading the NLDAS-2 elevation file.
Exception – If there is any other error during the processing of the elevation data.

Notes

The NLDAS-2 elevation data is downloaded from NASA’s LDAS repository as a NetCDF file.
The downloaded dataset is processed to drop unnecessary variables, and the CRS is assigned using the epsg parameter.
The EPSG code 32662 (Equidistant Cylindrical projection) is used by default if no EPSG code is specified.
Caching the dataset or passing it as a variable rather than downloading it every time is a potential improvement.

static get_nldas_geom(geom, begin, end, epsg, write_path=None, **hyriver_env_vars)

Fetch NLDAS-2 data for a given geometry and time period, with optional caching and environment variable configuration.

This method retrieves NLDAS-2 data for a specified geometry and time range, using the pynldas2 library. It supports environment variables for controlling caching and verbosity, and optionally saves the resulting xarray dataset to a NetCDF file.

Parameters:

geom (str) – The geometry (as a Polygon or MultiPolygon) for which the data is being requested.
begin (str) – The start date for the data request in ‘YYYY-MM-DD’ format.
end (str) – The end date for the data request in ‘YYYY-MM-DD’ format.
epsg (int) – The EPSG code for the coordinate reference system of the geometry.
write_path (str, optional) – The file path where the resulting xarray dataset should be saved as a NetCDF file. If not provided, the dataset is not saved to a file.
**hyriver_env_vars (dict, optional) – Additional keyword arguments representing environment variables to control request/response caching and verbosity. Supported variables include: - HYRIVER_CACHE_NAME: Path to the caching SQLite database for asynchronous HTTP requests. - HYRIVER_CACHE_NAME_HTTP: Path to the caching SQLite database for HTTP requests. - HYRIVER_CACHE_EXPIRE: Expiration time for cached requests in seconds. - HYRIVER_CACHE_DISABLE: Disable reading/writing from/to the cache. - HYRIVER_SSL_CERT: Path to an SSL certificate file.

Returns:

The dataset containing the NLDAS-2 data for the specified geometry and time period.

Return type:

xarray.Dataset

Raises:

Exception – If an error occurs during the data retrieval or saving process.

Notes

This method uses the NLDAS-2.get_bygeom function from the pynldas2 library to fetch NLDAS-2 data for the specified geometry.
The geometry is automatically converted to a MultiPolygon if it is provided as a Polygon.
The HyRiver library should be cited as follows: Chegini T, Li H-Y, Leung LR. 2021. HyRiver: Hydroclimate Data Retriever. Journal of Open Source Software 6: 3175. DOI: 10.21105/joss.03175.
If the write_path is specified, the resulting dataset is saved as a NetCDF file at the given location.

get_nldas_point(centroids, begin, end, epsg=None)

Fetch NLDAS-2 forcing data from NASA Giovanni for specific coordinates.

This method handles authentication via Earthdata (creating a .netrc file if needed), retrieves a session token, and downloads timeseries data directly from the Giovanni API.

Prerequisites: 1. An Earthdata Login account. 2. The ‘NASA GESDISC DATA ARCHIVE’ application must be authorized in your Earthdata profile.

Parameters:

centroids (list of tuples or list of lists) – Coordinates [(x, y), …] in the projection specified by ‘epsg’.
begin (str) – Start date ‘YYYY-MM-DD’.
end (str) – End date ‘YYYY-MM-DD’.
epsg (int, optional) – The EPSG code of the input centroids. Defaults to self.meta[‘EPSG’] if None.

Returns:

Combined dataframe of all variables.

Return type:

pandas.DataFrame

polygon_centroid_to_geographic(polygon, utm_crs=None, geographic_crs='EPSG:4326'): Helper function from Aux Class

run_met_workflow(watershed, begin, end, elev=None)

Execute the meteorological data workflow for a given watershed.

This method performs the following steps: - Calculates the geographic centroid of the provided watershed. - Retrieves meteorological data for the centroid from the NLDAS-2 dataset. - Converts and writes the NLDAS-2 time series data to the specified format.

Parameters:

watershed (shapely.geometry.Polygon) – A Shapely polygon representing the watershed area. The geographic centroid of this polygon is used for data retrieval.
begin (str) – The start date for the meteorological data retrieval, in ‘YYYY-MM-DD’ format.
end (str) – The end date for the meteorological data retrieval, in ‘YYYY-MM-DD’ format.
elev (float) – The elevation of the watershed centroid, used in the data processing.

Returns:

A DataFrame containing the retrieved and processed NLDAS-2 meteorological data for the specified centroid.

Return type:

pandas.DataFrame

Notes

The geographic centroid of the watershed is calculated and used to retrieve the meteorological data from the NLDAS-2 dataset.
The NLDAS-2 time series data is retrieved for the specified time range, processed, and written to the appropriate format using the convert_and_write_nldas_timeseries method.
A cache is used for the NLDAS-2 data retrieval to improve efficiency.

class pytRIBS.land.land.LandProcessor

static classify_vegetation_height(raster_path, thresholds, output_path, plot_result=True)

Classifies vegetation height raster based on user-defined thresholds.

Parameters:

raster_path (str) – Path to the input tree height raster.
thresholds (list of tuple) –
Each tuple defines a range (min, max, class) and its class value. For example: [(0, 5, 1), (5, 10, 2), (10, 15, 3)] will classify heights from 0-5 as class 1, 5-10 as class 2, and so on. - The min and max values must be increasing. - On the first iteration, min is allowed to equal max; on subsequent iterations,

min must be greater than the previous max.
output_path (str) – Path to save the classified raster.
plot_result (bool, optional) – If True, the classified image will be plotted. Default is True.

Returns:

classified_image (np.ndarray) – The classified raster array.
class_list (list of dict) – List of class attributes, where each dictionary represents a class range and its value.

Raises:

ValueError – If the min value is not greater than the previous max value, or if min equals max on any iteration other than the first.

Examples

To classify vegetation height based on predefined thresholds:

>>> thresholds = [(0, 5, 1), (5, 10, 2), (10, 15, 3)]
>>> classified_data, class_list = classify_vegetation_height(
...     raster_path="path/to/raster.tif",
...     thresholds=thresholds,
...     output_path="path/to/output.tif"
... )
>>> print(classified_data.shape)

create_gdf_content(parameters, watershed)

Create a dictionary containing geographic and parameter information for a watershed.

This function computes the geographic centroid of the given watershed polygon, converts it to geographic coordinates (latitude and longitude), and then creates a dictionary containing the number of parameters, the centroid’s geographic location, the GMT time zone, and a list of parameters.

Parameters:

parameters (list of list) – A list where each element is a list containing: - parameter name (str): The name of the parameter (e.g., ‘VH’). - raster path (str): The file path to the specified raster. - file extension (str): The extension of the raster file (e.g., ‘.tif’).
watershed (GeoDataFrame or shapely.geometry.Polygon) – The watershed boundary used to compute the centroid and derive geographic coordinates.

Returns:

A dictionary containing the following key-value pairs: - ‘Number of Parameters’ : int

The number of parameters provided.

’Latitude’float
The latitude of the watershed centroid in geographic coordinates.
’Longitude’float
The longitude of the watershed centroid in geographic coordinates.
’GMT Time Zone’int
The GMT time zone derived from the geographic coordinates.
’Parameters’list
The original list of parameters provided.

Return type:

dict

Examples

To create a dictionary of geographic and parameter information for a watershed:

>>> parameters = [
...     ['VH', 'path/to/raster1.tif', '.tif'],
...     ['NDVI', 'path/to/raster2.tif', '.tif']
... ]
>>> watershed = geopandas.read_file('path/to/watershed.shp')
>>> gdf_content = create_gdf_content(parameters, watershed)
>>> print(gdf_content['Latitude'], gdf_content['Longitude'])

static discrete_colormap(N, base_cmap=None)

static unsupervised_classification_naip(image_path, output_file_path, method='NDVI', n_clusters=4, plot_result=True)

Perform unsupervised classification on a NAIP image using K-means clustering.

Parameters:

image_path (str) – Path to the NAIP image file.
output_file_path (str) – Path to save the classified image.
method (str, optional) – Method to use for classification, either ‘NDVI’ or ‘true_color’. Default is ‘NDVI’.
n_clusters (int, optional) – Number of clusters for K-means clustering. Default is 5.
plot_result (bool, optional) – If True, the classified image will be plotted. Default is True.

Returns:

The classified image with the same dimensions as the input image.

Return type:

np.ndarray

Examples

To classify an image using NDVI and 5 clusters:

>>> classified_image = perform_kmeans_classification("path/to/naip_image.tif", "path/to/output.tif")
>>> print(classified_image.shape)

To classify an image using true color with 3 clusters and no plotting:

>>> classified_image = perform_kmeans_classification("path/to/naip_image.tif", "path/to/output.tif", method='true_color', n_clusters=3, plot_result=False)

static update_landfiles_with_dates(file_path, date_str)

class pytRIBS.mesh.mesh.Preprocess(outlet, snap_distance, threshold_area, dem_path, verbose_mode, meta, dir_proccesed=None)

A class for preprocessing digital elevation models (DEMs) and extracting watershed and stream network data.

This class provides methods for preparing DEMs, setting up metadata, and processing elevation data to extract watershed and stream network features. It uses the WhiteboxTools library for various geospatial operations and supports setting verbose mode for logging.

Parameters:

outlet (tuple of float) – A tuple containing the outlet coordinates (longitude, latitude) used in the processing.
snap_distance (float) – The distance used for snapping outlet points to the nearest flow path.
threshold_area (float) – The area threshold used to identify streams in the flow accumulation raster.
dem_path (str) – The file path to the digital elevation model (DEM) to be processed.
verbose_mode (bool) – Flag to enable verbose mode for WhiteboxTools, controlling the amount of log output.
meta (dict) – A dictionary containing metadata for the project, including ‘Name’, ‘EPSG’, and ‘Scenario’.
dir_proccesed (str, optional) – Directory path for saving processed files. If not provided, defaults to ‘preprocessing’.

Raises:

OSError – If there is an issue creating directories or accessing files.
ValueError – If the EPSG code of the DEM does not match the project metadata and cannot be reconciled.

Attributes:

outlet (tuple of float) – The outlet coordinates used in the processing.
snap_distance (float) – The distance used for snapping outlet points to the nearest flow path.
threshold_area (float) – The area threshold used to identify streams.
dem_preprocessing (str) – The absolute path to the digital elevation model (DEM) file.
output_dir (str) – The directory where processed files are saved.
meta (dict) – Metadata for the project, including ‘Name’, ‘EPSG’, and ‘Scenario’.
wbt (WhiteboxTools) – Instance of WhiteboxTools for performing geospatial operations.

Examples

To initialize and preprocess a DEM:

>>> meta_data = {'Name': 'Watershed Project', 'EPSG': 4326, 'Scenario': 'Base'}
>>> dem_preprocessor = DEMPreprocessor(
...     outlet=(45.1234, -120.5678),
...     snap_distance=100.0,
...     threshold_area=50.0,
...     dem_path="path/to/dem.tif",
...     verbose_mode=True,
...     meta=meta_data
... )
>>> dem_preprocessor.process()

clip_rasters(raster_list, watershed_boundary, method='boundary', output_dir=None)

Clip a list of rasters using the watershed boundary polygon or extent.

This method clips a list of rasters (e.g., filled DEM, flow direction raster, stream raster) using the provided watershed boundary polygon. The clipping can be performed based on the polygon boundary, the bounding box (extent), or both. The clipped rasters are saved to the specified output directory.

Parameters:

raster_list (list of str) – A list of file paths to the rasters that need to be clipped.
watershed_boundary (str) – The file path to the watershed boundary shapefile, which is used to clip the rasters.
method (str, optional) – The clipping method to use. Options are: - ‘boundary’: Clip rasters using the polygon boundary. - ‘extent’: Clip rasters using the bounding box of the polygon. - ‘both’: Perform both boundary and extent clipping. Default is ‘boundary’.
output_dir (str, optional) – The directory where the clipped rasters will be saved. If not provided, the default directory {output_dir}/ will be used.

Returns:

This method saves the clipped rasters to the specified output directory.

Return type:

None

clip_streamline(stream_shapefile, watershed_boundary, output_path=None)

Clip the streamlines shapefile by the watershed boundary polygon.

This method clips the streamlines from the provided stream shapefile using the watershed boundary polygon. The clipped streamlines are saved to the specified output path, or a default path based on the project metadata if no path is provided.

Parameters:

stream_shapefile (str) – The file path to the streamlines shapefile to be clipped.
watershed_boundary (str) – The file path to the watershed boundary shapefile, which will be used to clip the streamlines.
output_path (str, optional) – The file path where the clipped streamlines shapefile will be saved. If not provided, the default path will be ‘{output_dir}/{project_name}_streams.shp’.

Returns:

The absolute path to the saved clipped streamlines shapefile.

Return type:

str

convert_stream_raster_to_vector(stream_raster, flow_direction_raster, output_path=None)

Create a shapefile of the stream network from the stream raster.

This method converts a stream raster into a vector format (shapefile) using the flow direction raster. The resulting stream network is saved to the specified output path, or a default path based on the project metadata if no path is provided.

Parameters:

stream_raster (str) – The file path to the stream raster, which contains the stream network data.
flow_direction_raster (str) – The file path to the flow direction raster (D8 method) used to convert the stream raster to vector format.
output_path (str, optional) – The file path where the stream network shapefile will be saved. If not provided, the default path will be ‘{output_dir}/{project_name}_streams.shp’.

Returns:

The absolute path to the saved stream network shapefile.

Return type:

str

create_outlet(x, y, flow_accumulation_raster, snap_distance, output_path=None)

Create a shapefile of the pour point (outlet) using specified coordinates, and snap it to the nearest flow path.

This method takes the pour point coordinates (x, y), creates a shapefile for the pour point, and snaps the point to the nearest flow path within the specified snapping distance using the flow accumulation raster.

Parameters:

x (float) – The x-coordinate (longitude or easting) of the pour point.
y (float) – The y-coordinate (latitude or northing) of the pour point.
flow_accumulation_raster (str) – The file path to the flow accumulation raster used for snapping the pour point.
snap_distance (float) – The distance used to snap the pour point to the nearest flow path.
output_path (str, optional) – The file path where the outlet shapefile will be saved. If not provided, the default path will be ‘{output_dir}/{project_name}_outlet.shp’.

Returns:

The absolute path to the saved shapefile of the snapped pour point.

Return type:

str

extract_watershed_and_stream_network(outlet_path, boundary_path, output_streams_path, clean=True)

Extract the watershed and stream network from elevation data and save the results to specified paths.

This method processes elevation data to extract both the watershed and stream network. It performs a series of operations, including filling depressions, generating flow direction and accumulation rasters, identifying streams, and creating/clipping the watershed boundary and stream network. The results are saved to the provided file paths. Optionally, temporary files and directories can be cleaned up after processing.

Parameters:

outlet_path (str) – The file path where the outlet raster will be saved.
boundary_path (str) – The file path where the watershed boundary shapefile will be saved.
output_streams_path (str) – The file path where the clipped stream network shapefile will be saved.
clean (bool, optional) – If True, temporary files and directories will be cleaned up after processing. Defaults to True.

Return type:

None

Raises:

OSError – If there is an issue creating directories or accessing files.
ValueError – If the input parameters are not correctly specified.

fill_depressions(output_path=None)

Fill depressions (sinks) in the DEM within the watershed and save the filled DEM to the specified location.

This method uses the WhiteboxTools library to fill depressions (sinks) in the digital elevation model (DEM). After the depressions are filled, the DEM is saved to the provided output path or a default path based on the project metadata.

Parameters:: output_path (str, optional) – The file path where the filled DEM will be saved. If not provided, the default path will be ‘{output_dir}/{project_name}_filled.tif’.
Returns:: The absolute path to the saved filled DEM file.
Return type:: str

Examples

To fill depressions and save the result to a specified file:

>>> filled_dem_path = dem_preprocessor.fill_depressions(output_path='path/to/filled_dem.tif')
>>> print(f"Filled DEM saved at: {filled_dem_path}")

To use the default output path for saving the filled DEM:

>>> filled_dem_path = dem_preprocessor.fill_depressions()
>>> print(f"Filled DEM saved at: {filled_dem_path}")

generate_flow_accumulation_raster(flow_direction_raster, output_path=None)

Generate a flow accumulation raster using the flow direction raster obtained from the D8 method.

This method creates a flow accumulation raster based on the flow direction raster, which was generated using the D8 method. The generated flow accumulation raster is saved to the provided output path, or a default path based on the project metadata if no path is specified.

Parameters:

flow_direction_raster (str) – The file path to the flow direction raster (D8 method) used to compute the flow accumulation.
output_path (str, optional) – The file path where the flow accumulation raster will be saved. If not provided, the default path will be ‘{output_dir}/{project_name}_flow_acc.tif’.

Returns:

The absolute path to the saved flow accumulation raster.

Return type:

str

generate_flow_direction_raster(filled_dem, output_path=None)

Generate a flow direction raster using the D8 method and save it to the specified location.

This method uses the WhiteboxTools library to create a flow direction raster based on the D8 flow algorithm. The generated raster is saved to the provided output path, or a default path based on the project metadata if no path is specified.

Parameters:

filled_dem (str) – The file path to the filled DEM (digital elevation model) that will be used to compute flow directions.
output_path (str, optional) – The file path where the flow direction raster will be saved. If not provided, the default path will be ‘{output_dir}/{project_name}_d8.tif’.

Returns:

The absolute path to the saved flow direction raster.

Return type:

str

generate_streams_raster(flow_accumulation_raster, threshold_area, output_path=None)

Generate a stream raster from the flow accumulation raster based on the specified stream threshold.

This method creates a stream raster by extracting streams from the flow accumulation raster. The streams are identified using the provided threshold area. The generated stream raster is saved to the specified output path, or a default path based on the project metadata if no path is provided.

Parameters:

flow_accumulation_raster (str) – The file path to the flow accumulation raster used to extract streams.
threshold_area (float) – The area threshold (in map units) used to define streams in the flow accumulation raster.
output_path (str, optional) – The file path where the stream raster will be saved. If not provided, the default path will be ‘{output_dir}/{project_name}_stream.tif’.

Returns:

The absolute path to the saved stream raster.

Return type:

str

generate_watershed_boundary(watershed_mask, output_path=None)

Generate the watershed boundary shapefile from the watershed raster.

This method extracts the watershed boundary from a given watershed mask (raster) and saves the boundary as a shapefile. The watershed boundary is created by converting raster values to vector shapes, and the result is saved to a specified output path or a default path based on the project metadata if no path is provided.

Parameters:

watershed_mask (str) – The file path to the watershed mask raster, which is used to delineate the watershed boundary.
output_path (str, optional) – The file path where the watershed boundary shapefile will be saved. If not provided, the default path will be ‘{output_dir}/{project_name}_watershed_bound.tif’.

generate_watershed_mask(flow_direction_raster, pour_point_shp, output_path=None)

Delineate the watershed in the form of a raster using the pour point shapefile and the flow direction raster.

This method uses the pour point shapefile and the flow direction raster (D8 flow direction) to delineate the watershed area. The result is saved as a raster, which can be stored in a specified location or a default location based on the project metadata if no path is provided.

Parameters:

flow_direction_raster (str) – The file path to the flow direction raster (D8 method) used to delineate the watershed.
pour_point_shp (str) – The file path to the shapefile containing the pour point(s) used to define the watershed outlet.
output_path (str, optional) – The file path where the watershed mask raster will be saved. If not provided, the default path will be ‘{output_dir}/{project_name}_watershed_msk.tif’.

Returns:

The absolute path to the saved watershed mask raster.

Return type:

str

class pytRIBS.mesh.mesh.GenerateMesh(path_to_raster, path_to_watershed, path_to_stream_network, path_to_outlet, maxlevel=None, feature_method='curvature')

A class for generating a locally refined triangular irregular network (TIN) mesh from raster data, watershed boundaries, and stream networks.

This class performs several operations to generate the TIN mesh, including extracting raster and wavelet information, loading watershed boundaries, stream networks, and outlet points, and utilizing wavelet decomposition to refine the mesh and incorporate significant details.

Parameters:

path_to_raster (str) – Path to the raster file used for extracting elevation data and wavelet analysis.
path_to_watershed (str) – Path to the shapefile containing watershed boundary data.
path_to_stream_network (str) – Path to the shapefile containing stream network data.
path_to_outlet (str) – Path to the shapefile containing outlet point data.
maxlevel (int, optional) – Maximum level for wavelet decomposition. If None, the maximum level is determined from the data.

Attributes:

normalizing_coeff (float, optional) – Coefficient used for normalizing wavelet coefficients, initially None.
raster (str) – Path to the raster file.
maxlevel (int, optional) – Maximum level for wavelet decomposition.
watershed (GeoDataFrame) – GeoDataFrame containing watershed boundaries.
stream_network (GeoDataFrame) – GeoDataFrame containing stream network data.
outlet (GeoDataFrame) – GeoDataFrame containing outlet points.

convert_coords_to_mesh(coords)

Converts a set of coordinates into a 2D mesh using Delaunay triangulation.

This method takes an array of coordinates and converts them into a 2D mesh. The input coordinates include x, y, and z values, along with an optional boundary code. While the boundary codes label points in the mesh, they currently do not influence the mesh creation.

Parameters:

coords (numpy.ndarray) – An array where each row represents a point with x, y, z coordinates and a boundary code.

Returns:

A 2D mesh generated from the input coordinates using Delaunay triangulation.

Return type:

pyvista.PolyData

Raises:

IndexError – If the input coords array does not have at least four columns.
ValueError – If the input coords array is empty or has incorrect dimensions.

Notes

The boundary codes provided in the coords array are stored with the mesh but do not currently affect the triangulation process. Elevation data (z-values) are included in the mesh as a scalar field, which could be used for subsequent analysis.

static convert_points_to_gdf(points): Helper function for converting points generated from extract_points_from_significant_details to a geopandas data frame.

static distance_to_nearest_n(points, n=6)

Calculates the average distance to the nearest n points for each point in a list.

This function uses a KD-tree to efficiently compute the distances between points and determine the distances to the n nearest neighbors for each point. It returns the median and maximum distances to these neighbors.

Parameters:

points (list of tuples) – A list of (x, y) coordinates representing the points to analyze.
n (int, optional) – The number of nearest neighbors to consider for each point. Defaults to 6.

Returns:

float : The median distance to the nearest n points for each point.
float : The maximum distance to the nearest n points for each point.

Return type:

tuple of (float, float)

Raises:

ValueError – If the input points is empty or if n is less than 1.
TypeError – If points is not a list of tuples or if n is not an integer.

Notes

This method computes distances using a KD-tree, ensuring efficient calculations even with large datasets. It calculates the distance to the nearest n points for each point in the input list, then returns the median and maximum distances across all points.

extract_points_from_significant_details(threshold)

Extracts significant points from wavelet decomposition details and processes them.

This method identifies significant details from wavelet decomposition at various levels, filters closely spaced points, and generates additional points along stream paths. It combines these points with boundary codes and interpolated elevations and returns the final set of points along with a buffered watershed geometry.

Parameters:

threshold (float) – The minimum threshold for identifying significant details in the wavelet decomposition.

Returns:

numpy.ndarray : Array of points with their x and y coordinates, elevations, and boundary codes.
geometry : Buffered watershed geometry, which can be used for further spatial analysis.

Return type:

tuple of (numpy.ndarray, geometry)

Raises:

AttributeError – If self.wavelet_packet, self.maxlevel, or other required attributes are not properly set.
ValueError – If there are issues with the processing or filtering of points.

Notes

This method first identifies the significant detail points in the wavelet decomposition at various levels, then removes any points closer than the raster resolution. It generates points along stream paths and combines all points with interpolated elevations and boundary codes. The method ensures that the points are unique and returns them along with a buffered watershed geometry.

Examples

>>> points, buffered_watershed = obj.extract_points_from_significant_details(threshold=0.5)
>>> print(points)
[[x1, y1, elevation1, boundary_code1], [x2, y2, elevation2, boundary_code2], ...]

>>> print(buffered_watershed)
<Polygon geometry>

extract_raster_and_wavelet_info()

Extracts information from a raster file and performs wavelet decomposition on the raster data.

This method reads the first band of the raster file, extracts its spatial information (coordinates, transformation, bounds, width, and height), and computes the corresponding meshgrids of x and y coordinates. It then performs a 2D wavelet decomposition on the raster data and stores the resulting wavelet packet. The wavelet decomposition level is set or updated.

Parameters:

None

Attributes:

data (numpy.ndarray) – The first band of the raster file as a 2D array.
transform (affine.Affine) – The affine transformation matrix for the raster file.
bounds (rasterio.coords.BoundingBox) – The bounding box of the raster file.
width (int) – The width of the raster in pixels.
height (int) – The height of the raster in pixels.
x_grid (numpy.ndarray) – The x-coordinates of the raster data as a meshgrid.
y_grid (numpy.ndarray) – The y-coordinates of the raster data as a meshgrid, adjusted based on the transformation.
wavelet_packet (pywt.WaveletPacket2D) – The wavelet packet object containing the decomposed data.
maxlevel (int) – The maximum level for wavelet decomposition, updated if necessary.
normalizing_coeff (float) – The coefficient used to normalize wavelet coefficients, computed by find_max_maximum_coeffs().

Raises:

None –

Notes

The y-coordinates (y_grid) are flipped if the transformation requires it.

filter_coords_within_geometry(coords, buffer_distance)

Filters and categorizes coordinates based on their position relative to a buffered watershed geometry.

Parameters:

coords (numpy.ndarray) – An array of shape (N, 2) representing coordinates to be filtered.
buffer_distance (float) – The distance to buffer the watershed geometry.

Returns:

all_coords (numpy.ndarray) – An array of filtered coordinates that include original, boundary, and inner boundary points.
all_boundary_codes (numpy.ndarray) – An array of boundary codes corresponding to the filtered coordinates. Codes indicate whether a coordinate is part of the original watershed (0), the outer boundary (1), or an updated outlet point (2).
buffered_watershed (shapely.geometry.Polygon) – The buffered version of the original watershed geometry used for filtering.

Notes

The function processes the original watershed geometry by applying a specified buffer distance and generates boundary points. It finds coordinates within the original watershed and boundary points outside of it, adjusts coordinates based on proximity to the updated outlet, and categorizes coordinates with boundary codes.

Coordinates within the original watershed are assigned a boundary code of 0.
Coordinates on the outer boundary are assigned a boundary code of 1.
The outlet is adjusted to the nearest boundary point and given a boundary code of 2.

find_max_average_coeffs()

Calculates the maximum average coefficient from the wavelet packet decomposition.

This method computes the average of the absolute values of the vertical, horizontal, and diagonal wavelet coefficients at each level of the wavelet decomposition. It then finds and returns the maximum of these average coefficients across all decomposition levels.

The method uses the self.wavelet_packet attribute, which should be an instance of pywt.WaveletPacket2D containing the decomposed data, and self.maxlevel, which defines the maximum level of decomposition.

Returns:

The maximum average coefficient across all levels of the wavelet decomposition.

Return type:

float

Raises:

AttributeError – If self.wavelet_packet or self.maxlevel is not properly set.
ValueError – If there are issues with the wavelet coefficients or their dimensions.

Notes

The wavelet packet decomposition must be performed before calling this method. The method averages the absolute values of the coefficients for the vertical (‘v’), horizontal (‘h’), and diagonal (‘d’) details at each level of the decomposition.

find_max_maximum_coeffs()

Calculates the maximum coefficient from the wavelet packet decomposition.

This method computes the maximum of the absolute values of the vertical, horizontal, and diagonal wavelet coefficients at each level of the wavelet decomposition. It then finds and returns the maximum of these coefficients across all decomposition levels.

The method uses the self.wavelet_packet attribute, which should be an instance of pywt.WaveletPacket2D containing the decomposed data, and self.maxlevel, which defines the maximum level of decomposition.

Returns:

The maximum coefficient across all levels of the wavelet decomposition.

Return type:

float

Raises:

AttributeError – If self.wavelet_packet or self.maxlevel is not properly set.
ValueError – If there are issues with the wavelet coefficients or their dimensions.

Notes

The wavelet packet decomposition must be performed before calling this method. The method computes the maximum of the absolute values of the coefficients for the vertical (‘v’), horizontal (‘h’), and diagonal (‘d’) details at each level of the decomposition.

static generate_meshbuild_input_file(filename, base_name, point_filename=None, mesh_filename=None)

Generates the input file for the meshbuilder executable.

Handles two modes: generating a mesh from points (provide ‘point_filename’) or processing a pre-existing mesh (provide ‘existing_mesh_filename’).

Parameters:

filename (str) – Path where the input file will be written.
out_filename (str) – The base name for meshbuilder output files (OUTFILENAME).
params (dict) – Dictionary of meshbuilder parameters. Expected keys: ‘velocity_ratio’, ‘baseflow’, ‘velocity_coef’, ‘flow_exp’.
point_filename (str, optional) – Filename of the .points file for mesh generation.
existing_mesh_filename (str, optional) – Filename of an existing mesh to be processed.

generate_points_along_stream(coords)

Generates points along a stream network ensuring they are sufficiently spaced from each other.

This method computes points along the stream network by interpolating positions at regular intervals based on the resolution of the DEM (Digital Elevation Model). It ensures that these points are spaced sufficiently from each other and from existing interior points by checking their distances from both the stream network and any interior points.

Parameters:: coords (numpy.ndarray) – An array of coordinates representing the existing points in the interior.
Returns:: A list of points along the stream network, ensuring they are not too close to existing interior points.
Return type:: list of lists
Raises:: ValueError – If the input coords is empty or has incorrect dimensions.

Notes

The stream network is traversed, and points are interpolated along each stream line at regular intervals based on the DEM resolution. A KDTree is used to check distances between stream points and existing interior points, ensuring the generated stream points are not too close to each other or to the interior points.

get_extent(): Returns extent (xmin, xmax, ymin, ymax) of dem data used in mesh generation.

interpolate_elevations(points)

Interpolates elevation values for a set of geographic coordinates.

This method uses a regular grid interpolator to estimate elevation values at specified geographic coordinates based on a given elevation data grid. It performs linear interpolation and handles coordinates that fall outside the bounds of the grid by returning NaN.

Parameters:

points (numpy.ndarray) – An array of points where each row contains x and y coordinates for which elevation values are to be interpolated.

Returns:

An array of interpolated elevation values corresponding to the input coordinates.

Return type:

numpy.ndarray

Raises:

ValueError – If points does not have exactly two columns (x and y coordinates).
AttributeError – If self.data (the elevation data grid) or self.transform (the affine transformation matrix) is not properly set.

Notes

The interpolation is performed using a regular grid interpolator with linear interpolation. Points outside the bounds of the elevation grid will return NaN as their elevation value. The input points array must contain two columns, representing the x and y coordinates, respectively.

static partition_mesh(volume, partition_args)

Partitions a mesh and produces a .reach file for parallel execution with tRIBS.

This function handles the partitioning of a mesh by interacting with Docker to build and process the mesh based on the provided input files and partitioning parameters. It performs the following steps:

Changes the working directory to the specified volume containing the mesh files.
Initializes and runs a Docker container to perform the mesh partitioning.
Executes the mesh partitioning workflow with the provided arguments.
Cleans up the Docker container and restores the working directory.

Parameters:

volume (str) – Path to the directory containing the .in and .points files for the mesh.
partition_args (list) –
A list of arguments for partitioning:
- str : The name of the input file.
- int : The number of nodes for partitioning.
- int : The partition method (1-3).
- str : The basename for output files.

Return type:

None

Notes

This function interacts with Docker to automate the mesh partitioning process for parallel execution with tRIBS. It switches to the specified working directory, initializes a Docker container, runs the mesh partitioning workflow, and then restores the original working directory after cleaning up.

static plot_mesh(mesh, scalar=None, **kwargs): Helper function for plotting a PyVista mesh.

process_level(level, threshold)

Processes a specific level of wavelet decomposition to extract significant detail points using gradient or curvature.

This method examines the vertical, horizontal, and diagonal coefficients at the specified wavelet level, identifies significant points based on a given threshold, and processes the points using either gradient or curvature to find the most significant detail within each raster cell. If the cell is too small for gradient or curvature computation, the centroid of the cell is used as the significant point.

Parameters:

level (int) – The level of wavelet decomposition to process.
threshold (float) – The threshold value used to filter significant coefficients.

Returns:

An iterator of tuples containing the x and y coordinates of significant points.

Return type:

iterator of tuples

Raises:

AttributeError – If self.wavelet_packet, self.data, self.x_grid, or self.y_grid is not properly set.

Notes

The method processes each raster cell to find significant points by either gradient, curvature, or centroid, depending on the self.feature_method attribute. The self.feature_method can be set to ‘gradient’, ‘curvature’, or default to ‘centroid’ if the method is not specified. The function filters coefficients based on the normalized wavelet coefficients and a threshold value, ensuring only significant points are extracted.

static remove_close_points(points, threshold): Removes points that are closer than a specified threshold.

static write_point_file(gdf, output): Helper function for writing out points file.

class pytRIBS.mesh.run_docker.MeshBuilderDocker(volume_path)

A class to manage the execution of the MeshBuilder tool in a Docker container.

This class facilitates setting up and running the MeshBuilder tool within a Docker container. It allows for specifying the Docker image and volume path. The class manages the creation and lifecycle of Docker containers for executing the MeshBuilder tool.

Parameters:: volume_path (str) – Path to the directory that will be mounted as a volume inside the Docker container.
Attributes:: volume_path (str) – The path to the directory that is mounted as a volume inside the Docker container.

clean_directory()

Clean the directory by removing intermediate files.

This method removes all files in the specified volume directory, except those with extensions .in, .points, .reach, or .out. It ensures that only intermediate files are deleted, leaving essential files intact. If an error occurs while deleting a file, an error message is printed.

Parameters:: None
Return type:: None
Raises:: Exception – If an error occurs while deleting a file, the exception is caught and a message is displayed.

Notes

The method iterates through the files in the self.volume_path directory.
Files with the extensions .in, .points, .reach, and .out are preserved, while others are deleted.

cleanup_container()

Stop and remove the Docker container.

This method stops the currently running Docker container and then removes it. If an error occurs during stopping or removing the container, the exception is caught, and an error message is displayed.

Parameters:: None
Return type:: None
Raises:: Exception – If an error occurs while stopping or removing the Docker container, the exception is caught and re-raised.

Notes

The method assumes that self.container is a valid Docker container object that is running.

execute_command_in_container(command)

Execute a command in the running Docker container.

This method runs a specified shell command inside the currently running Docker container. The output of the command is streamed and printed line by line. If the command runs successfully, a success message is printed; otherwise, the exit code is displayed.

Parameters:: command (str) – The shell command to be executed inside the Docker container.
Return type:: None
Raises:: Exception – If there is an error while executing the command inside the Docker container, the exception is caught and re-raised.

Notes

The container must be running before executing this method.
The method uses the exec_run function from the Docker client to run the command and stream the output.
The command is executed in the /bin/bash shell of the container.
The exit code of the command is checked, and success or failure is reported accordingly.

execute_meshbuild_workflow(file_path, nn, OPT_Part, basename)

Execute the MeshBuilder workflow directly in the running Docker container.

This method executes a sequence of commands inside the running Docker container to perform the MeshBuilder workflow. The workflow involves copying necessary files, running the MeshBuilder tool, and executing partitioning using the METIS tool.

Parameters:

file_path (str) – Path to the .in file to be used by MeshBuilder.
nn (int) – Number of computer nodes for partitioning.
OPT_Part (int) – Partitioning method to be used.
basename (str) – Simulation basename for the output files.

Return type:

None

Raises:

Exception – If an error occurs during the execution of the commands inside the Docker container, the exception is caught and re-raised.

Notes

The container must be running before this method is executed.
The method runs several shell commands in sequence inside the Docker container to set up the environment, run MeshBuilder, and partition the mesh.
The commands are executed using the exec_run function from the Docker client with sh -c to ensure proper execution.

initialize_docker_client()

Initialize the Docker client.

This method attempts to connect to the Docker daemon using the Docker client. If the connection is successful, it assigns the client to the self.client attribute. If the connection fails, it prints an error message and raises an exception.

Parameters:: None
Return type:: None
Raises:: Exception – If the connection to the Docker daemon fails, the exception is caught and re-raised.

Notes

This method assumes that Docker is properly installed and running on the system.

pull_image()

Pull the Docker image.

This method pulls the Docker image specified by the self.image_name attribute from a Docker registry. If the image pull is successful, a success message is printed. If an error occurs during the process, it prints an error message and raises the exception.

Parameters:: None
Return type:: None
Raises:: Exception – If there is an error while pulling the Docker image, the exception is caught and re-raised.

Notes

The method expects that the Docker client (self.client) has already been initialized and connected to the Docker daemon, and that self.image_name contains the name of the Docker image to pull.

run_container()

Run the Docker container with the specified volume.

This method starts a Docker container using the specified Docker image and mounts a volume from the host to the container. The volume is mounted in read-write mode at the /meshbuild/data path inside the container. The container is run with interactive terminal options (tty and stdin_open), and it is detached from the terminal.

Parameters:: None
Return type:: None
Raises:: Exception – If an error occurs while starting the Docker container, the exception is caught and re-raised.

Notes

On Windows systems, the method replaces backslashes (`) in the volume path with forward slashes (/`).
The Docker client (self.client) must be initialized and connected to the Docker daemon before running this method.
The volume path (self.volume_path) must be correctly set to a valid host directory.
The container runs with /bin/bash as the entrypoint.

start_docker_desktop(attempts=0)

Ensure Docker is running, and if not, attempt to start it or prompt for installation.

This method checks if Docker is running by attempting to ping the Docker client. If Docker is not running, it attempts to start Docker based on the current operating system. If Docker cannot be started after a specified number of attempts, it prompts for installation.

Parameters:: attempts (int, optional) – The current number of attempts made to start Docker, by default 0.
Returns:: Returns True if Docker is running or successfully started, otherwise False.
Return type:: bool
Raises:: docker.errors.DockerException – If there is an issue connecting to Docker or starting it.

Notes

On Windows, the method attempts to start Docker Desktop via PowerShell.
On macOS, the method uses the open command to launch Docker.
On Linux, the method uses systemctl to start the Docker service.
If the operating system is not recognized, the method prints an error and returns False.
The method will attempt to start Docker up to 5 times, waiting 15 seconds between each attempt.

class pytRIBS.results.evaluate.Evaluate

A collection of static methods for evaluating the performance of simulated data against observed data.

static kling_gupta_efficiency(observed, simulated)

Calculate the Kling-Gupta efficiency (KGE).

The Kling-Gupta efficiency is a metric that evaluates model performance based on correlation, variability, and bias. It ranges from -∞ to 1, with 1 indicating perfect model performance.

Parameters:

observed (numpy.ndarray) – Array of observed data values.
simulated (numpy.ndarray) – Array of simulated data values.

Returns:

The Kling-Gupta efficiency coefficient.

Return type:

float

static nash_sutcliffe(observed, simulated)

Calculate the Nash-Sutcliffe efficiency coefficient.

The Nash-Sutcliffe efficiency (NSE) is a normalized statistic that determines the relative magnitude of the residual variance compared to the measured data variance. It ranges from -∞ to 1, with 1 indicating a perfect match between observed and simulated values.

Parameters:

observed (numpy.ndarray) – Array of observed data values.
simulated (numpy.ndarray) – Array of simulated data values.

Returns:

The Nash-Sutcliffe efficiency coefficient.

Return type:

float

static percent_bias(observed, simulated)

Calculate the percent bias.

The percent bias (PBIAS) measures the average tendency of the simulated data to be larger or smaller than the observed data. Positive values indicate model underestimation, while negative values indicate model overestimation.

Parameters:

observed (numpy.ndarray) – Array of observed data values.
simulated (numpy.ndarray) – Array of simulated data values.

Returns:

The percent bias.

Return type:

float

static root_mean_squared_error(observed, simulated)

Calculate the root mean squared error (RMSE).

RMSE measures the square root of the average squared differences between observed and simulated values. It provides a measure of how well the model predicts the observed data.

Parameters:

observed (numpy.ndarray) – Array of observed data values.
simulated (numpy.ndarray) – Array of simulated data values.

Returns:

The root mean squared error.

Return type:

float

class pytRIBS.results.read.Read

Framework class for Results class

get_element_results()

Reads and processes element result files, and assigns them to self.element_results.

This method performs the following steps: 1. Constructs the directory path from the “outfilename” option in self.options. 2. Searches for files in the directory that match a certain pattern (*.pixel). 3. For each matching file, extracts the node ID from the filename. 4. Reads the content of each element file and creates a DataFrame of results. 5. Stores the results in a dictionary, with keys representing node IDs and the value being another dictionary

containing the pixel DataFrame and a placeholder for waterbalance.

The resulting dictionary is assigned to self.element_results. The key “invar” is used for the .invpixel file, while node-specific files use their respective node IDs as keys.

Returns:: This method updates the self.element_results attribute with the processed data.
Return type:: None

get_element_wb_dataframe(element_id)

Generates a DataFrame with water balance results for a specified element.

This method retrieves water balance data for the given element_id and calculates various metrics based on the pixel data. It uses attributes like porosity and element area from the spatial variables to compute values such as saturated water, canopy snow water equivalent, and surface runoff.

Parameters:: element_id (int) – Identifier for the element whose water balance data is to be retrieved.
Returns:: DataFrame containing water balance metrics with the following columns: - Time: Time series of the data - Unsat_mm: Unsaturated moisture in mm - Sat_mm: Saturated moisture in mm, adjusted by porosity - CanopySWE_mm: Canopy snow water equivalent in mm - SWE_mm: Snow water equivalent in mm - Canop_mm: Canopy storage in mm - P_mm_h: Precipitation in mm/h - ET_mm_h: Evapotranspiration in mm/h, adjusted for sublimation and evaporation - Qsurf_mm_h: Surface runoff in mm/h - Qunsat_mm_h: Unsaturated runoff in mm/h - Qsat_mm_h: Saturated runoff in mm/h, adjusted by element area
Return type:: pd.DataFrame

get_mrf_results(mrf_file=None)

Reads and processes the .mrf file containing model results.

If mrf_file is not provided, constructs the filename using the value of the “outhydrofilename” option from self.options, combined with the runtime value, and appends “_00.mrf” to it.

This method performs the following steps: 1. Reads the column names and units from the first two rows of the .mrf file. 2. Loads the data into a DataFrame, skipping the first two rows which contain metadata. 3. Assigns the read column names to the DataFrame and adds the units as metadata. 4. Converts the Time column from hours since the start date to actual timestamps. 5. Updates the self.mrf attribute with the results, excluding extra time steps that may be included in the file.

Parameters:: mrf_file (str, optional) – The path to the .mrf file. If not provided, the filename is constructed based on the “outhydrofilename” and “runtime” options from self.options.
Returns:: This method updates the self.mrf attribute with the processed results DataFrame.
Return type:: None

get_mrf_wb_dataframe()

Generates a DataFrame with water balance results based on the MRF data.

This method computes water balance metrics from the MRF data using attributes such as drainage area, porosity, and various MRF parameters. The resulting DataFrame includes calculated values for unsaturated moisture, saturated moisture, snow water equivalent, and other hydrological metrics.

Returns:: DataFrame containing water balance metrics with the following columns: - Time: Time series of the data - Unsat_mm: Unsaturated moisture in mm, calculated using the product of moisture storage, drainage weight, and porosity - Sat_mm: Saturated moisture in mm, calculated using the product of drainage weight and porosity - CanopySWE_mm: Canopy snow water equivalent in mm - SWE_mm: Snow water equivalent in mm - Canop_mm: Canopy storage in mm (currently set to 0 as it is not averaged) - P_mm_h: Precipitation in mm/h - ET_mm_h: Evapotranspiration in mm/h, adjusted for sublimation and evaporation - Qsurf_mm_h: Surface runoff in mm/h, adjusted for drainage area - Qunsat_mm_h: Unsaturated runoff in mm/h (currently set to 0; subject to validation) - Qsat_mm_h: Saturated runoff in mm/h (currently set to 0; subject to validation)
Return type:: pd.DataFrame

get_qout_results()

Reads the outlet discharge and water level data from a specified .qout file.

This method reads a .qout file containing outlet discharge and water level data, parses it into a DataFrame, and converts the time information from hours since the start date to actual timestamps.

The .qout file is expected to be named by appending ‘_Outlet.qout’ to the value of the “outhydrofilename” option from the self.options dictionary.

The method performs the following steps: 1. Reads the .qout file into a DataFrame with columns [‘Time_hr’, ‘Qstrm_m3s’, ‘Hlev_m’]. 2. Converts the Time_hr column from hours since the start date into actual timestamps. 3. Returns the DataFrame with an additional Time column representing the timestamps.

Returns:: A DataFrame containing columns [‘Time_hr’, ‘Qstrm_m3s’, ‘Hlev_m’, ‘Time’], where: - Time_hr is the time in hours since the start date. - Qstrm_m3s is the discharge in cubic meters per second. - Hlev_m is the water level in meters. - Time is the converted timestamp corresponding to each time step.
Return type:: pandas.DataFrame

read_element_files(element_results_file)

Reads a .pixel file from tRIBS model results and converts hourly time steps to datetime.

This method performs the following steps: 1. Reads the content of the specified .pixel file into a pandas DataFrame. 2. Converts the Time_hr column from hourly timesteps into datetime objects based on the starting date. 3. Adds a Time column to the DataFrame that contains the converted datetime values.

Parameters:: element_results_file (str) – Path to the .pixel file containing the tRIBS model results.
Returns:: DataFrame containing the results with an updated Time column reflecting datetime values.
Return type:: pd.DataFrame

class pytRIBS.results.visualize.Viz

Framework class for Results Class

create_animation(outfile, df_dict, frames, var, fps=4, vlims=None, nan_color='gray', nan_edge_color='red', cmap='viridis')

Create and save an animation based on a dictionary of DataFrames of tRIBS dynamic files.

Parameters:

outfile (str) – The file path for saving the animation, format is determined from file extension (.mp4,.gif,.avi,.html).
df_dict (dict) – A dictionary where keys represent animation frames and values are DataFrames to be plotted.
frames (iterable) – Iterable containing keys from df_dict representing the frames to include in the animation.
var (str) – The column name in DataFrames to be plotted.
fps (int, optional) – Frames per second for the animation (default is 4).
vlims (tuple, optional) – Tuple containing minimum and maximum values for color normalization (default is None).
nan_color (str, optional) – Color for NaN values in the plot (default is ‘gray’).
nan_edge_color (str, optional) – Edge color for NaN values in the plot (default is ‘red’).

Returns:

None

Raises:

ValueError – If outfile is not a valid file path or frames is empty.

Notes

This method creates an animation by iterating over frames specified in the frames parameter.
Each frame corresponds to a key in the df_dict dictionary, and the corresponding DataFrame is plotted.
NaN values in the DataFrame are flagged with the specified nan_color and nan_edge_color.
The animation format is dependent on the outfile extension with the specified frames per second (fps).

Example

# Assuming instance is an instance of the class containing create_animation method instance.create_animation(“animation.gif”, df_dict, frames=[‘0’,’1’,’2’,’3’], var=”ET”, fps=10)

static discrete_colormap(N, base_cmap=None)

static plot_water_balance(waterbalance, saved_fig=None)

Plots water balance components and saves the figure if a filename is provided.

This function creates a bar plot of water balance components, including precipitation (nP), runoff (nQ), evapotranspiration (nET), and changes in storage (dS). It displays labels for the difference between precipitation and the sum of other components. The plot is saved to a file if saved_fig is provided.

Parameters:

waterbalance (pd.DataFrame) – DataFrame containing water balance components with columns: - nP: Precipitation - nQ: Runoff - nET: Evapotranspiration - dS: Change in storage
saved_fig (str, optional) – Filename to save the figure. If not provided, the figure is not saved.

Returns:

A tuple containing the matplotlib.figure.Figure and matplotlib.axes.Axes objects for the plot.

Return type:

tuple

Notes

The plot includes a stacked bar chart of nQ, nET, and dS with different colors.
Labels indicate the net difference between nP and the sum of dS, nQ, and nET.
The plot will automatically format the x-axis dates and display mean difference in the plot.

class pytRIBS.results.waterbalance.WaterBalance

get_element_water_balance(method)

Calculate water balance for elements in the model.

This method iterates through element results and calculates the water balance for a specific node or key. The user can specify the method for calculating the time frames over which the water balance is computed.

Parameters:: method (str or list of str) – A string specifying the calculation method (‘full’ for the entire time period or ‘water_year’ for water year calculations), or a list of custom date ranges in ‘YYYY-MM-DD’ format (e.g., [‘2024-01-01’, ‘2024-01-31’]).
Return type:: None

Notes

The method reads node results from the node output list and calculates the water balance for each element.
The method parameter determines how the time period is defined for water balance calculations: - ‘full’: The full simulation period. - ‘water_year’: Calculations based on the water year. - A custom date range can be provided as a list of strings in ‘YYYY-MM-DD’ format.
The water balance results are stored in the self.element dictionary with the key “waterbalance” for each node.

Examples

Calculate full water balance: >>> get_element_water_balance(‘full’)
Calculate custom water balance for specific dates: >>> get_element_water_balance([‘2024-01-01’, ‘2024-01-31’])

get_mrf_water_balance(method)

Calculate the water balance from watershed averaged results (*.mrf).

This method computes the water balance for the *.mrf file based on the specified method and stores the result in the obj.mrf dictionary.

Parameters:: method (str or list of str) – A string specifying the calculation method (‘full’ for the entire time period, ‘water_year’ for water year calculations), or a list of custom date ranges in ‘YYYY-MM-DD’ format (e.g., [‘2024-01-01’, ‘2024-01-31’]).
Return type:: None

Notes

The method uses the _run_mrf_water_balance internal function to compute the water balance based on the provided method.
The resulting water balance is stored in the self.mrf dictionary with the key “waterbalance”.

Examples

Calculate full MRF water balance: >>> get_mrf_water_balance(‘full’)
Calculate custom MRF water balance for specific dates: >>> get_mrf_water_balance([‘2024-01-01’, ‘2024-01-31’])

pytRIBS Shared and Helper Classes

class pytRIBS.shared.aux.Aux

static build(source_file, build_directory, verbose=True, exe='tRIBS', parallel='ON', cxx_flags='-O2')

Run a tRIBS model simulation with optional arguments.

Run_simulation assumes that if relative paths are used then the binary and input file are collocated in the same directory. That means for any keywords that depend on a relative path, must be specified from the directory the tRIBS binary is executed. You can pass the location of the input file and executable as paths, in which case the function copies the binary and input file to same directory and then deletes both after the model run is complete. Optional arguments can be passed to store

Parameters:

binary_path (str) – The path to the binary model executable.
control_file_path (str) – The path to the input control file for the binary.
optional_args (str) – Optional arguments to pass to the binary.

Returns:

The return code of the binary model simulation.

Return type:

int

clean()

static convert_to_datetime(starting_date)

Returns a pandas date-time object.

Parameters:: starting_date (str) – The start date of a given model simulation, note needs to be in tRIBS format.
Rtupe:: A pandas Timestamp object

static discrete_cmap(N, base_cmap='viridis'): Create an N-bin discrete colormap from the specified input map.

static fillnodata(files, overwrite=False, resample_pixel_size=None, resample_method='nearest', **kwargs)

Fills nodata gaps in raster files based on a maximum search distance and optionally resamples the raster.

Parameters: files (list): List of paths to raster files. overwrite (bool): If True, the original files will be overwritten with filled data. If False, new files with “_filled” suffix will be created. resample_pixel_size (float, optional): Target pixel size for resampling. If None, no resampling is performed. resample_method (str, optional): Method for resampling. Choices are ‘nearest’, ‘bilinear’, ‘cubic’, etc. Defaults to ‘nearest’. **kwargs: Additional keyword arguments to be passed to rasterio.fill.fillnodata.

Note: This function essentially wraps rasterio.fill.fillnodata and includes optional resampling.

polygon_centroid_to_geographic(polygon, utm_crs=None, geographic_crs='EPSG:4326')

Converts the centroid of a polygon from UTM coordinates to geographic coordinates (latitude and longitude), and calculates the GMT offset of the local time zone at the centroid location.

Parameters:

polygon (shapely.geometry.Polygon) – A Shapely Polygon object for which the centroid’s geographic coordinates are to be calculated.
utm_crs (str, optional) – The EPSG code or CRS string of the UTM coordinate system. If not provided, it defaults to the CRS specified in the self.meta[‘EPSG’] attribute.
geographic_crs (str, optional) – The CRS string for the geographic coordinate system. Defaults to “EPSG:4326” for WGS84.

Returns:

A tuple containing: - lat : float

Latitude of the centroid in decimal degrees.

lonfloat
Longitude of the centroid in decimal degrees.
gmt_offsetint
GMT offset in hours based on the local time zone at the centroid location.

Return type:

tuple

Raises:

ValueError – If no UTM CRS is found and self.meta[‘EPSG’] is None, a ValueError is raised.

Notes

The function uses the Transformer class from the pyproj library to convert UTM coordinates to geographic coordinates.
The TimezoneFinder library is used to determine the local time zone based on latitude and longitude.
The GMT offset is calculated using the local time zone’s UTC offset.

Examples

>>> from shapely.geometry import Polygon
>>> polygon = Polygon([(0, 0), (1, 0), (1, 1), (0, 1)])
>>> lat, lon, gmt_offset = self._polygon_centroid_to_geographic(polygon, utm_crs="EPSG:32633")
>>> print(lat, lon, gmt_offset)
(52.5167, 13.3833, 1)

print_tags(tag_name)

Prints .in options for a specified tag. :param tag_name: Currently: “io”, input/output, “physical”, physical model params, “time”, time parameters, “opts”, parameters for model options, “restart”, restart capabilities, “parallel”, parallel options.

Example

>>> m.print_tags("io")

static rename_file_with_date(file_path, date_str)

Renames a file by appending the provided date and ‘00’ for hours before the file extension.

Parameters:

file_path (str) – The full path of the file to be renamed.
date_str (str) – The date string in the format ‘YYYY-MM-DD’.

Returns:

The new file name after renaming.

Return type:

str

static run(executable, input_file, mpi_command=None, tribs_flags=None, log_path=None, store_input=None, timeit=True, verbose=True)

Run a tRIBS model simulation with optional arguments.

Run_simulation assumes that if relative paths are used then the binary and input file are collocated in the same directory. That means for any keywords that depend on a relative path, must be specified from the directory the tRIBS binary is executed. You can pass the location of the input file and executable as paths, in which case the function copies the binary and input file to same directory and then deletes both after the model run is complete. Optional arguments can be passed to store

Parameters:

binary_path (str) – The path to the binary model executable.
control_file_path (str) – The path to the input control file for the binary.
optional_args (str) – Optional arguments to pass to the binary.

Returns:

The return code of the binary model simulation.

Return type:

int

utm_to_latlong(easting, northing, epsg=None)

Convert UTM coordinates to latitude and longitude using an EPSG code with pyproj.

Parameters: easting (float): UTM easting coordinate. northing (float): UTM northing coordinate. epsg_code (int): EPSG code representing the UTM projection.

Returns: tuple: A tuple containing latitude and longitude.

class pytRIBS.shared.infile_mixin.Infile

Mixin for .in file parameters and definitions shared by both pytRIBS Classes Model & Results.

static create_input_file()

Creates a dictionary with tRIBS input options assigne to attribute input_options.

This function loads a dictionary of the necessary variables for a tRIBS input file. And is called upon initialization. The dictionary is assigned as instance variable:input_options to the Class Simulation. Note the templateVars file will need to be udpated if additional keywords are added to the .in file.

Each subdictionary has a tags key. With the tag indicating the role of the given option or variable in the model simulation.

Tags: time - parameters related to model simulation times and time steps mesh - options for reading mesh flow - flow routing parameters hydro - options and physical parameters for hydrological components of modle spatial - input raster files and tables for bedrock, groundwater, landuse, and soil properties. meterological - options for meterological data output - options and paths for model outputs forecast - suite of options for forecast mode stochastic - suite of options for stochastic mode restart - options for restart functionality parallel - options for parallel functionality

class pytRIBS.shared.inout.InOut

Shared Class for managing reading and writing tRIBS files

static read_ascii(file_path): Returns dictionary containing ‘data’, ‘profile’, and additional metadata. :param file_path: Path to ASCII (or other formats) raster. :return: Dict

read_grid_data_file(grid_type): Returns dictionary with content of a specified Grid Data File (.gdf) :param grid_type: string set to “weather”, “soil”, of “land”, with each corresponding to HYDROMETGRID, SCGRID, LUGRID :return: dictionary containg keys and content: “Number of Parameters”,”Latitude”, “Longitude”,”GMT Time Zone”, “Parameters” (a list of dicts)

static read_json(file_path)

read_landuse_table(file_path=None)

Returns list of dictionaries for each type of landuse specified in the .ldt file.

Land Use Reclassification Table Structure (*.ldt, see tRIBS documentation for more details) #Types #Params ID a b1 P S K b2 Al h Kt Rs V LAI theta*_s theta*_t

read_met_sdf(file_path=None): Returns list of met stations, where information from each station is stored in a dictionary. :param file_path: Reads from options[“hydrometstations”][“value”], but can be separately specified. :return: List of dictionaries.

static read_met_station(file_path)

Reads a meteorological station data file and processes it into a pandas DataFrame with a datetime index.

Parameters:: file_path (str) – Path to the meteorological station data file. The file should be in a space-separated format with columns for year, month, day, and hour.
Returns:: A DataFrame containing the meteorological data with a single ‘date’ column as a datetime index, and the remaining columns from the input file.
Return type:: pandas.DataFrame

Notes

The function expects the input file to have columns ‘Y’, ‘M’, ‘D’, and ‘H’ for year, month, day, and hour, respectively.
The columns for year, month, day, and hour are converted into a single ‘date’ column of datetime type.
The original columns ‘Y’, ‘M’, ‘D’, and ‘H’ are dropped from the DataFrame after the datetime conversion.

read_point_files(): Returns Pandas dataframe of nodes or point used in tRIBS mesh.

read_precip_sdf(file_path=None): Returns list of precip stations, where information from each station is stored in a dictionary. :param file_path: Reads from options[“hydrometstations”][“value”], but can be separately specified. :return: List of dictionaries.

static read_precip_station(file_path): Returns pandas dataframe of precipitation from a station specified by file_path. :param file_path: Flat file with columns Y M D H R :return: Pandas dataframe

static write_ascii(raster_dict, output_file_path, dtype='float32', decimals=None): Writes raster data and metadata from a dictionary to an ASCII raster file. :param raster_dict: Dictionary containing ‘data’, ‘profile’, and additional metadata. :param output_file_path: Output ASCII raster file path. :param dtype: Data type for the output raster (default is ‘float32’). :param decimals: Optional integer specifying number of decimal places for raster values.

static write_geotiff(raster_dict, output_file_path, dtype='float32', compress=None)

Writes raster data and metadata from a dictionary to a GeoTIFF file.

This is a more efficient and robust alternative to ASCII for visualization and analysis purposes.

Parameters:

raster_dict – Dictionary containing ‘data’ and ‘profile’ keys. ‘data’ is the 2D numpy array of raster values. ‘profile’ is the rasterio metadata dictionary.
output_file_path – Path for the output GeoTIFF file (e.g., ‘output.tif’).
dtype – Data type for the output raster (default is ‘float32’).
compress – Optional compression method. Common choices are ‘lzw’, ‘deflate’, or ‘packbits’. Using compression is highly recommended to reduce file size.

static write_grid_data_file(grid_file, data): Writes the content of a dictionary to a specified Grid Data File (.gdf) :param grid_file: path to write out grid file to. :param data: dictionary containing keys and content: “Number of Parameters”, “Latitude”, “Longitude”, “GMT Time Zone”, “Parameters” (a list of dicts) :return: None

write_input_file(output_file_path, detailed=False): Writes .in file for tRIBS model simulation. :param self: :param output_file_path: Location to write input file to. :param detailed: Option to print input file with option descriptions and related info.

static write_landuse_table(landuse_list, file_path)

Writes out Land Use Reclassification Table(*.ldt) file with the following format: #Types #Params ID a b1 P S K b2 Al h Kt Rs V LAI theta*_s theta*_t

Parameters:

landuse_list – List of dictionaries containing land information specified by .ldt structure above.
file_path – Path to save *.sdt file.

static write_met_sdf(output_file_path, station_list): Writes a list of meteorological stations to a flat file (i.e. *.sdf file). :param station_list: List of dictionaries containing station information. :param output_file_path: Output flat file path.

static write_met_station(df, output_file_path): Converts a DataFrame with ‘date’ and ‘PA’,’TD’ or ‘RH’ or ‘VP’,’XC’,’US’,’TA’,’TS’,’NR’ columns to flat file format. See tRIBS documentation for more details on weather station data structure (i.e. *mdf files). :param df: Pandas DataFrame with ‘date’ and ‘R’ columns. :param output_file_path: Output flat file path.

static write_node_file(node_ids, file_path)

static write_point_file(nodes_gdf, output_file)

Write a points file from a GeoDataFrame of nodes.

Parameters: - nodes_gdf: GeoDataFrame

GeoDataFrame containing nodes with ‘geometry’, ‘elevation’, and ‘bc’ columns.

output_file: str
Path to the output points file.

Returns: None

static write_precip_sdf(station_list, output_file_path): Writes a list of precip stations to a flat file. :param station_list: List of dictionaries containing station information. :param output_file_path: Output flat file path.

static write_precip_station(df, output_file_path): Converts a DataFrame with ‘date’ and ‘R’ columns to flat file format with columns Y M D H R. :param df: Pandas DataFrame with ‘date’ and ‘R’ columns. :param output_file_path: Output flat file path.

class pytRIBS.shared.shared_mixin.Shared

Shared methods betweens the pytRIBS Classes.

static convert_to_datetime(starting_date)

Returns a pandas date-time object.

Parameters:: starting_date (str) – The start date of a given model simulation, note needs to be in tRIBS format.
Rtupe:: A pandas Timestamp object

get_invariant_properties()

Reads and processes invariant spatial properties based on the parallel mode setting.

This method handles the integration of spatial variables and Voronoi files depending on the mode specified in the options. It merges parallel files or reads single files, computes weights, and loads Voronoi data.

The method does the following: - Checks the parallelmode setting to determine if parallel processing is enabled. - Merges parallel spatial files if in parallel mode, or reads a single spatial file if not. - Computes weights based on the VAr column if in non-parallel mode. - Loads Voronoi files based on the parallelmode setting.

Parameters:: None
Return type:: None

Notes

If parallelmode is set to 1, the method merges files with a _00i suffix and integrates spatial variables based on runtime values.
If parallelmode is set to 0, it reads a single file based on the outfilename and runtime values, and computes weights using the VAr column.
Voronoi files are read or merged based on the parallel mode setting.
If the parallelmode is not recognized, it prints an error message and sets the spatial variables and Voronoi data to None.

Example

>>> obj.get_invariant_properties()

Raises:: ValueError – If there are issues merging files or reading Voronoi data.

get_spatial_files(suffix='_00d', dtime=0, write=True, header=True, colnames=None, single=True)

Reads and returns spatial output files (Dynamic or Integrated) for tRIBS models.

The method determines whether to look for Serial or Parallel output files based strictly on the ‘parallelmode’ setting in the tRIBS input file.

Parameters:: suffix (str) – Either _00d for dynamics outputs or _00i for time-integrated ouputs.

:param int dtime : Option to specify time step at which to start merge of files. :param bool write: Option to write combined dataframe to file (only applies if parallel). :param bool header: Set to False if headers are not provided with spatial files. :param list colnames: If header = False, column names can be provided here. :param bool single: If single = True then only spatial files specified at dtime are read. :return: Dictionary of pandas dataframes keyed by time string.

static grid_geodataframe(gdf, value_column, cell_size, nodata_value=-9999.0, fill_nodata_with_mean=False)

Rasterizes a GeoDataFrame using area-weighted averaging.

This method is calculating the value of each raster cell based on the proportional area of all voronoi polygons that overlap it.

Parameters:

gdf (GeoDataFrame) – The GeoDataFrame that contains the voronoi polygons and outputs to rasterize. Must have a valid CRS.
value_column (str) – The name of the column in the gdf to use for the raster values.
cell_size (float) – The desired cell size (resolution) of the output raster.
nodata_value (float, optional) – The value for pixels that do not fall within any polygon. A value of -9999.0 is usually appropriate for tRIBS.
fill_nodata_with_mean (bool, optional) – If True, any remaining nodata cells in the final raster will be filled with the mean of all valid data cells. Defaults to False.

Returns:

A dictionary containing ‘data’ and ‘profile’ for write_ascii. Returns None if the input GeoDataFrame has no CRS defined.

Return type:

dict or None

Example

>>> dynamic_data_dict = results.merge_parallel_spatial_files(suffix="_00d", dtime=final_runtime, single=True)
>>> gdf_final_state = results.voronoi.merge(dynamic_data_dict, on='ID')
>>> final_gw_raster_dict = results.grid_geodataframe( gdf=gdf_final_state, value_column='Nwt', cell_size=30.0)

Raises:: Error – If there is not a valid CRF attached to the GeoDataFrame.

merge_parallel_voi(join=None, result_path=None, format=None, save=False)

Returns geodataframe of merged vornoi polygons from parallel tRIBS model run.

Parameters:

join – Data frame of dynamic or integrated tRIBS model output (optional).
save – Set to True to save geodataframe (optional, default True).
result_path – Path to save geodateframe (optional, default OUTFILENAME).
format – Driver options for writing geodateframe (optional, default = ESRI Shapefile)

Returns:

GeoDataFrame

mesh2vtk(outfile)

Converts mesh data files into a VTK file format for visualization.

This function reads node, triangle, and elevation data from files and writes them to a VTK file. The VTK file will be an unstructured grid dataset containing points and cells, with associated scalar data.

Parameters:: outfile (str) – Path to the output VTK file where the mesh data will be written.
Return type:: None

Notes

The function expects the following files in the directory specified by the ‘outfilename’ option:
- A node file with a .nodes extension containing node coordinates and boundary codes.
- A triangle file with a .tri extension containing triangle vertex indices.
- A z-file with a .z extension containing elevation values.
The node file should contain columns for x, y coordinates, and a boundary code.
The triangle file should contain columns for vertex indices of triangles.
The z-file should contain elevation values for each node.
The output VTK file will include point data (coordinates and elevations) and cell data (triangles).
Boundary codes are used to set NaN values in the altitude scalars in the VTK file.

Example

>>> self.mesh2vtk('output_mesh.vtk')

Raises:

FileNotFoundError – If the required node, triangle, or z files cannot be found in the specified directory.
IndexError – If there is an issue reading data from the node, triangle, or z files, which may indicate file corruption.

static plot_mesh(mesh, scalar=None, **kwargs)

Plots a 3D mesh using PyVista with optional scalar data.

This method visualizes a mesh object, optionally using scalar data to color the mesh. It handles meshes from a file path or PyVista object and allows for customizing the plot with additional keyword arguments.

Parameters:

mesh (str or pv.PolyData) – If a string is provided, it should be a path to a mesh file that will be read using PyVista. If a PyVista PolyData object is provided, it will be used directly for plotting.
scalar (array-like, optional) – Scalar data to be used for coloring the mesh. If not provided, it defaults to the ‘Elevation’ array of the mesh. The scalar data must match the number of points or cells in the mesh.
**kwargs (keyword arguments) – Additional keyword arguments passed to pyvista.Plotter.add_mesh for further customization of the plot.

Returns:

A PyVista Plotter object configured to display the mesh.

Return type:

pv.Plotter

Notes

If scalar is provided, it will be used to color the mesh. Closed points or cells (where ‘BoundaryCode’ is 1) are set to NaN.
If the length of scalar matches the number of points, NaNs are assigned to closed points.
If the length of scalar matches the number of cells, NaNs are assigned to closed cells.
The plot camera is set to view from the top-down (xy plane) with north up.

Example

>>> mesh = pv.read('path_to_mesh_file.vtk')
>>> plotter = plot_mesh(mesh, scalar=my_scalar_data, cmap='viridis')
>>> plotter.show()

Raises:: ValueError – If the length of scalar does not match either the number of points or cells in the mesh.

read_input_file(file_path): Reads .in file for tRIBS model simulation and assigns values to options attribute. :param file_path: Path to .in file.

static read_node_list(file_path)

Returns node list provide by .dat file.

The node list can be further modified or used for reading in element/pixel files and subsequent processing.

Parameters:: file_path (str) – Relative or absolute file path to .dat file.
Returns:: List of nodes specified by .dat file
Return type:: list

read_reach_file(filename=None): Returns GeoDataFrame containing reaches from tRIBS model domain. :param filename: Set to read _reach file specified from OUTFILENAME,but can be changed. :return: GeoDataFrame

read_voi_file(filename=None): Returns GeoDataFrame containing voronoi polygons from tRIBS model domain. :param filename: Set to read _reach file specified from OUTFILENAME,but can be changed. :return: GeoDataFrame

class pytRIBS.shared.shared_mixin.Meta: Class for project metadata.