RWQForecast

RWQForecast

RWQForecast (Remote Water Quality Forecast) is a web application designed for short-term forecasts of water quality changes in reservoirs over a timeframe of days, for analyzing temporal changes in water quality within reservoirs, and for assessing the spatial distribution of water quality within a reservoir for a given date. The analysis is based on combination of the Sentinel 2 satellite data and their analysis using AI Machine Learning and Deep Learning methods.

The RWQForecast system consists of two parts:

RWQForecast offers the following options for processing and evaluating satellite data:

  • Estimation of water quality in reservoirs
  • Evaluation of temporal changes
  • Short-term forecast of water quality changes
  • Assessment of spatial distribution of water quality in reservoirs

RWQForecast allows evaluation of the following parameters:

  • ChlA – Chlorophyll-a concentration (µg.l⁻¹)

Upcoming models will include:

  • ChlB – Chlorophyll-b concentration (µg.l⁻¹)
  • TSS – Total suspended solids (mg.l⁻¹)
  • PC – Phycocyanin concentration (µg.l⁻¹)
  • APC – Allophycocyanin concentration (µg.l⁻¹)
  • PE – Phycoerythrin concentration (µg.l⁻¹)
  • CX – Carotenoids and xanthophylls concentration (µg.l⁻¹)
  • SD – Secchi disk transparency (m)

The system's functionality is designed to minimize user input requirements. The web application handles the user interface, where users simply select a reservoir from a map or list and specify the desired water quality parameters.

The source code is available at: https://github.com/JakubBrom/RWQForecast. The system is written in Python, the web service uses the Flask microframework, and the database is PostgreSQL/PostGIS.

Status

Testing version.

The testing web page is available at http://160.217.162.143:8080/ (can be temporaly unavailable because the development). The later version will be placed at https://rwqforecast.com (upcoming).

Authors and Collaborators

Jakub Brom - libretto, drama, roles, actors, stage, light design, prompter, masks, coffeelings and other issues

Václav Nedbal - data acquisition, water field sampling, spectral measurement, laboratory preparation of samples

Blanka Tesařová - laboratory analyses

Jan Kuntzman - field work, laboratory preparation of samples

License

GNU GPL v. 3 or later

© 2024 Jakub Brom, University of South Bohemia in České Budějovice, Faculty of Agriculture and Technology

© 2024 AIHABs Consorcium

E-mail: jbrom@fzt.jcu.cz

Documentation

The RWQForecast service has two parts. The first is a user interface (frontend) and the second is computational which provide data downloading and processing. The computational unit provides the data/results to the user with using the database, which is connected with the user interface. The computational unit is implemented in the RWQForecast services.

User interface

The RWQForecast system is designed to be user-friendly and intuitive. Some parts are still under development. The following tutorial provides a step-by-step guide to using the system:

  1. User account
    • The user registers using the "Sign Up" tab.
    • After successful registration, the user receives an email with a confirmation link.
    • After confirming the email, the user can log in to the system.
  2. Logging in
    • The user logs in to the system using the "Login" tab.
    • After successful login, the user is redirected to the "Home" tab.
  3. Analysis
  4. The user selects the desired reservoir, evaluation parameter, and prediction model either through the reservoir selection form or from the map window.

    After confirming the selection, the time series for the chosen parameter and reservoir is displayed with all available data. Missing data (part of time series) can be filled using the "Update dataset" button, which initiates the process of data retrieval and feature computation.

    To perform analyses, the user must set up OpenEO/CDSE access credentials. The system will request these credentials during or before the first analysis. The credentials can be obtained through a Copernicus DataSpace Ecosystem account. If the user does not have an OpenEO account, they must register first.

    For parallel computing, multiple OpenEO credentials can be used, but a single key cannot be utilized for multiple analyses simultaneously.

    A table displays information about the reservoir and the dataset after confirmation. The time series line chart presents the average, median, and confidence intervals. The forecast chart shows a two-week prediction as an interactive line graph (upcoming). Time series and forecast data can be downloaded as a value table (upcoming for forecast).

    The user can display the spatial distribution of values for the selected reservoir. After selection of the particular date of the data acquisition and confirmation, a table with statistics for the reservoir and selected date is generated and an interactive graph visualizing the spatial distribution of values across the reservoir is displayed.

    Data in the form of a point vector layer can be downloaded by clicking the "Download data" button.

    Statistical indicators are computed from interpolated data using the linear interpolation method.

    For adding a new water reservoir, user can click the "Add new reservoir" button which goes to the "Select reservoir" window. The page provides selection of the reservoir from the map for data processing. After confirming the selection the system requests confirmation again and then initiates the analysis.

    Once the analysis starts, the system notifies the user in a new application window.

    After the analysis is completed, the user receives an email notification with a link to the results.

Computation workflow

The computation unit (RWQForecas-engine) is a software which provides all the processing steps of the data including satellite data downloading, feature calculation/prediction and forecast of the water quality parameters. The RWQForecast engine is available at https://github.com/JakubBrom/RWQForecast-engine.

  1. Downloading the vector layer for the selected reservoir
  2. The user selects a reservoir in the application, which is retrieved from the OpenStreetMap database and stored in the system's database. Users can select any reservoir worldwide, with processing available for reservoirs larger than 1 hectare.

  3. Defining points within the reservoir
  4. The system generates spatial points inside the selected reservoir using OpenStreetMap as a reference. Each point serves as a spatial reference for processing time series data, allowing interpolation to reconstruct the spatial distribution of values for a given time interval. The approach considers the geometric complexity of reservoirs, ensuring that areas such as bays are included in the analysis. The point density is set to 100 points per km², with adjustments for small and large reservoirs. The system randomly selects and processes a maximum of 5,000 points per reservoir to optimise computational efficiency.

  5. Data retrieval
  6. The system automatically downloads data for the defined points from Sentinel-2 satellite imagery available in the ESA OpenEO archive. Meteorological data for each location is retrieved from OpenMeteo.

  7. Calculation of water quality parameters
  8. In this step, a pre-trained AI estimation model is used to compute the requested water quality parameters based on the corresponding data stored in the database.

  9. Imputation of missing values
  10. Satellite image data availability is irregular due to weather conditions, especially cloud cover, which varies geographically. The system handles missing data using the Support Vector Machine (SVM) method, which reconstructs a complete daily time series by utilizing the characteristics of each time series along with meteorological data as a coregressor.

  11. Forecasting
  12. The model estimates the probable evolution of time series using historical data and meteorological information as coregressors. The Long Short-Term Memory (LSTM) method is used for prediction, generating forecasts for all selected points within the reservoir. upcoming)

  13. Visualization of results and statistical analysis
  14. Spatial distribution of data for a given date is visualized using contour plots (ContourPlot).

    The web application provides visualization of results using the Plotly library.

    Time series are represented by interactive line charts displaying the mean value, median, and confidence intervals.

    Missing data are handled using the connectgaps method, and smoothing is applied for improved clarity.

    Statistical indicators are computed from interpolated data using the inverse distance weighting (IDW) interpolation method.

  15. User data export
  16. The application allows users to download time series data for a selected reservoir as a data table, as well as spatial data for individual dates in the form of a point vector layer in the WGS84 coordinate system.

System limitations

The system has following limitations:

  • Processing is available for reservoirs larger than 1 hectare.
  • Processing is limited to a maximum of 10,000 points per reservoir.
  • Processing is limited to a maximum of 16 days for forecasting.
  • Processing is limited to a maximum of 10 years for historical data. The data for Sentinel 2 are available from June 2015.
  • Water quality prediction accuracy is limited by the quality of Satellite data - The L2A Sen2Cor product is used for the data analysis.
  • The number of analyses is limitted by the OpenEO data availability limits for the user.
  • The time for analysis can be very long because the OpenEO limis.
  • The RWQForecast system is not suitable for large reservoirs.