###
**
Rüppel, Uwe ; Göbel, Peter
**

Möller, Andreas ; Page, Bernd ; Schreiber, Martin (eds.)
:

*Pretreatment of Environmental Data for Forecasting Purposes.*

[Online-Edition: http://enviroinfo.eu/sites/default/files/pdfs/vol116/0569.pd...]

In: 22nd International Conference on Informatics for Environmental Protection (EnviroInfo), 10.-12. September 2008, Lüneburg, Germany.
EnviroInfo 2008: Environmental Informatics and Industrial Ecology
Shaker Verlag
, Aachen

[
Konferenzveröffentlichung]
, (2008)

Möller, Andreas ; Page, Bernd ; Schreiber, Martin (eds.) :

## Kurzbeschreibung (Abstract)

To assess present actions on the environment, it is necessary to estimate its impact in the future. Niels Bohr2 recognized, “prediction is very difficult, especially about the future”. Fortunately, “the future is made of the same stuff as the present” (Simone Weil3). This holds the fundamental possibility to forecast. The present is described with data. To draw the right conclusion about the future, the data need to be significant, correct and complete. This work is part of a project for an active control of groundwater levels. In this project, expected groundwater levels are being prognosticated according to varying infiltration masses using Artificial Neural Networks (ANN). Thus, an adequate infiltration quantity will be identified in order to reach the desired groundwater level. Before the environmental data are suitable for the actual forecast purpose, they need to undergo a wide range of pretreatments. These efforts are being described within this paper. In a first step, substitution methods will be presented to impute missing data. Basically, these methods can be divided into two branches. One category with correlation and kriging methods which use related measuring data sets, i.e. data sets of a nearby measuring station for e.g. groundwater level, temperature, rainfall etc.. The other category that uses only the one regarded data set consists of mere statistical methods that are spline interpolation, time-series forecasts and multiple imputations. In a second step, the completed data sets need to be freed from gross errors. For that reason different test criteria like bound checking, comparisons of spacings and different statistical methods are implemented. Furthermore, the original dynamic and time-variant data sets are compared with computed data sets, generated with time series analysis models. Outliers are indicated if computed values strongly diverge from original values. In doubtful situations the current curve can be compared with a curve of a correlative data set, if available. In a third step, in terms of a complexity reduction, the number of the relevant data that serve as input parameters for the ANN need to be reduced without losing the necessary information to make predictions. This is important because in the present case the number of necessary input parameters is too high in comparison to the number of training sets to train the ANN. Different statistical approaches will be discussed, like moving averages, time-weighted transformations and a method to combine sets of moving averages to reduce the number of input parameters of the ANN with consistent information content.

Typ des Eintrags: | Konferenzveröffentlichung |
---|---|

Erschienen: | 2008 |

Herausgeber: | Möller, Andreas ; Page, Bernd ; Schreiber, Martin |

Autor(en): | Rüppel, Uwe ; Göbel, Peter |

Titel: | Pretreatment of Environmental Data for Forecasting Purposes |

Sprache: | Englisch |

Kurzbeschreibung (Abstract): | To assess present actions on the environment, it is necessary to estimate its impact in the future. Niels Bohr2 recognized, “prediction is very difficult, especially about the future”. Fortunately, “the future is made of the same stuff as the present” (Simone Weil3). This holds the fundamental possibility to forecast. The present is described with data. To draw the right conclusion about the future, the data need to be significant, correct and complete. This work is part of a project for an active control of groundwater levels. In this project, expected groundwater levels are being prognosticated according to varying infiltration masses using Artificial Neural Networks (ANN). Thus, an adequate infiltration quantity will be identified in order to reach the desired groundwater level. Before the environmental data are suitable for the actual forecast purpose, they need to undergo a wide range of pretreatments. These efforts are being described within this paper. In a first step, substitution methods will be presented to impute missing data. Basically, these methods can be divided into two branches. One category with correlation and kriging methods which use related measuring data sets, i.e. data sets of a nearby measuring station for e.g. groundwater level, temperature, rainfall etc.. The other category that uses only the one regarded data set consists of mere statistical methods that are spline interpolation, time-series forecasts and multiple imputations. In a second step, the completed data sets need to be freed from gross errors. For that reason different test criteria like bound checking, comparisons of spacings and different statistical methods are implemented. Furthermore, the original dynamic and time-variant data sets are compared with computed data sets, generated with time series analysis models. Outliers are indicated if computed values strongly diverge from original values. In doubtful situations the current curve can be compared with a curve of a correlative data set, if available. In a third step, in terms of a complexity reduction, the number of the relevant data that serve as input parameters for the ANN need to be reduced without losing the necessary information to make predictions. This is important because in the present case the number of necessary input parameters is too high in comparison to the number of training sets to train the ANN. Different statistical approaches will be discussed, like moving averages, time-weighted transformations and a method to combine sets of moving averages to reduce the number of input parameters of the ANN with consistent information content. |

Buchtitel: | EnviroInfo 2008: Environmental Informatics and Industrial Ecology |

Ort: | Aachen |

Verlag: | Shaker Verlag |

Fachbereich(e)/-gebiet(e): | 13 Fachbereich Bau- und Umweltingenieurwissenschaften > Institut für Numerische Methoden und Informatik im Bauwesen 13 Fachbereich Bau- und Umweltingenieurwissenschaften |

Veranstaltungstitel: | 22nd International Conference on Informatics for Environmental Protection (EnviroInfo) |

Veranstaltungsort: | Lüneburg, Germany |

Veranstaltungsdatum: | 10.-12. September 2008 |

Hinterlegungsdatum: | 21 Jan 2015 13:15 |

Offizielle URL: | http://enviroinfo.eu/sites/default/files/pdfs/vol116/0569.pd... |

Zusätzliche Informationen: | ISBN: 978-3-8322-7313-2 |

Export: |

#### Optionen (nur für Redakteure)

Eintrag anzeigen |