Identifying nonlinear dynamical systems with multiple time scales and long-range dependencies

A main theoretical interest in biology and physics is to identify the nonlinear dynamical system (DS) that generated observed time series. Recurrent Neural Networks (RNNs) are, in principle, powerful enough to approximate any underlying DS, but in their vanilla form suffer from the exploding vs. van...

Full description

Saved in:
Bibliographic Details
Main Authors: Schmidt, Dominik (Author) , Koppe, Georgia (Author) , Monfared, Zahra (Author) , Beutelspacher, Max (Author) , Durstewitz, Daniel (Author)
Format: Chapter/Article Conference Paper
Language:English
Published: 12 Mar 2021
Edition:Version v3
In: Arxiv
Year: 2021, Pages: 1-29
DOI:10.48550/arXiv.1910.03471
Online Access:Verlag, kostenfrei, Volltext: https://doi.org/10.48550/arXiv.1910.03471
Verlag, kostenfrei, Volltext: http://arxiv.org/abs/1910.03471
Get full text
Author Notes:Dominik Schmidt, Georgia Koppe, Zahra Monfared, Max Beutelspacher, Daniel Durstewitz
Description
Summary:A main theoretical interest in biology and physics is to identify the nonlinear dynamical system (DS) that generated observed time series. Recurrent Neural Networks (RNNs) are, in principle, powerful enough to approximate any underlying DS, but in their vanilla form suffer from the exploding vs. vanishing gradients problem. Previous attempts to alleviate this problem resulted either in more complicated, mathematically less tractable RNN architectures, or strongly limited the dynamical expressiveness of the RNN. Here we address this issue by suggesting a simple regularization scheme for vanilla RNNs with ReLU activation which enables them to solve long-range dependency problems and express slow time scales, while retaining a simple mathematical structure which makes their DS properties partly analytically accessible. We prove two theorems that establish a tight connection between the regularized RNN dynamics and its gradients, illustrate on DS benchmarks that our regularization approach strongly eases the reconstruction of DS which harbor widely differing time scales, and show that our method is also en par with other long-range architectures like LSTMs on several tasks.
Item Description:Online veröffentlicht am 8. Oktober 2019, Version 2 am 19. Juni 2020, Version 3 am 12. März 2021
Gesehen am 10.01.2024
Physical Description:Online Resource
DOI:10.48550/arXiv.1910.03471