Nal Kalchbrenner - Grid Long Short-Term Memory (2016)
History /
Edit /
PDF /
EPUB /
BIB /
Created: June 16, 2017 / Updated: November 2, 2024 / Status: in progress / 1 min read (~172 words)
Created: June 16, 2017 / Updated: November 2, 2024 / Status: in progress / 1 min read (~172 words)
- N-dimensional Grid LSTM (N-LSTM for short) can naturally be applied as feed-forward networks as well as recurrent ones
- One-dimensional Grid LSTM corresponds to a feed-forward network that uses LSTM cells in place of transfer functions such as tanh and ReLU
- These networks are related to Highway Networks where a gated transfer function is used to successfully train feed-forward networks with up to 900 layers of depth
- Grid LSTM with two-dimensions is analogous to the Stacked LSTM, but it adds cells along the depth dimension too
- Grid LSTM with three or more dimensions is analogous to Multidimensional LSTM, but differs from it not just by having the cells along the depth dimension, but also by using the proposed mechanism for modulating the N-way interaction that is not prone to the instability present in Multidimensional LSTM
- Kalchbrenner, Nal, Ivo Danihelka, and Alex Graves. "Grid long short-term memory." arXiv preprint arXiv:1507.01526 (2015).