Keras
Deep Learning
LSTM
Machine Learning
Python
GRU
RNN
In many real-world problems, data is presented in sequential form: words in a sentence, stock prices over time, musical notes in a melody, or physiological signals measured at regular intervals. To address such problems, it is not enough to analyze data in isolation; models must be capable of capturing the dependencies between elements in a sequence.
Recurrent Neural Networks (RNNs) were specifically designed with this goal in mind: enabling a model to learn temporal dependencies in data by incorporating a form of internal memory.
A recurrent neural network is a network architecture that, unlike a traditional feedforward neural network, incorporates cycles in its structure. This allows it to maintain an internal state that is updated with each new input and influences the model's future decisions.
The key idea behind an RNN is that the model's output at a given time depends not only on the current input but also on the accumulated internal state up to that point.
In a recurrent neural network, the computation process at each time step can be simplified as follows:
x_t
: input at time t
.h_t
: hidden state at time t
, updated as a function of x_t
and the previous state h_{t-1}
.y_t
: model output at time t
.The state update and output generation can be formalized with these equations:
1h_t = tanh(W_hh * h_{t-1} + W_xh * x_t + b_h) y_t = W_hy * h_t + b_y
The state h_t
acts as a dynamic memory, which adjusts as the model progresses through the sequence. RNNs are particularly useful in tasks where context is essential. Some examples include:
Although RNNs can theoretically learn long-term dependencies, in practice, they suffer from a phenomenon known as vanishing or exploding gradients during training, which hinders their ability to remember distant information in the sequence.
This limits their effectiveness in tasks where capturing relationships between events far apart in the sequence is crucial. To overcome the limitations of traditional RNNs, improved architectures were developed:
Both variants are now the de facto standard when working with sequences.
The following example demonstrates how to build a simple RNN using the Keras library in Python:
1from keras.models import Sequential 2from keras.layers import SimpleRNN, Dense 3 4model = Sequential() 5model.add(SimpleRNN(64, input_shape=(timesteps, features))) 6model.add(Dense(1, activation='sigmoid')) 7model.compile(optimizer='adam', loss='binary_crossentropy')
This model can be adapted for tasks such as sentiment classification or binary event prediction from sequences.