This example builds on ../MLP/4_TextClassification by replacing the MLP with three LSTMs. The LSTMs process all the words on the input sequence recursively and output some features for each word. These features are weighted over time by a learnable linear layer and an output linear layer maps this average to the classification space.
- Modeling text via LSTM.
- Using
torch.nn.utils.rnn.PackedSequenceto omit the zero-padded tokens and still have vectorized mini-batching.
The exercise build on the example by asking to use Bidirectional LSTMs to incorporate future context into the processing of each word.
- Using BiLSTMs including different of combining their outputs.