This project is a complete NLP deep learning pipeline for emotion classification, combined with a Streamlit-based interactive frontend.
It demonstrates:
- Custom Attention mechanism
- Handling of class imbalance
- Modular ML pipeline (embedding → training → inference → deployment)
This system classifies user-entered text into one of the following emotions:
- 😢 sadness
- 😄 joy
- ❤️ love
- 😡 anger
- 😨 fear
- 😲 surprise
The backend model is a BiLSTM + Attention neural network trained on a large Twitter-based emotion dataset.
The frontend is built with Streamlit for real-time inference and visualization.
Source:
- Kaggle Emotion Dataset link - https://www.kaggle.com/datasets/nelgiriyewithana/emotions
Classes:
- 0 → sadness
- 1 → joy
- 2 → love
- 3 → anger
- 4 → fear
- 5 → surprise
Key Properties:
- Large-scale text dataset
- Nearly class-balanced
- No emojis (pure textual patterns)
- Cleaned and deduplicated before training
Architecture Flow:
Embedding → BiLSTM → Attention → Dense → Softmax
Details:
- Embedding Dimension: 200
- BiLSTM Hidden Units: 128
- Attention: Custom time-step weighted aggregation
- Dense Layer: 64 units (ReLU)
- Output: 6-class Softmax
Why Attention?
- Helps the model focus on the most emotionally relevant words
- Improves interpretability and robustness over plain LSTM
Loss Function:
- Sparse Categorical Crossentropy
Optimizer:
- Adam (LR = 1e-3)
Batch Size:
- 128
Class Imbalance Handling:
- Dynamic
class_weightcomputation using scikit-learn
Callbacks:
- EarlyStopping
- ReduceLROnPlateau
- ModelCheckpoint (full
.kerasmodel)
Final Model Storage:
- Stored as a full model: artifacts/bilstm_attention_emotions.keras
- Test Accuracy: ~96–97%
- High recall for minority classes after class-weight correction
- Stable training with early convergence (2–4 epochs)
The model performs very strongly on:
- sadness
- joy
- anger
- fear
- surprise
- The "love" and "joy" classes show semantic overlap.
- Highly positive affectionate sentences are sometimes classified as joy instead of love.
- This is a dataset + label boundary issue, not an engineering bug.
This limitation is common even in transformer-based emotion models.
- Python
- TensorFlow / Keras
- NumPy
- Pandas
- Scikit-learn
- Streamlit
text-emotion-predictor
| -- streamlit_app.py
|
| -- notebooks/
| │-- embeddings.ipynb
| │-- attention.ipynb
| │-- training.ipynb
| └-- prediction.ipynb
|
| -- artifacts/
| │-- tokenizer.pkl
| └-- bilstm_attention_emotions.keras
|
| -- requirements.txt
| -- README.md
| -- LICENSE
| -- .gitignore- Install dependencies:
pip install -r requirements.txt- Ensure artifacts exist:
artifacts/
├── bilstm_attention_emotions.keras
└── tokenizer.pkl- Start the Streamlit app:
streamlit run streamlit_app.pyDeveloped by: Pradyumn Bisht
Focus Area: Machine Learning, Deep Learning, NLP, AI Deployment


