AppClassifier is a multilabel NLP project. Almost 360+ labels are used in this project which is deployed and rendered.
| HuggingFace API | Rendered website |
|---|---|
![]() |
![]() |
I collected data from : https://sourceforge.net/
Around 33634 pieces of data have been collected.
Because this project is based on MultiLabel, each entity must have more than one label. I chose the most common labels because the least common labels can distract a model from detecting accurate labels.
Around 360+ labels are chosen and then Converted the string categories into numerical form.
For the rest of the work I use PyTorch WorkFrame. Also Blurr Api for training models. For model choosing, I used two types of models, those are collected from the HuggingFace model library.
distilroberta-baseandbertabaporu-large
I used 2 models for comparison. But both models performed the same; they gave 99% accuracy. Both model did great work.
- distilroberta-base : As all the processes done in Pytorch so I had to use
dataloadersfor transform dataset for model. In that case I choose batch size32. This model is very faster than other models. So I used this model for that project - bertabaporu-large : For this model I had to choose the Batch size 2. Othewise
CUDATerminate the training process for crossing the limit ofCUDA. This model is the most Slower.
All of the models can be found in here.
I used HuggingFace to deploy this project. It is very easy to use and free.
You can use this project as deployed : https://huggingface.co/spaces/Rimi98/AppClassifier
I used Flask for rendering this project as an open website. I created a very basic GUI to build this website.
Also I use the Render for integration.
Click this link
Almost 50000+ data points can be found on this website where I have collected 33634 data points. My future work will be to collect more data and develop this project. Anyone can join me. Feel free to pull a request.

