AppClassifier

Motive

AppClassifier is a multilabel NLP project. Almost 360+ labels are used in this project which is deployed and rendered.

HuggingFace API	Rendered website

Data Collection

I collected data from : https://sourceforge.net/
Around 33634 pieces of data have been collected.

Data Preprocessing

Because this project is based on MultiLabel, each entity must have more than one label. I chose the most common labels because the least common labels can distract a model from detecting accurate labels.

Around 360+ labels are chosen and then Converted the string categories into numerical form.

Training

For the rest of the work I use PyTorch WorkFrame. Also Blurr Api for training models. For model choosing, I used two types of models, those are collected from the HuggingFace model library.

distilroberta-base and
bertabaporu-large

Training Results

I used 2 models for comparison. But both models performed the same; they gave 99% accuracy. Both model did great work.

distilroberta-base : As all the processes done in Pytorch so I had to use dataloaders for transform dataset for model. In that case I choose batch size 32. This model is very faster than other models. So I used this model for that project
bertabaporu-large : For this model I had to choose the Batch size 2. Othewise CUDA Terminate the training process for crossing the limit of CUDA. This model is the most Slower.

Models

All of the models can be found in here.

Deployment

I used HuggingFace to deploy this project. It is very easy to use and free. You can use this project as deployed : https://huggingface.co/spaces/Rimi98/AppClassifier

Integration/ Render

I used Flask for rendering this project as an open website. I created a very basic GUI to build this website. Also I use the Render for integration.

Click this link

Future Work

Almost 50000+ data points can be found on this website where I have collected 33634 data points. My future work will be to collect more data and develop this project. Anyone can join me. Feel free to pull a request.

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
data		data
deployments		deployments
flask		flask
images		images
notebook		notebook
scrape		scrape
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

AppClassifier

Motive

Data Collection

Data Preprocessing

Training

Training Results

Models

Deployment

Integration/ Render

Future Work

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

AklimaRimi/AppClassifier

Folders and files

Latest commit

History

Repository files navigation

AppClassifier

Motive

Data Collection

Data Preprocessing

Training

Training Results

Models

Deployment

Integration/ Render

Future Work

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages