Multilabel TV Series Classifier
- Tech Stack: Python,NLP,MultiLabel Classification,Web Scrapping,Rendering,Huggingface.
- Github URL: Project Link
This project is all about text classification which classifies genres of TV series based on the short description of the plot. It can classify upto 28 different genres of TV Series and can predict genres based on a short description of the TV series plot.Srapping TV series data was done from IMDb website using Selenium, collecting almost 25000 descriptions having 28 genres.Fine-tuned a pre-trained DistilRoberta-Base model using HuggingFace Transformers alongside Fastai and Blurr. Through rigorous training,Obtained remarkable results, achieving an outstanding accuracy of 99.885% and an impressive F1 score (micro) of 99.98%.Optimized the model for deployment,then converted it to the ONNX format and further compressed it using ONNX quantized format.
Deployed the compressed model on HuggingFace Spaces using a Gradio App.