๐๏ธ Trash Image Classification using Pre-trained Vision Transformer (ViT)
This project implements an image classification system using a pre-trained Vision Transformer (ViT) from Hugging Face, fine-tuned to classify waste images into six categories:
โป๏ธ Cardboard, Glass, Metal, Paper, Plastic, and Trash
๐ Dataset
We used the garythung/trashnet
dataset with the following distribution:
- ๐ฆ Cardboard: 806 images
- ๐พ Glass: 1002 images
- ๐ฅซ Metal: 820 images
- ๐ Paper: 1188 images
- ๐งด Plastic: 964 images
- ๐ฏ Trash: 274 images
โ ๏ธ Due to class imbalance, a
WeightedRandomSampler
was used to ensure fair training.
๐ง Model Overview
We fine-tuned the powerful ViT model:
๐ google/vit-base-patch16-224-in21k
The resulting model is published at:
๐ tribber93/my-trash-classification
โ๏ธ Requirements
Install dependencies with:
pip install -r requirements.txt