Trash Image Classification using Pre-trained Vision Transformer (ViT)

Kategori: AI/Data/ML

Date: Jul 2023

Trash Image Classification using Pre-trained Vision Transformer (ViT)

๐Ÿ—‘๏ธ Trash Image Classification using Pre-trained Vision Transformer (ViT)

This project implements an image classification system using a pre-trained Vision Transformer (ViT) from Hugging Face, fine-tuned to classify waste images into six categories:

โ™ป๏ธ Cardboard, Glass, Metal, Paper, Plastic, and Trash


๐Ÿ“Š Dataset

We used the garythung/trashnet dataset with the following distribution:

  • ๐Ÿ“ฆ Cardboard: 806 images
  • ๐Ÿพ Glass: 1002 images
  • ๐Ÿฅซ Metal: 820 images
  • ๐Ÿ“„ Paper: 1188 images
  • ๐Ÿงด Plastic: 964 images
  • ๐Ÿšฏ Trash: 274 images

โš ๏ธ Due to class imbalance, a WeightedRandomSampler was used to ensure fair training.


๐Ÿง  Model Overview

We fine-tuned the powerful ViT model:

๐Ÿ”— google/vit-base-patch16-224-in21k

The resulting model is published at:

๐Ÿ”— tribber93/my-trash-classification


โš™๏ธ Requirements

Install dependencies with:

pip install -r requirements.txt