This walkthrough uses an ML algorithm called an image classification model. These models learn to distinguish between different objects by observing many examples over many iterations. This post uses a technique called transfer learning to dramatically reduce the time and data required to train an image classification model. For more information about transfer learning with Amazon SageMaker built-in algorithms, see How Image Classification Works. With transfer learning, you only need a few hundred images of each type of trash. As you add more training samples and vary the viewing angle and lighting for each type of trash, the model takes longer to train but improves its accuracy during inference, when you ask the model to classify trash items it has never seen before.
Before going out and collecting images yourself, consider using the many sources for images that are publicly available. Wewant images that have clear labels (often done by humans) on what’s inside the image. Here are some sources you could look for your use case:
A good practice for collecting images is to use pictures at different possible angles and lighting conditions to make the model more robust. The following image is an example of the type of image the model classifies into landfill, recycling, or compost.
When you have your images for each type of trash, separate the images into folders.
|-images |-Compost |-Landfill |-Recycle
After you have the images you want to train your ML model on, upload them to Amazon S3. First, create an S3 bucket. For AWS DeepLens projects, the S3 bucket names must start with the prefix deeplens-.
This recipe provides a dataset of images labeled under the categories of recycling, landfill, and compost.