Semi-Supervised Semantic Segmentation
Improving semantic segmentation using a multi-phase training strategy and enhanced generative models.
This project builds upon Nvidia’s “Semantic Segmentation with Generative Models” by introducing a novel training approach that combines labeled and unlabeled data to improve performance in semantic segmentation. By adding more generators and discriminators to the architecture, the model achieves greater flexibility and accuracy, addressing challenges in semi-supervised learning.
Key Contributions
The core innovation in this project lies in enhancing the original generative model architecture. Unlike the base design, which included one generator and two discriminators, the extended framework employs multiple generators and discriminators. This expansion allows the model to handle more complex data distributions, making it better equipped for semi-supervised tasks.
To further boost performance, the project implements a multi-phase training strategy. The first phase involves training the initial generator with labeled data to create a robust baseline. In the second phase, a new generator is introduced and trained on a combination of labeled and unlabeled data, while freezing the parameters of the first generator. Finally, the first generator is retrained on the entire dataset—both labeled and unlabeled—to refine segmentation accuracy. This staged training process ensures gradual improvement and adaptability to diverse data distributions.
Method
This project integrates CycleGAN to enhance the training process by generating synthetic data that complements real-world datasets. CycleGAN helps diversify the training data, exposing the model to a broader range of variations and enabling it to generalize more effectively. The synthetic data generated by CycleGAN fills gaps in the labeled dataset, making semi-supervised learning more efficient.
The iterative training method combines the outputs from multiple phases, leveraging both labeled and unlabeled data. This approach allows the model to become more robust over time, as it learns to address challenges presented by incomplete or noisy datasets. By carefully balancing these datasets and incorporating additional generators and discriminators, the project demonstrates how semi-supervised learning can achieve high levels of segmentation accuracy without requiring extensive manual labeling.