SS-GAN-ViT: Advancing Multi-label Chest Image Annotation Through Self-Supervised Learning, Adversarial Networks, and Vision Transformers

JIITA, vol.8 no.1, p.918-932, 2024, DOI: 10.22664/ISITA.2024.8.1.918

Sang Suh*, Sobha Rani Ponduru
Computer Science Department Texas A&M University-Commerce, Texas, U.S.A

Abstract: The swift advancements in medical imaging highlight the need for robust automated multi-label annotation systems, particularly in chest imaging, crucial for diagnosing and monitoring various thoracic diseases. Despite the adoption of deep learning models for image annotation, accurately annotating multiple conditions in chest images remains challenging. A noteworthy attempt, the adversarial-based denoising autoencoder model, showed promise in multi-label classification but had limitations in accuracy and robustness. Motivated by this, we propose the SS-GAN-ViT model, melding self-supervised learning, adversarial networks, and Vision Transformers to significantly enhance multi-label annotation accuracy in chest imaging. This novel amalgamation aims to address the identified limitations of existing models,
offering a robust solution for accurate multi-label annotation. Anticipated comparative evaluations with existing models are expected to showcase the superior performance of SS-GAN-ViT, advancing the field of medical image annotation and potentially aiding better diagnostic and treatment planning in healthcare.

Keywords: annotation, thoracic, adversarial

Fullpaper:

JIITA_Vol8_No1_p.918-932.pdf