Real-Time Lightweight Human Parsing Based on Class Relationship Knowledge Distillation

Author(s)
LANG YUQI
Alternative Author(s)
LANG YUQI
Advisor
황원준
Department
일반대학원 인공지능학과
Publisher
The Graduate School, Ajou University
Publication Year
2023-08
Language
eng
Keyword
Human ParsingKnowledge DistillationModel Lightweight
Alternative Abstract
In the field of computer vision, understanding human objectives is a crucial and chal- <br>lenging task, as it requires recognizing and comprehending human presence and behavior in <br> <br>images or videos. Within this domain, human parsing is an extremely challenging task, as <br>it necessitates accurately locating the human region and dividing it into multiple semantic <br>areas. This is a dense prediction task that demands powerful computational capabilities <br>and high-precision models. Recently, with the continuous development of computer vision <br> <br>technologies, human parsing has been widely applied to other tasks related to human ob- <br>jectives, such as pose estimation, and human image generation. These applications are <br> <br>expected to play an increasingly important role in future artificial intelligence research. <br> <br>To achieve real-time human parsing tasks on devices with limited computational re- <br>sources, we have designed and introduced a lightweight human parsing model. We chose <br> <br>Resnet18 as the core network structure and simplified the traditional pyramid module used <br> <br>to obtain high-definition contextual information, thus significantly reducing the complex- <br>ity of the model. Additionally, to enhance the parsing accuracy of the model, we integrated <br> <br>a spatial attention fusion strategy. Our lightweight model exhibits efficient performance <br>and achieves high segmentation accuracy on the commonly used dataset for human parsing <br>tasks, Look into Person (LIP). Although traditional models perform excellently in terms of <br>segmentation accuracy, their high complexity and abundance of parameters restrict their <br>use on devices with limited computational resources. To further improve the accuracy of <br> <br>our lightweight network, we also implemented knowledge distillation techniques. The tra- <br>ditional knowledge distillation method uses the Kullback-Leibler (KL) divergence to match <br> <br>the prediction probability scores of teacher-student models. However, this approach may <br>be ineffective at learning useful knowledge when there is a significant difference between <br>the teacher and student networks. Therefore, we adopted a new distillation standard, <br>based on inter-class and intra-class relationships in prediction results, which significantly <br>improves parsing accuracy. Empirical evidence has shown that, while maintaining high <br>segmentation accuracy, our lightweight model has substantially reduced the number of <br>parameters, thereby achieving our expected goals.
URI
https://dspace.ajou.ac.kr/handle/2018.oak/24285
Fulltext

Appears in Collections:
Graduate School of Ajou University > Department of Artificial Intelligence > 3. Theses(Master)
Files in This Item:
There are no files associated with this item.
Export
RIS (EndNote)
XLS (Excel)
XML

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

Browse