Real-Time Lightweight Human Parsing Based on Class Relationship Knowledge Distillation

DC Field Value Language
dc.contributor.advisor황원준-
dc.contributor.authorLANG YUQI-
dc.date.accessioned2025-01-25T01:35:51Z-
dc.date.available2025-01-25T01:35:51Z-
dc.date.issued2023-08-
dc.identifier.other32930-
dc.identifier.urihttps://dspace.ajou.ac.kr/handle/2018.oak/24285-
dc.description학위논문(석사)--아주대학교 일반대학원 :인공지능학과,2023. 8-
dc.description.tableofcontentsI Introduction 1 <br>II Related Works 5 <br>III Proposed Method 8 <br> 3.1 Framework Overview 9 <br> 3.2 Proposed Method 9 <br> 3.2.1 Effective model light-weighting methods 9 <br> 3.2.2 An Effective Lightweight Spatial Feature Fusion Attention Method for Human Parsing Models(LSFA) 10 <br> 3.2.3 Applying the intra-class and inter-class relationship approach to knowledge distillation 12 <br>IV. Experimental Results and Discussion 16 <br> 4.1 Dataset 16 <br> 4.2 Implementation Details 16 <br> 4.3 Inference speed and performance 17 <br> 4.4 Ablation experiment 19 <br>V Conclusion 21 <br>References 22-
dc.language.isoeng-
dc.publisherThe Graduate School, Ajou University-
dc.rights아주대학교 논문은 저작권에 의해 보호받습니다.-
dc.titleReal-Time Lightweight Human Parsing Based on Class Relationship Knowledge Distillation-
dc.typeThesis-
dc.contributor.affiliation아주대학교 대학원-
dc.contributor.alternativeNameLANG YUQI-
dc.contributor.department일반대학원 인공지능학과-
dc.date.awarded2023-08-
dc.description.degreeMaster-
dc.identifier.localIdT000000032930-
dc.identifier.urlhttps://dcoll.ajou.ac.kr/dcollection/common/orgView/000000032930-
dc.subject.keywordHuman Parsing-
dc.subject.keywordKnowledge Distillation-
dc.subject.keywordModel Lightweight-
dc.description.alternativeAbstractIn the field of computer vision, understanding human objectives is a crucial and chal- <br>lenging task, as it requires recognizing and comprehending human presence and behavior in <br> <br>images or videos. Within this domain, human parsing is an extremely challenging task, as <br>it necessitates accurately locating the human region and dividing it into multiple semantic <br>areas. This is a dense prediction task that demands powerful computational capabilities <br>and high-precision models. Recently, with the continuous development of computer vision <br> <br>technologies, human parsing has been widely applied to other tasks related to human ob- <br>jectives, such as pose estimation, and human image generation. These applications are <br> <br>expected to play an increasingly important role in future artificial intelligence research. <br> <br>To achieve real-time human parsing tasks on devices with limited computational re- <br>sources, we have designed and introduced a lightweight human parsing model. We chose <br> <br>Resnet18 as the core network structure and simplified the traditional pyramid module used <br> <br>to obtain high-definition contextual information, thus significantly reducing the complex- <br>ity of the model. Additionally, to enhance the parsing accuracy of the model, we integrated <br> <br>a spatial attention fusion strategy. Our lightweight model exhibits efficient performance <br>and achieves high segmentation accuracy on the commonly used dataset for human parsing <br>tasks, Look into Person (LIP). Although traditional models perform excellently in terms of <br>segmentation accuracy, their high complexity and abundance of parameters restrict their <br>use on devices with limited computational resources. To further improve the accuracy of <br> <br>our lightweight network, we also implemented knowledge distillation techniques. The tra- <br>ditional knowledge distillation method uses the Kullback-Leibler (KL) divergence to match <br> <br>the prediction probability scores of teacher-student models. However, this approach may <br>be ineffective at learning useful knowledge when there is a significant difference between <br>the teacher and student networks. Therefore, we adopted a new distillation standard, <br>based on inter-class and intra-class relationships in prediction results, which significantly <br>improves parsing accuracy. Empirical evidence has shown that, while maintaining high <br>segmentation accuracy, our lightweight model has substantially reduced the number of <br>parameters, thereby achieving our expected goals.-
Appears in Collections:
Graduate School of Ajou University > Department of Artificial Intelligence > 3. Theses(Master)
Files in This Item:
There are no files associated with this item.

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

Browse