Action Segmentation using Bezier Curvature as Spatio-Temporal Feature by Triplet Learning
DC Field | Value | Language |
---|---|---|
dc.contributor.advisor | 손경아 | - |
dc.contributor.author | 장승민 | - |
dc.date.accessioned | 2022-11-29T03:01:11Z | - |
dc.date.available | 2022-11-29T03:01:11Z | - |
dc.date.issued | 2022-02 | - |
dc.identifier.other | 31451 | - |
dc.identifier.uri | https://dspace.ajou.ac.kr/handle/2018.oak/20840 | - |
dc.description | 학위논문(석사)--아주대학교 일반대학원 :인공지능학과,2022. 2 | - |
dc.description.tableofcontents | 1 Introduction 1 2 Related Works 4 3 Method 6 3.1 Framewise Embedding 7 3.1.1 Triplet Network for Video 7 3.1.2 Reorganization for Triplet Selection 8 3.2 Curvature Synthesis 9 3.2.1 Bezier Curve Principle 9 3.2.2 Continuous Temporal Information 9 3.2.3 Discrete Temporal Information 10 3.3 Action Segmentation from Curvature 10 4 Experiment 12 4.1 Datasets 12 4.2 Metrics 12 4.3 Backbone Models 13 4.4 Quantitative Results 13 4.4.1 Comparison with the state-of-the-art on GTEA dataset 14 4.4.2 Comparison with the state-of-the-art on 50salads dataset 15 4.4.3 Comparison with the state-of-the-art on Breakfast dataset 16 4.5 Qualitative Results 17 4.5.1 Curvature Effect on Backbone 1 17 4.5.2 Curvature Effect on Backbone 2 18 4.5.3 Curvature Effect on Backbone 3 19 4.6 Effect of Reorganization 20 4.6.1 Partition Selection 20 4.6.2 Successive Selection 21 4.6.3 Reorganization 22 5 Conclusion 23 | - |
dc.language.iso | eng | - |
dc.publisher | The Graduate School, Ajou University | - |
dc.rights | 아주대학교 논문은 저작권에 의해 보호받습니다. | - |
dc.title | Action Segmentation using Bezier Curvature as Spatio-Temporal Feature by Triplet Learning | - |
dc.type | Thesis | - |
dc.contributor.affiliation | 아주대학교 일반대학원 | - |
dc.contributor.department | 일반대학원 인공지능학과 | - |
dc.date.awarded | 2022. 2 | - |
dc.description.degree | Master | - |
dc.identifier.localId | 1245033 | - |
dc.identifier.uci | I804:41038-000000031451 | - |
dc.identifier.url | https://dcoll.ajou.ac.kr/dcollection/common/orgView/000000031451 | - |
dc.subject.keyword | Action Segmentation | - |
dc.description.alternativeAbstract | With the development of recording technologies, the demand for video-based techniques is increasing. Despite the success in action segmentation which classifies short trimmed video, it remains a challenge to use long untrimmed videos. Action segmentation is the field of detecting and temporally locating segments in a video. Although previous approaches have shown an outstanding architectural development, the feature extractor remains. Recent approaches require additional temporal information such as action boundary information, which is difficult to obtain in real-world assumptions. This is because temporal features are not as well developed as spatial features. In this thesis, we propose a new feature synthesis framework, called a Temporal Curvature Feature (TCF). This framework consists of two stages: (a) framewise embedding and (b) curvature synthesis. In framewise embedding stage, we use a triplet network to map a video into T points. which are based on each action label corresponding to the frame. In curvature synthesis stage, we approximate a curve with these embedding points and synthesize the curvatures from the curve. These curvatures are used to enhance the temporal information of data through a framewise residual operation. The outputs have the same shape as the old shape and are used as the new input to bring out the potential from various models. To validate the effectiveness of our approach, curvatures are plugged into three action segmentation datasets, i.e., GTEA, 50Salads, and Breakfast, and we use the new input to train the previous state-of-the-art models: MS-TCN, MS-TCN2, ASRF, and ASFormer. The result tables show the overall increases in the performances. In particular, the F1 scores show the effectiveness of the approach in solving segmentation problem. Finally, the figures demonstrate that the curvature helps the model to better understand the temporal information. | - |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.