The robustness of neural network-based deep learning models against adversarial attacks is closely related to the geometric structure of their decision boundaries. Prior empirical studies have revealed that networks exhibiting fragmented decision boundaries, that are quantified by a high proportion of the Populated Region Set (PRS), are particularly vulnerable to adversarial attacks. Consequently, geometric regularization methods based on the PRS concept have been proposed. However, these methods have limitations in scenarios where the geometric structure of the decision boundary is insufficiently developed, such as during early training phases or when using large batch sizes. In this thesis, I first provide a theoretical analysis that mathematically validates how training configurations, including batch size, and regularization techniques such as label smoothing influence the geometric structure of the decision boundaries. Based on this theoretical insight, I propose a novel regularization technique, named PRS-aware Label Smoothing (PaLS), that integrates the concept of label smoothing with the PRS framework to overcome the limitations of existing PRS-based regularization methods. PaLS encourages alignment between the key centroids of decision regions and feature embeddings through smoothing techniques, thereby addressing the issue of fragmented decision boundaries and enhancing adversarial robustness. To validate the effectiveness of the proposed method, experiments were conducted across three different models. Experimental results demonstrate that PaLS provides superior robustness compared to standard label smoothing and existing PRS-based regularization methods across various architectures and training scenarios. In conclusion, this study deepens the geometric understanding of deep learning models and offers a practical approach to improving robustness against adversarial attacks.
신경망 기반딥러닝 모델의적대적공격에 대한강건성은모델결정영역의기하학적구조와밀접한 관련이있다. 이전의경험적연구들은결정영역이파편화된네트워크일수록 Populated Region Set(PRS) 의비율이높고, 이로 인해적대적공격에 취약하다는 것을밝혀냈다. 따라서PRS 개념에 기반한기하학적 정규화기법들이제안되었으나, 훈련 초기이거나 배치 크기가클 때처럼 결정영역의기하학적구조가충분 히 형성되지않은상황에서는 이방법론들을적용하는 데한계가있다. 본논문에서는 먼저이론적분석을 통해배치 크기와같은학습방식의차이와라벨스무딩(label smoothing) 기법과같은정규화방법이결정 영역의기하학적구조에 미치는 영향을수학적으로 검증한다. 이를 바탕으로 기존PRS 기반정규화기법의 한계를 극복하고자라벨스무딩을PRS 개념과결합한새로운정규화기법인PRS-aware Label Smoothing (PaLS)를 제안한다. PaLS는 결정영역의주요중심점과특징임베딩을스무딩 기법을통해정렬하도록 유도하여, 결정영역의파편화문제를 해결하고적대적강건성을향상시킨다. 제안한기법의유효성을검증 하기 위해세가지모델에 적용하여 실험을수행하였다. 실험결과PaLS는 표준라벨스무딩 및기존PRS 기반정규화방법에 비해다양한아키텍쳐 및훈련 시나리오에서더높은강건성을보였다. 결론적으로, 본 연구는 딥러닝 모델의기하학적이해를 심화하고, 적대적공격에 대한강건성을향상시키기 위한실용적인 접근법을제시한다.