한국과학기술원 도서관

서지주요정보
딥러닝 모델과 대형 언어 모델에서의 차분 프라이버시 완화 기법 = Relaxation methods of differential privacy on deep learning models and large language models
서명 / 저자	딥러닝 모델과 대형 언어 모델에서의 차분 프라이버시 완화 기법 = Relaxation methods of differential privacy on deep learning models and large language models / 서준석.
발행사항	[대전 : 한국과학기술원, 2024].
Online Access	원문보기 원문인쇄

소장정보

등록번호

8042201

소장위치/청구기호

학술문화관(도서관)2층 학위논문

MEE 24089

휴대폰 전송

도서상태

이용가능(대출불가)

사유안내

반납예정일

리뷰정보

초록정보

In the realm of deep learning, privacy preservation stands as one of the critical concerns when contemplating trustworthy artificial intelligence. Differential privacy, a prevalent concept for ensuring privacy in AI models, tends to provide a uniform level of privacy for each data point, thereby resulting in excessive protection and reduced performance. This paper aims to introduce two techniques that alleviate differential privacy definitions to enhance performance without excessive privacy imposition. Firstly, we explore research that aligns with users' specific privacy requirements instead of uniform protection levels in deep learning, thereby reducing unnecessary protection and boosting performance. PDP-SGD, an extension of the widely-used differential privacy technique DP-SGD, caters to individual privacy demands through personalized differential privacy. Secondly, in the context of protecting individual data in downstream tasks of large language models, we address a scenario where the personal information comprises a minute fraction compared to the overall text data. In this regard, we delve into research focusing on selectively safeguarding only those bits of personal information. SDPPrompt achieves selective differential privacy in large language models through prompt tuning, reducing training costs while improving performance.

딥 러닝에서 개인정보 보호는 신뢰 가능한 인공지능을 생각할 때 중요한 이슈 중 하나이다. 그를 위해 차분 프라이버시는 인공 지능 모델에서 개인 정보 보호를 위한 개념으로 널리 활용되고 있지만, 각 데이터에 대해 균일한 수준의 개인정보 보호를 제공하기 때문에 과도한 보호를 하게 되고 성능이 떨어지게 된다. 본 논문에서는 차분 프라이버시의 완화된 정의들을 통해 개인 정보 보호를 과도하게 하지 않고 성능을 올리는 기법 2가지를 소개하고자 한다. 첫번째로, 딥러닝에서 균일한 보호 수준 대신에 사용자들의 개인 정보 보호 요구사항을 맞춰 불필요한 보호를 줄이고 성능을 올리는 연구를 하였다. PDP-SGD는 널리 쓰이는 차분 프라이버시 기법 DP-SGD를 확장한 것으로, 개인화 차분 프라이버시를 통해 사용자 개인의 개인정보 보호 요구를 만족시킬 수 있는 기법이다. 두번째로, 대형 언어 모델에서 다운스트림 데이터에 대한 개인정보를 보호하고자 할 때, 개인정보의 비중은 전체 텍스트 데이터에 비하면 극히 일부이기 때문에 그 개인정보들만 선택적으로 보호하는 연구를 하였다. SDPPrompt는 대형 언어 모델에서 선택적 차분 프라이버시를 프롬프트 튜닝을 통해 달성함으로써 훈련하는 데 드는 비용은 줄이고 성능은 늘리는 기법이다.

서지기타정보

서지기타정보
청구기호	{MEE 24089
형태사항	iv, 31 p. : 삽도 ; 30 cm
언어	한국어
일반주기	저자명의 영문표기 : Jun Seok Seo 지도교수의 한글표기 : 황의종 지도교수의 영문표기 : Steven Euijong Whang 부록 수록
학위논문	학위논문(석사) - 한국과학기술원 : 전기및전자공학부,
서지주기	참고문헌 : p. 26-29
주제	차분 프라이버시 개인화 차분 프라이버시 선택적 차분 프라이버시 대형 언어 모델 프롬프트 튜닝 differential privacy personalized differential privacy selective differential privacy large language model prompt tuning

QR CODE

책소개

전체보기

나의 도서관정보

메뉴

소장정보

리뷰정보

초록정보

서지기타정보

책소개

목차

이 주제의 인기대출도서