한국과학기술원 도서관

서지주요정보
Development of an anti-drone system based on a multi-agent reinforcement learning algorithm = 다중에이전트 강화학습 기반 안티드론 시스템 개발
서명 / 저자	Development of an anti-drone system based on a multi-agent reinforcement learning algorithm = 다중에이전트 강화학습 기반 안티드론 시스템 개발 / Donghwi Kim.
발행사항	[대전 : 한국과학기술원, 2021].
Online Access	원문보기 원문인쇄

소장정보

등록번호

8038076

소장위치/청구기호

학술문화관(문화관) 보존서고

MEE 21130

휴대폰 전송

도서상태

이용가능(대출불가)

사유안내

반납예정일

리뷰정보

초록정보

Currently, The main technology of Anti-Drones is to use images to find out the relative position and speed of the target drone and follow it. However, there is no specific maneuver for capturing target drones for multiple agents. This paper proposes element techniques that can subdue the target drone using multiple drones in response to the unauthorized operation and penetration of drones that have recently emerged rapidly. two or three slow-paced pursuit drones were trained to capture one target drone using reinforcement learning in an obstacle environment. multi-agent deep deterministic policy gradients algorithm is used to learn multi-agent. To increase performance using recurrence networks, we apply single-agent off-policy recurrent reinforcement learning methods to multi-agent algorithms to guarantee recurrence and compare and analyze their performance. As a result, storing the hidden state of the recurrent network into the replay buffer and using it in the learning part showed 1.2 times better performance. To maximize performance, we designed the network to share and learn each agent’s experience, increasing the speed by about 2x.

현재 안티 드론의 주 기술은 이미지를 이용하여 타겟 드론의 3차원 상대 위치와 속도를 알아내어 이를 따라가는 방식이다. 하지만 다중에이전트의 타겟 드론 포획에 대해서는 특별한 전략이 없다. 본 논문은 최근 급속도로 대두되고 있는 드론의 허가 받지 않는 운용과 침투에 대응하여 다중 드론을 이용해 타겟 드론을 제압할 수 있는 요소 기술에 대해 설명한다. 장애물이 있는 환경 속에서 강화학습을 이용하여 속도가 느린 2 혹은 3대의 추격 드론이 1대의 타겟 드론을 포획하도록 학습을 진행하도록 하였다. 다중 에이전트를 학습하기 위해 다중 에이전트 심층 결정론적 정책 그레이디언트의 강화학습을 사용하였다. 회귀 네트워크를 이용하여 성능을 높이고자 오프 정책 단일 에이전트에서 회귀성을 보장하는 방법을 다중 에이전트 알고리즘에 접목하였고, 성능을 비교 분석해보았다. 그 결과 회귀 네트워크를 저장버퍼에 넣고, 이를 학습단에서 사용하는 것이 1.2배 가량 좋은 성능을 보였다. 성능을 극대화 하기 위해 각 에이전트의 경험을 공유하며 학습하도록 네트워크를 설계하여 약 2배 가량 속도를 증진시켰다.

서지기타정보

서지기타정보
청구기호	{MEE 21130
형태사항	iv, 41 p. : 삽화 ; 30 cm
언어	영어
일반주기	저자명의 한글표기 : 김동휘 지도교수의 영문표기 : Hyunchul Shim 지도교수의 한글표기 : 심현철
학위논문	학위논문(석사) - 한국과학기술원 : 전기및전자공학부,
서지주기	References : p. 39-40

QR CODE

책소개

전체보기

나의 도서관정보

메뉴

소장정보

리뷰정보

초록정보

서지기타정보

책소개

목차

이 주제의 인기대출도서