서지주요정보
Human visual system modeling and its application to visual communication over lossy packet networks = 인지시각의 모델링과 패킷망 영상전송에의 응용
서명 / 저자 Human visual system modeling and its application to visual communication over lossy packet networks = 인지시각의 모델링과 패킷망 영상전송에의 응용 / Seong-Whan Kim.
저자명 Kim, Seong-Whan ; 김성환
발행사항 [대전 : 한국과학기술원, 1999].
Online Access 원문보기 원문인쇄

소장정보

등록번호

8009911

소장위치/청구기호

학술문화관(문화관) 보존서고

DCS 99008

휴대폰 전송

도서상태

이용가능

대출가능

반납예정일

등록번호

9006223

소장위치/청구기호

서울 학위논문 서가

DCS 99008 c.2

휴대폰 전송

도서상태

이용가능

대출가능

반납예정일

초록정보

The main problem of visual information transmission over the packet networks is that they are lossy and do not support guaranteed service. There are four kinds of losses: packet corruption (i.e. a partial information of a packet is lost or changed), packet loss (i.e. whole information of a packet is lost), packet delay (i.e. timing information is lost), and packet jitter or delay variation (i.e. synchronization information is lost). These losses are the main reason of image quality impariments. These losses are inevitable for packet networks. Fortunately, in contrast to other data such as text, it is possible to allow a small distortion in the visual data, as long as it is not perceivable to human eyes by adopting an appropriate masking built into the signal decoders. For this reason, we first initiated our study from the investigation of the basic mechanism and characteristics of human visual system (HVS)'s information processing. To faciliate our discussion, we formulated a {\it percepton} model for visual perception. Percepton is defined as a basic unit for human perception coming from the scene. The two problems to deal with in this thesis are as follows: what is an maximum tolerance for a specific representation of percepton and what is an efficient processsing of percepton to maximize perceptual image quality over lossy packet networks. First part, we identify the maximum tolerance level for coefficients in transform-domain representation, where we take discrete cosine transform (DCT) and wavelet transform (WT) because they are widely used and efficient. There are many researches on the imperceptible distortion. We extend Watson model which have been applied to the baseline of the JPEG coder. The model exploits three different properties of the human visual system: frequency sensitivity, luminance masking, and contrast masking. In human visual system, horizontal and amacrine cells transmits signal to the neighbour bipolar and ganglion cells, which inhibits their responses (i.e. lateral inhibition). To increase the maximum tolerance for a percepton using this observation, we design a masking model using a complexity of neighbours. We use an entropy and a variance (for simple computation) to compute the complexity. For the video signals, we further extended our model to motion entropy masking model. The basic idea is that HVS decrease its sensitivity to a percepton if the motion complexity increases around a percepton. We model this decreased sensitivity as a motion entropy masking. It is also a kind of lateral inhibition by neighbours' motion information. A digital watermark, or watermark in short, is an invisible mark inserted in digital media such as digital images, audio and video so that it can later be detected and used as evidence of copyright infringement. We used our entropy masking model to make watermarks to distort each DCT coefficients. We compared our watermarking scheme with P&Z's and SS scheme in terms of imperceptibility, maximality and robustness. Second part, we deal with an efficient processsing of percepton to maximize perceptual image quality over lossy packet networks. Our percepton model assumes frameless rendering (FR) or asynchronous video coding (AVC) approach. A FR or AVC receivers display image blocks asynchronously with the sender. If an image block arrives the receiver and its not too late or too early for its display, it will be displayed. First, we deal with a problem of balancing the number of blocks in each stream. It is important not only for traffic balancing at the sender side, but also for maximizing information from received packets at the receiver side. As most motions in a frame gets lower, the eye will decrease its retinal velocity increasing its sensitivity to low motion, which means that it will have finer resolution in low motion. In case when most motions in a frame gets higher, the eye will increase its retinal velocity increasing its sensitivity to high motion, which means that it will have finer resolution in high motion. We model this observation using motion entropy. We compute a ratio of motion entropy of a motion region over total motion entropy. If it is large, we decrease the resolution of the motion class to decrease the number of blocks which belongs to the motion class, thereby passing the blocks to other motion class. Second, we deal with a problem of scheduling substreams to maximize the visual quality by minimizing jitter between substreams. This problem can be modeled as a real time scheduling problem, which is studied for a long time. We assign two requirements, delay bound and minimum quality, for each substream. Low motion stream can be delayed more than high motion blocks; High motion blocks require less information for minimum quality. We applied this model using DCT and WT for the representation of percepton, and compare our scheme with two popular real time scheduling model: earliest deadline first (EDF) and (m,k) algorithms. We use REAL network simulator for our simulation for overloaded network situation. We also implemented our scheme over TCP/IP network to show perceptual quality gain over other two algorithms.

인지시각을 모델링하려는 연구는 인지과학에서의 기능적인 모델과 인지능력의 감소를 측정하는 마스킹 모델이 있으며, 최근 Watson97에 의해 엔트로피 마스킹으로 가능성이 제시되었다. 본 논문에서는 엔트로피를 중심으로 인지시각을 모델링하며, (1) 자극에 대한 개별 반응기는 정해진 엔트로피 용량으로, 용량초과에 따라 인지능력이 저하되고, (2) 반응기 전체는 엔트로피를 최대화 방향으로 동작한다는 가설을 기반으로 하고, 움직임 복잡도를 측정하기 위한 움직임 엔트로피를 새로 정의하였다. 엔트로피 기반의 인지모델은 엔트로피 최대이론 등의 다양한 수학 이론들을 다양한 응용에 적용할 수 있는 잇점을 가지며, 본 논문에서는 두가지 응용을 통하여 이득을 측정한다. 영상이나 비디오에 눈에 띄지 않는 정보를 첨가하여 저작권 정보를 첨가하는 응용을 워터마킹이라 한다. Cox 97, Podi 98 논문은 Watson 인지모델과 정규분포 N(0,1) 분포를 사용하여 워터마크를 첨가한다. 그러나, 이러한 기법들은 대비가 큰 영상의 경우에 영상의 화질저하가 나타나며, 만들어진 워터마크 역시 최대가 아니며, 강인성이 떨어진다. 본 논문에서는 엔트로피 인지모델과 제한정규분포 BN(0,1)을 사용하여 영상 내의 워터마크를 증가시키며 (최대성과 강인성 개선), Podi 98 기법에서 보이는 경계선 부분 화질저하를 없앤다 (화질 개선). 비디오의 경우에는 움직임 엔트로피 기여도가 큰 지점에서 워터마크를 증가시켜, 워터마크의 최대성과 MPEG 코딩에 대한 강인성을 향상시킨다. 패킷망은 지연,지연시간 변이,손실 특성에 강인한 영상 전송기법이 필요하다. Mess 94 논문에서는 각각의 영상 구성요소를 여러 움직임 채널로 나누어 보내고, EDF 스케쥴링한다. 빠른 영상의 경우, 움직임들이 일부 채널에 집중되고, 빠른 움직임의 독주에 의해 전체적인 화질이 떨어진다. 본 논문에서는 움직임 채널 중에서, 움직임 엔트로피 비가 높은 곳을 분할하여, 전체 움직임 채널에 고르게 분포하도록 하고, 빠른 움직임의 고해상도 성분보다 낮은 움직임의 저해상도 성분에 우선순위를 주는 MKW 스케쥴링으로 전체 엔트로피를 증가시킴으로써, 실제 TCP/IP 패킷망에서의 지연시간 및 화질을 개선한다.

서지기타정보

서지기타정보
청구기호 {DCS 99008
형태사항 x, 95 p. : 삽도 ; 26 cm
언어 영어
일반주기 저자명의 한글표기 : 김성환
지도교수의 영문표기 : Heung-Kyu Lee
지도교수의 한글표기 : 이흥규
학위논문 학위논문(박사) - 한국과학기술원 : 전산학과,
서지주기 Reference : p. 89-95
주제 Human visual system
Watermarking
Packet video
Entropy
인지시각
워터마킹
패킷 비디오
엔트로피
QR CODE qr code