The Discrete Cosine Transform(DCT) is considered to be the most effective transform coding technique for image and video compression. In this paper, using a fast DCT algorithm and multiplier-accumulator based, blocks of image data are converted into the transform-domain for more effective coding. An Inverse Discrete Cosine Transform(IDCT) is used to convert the transform-domain data back to the spatial domain. An often used block size is 8 x 8 pixels since it represents a good compromise between the coding efficiency and the hardware complexity. Because of its effectiveness, many proposed standards such as the CCITT H.261 recommended standard for px64 kb/s (p=1,2,...,30) visual telephony, and the still-image compression standard developed by ISO JPEG all include the use of 8 x 8 DCT in their algorithms.
In this paper, a proposed architecture and implementation of a flexible 8 x 8 DCT/IDCT core processor using multiplication arithmetic rather than distributed arithmetic is presented. Our chip is for experimental prototype purpose and is implemented using standard cells. The new and fast DCT/IDCT algorithms are implemented in the same chip. The internal clock frequency is half of the pixel rate. The chip achieves a better accuracy than the CCITT IDCT specification.