This thesis describes a three-stage pipelined floating-point processing unit(FPU) for the Very Long Instruction Word(VLIW) processor[2], which is aimed at image processing and 3-D Graphics. It has three functional units, floating-point arithmetic logic unit(FPALU), floating-point multiplier(FPMUL), and floating-point reciprocal(FPREC) unit.
It has two operation modes, Twin and Normal Mode. The FPALU and FPMUL are splittable to support th Twin Mode.
It can achieve a peak performance of 5 operations per clock in Twin mode and 3 operations per clock in Normal mode.
The FPMUL has a new multiplier architecture, which has smaller hardware than a conventional multiplier for floating-point double precision. Nevertheless, only one additional cycle is enough to perform multiplication in double precision, and it can achieve a speedup of two compared to a conventional multiplier in single precision.