A cascade design that optimizes a number of parameters, resulting in very low effective latency between operations with a sum dependence.
About
Abstract: Stanford researchers have patented an optimized design of floating-point units (FPU) that are more latency sensitive than graphical processing unit (GPU) designs. They have developed a cascade design that optimizes a number of parameters, resulting in very low effective latency between operations with a sum dependence. This results in a floating-point fused multiply-add (FMA) unit which may be implemented in a central processing unit (CPU). The design significantly reduces accumulation latency of traditional FMA designs and total system improvement with no energy or area overhead. Applications: Floating point units for latency sensitive CPU designs Advantages: Improvement in the total performance of the system because the average latency of the unit is 20% smaller than traditional design with no additional area or power consumption