In spite of the fact that floating-point arithmetic is costly in terms of silicon area, the joint design of hardware for floating-point and integer arithmetic is seldom considered. While components like multipliers and adders can potentially be shared, floating-point and integer units in contemporary processors are practically disjoint. Sabrewing tightly integrates floating-point and integer arithmetic in a single datapath. Because the architecture is mainly intended for use in low-power embedded digital signal processors, the following design constraints where adhered during its implementation: limited use of pipelining for the convenience of the compiler; maintaining compatibility with existing technology for easy integration; minimal area and power consumption for applicability in embedded systems. The architecture is tailored to digital signal processing by combining floating-point fused multiply-add and integer multiply-accumulate. It could be deployed in a multi-core system-on-chip designed to support applications with and without dominance of floating-point calculations.

The VHDL structural description of this architecture is available for download under BSD license. Besides being configurable at design time, it has been thoroughly checked for IEEE-754 compliance by means of a floating-point test suite originating from the IBM Research Labs. A proof-of-concept has also been implemented using STMicroelectronics 65nm technology. This prototype supports 32-bit signed two’s complement integers, IEEE-754 single precision and extended precision 41-bit (8-bit exponent and 32-bit significand) floating-point numbers. Evaluations show area savings of 19% compared to a reference design in which floating-point and integer arithmetic are implemented separately. The area overhead caused by combining floating-point and integer is less than 5%.

Implemented in STMicroelectronics general-purpose CMOS technology, the design can operate at a frequency of 1.35GHz, while 667MHz can be achieved in low-power CMOS. Considering that the entire datapath is partitioned in just three pipeline stages, and the fact that the design is mainly intended for use in the low-power applications, these frequencies are adequate. They are in fact competitive with current technology low-power floating-point units. Post-layout estimates indicate that the required area of a low-power implementation can be as small as 0.04mm2. Power consumption is in the order of several milliwatts. Strengthened by the fact that technology specific clock gating could further reduce Sabrewing's power consumption, we think that shared floating-point and integer architectures are a good choice for signal processing and low-power systems in general.

The Sabrewing architecture was published in ACM Transactions on Architecture and Code Optimization. A more in-depth (although by now somewhat outdated) description of the architecture can be found here.

The Sabrewing source can be downloaded here.

IEEE-1076 Fused Multiply-Add verification

Verification of floating-point units is a complex task, even more if the design is to be fully compliant with the IEEE-754 (2008) standard for floating-point arithmetic. During the development of Sabrewing this was a considerable obstacle. Especially considering that Sabrewing supports a custom (non-IEEE) floating-point format and the fact that its datapath is based on fused multiply-add. To overcome these difficulties the recently standardized floating-point functionality for VHDL (i.e., IEEE-1076) was used as reference. However, it turns out that this floating-point support package is currently not compliant with the IEEE-754 standard for floating-point arithmetic.

A test, based on IBM's FPGen floating-point test suite, that points out the inconsistencies between IEEE-1076 and IEEE-754 can be downloaded here.

Free Joomla Templates designed by Web Hosting Top