## Sabrewing

In spite of the fact that floating-point arithmetic is costly in terms of silicon area, the joint design of hardware for floating-point and integer arithmetic is seldom considered. While components like multipliers and adders can potentially be shared, floating-point and integer units in contemporary processors are practically disjoint. Sabrewing tightly integrates floating-point and integer arithmetic in a single datapath. Because the architecture is mainly intended for use in low-power embedded digital signal processors, the following design constraints where adhered during its implementation: limited use of pipelining for the convenience of the compiler; maintaining compatibility with existing technology for easy integration; minimal area and power consumption for applicability in embedded systems. The architecture is tailored to digital signal processing by combining floating-point fused multiply-add and integer multiply-accumulate. It could be deployed in a multi-core system-on-chip designed to support applications with and without dominance of floating-point calculations.

The VHDL structural description of this architecture is available for download under BSD license. Besides being configurable at design time, it has been thoroughly checked for IEEE-754 compliance by means of a floating-point test suite originating from the IBM Research Labs. A proof-of-concept has also been implemented using STMicroelectronics 65nm technology. This prototype supports 32-bit signed two’s complement integers, IEEE-754 single precision and extended precision 41-bit (8-bit exponent and 32-bit significand) floating-point numbers. Evaluations show area savings of 19% compared to a reference design in which floating-point and integer arithmetic are implemented separately. The area overhead caused by combining floating-point and integer is less than 5%.

Implemented in STMicroelectronics general-purpose CMOS technology, the design can operate at a frequency of 1.35GHz, while 667MHz can be achieved in low-power CMOS. Considering that the entire datapath is partitioned in just three pipeline stages, and the fact that the design is mainly intended for use in the low-power applications, these frequencies are adequate. They are in fact competitive with current technology low-power floating-point units. Post-layout estimates indicate that the required area of a low-power implementation can be as small as 0.04mm^{2}. Power consumption is in the order of several milliwatts. Strengthened by the fact that technology specific clock gating could further reduce Sabrewing's power consumption, we think that shared floating-point and integer architectures are a good choice for signal processing and low-power systems in general.

The Sabrewing architecture was published in ACM Transactions on Architecture and Code Optimization. A more in-depth (although by now somewhat outdated) description of the architecture can be found here.*The Sabrewing source can be downloaded*

*here*.

IEEE-1076 Fused Multiply-Add verification

Verification of floating-point units is a complex task, even more if the design is to be fully compliant with the IEEE-754 (2008) standard for floating-point arithmetic. During the development of Sabrewing this was a considerable obstacle. Especially considering that Sabrewing supports a custom (non-IEEE) floating-point format and the fact that its datapath is based on fused multiply-add. To overcome these difficulties the recently standardized floating-point functionality for VHDL (i.e., IEEE-1076) was used as reference. However, it turns out that this floating-point support package is currently not compliant with the IEEE-754 standard for floating-point arithmetic.

A test, based on IBM's FPGen floating-point test suite, that points out the inconsistencies between IEEE-1076 and IEEE-754 can be downloaded here.