Home > Community > Blogs > Functional Verification > how i nearly had my own subtract bug in a cpu design
Login with a Cadence account.
Not a member yet?
Create a permanent login account to make interactions with Cadence more conveniennt.

Register | Membership benefits
Get email delivery of the Functional Verification blog (individual posts).


* Required Fields

Recipients email * (separate multiple addresses with commas)

Your name *

Your email *

Message *

Contact Us

* Required Fields
First Name *

Last Name *

Email *

Company / Institution *

Comments: *

How I Nearly Had My Own “Subtract Bug” in a CPU Design

Comments(0)Filed under: Functional Verification, verification, debug, bugs, corner cases, subtract, divide, add, Cydrome, subtract bug

In a recent blog post, I talked about learning a public lesson on the importance of software verification while an intern at Digital Equipment Corporation (DEC). Since I spent most of my early career as a logic designer, not a programmer, I figure that an example of a corner-case condition from that part of my life would also be nice to share. This story will doubtless remind you of a well-known "divide bug" that appeared in a certain microprocessor in the mid-90s.

From 1985 to 1988, I worked at Cydrome, a mini-supercomputer startup whose very-long-instruction-word (VLIW) machine had quite a few novel aspects. I initially worked on the floating-point unit, with primary responsibility for the adder/subtractor. Anyone who has worked with floating-point numbers using the IEEE 754 standard knows that subtraction of two numbers that are close in value can result in a number that is "denormalized" with leading zeros in the mantissa. The usual way of handling this situation is to shift the result mantissa left to eliminate the leading zeroes while decrementing the exponent correspondingly.

It's also necessary before an add or subtract operation to align the two operands, generally by shifting the smaller operand right while increasing its exponent. My colleague Craig Nelson had the clever idea of merging the post-operation normalization into the pre-operation alignment to speed up overall latency. He developed a slick algorithm to predict when denormalization would occur, accurate to within one bit. Thus, we could replace the slow, complex result mantissa shifter and exponent decrementer with a fast, simple multiplexer.

Craig developed a proof for his algorithm that seemed solid to all of us who reviewed it, but of course it was still important to verify my logic implementation. This verification was even more important because one of the interesting aspects of the algorithm was that its implementation was non-intuitive, involving what seemed like random logic operations on random bits of the two operands. This is not always the case in logic design; for example the following well-known equations for a four-bit carry-look-ahead adder have a clear pattern that can be verified by inspection:

    C1 = G0 + P0 * C0
    C2 = G1 + G0 * P1 + C0 * P0 * P1
    C3 = G2 + G1 * P2 + G0 * P1 * P2 + C0 * P0 * P1 * P2
    C4 = G3 + G2 * P3 + G1 * P2 * P3 + G0 * P1 * P2 * P3 + C0 * P0 * P1 * P2 * P3

In contrast, the following actual fragment of my gate-level adder schematic (this was before commercial RTL synthesis) has no discernable pattern in terms of which bits of Bus A and Bus B are combined in the various gates:

We took a two-step approach to verifying this unusual design. First, Craig rigged up a program that generated random floating-point values with random add and subtract operations. The resulting calculations were performed on the Apple Macintosh, one of the few commercial implementations of the IEEE standard available at that time, and compared against the results from a C implementation of the algorithm. I then took a subset of these tests and ran them against my implementation in logic simulation using a simple testbench that fed in the values and operands and then checked the results.

Quite late in the process, at the point where I had fairly high confidence in the correctness of my implementation, logic simulation reported a miscompare with one expected result. After spending a couple of hours tracking the problem down, I found the bug -- a single mis-numbered "ripper" on a single bit of one bus on one of the eighteen pages with logic similar to the fragment above. I have to admit: that bug shook me up. A simple typo that I had missed on repeated visual inspection of the schematics also slipped through a lot of test cases. I was fortunate that the random tests happened to catch the bug, and that I had continued verification long enough for this catch to occur.

When the infamous microprocessor "divide bug" cropped up in the industry a few years later, I had a strong sense of déjà vu. As with my subtract bug, the vast majority of operands would work just fine, but every once in a while the answer would be wrong. We usually think of corner cases in terms of combinations of control signals, or of obvious data values such as min and max, but with some designs the corner cases are not at all intuitive. The only way to catch them, of course, is to verify, verify, and verify some more.

Tom A.

The truth is out there...sometimes it's in a blog.




Leave a Comment

E-mail (will not be published)
 I have read and agree to the Terms of use and Community Guidelines.
Community Guidelines
The Cadence Design Communities support Cadence users and technologists interacting to exchange ideas, news, technical information, and best practices to solve problems and get the most from Cadence technology. The community is open to everyone, and to provide the most value, we require participants to follow our Community Guidelines that facilitate a quality exchange of ideas and information. By accessing, contributing, using or downloading any materials from the site, you agree to be bound by the full Community Guidelines.