In a recent blog post,
I talked about learning a public lesson on the importance of software
verification while an intern at Digital Equipment Corporation (DEC).
spent most of my early career as a logic designer, not a programmer, I
that an example of a corner-case condition from that part of my life
be nice to share. This story will doubtless remind you of a well-known
bug" that appeared in a certain microprocessor in the mid-90s.
From 1985 to 1988, I worked at
Cydrome, a mini-supercomputer startup whose very-long-instruction-word
had quite a few novel aspects. I initially worked on the floating-point
with primary responsibility for the adder/subtractor. Anyone who has
with floating-point numbers using the IEEE 754 standard knows that
of two numbers that are close in value can result in a number that is
"denormalized" with leading zeros in the mantissa. The usual way of
this situation is to shift the result mantissa left to eliminate the
zeroes while decrementing the exponent correspondingly.
It's also necessary before an add or
subtract operation to align the two operands, generally by shifting the
operand right while increasing its exponent. My colleague Craig Nelson
had the clever
idea of merging the post-operation normalization into the pre-operation
alignment to speed up overall latency. He developed a slick algorithm to
predict when denormalization would occur, accurate to within one bit.
could replace the slow, complex result mantissa shifter and exponent
with a fast, simple multiplexer.
Craig developed a proof for his
algorithm that seemed solid to all of us who reviewed it, but of course
still important to verify my logic implementation. This verification was
more important because one of the interesting aspects of the algorithm
its implementation was non-intuitive, involving what seemed like random
operations on random bits of the two operands. This is not always the
logic design; for example the following well-known equations for a
carry-look-ahead adder have a clear pattern that can be verified by
C1 = G0 + P0 * C0
C2 = G1 + G0 * P1 + C0 * P0 * P1
C3 = G2 + G1 * P2 + G0 * P1 * P2 + C0
* P0 * P1 * P2
C4 = G3 + G2 * P3 + G1 * P2 * P3 + G0
* P1 * P2 * P3 + C0 * P0 * P1 * P2 * P3
In contrast, the following actual
fragment of my gate-level adder schematic (this was before commercial
synthesis) has no discernable pattern in terms of which bits of Bus A
and Bus B
are combined in the various gates:
We took a two-step approach to
verifying this unusual design. First, Craig rigged up a program that
random floating-point values with random add and subtract operations.
resulting calculations were performed on the Apple Macintosh, one of the
few commercial implementations
of the IEEE standard available at that time, and compared against the
from a C implementation of the algorithm. I then took a subset of these
and ran them against my implementation in logic simulation using a
testbench that fed in the values and operands and then checked the
Quite late in the process, at the
point where I had fairly high confidence in the correctness of my implementation, logic
simulation reported a miscompare with one expected result. After
couple of hours tracking the problem down, I found the bug -- a single
mis-numbered "ripper" on a single bit of one bus on one of the eighteen
with logic similar to the fragment above. I have to admit: that bug
up. A simple typo that I had missed on repeated visual inspection of the
slipped through a lot of test cases. I was fortunate that the
tests happened to catch the bug, and that I had continued verification
enough for this catch to occur.
When the infamous microprocessor
"divide bug" cropped up in the industry a few years later, I had a
of déjà vu. As with my subtract bug,
the vast majority of operands would work just fine, but every once in a
the answer would be wrong. We usually think of corner cases in terms of
combinations of control signals, or of obvious data values such as min
but with some designs the corner cases are not at all intuitive. The
to catch them, of course, is to verify, verify, and verify some more.
The truth is out there...sometimes
it's in a blog.