Everybody is looking to reduce their chip's power
consumption these days. Often a lot of reduction is needed in order to fit in
the desired power envelope. Until now, designers of chips for wireless
applications formed the majority of the power management community. These days,
it is applicable to all kinds of chips, both wireless and wired.
Shutting down functional blocks is a great technique to reduce
power, there are many blocks that cannot be shut down long enough to deliver
power savings that justifies the associated overhead of this technique.
One under-utilized technique is reducing supply voltage on
domains that have performance margin. This can significantly reduce active
power - and leakage as well - and does not require any additional functional
verification. However it is difficult to figure out which domains can have
their voltages reduced, and by how much, and still meet timing. There just
isn't enough time to run all the experiments necessary to figure it out,
especially if your synthesis methodology is bottom-up. This is why we developed
Design Explorer (aka "DEX")
in RTL Compiler. We're starting to see some results now:
USB chip
This design targets a 65nm multi-vt library. The default
synthesis flow used a single 1.08v domain. By creating an extra domain and
letting DEX explore both domains across 0.8v, 0.9v, and 1.08v (so, that's 32,
or 9 exploration runs), we found that this design could meet timing with one of
the domains at 1.08v and one at 0.8v. This resulted in a 51% reduction in
active power and a 60% reduction in leakage, with a 1% area increase, and
again, same performance. That's a pretty significant ROI.
Wired networking chip
This chip's performance target is aggressive and cannot
leverage shutting down blocks. It targets a 45nm multi-vt library, and had been
only using a single 1.08v domain. Partitioning the major functional blocks into
power domains resulted in a total of 3 domains. Letting DEX explore across 4
libraries - 0.8v, 0.9v, 0.99v, and 1.08v - means that it explored 64 different
scenarios in parallel. The scenario that best balanced performance with area
with power savings was one in which one domain was at 1.08v, one at 0.9v, one
at 0.99v, and one at 0.8v. This delivered an active power savings of 17%, a
leakage power savings of 29%, at a cost of a 1% area increase. Timing was -10ps
in this scenario, but that was deemed to be close enough that it could be
closed in physical implementation.
Storage chip
Another high-performance 65nm chip, this one was already
utilizing a multi-supply approach. However there were only 2 power domains, one
at 1.1v and the other at 0.9v. This was a conservative approach to save some
power while ensuring that performance would still be able to be met. For the
DEX explorations, we carved out two additional power domains, and explored
across 0.75v, 0.9v, and 1.1v. DEX explored across 81 different scenarios (4
domains, 3 voltages)! The scenario that delivered the desired balance of
performance, power, and area had the 1.1v domain remaining at 1.1v, and the
other three were able to meet timing at 0.75v. Area increased by 2% in this
scenario, but active power decreased by 36%, and leakage decreased by 89%!
These are just some examples of the type of extensive automatic
exploration that can be performed with DEX. You can see that the number of
power domains in each case was still small, in order to minimize the physical
implementation overhead. Yet the power savings achieved was still great. So how
much power are you leaving on the table by not doing this type of exploration?
Jack Erickson