Jeroen Leijten is Chief Technology Officer for Silicon Hive, a Dutch company that has
quickly become one of the world's leading intellectual property (IP) providers
of imaging and video processing solutions for rapidly changing market segments
such as connected, interactive digital televisions and smart phones. Silicon Hive programmable parallel system solutions are
licensed by semiconductor companies such as Samsung, LSI and Intel. Silicon Hive engineers recently implemented a
use model called"transaction-based acceleration" (TBA) with the Cadence
Palladium III accelerator/emulator to verify some of Silicon Hive's most
advanced multimedia system IP solutions (including hardware and software). Jeroen recently sat down to talk with us
about his team's work.
Q: Jeroen, can you tell us more about Silicon
Silicon Hive is a worldwide, independent supplier of semiconductor intellectual
property. Our company designs, builds and licenses application-specific system
solutions for imaging, video processing and communications using our programmable
HiveFlex parallel processor cores,
complete vertical HiveGo imaging and
video system solutions, and HiveLogic
platform. HiveLogic and HiveGo system solutions are supported by
HiveCC programming development tools,
and by partner-supplied application libraries provided by Silicon Hive and its
products enable semiconductor and consumer electronics companies to create and
integrate fully programmable SoCs, thus improving time to market performance. As
we develop our products, we maintain full programmability and field-upgradeability
within the cost and power constraints required in our target market
patented technology originates from 10 years of research and development within
Philips Research Laboratories. Silicon Hive spun out of Philips in 2007.
markets is Silicon Hive going after today?
HiveGo VSS (Video System Solutions) aim at the digital televison (HDTV) market,
while HiveGo CSS Camera Sub-Systems Solutions are focused on smart phones and multimedia
phones. We are one of the very few
companies who can deliver the complete image processing chain all the way from
sensor to codec, including all the software.
Our solutions deliver up to 20 million pixels resolution up to 30 frames
license our HiveFlex processors standalone together with all the necessary
software development tools. However, interestingly, customers today are asking
for this less often, preferring full HiveGo system solutions where Silicon Hive
and its customers customize and integrate customer's proprietary software.
imaging solutions are used in wireless handsets. When you consider that more
than one billion mobile phones are sold each year, and a rapidly increasing
share of these now contain some sort of camera, you start to understand the market
potential here. The Digital HDTV market, Internet digital video and
connectivity are changing the functional requirements rapidly. The inherent flexibility of our architecture
is a big advantage in this market for us, because Internet video codec standards
continue to evolve rapidly, changing as frequently as every three months. This really
requires software programmability.
do you see as the critical requirements in order for Silicon Hive to win in the
digital TV market?
most consumer applications, cost is perhaps the most important, but also
quality and performance. Our customers, including
OEMs like Samsung and Intel, mandate an extremely low failure rate for the
final consumer products. The warranty
cost otherwise would be prohibitive.
This is totally another level of quality compared to, say, a PC. While a television has become a computer in
itself, unlike with a PC, consumers will not accept a TV "crashing" and having
this kind of problem is difficult to prevent.
In order to achieve this, one must build in capabilities for error
resilience and error recovery so the TV gracefully recovers from any system
errors and does not crash. Obviously,
all of this requires a lot of stress testing and verification of the system
under a wide set of conditions.
you tell us more about the verification challenges with Hive Go systems?
products are complex hardware and software system solutions with four or five
internal buses, multiple arbiters hooking up to multiple processors, and large
memories. The challenge is that you have
to verify all the blocks will work together properly in the system.
first you verify blocks working separately, but that is only the starting
point. Then you must do system stress
testing. The most difficult bugs are
those resulting from interaction between blocks, such as corner cases, pipeline
stalls, and overflowing buffers, which are almost impossible to predict ahead
need to verify proper synchronization between hardware and software
blocks. Synchronization can be highly
timing-dependent. You might never see
any problems when testing blocks separately, or running non cycle-accurate
simulations. Then at some point you may find problems (such as in a video
codec) only after 100 or more video frames have been run through the
system. So the only way to find these
bugs is to do lengthy simulations with random testing, where you have the
different blocks running together and stressing each other. There is no other way to find these kinds of
Q: So, after looking at different verification solutions,
what options did you consider, and why did you ultimately decide on the
Cadence/Palladium III product?
A: In the
past, Silicon Hive was focused more on offering standalone processors, rather
than delivering full systems with application software. In this case RTL
simulation with instruction-set simulation (ISS) was quite sufficient. However, verifying multi-core solutions
require much more capability than this.
we tried to make our high-level simulators cycle-accurate. There's a problem -- that requires every
building block in our IP portfolio to have its own abstract model, which you
have to maintain. Plus the simulation speed really drops dramatically. Then we tried working with FPGA prototype
boards. The problem there is that our
designs do not typically fit onto one FPGA.
So you must partition, or modify the design, or map smaller cores. So you end up modifying the design and not
verifying what is actually being delivered.
looked at hardware-based acceleration and emulation systems. We decided against FPGA-based emulation due
to long turnaround/iteration cycles and also the limited observability. You also have the similar problem as before,
where the RTL does not fit everything into one FPGA. So we chose Palladium ultimately because of
ease of integration, short iteration/turnaround cycles, full observability, and
of course speed. To verify complex video
processing systems like ours, you need lengthy simulations with high
visibility, and using a system like Palladium III is the only way to find real
synchronization bugs in hardware and/or software.
Q: Could you describe the transaction-based
acceleration (TBA) environment you created with Palladium III?
A: So far
we used Palladium III in verifying our HiveGo VSS systems -- with video you
really need the highest performance possible. Our test bench consists of a bus,
connected to our hardware IP and system memory, running alongside a host. Everything except the host is synthesized to
Palladium III. The host is represented by using a SystemC model that runs on a
PC. The PC uses a transaction-based interface to communicate with
Palladium. The resulting speed is
several hundred times faster than RTL simulation, approaching that of full
in-circuit emulation. It works very
Q: How easy or difficult was it to get the whole
environment to work together?
frankly, during our first time it took quite some efforts to get the TBA
interface up and running, with great support from Cadence. But now in hindsight,
because we know what to do, it seems pretty easy and straightforward. One issue was that our designs are VHDL
based, while Palladium flow works better with Verilog based flow. So if somebody was using entirely Verilog,
they might not have the problems we encountered. Another issue was adapting the compilation
flow; molding the make-files into our flow took some time.
Q: How would you compare the performance of
Palladium accelerators/emulators to FPGA-based systems?
course, FPGA-based systems are generally quite faster, but that is only part of
the story. When you are verifying a
complex system, what you also need to look at is iteration cycle time, which
for us is even more important. For
example, the simulation speeds we are achieving with our Palladium III system
are about 100 frames of full HD video in about 30 minutes (1/500 of real-time).
It takes us about 20 minutes to compile the design, so total iteration cycle
time is about 1 hour. This lets our
engineers run several iterations per day.
other hand, with an FPGA-based system you could probably run video frames 20-25
times faster, which is great, but if you run into a bug you have two real
problems. First, it's very difficult to probe different signals, so finding the
cause of a bug can be very challenging. Second, the time to partition and
compile the design for the FPGA generally takes much longer. You also have the problem where partitioning
or adapting the design to fit or work in the FPGA might change the
functionality in non-obvious ways. So at
the end, it becomes much more difficult to uncover and fix any bugs.
The FPGA approach makes the most sense
when 95% of the bugs are out. If you
have any synchronization problems, corner-cases, or overflowing buffers, then
you likely won't find them using an FPGA approach.
Would you recommend that other companies invest in a Palladium system...why?
A: If a
company is making complex system designs including hardware and software, if
they want a way to run long simulations (which is very important in video and
graphics applications) to discover corner cases, then I would definitely
recommend a system like Palladium, certainly over any FPGA based
solutions. The primary reason is that
FPGA solutions lack controllability and observability. If you have time in your schedule, then an FPGA
solution can work, but it will have longer iteration cycles. Palladium has an enormous advantage by enabling
shorter iteration cycles.
Q: How do you see Silicon Hive and Cadence
working together in the future...what's next?
Hive and Cadence have been working together for a long time, and we appreciate
the level of support Cadence provides.
In the near future, we are exploring how to use Palladium for power
exploration and power analysis in combination with TBA. This will allow us to perform power analysis
on complete software stacks. The reason
you need TBA there is because these stacks require a lot of rapid interaction
between Palladium and the host in order to function correctly. Another interest of ours is to combine
Specman constrained-random testing runs with Palladium, together with TBA.