Cadence and ARM have been working closely together for several years, and that relationship reached a new milestone Oct. 18 with the joint announcement of the first 20nm tapeout using the Cortex-A15 MPCore processor. The announcement also brought news of a multi-year technology collaboration that will optimize Cadence's design and verification flows for advanced ARM processors. This post will look at what some of those optimizations involve and how they will help SoC designers who use ARM Cortex processors.
In September 2010, Cadence announced an unusually deep and early collaboration with ARM to develop a reference methodology for customers seeking early access to the multi-core Cortex-A15 processor. I blogged about this collaboration last fall. The collaboration resulted in an optimized reference methodology that helped give Texas Instruments, an early Cortex-A15 licensee, a head start in its development work.
The new agreement includes the Cortex-A9 as well as the Cortex-A15. The multi-year agreement will provide ARM development teams with Cadence Silicon Realization products and access to Cadence design services. It will also result in an optimized Cadence Encounter Digital Implementation flow and Incisive verification flow for mutual ARM/Cadence customers, as described below.
To understand what the agreement means for Cadence customers who are ARM licensees, I talked to John Murphy, director of strategic business alliances, and Rob Lipsey, a distinguished Cadence engineer who has been working with ARM. The bottom line, said Murphy, is that "because of our knowledge and its codification into heuristics in our tools, they [customers] are going to be able to lower their design risk and speed their time to market."
Murphy said this knowledge was gained through months of close work with ARM, starting with early access to RTL code. This led to "fine tuning" that will help Cadence Encounter users optimize Cortex designs for power, performance, and area. Lipsey cited several examples, including optimizing certain kinds of case statements in synthesis, clustering registers more closely with clock gating cells, concurrent optimization of synthesis and physical implementation, and handling sub-optimal mux structures in synthesis.
The work with ARM also led to new Encounter Digital Implementation System capabilities that help designers avoid timing bottlenecks. CPU designs, Lipsey explained, have cells that fan out signals to many end points. The Encounter Digital Implementation System can help ARM Cortex users attack a potential bottleneck at the right location, making it possible to optimize many points in the design concurrently and speed up run time.
"The value add in terms of knowledge cannot be understated," Lipsey said. "When a new customer loads up a Cortex-A9 or a Cortex-A15 design and it's not meeting timing, we know where the design could get stuck. For example, maybe there are paths to memory that should not be critical. We have knowledge of where you need to crank on the design and where you don't. Some of that is built into our tools."
The collaboration also involved some development work in high-frequency clock tree optimization, and that's part of what led to the subsequent acquisition of Azuro by Cadence, which I blogged about recently. The clock concurrent optimization technology developed by Azuro combines clock tree synthesis and physical optimization into a single step. It has been shown to improve performance and to reduce power consumption and area for SoC designs with ARM Cortex processors. Clock concurrent optimization (ccopt) is now a technology add-on for Encounter, and the ccopt technology is well on its way to full integration within the Encounter Digital Implementation System.
On the verification side, the collaboration resulted in a significant simulation speedup for advanced ARM processors. Cadence tuned its Incisive Enterprise Simulator to deliver performance improvements of more than 2X on the Cortex-A15, and more than 8X on the Cortex-A5. Cadence also worked with ARM on testbench optimization.
Into the Field
The "vast experience" gained from the ARM-Cadence collaboration extends beyond optimized tools and into the field organization and services. "We have a lot of people inside Cadence who are experts on the [ARM Cortex] design and are able to go out and leverage that knowledge across many different customers," Lipsey said. This knowledge includes Cortex-A15 debugging, always a challenge for advanced multi-core processors.
The deep collaboration with ARM is a sign of things to come, noted Y.T. Lin, vice president of research and development at Cadence. "Deep collaboration is needed for high-performance designs such as CPU cores," he said. "By working together we can understand what they are trying to achieve from a design perspective, and we can develop the latest [EDA] technology to accommodate their needs."
Further information about Cadence tool support for ARM Cortex processors, as well as 28nm and 20nm design flows, will be available at the ARM TechCon conference next week. A 20nm whitepaper from Cadence is available here.
A new ARM resource page is available at Cadence.com. It includes further information about the recent collaborative work between Cadence and ARM on tool support for advanced Cortex processors. This page also describes the Cadence Silicon Realization and System Realization flows for ARM processors, lists relevant webinars and events, cites Cadence news related to ARM, and provides links to information on the ARM web site.