CadenceLIVE Americas – OnDemand

5G RF

Silicon-Validated RFIC/Package Co-Design Using Virtuoso RF Solution in Tower Semiconductor’s CS18 RF SOI Technology

Established and emerging wireless and wireline applications such as 5G, WiFi and 400-800G optical networking are driving the demand for highly optimized RFIC solutions. Typical RF/mmWave design flows rely on the use of multiple EDA tools often sourced from multiple EDA vendors. This is inherently inefficient and often error prone leading to delays in getting a product to market. In addition, there exist multiple combinations of design tools and flows that prevent a foundry from providing a golden reference flow that can be used by a large portions of the design community. In this paper we present a silicon validated unified RFIC design flow using the Virtuoso RF. The design flow is based on a high-power SP4T switch design in Tower Semiconductor’s CS18QT9x1 foundry technology. RF SOI switch designs offer a useful test-case for the Virtuoso RF design flow as they require co-design and co-optimization of both the silicon and the package which is a key strength of this design flow. The design flow will be used to present a consistent modeling and simulation methodology. A seamless hand-off between PDK provided model, metal interconnect extraction within the p-cell, metal interconnect modeling outside the p-cell using EMX and Clarity, and the flip-chip package will be presented, while maintaining a single unified database that is used for tapeout. Silicon validation of key small and large-signal metrics will be presented highlighting the importance of the tight interaction between foundry Virtuoso PDK and package modeling using EMX and Clarity.

Chris Masse, Tower Semiconductor
Samir Chaudhry, Tower Semiconductor

Academic

Seakeeping and Maneuverability Analysis for Autonomous Aquatic Drone Using NUMECA CFD Simulation

Surface Autonomous Vehicle for Emergency Response (SAVER) is a design concept put forth by Micro-g Neutral Buoyancy Experiment Design Teams (Micro-g NExT) and NASA to assist in an astronaut recovery mission just after the crew capsule’s splashdown. In accordance with the competition guidelines, the UCB WaterBears team has proposed an unmanned surface vehicle (USV) capable of aerial deployment, triangulation of an astronaut’s distress beacon, and the subsequent delivery of emergency supplies. The USV is designed to be statically stable in open water due to internal righting moments developed by a low center of gravity; however, further simulation is required to robustly design for seakeeping in more dynamic scenarios. Using Numeca’s FINE™/Marine CFD solver, the USV’s seakeeping ability will be rigorously analyzed to understand the vehicle dynamics considering a free surface and variable wave conditions. Key simulation outcomes include determining range of stability along with evaluating dynamic stability under various theoretical quasi-static loading scenarios. The ultimate goal of this study is to arrive at a hull configuration for physical carbon-fiber composite prototyping that optimizes the seakeeping performance metrics described above.

Anish Seshadri, University of California, Berkeley

High-Efficiency Microwave Transmitters for Broadband High-PAR Signals

Achieving with high efficiency and good linearity of microwave transmitter power amplifiers (PAs) is challenging if the amplified signals have wide instantaneous bandwidths (>100MHz) and high peak-to-average power ratios (PAPR > 10dB). Examples of such signals include multi-carrier concurrent signals, both closely and widely spaced, and band-limited noise-like signals, typical of 5G and other multi-carrier aggregated signal applications. This talk will overview techniques for efficient amplification with some forms of linearization, including dual-band and broadband PA designs. Specifically, supply modulation of broadband signals amplified by different GaN PAs, including a 2-4GHz single ended hybrid PA, an X-band MMIC PA, and a K-band MMIC PA will be presented. An analog technique for improving power amplifier (PA) linearity for high instantaneous bandwidth signals using gate bias modulation will be discussed, as well as a new type of diplexed broadband PA design.

Professor Zoya Popovic, University of Colorado, Boulder

University of Notre Dame: A Preternship Project on Standardizing Netlist Graph Representation in EDA Synthesis Flows

Customized netlists representing the transistor layout for a synthesize circuit are often represented through data structures, specifically graphs. Modern synthesis flows utilize brute-force approaches to search for equivalent subgraphs within the netlist representation, resulting in high processing power consumption and degraded run time, which ultimately affects time to market. Meanwhile, attempting to run every possible circuit layout in simulation requires far too much time and effort, offsetting any marginal benefit gained from improving the PPA characteristics of the circuit. Standardizing the netlist graphs also complicates the design flows already being used by vendors who are used to a different representation. In this project, a student “Preternship” team at the University of Notre Dame, performed a proof-of-concept study to find isomorphic subgraphs within a graph representation of a circuit. The team applied several algorithms, such as the recursive backtracking procedure, to help identify whether or not identical subgraphs exist, with the aim of reducing the structural complexity of a given circuit design. The team implemented existing python libraries for rapid testing on predetermined netlists, investigated common patterns in circuit design which can be leveraged to aid these generic graph algorithms. Their objective is to develop a proposed standardized system to represent circuit data as a graph in an efficient, reliable, and robust manner. In this presentation, the team will characterize the performance of the different algorithms and structures in order to create comparison standards that will differentiate certain methodologies over others. Their work will facilitate development of visualizations for the process of detecting subgraphs or further optimization of known approaches to leverage the hypotheses of circuit derived graphs.

Matthew Morrison, University of Notre Dame
Alexander Clinton, student co-author University of Notre Dame
Collin Finnan, student co-author University of Notre Dame
Gerry Migwi, student co-author University of Notre Dame
Lucas Pinheiro, student co-author University of Notre Dame

Automotive

Fast and Accurate Method to Check the Impact of Aging in SRAM Memories

Speed=1 mode is best suited for the postage and stress simulation compared to APS or S3 mode to study impact of aging on the SRAM memories. Key benefits are like the gain in the CPU runtime and memory.

Divyank Gupta, ARM
Azhar Ahmed
Sachin Gulyani

Tensilica ConnX B10 DSP Low-Power Design in GLOBALFOUNDRIES 22FDX-AG1 Technology

This presentation demonstrates the low-power design of Tensilica ConnX B10 core in GLOBALFOUNDRIES 22FDX® AG1 flow and additional PPA benefits of back-gate bias flow. The AG1 flow demands tighter sign-off conditions such as 6 sigma for setup/hold, additional margining for aging of clock and data cells, stringent EMIR criteria and robust DRC rules and verification, all will impact both power and area. With differentiated back-gate bias method a significant reduction on both power and area is realized. GLOBALFOUNDRIES 22FDX® IP offerings are AG1 certified. This presentation is also touching upon layout setup conditions such as selecting optimal cells for clock-tree, optimal lib selection from synthesis to PnR. Fine tuning the constraints, selective flattening of modules and recommended flow setup in Genus/Innovus to realize the best in class QoR.

Haraprasad Nanjundappa, Global Foundries
Herbert Preuthen, Global Foundries

Check Analog Fault Simulation with Actual ATE Program

Analog fault simulation is a tool which is used to check the coverage of design defects (shorts and opens within design). Design defects should cause ATE final test program fail instead of pass. Otherwise, the fault coverage is not enough and leads to potential quality issue. Analog fault simulation (AFS) tool is provided within Cadence Virtuoso ADE Product Suite setup. AFS users can use it a different approach to achieve different goals. Some remaining questions which haven’t been addressed. First, can we run actual C++ ATE test program with AFS? Secondly, how do we manage large number of fault injection which will be an overkill for the computing resource? Lastly, how do we collaborate between multiple disciplines to achieve the AFS simulation run and analysis? This paper addressed all of the issues above. By using DPI-C in SystemVerilog, we are able to support C++ test program. Then we choose to run sub-block level sim with C++ stimulus to optimize the computing resource. Next, the checker results can be parsed from xrun.log by using a function named “CCSgrepCurrLogFile()”, dumped into Virtuoso ADE Product Suite output, and used for pass/fail criteria for AFS simulation. Lastly, the results are reviewed by design engineers, especially focused on undetected faults. This paper shows an innovative way of how to use Cadence AFS tool to emulate ATE tests with team effort cross DV, ATE, design functions.

Jun Yan, Texas Instruments

Improvements in Cell-Aware ATPG to Target Automotive Quality

Automotive applications pose stringent constraints on SoCs to support very low failure/FIT rates and very high quality of ~0 DPPM (defective parts per million). Further, using nanoscale technologies for such high-quality markets pose an additional challenge in ensuring new technology-related defect mechanisms, both intrinsic and extrinsic, are also well covered. Cell-aware ATPG has been projected to address this challenge by identifying special patterns needed to exercise defects within the cell considering the cell layout and technology related defect mechanisms. In this paper, we present novel enhancements across Cadence SpectreTM defect injection simulation, Cell-aware defect extraction and characterization, Cadence Modus ATPG and SOC-level pattern generation flows to improve accuracy, robustness and effectiveness of the generated cell-aware tests while ensuring minimal test volume overhead. Results from implementation on a large automotive SOC are also presented to illustrate the benefits. 

Ben Niewenhuis, Texas Instruments Inc
Joe Swenton, Cadence Design Systems
Santosh Malagi, Cadence Design Systems

Architecture Trends for Sensing and Computing to Enable Autonomous Driving

The level of automation of vehicles is the key driver for E/E architectures, sensor architectures and system-on-chip architectures. Radar, lidar and camera are the key sensors to enable autonomous driving. However, these sensors still need to be significantly improved in terms of resolution, power consumption, safety, form factor, and cost but will also evolve to address new compute architectures. All these new technologies will dramatically increase the electronic content of a car which require to integrate more functionality on a chip, rather than on a PCB to provide the performance, safety and reliability in a small form factor device. As a result, a new class of high-performance system-on-chip (SoC) and/or system-in-package (SiP) is needed to process all sensor data and fuse them together to enable vehicles to become “aware” of their surroundings. This talk provides an overview on automotive trends and the implications for SoC design for sensors and automated driving platforms.

Robert Schweiger, Cadence Design Systems

Cloud

Accelerating EDA Productivity: The Why, What and How of the Journey to the Cloud

AI and IoT, in the cloud and at the edge, are driving a need for more rapid innovation in semiconductor devices. This talk presents examples and best practice architectures for cloud-based EDA, including use-cases inside and outside of Amazon. The talk will include an overview of how the development of next-generation products is enhanced through the use of cloud technologies. We will present performance optimization for computing, storage, and EDA workload orchestration, as well as covering how cloud enables secure collaboration in the semiconductor and electronics supply chain. We will also present examples of specific EDA workloads optimized and deployed on AWS, both internally at Amazon and at external customers

David Pellerin, Amazon Web Services

Cloud-Scale Productivity Without the Complexity—Have Your Cake and Eat It, Too!

Today, every design team is looking at Cloud with great interest to solve their compute capacity gap and accelerate project Turn Around Time (TAT). However, transitioning EDA and CAD flows to cloud can be complex, requiring thoughtful decisions about cloud architecture, data management, IT retraining, infrastructure setup, security, to name just a few. This session will discuss Cadence platform to overcome cloud complexity. We’ll also uncover industry’s newest breed of cloud products that are allowing designers to enjoy their familiar on-prem design environment and yet enjoy all the great benefits of secure, scalable and agile cloud. All the goodness of cloud without the effort and delays involved in adopting and optimizing the right cloud environment. Have your cake and eat it too!!

Ketan Joshi, Cadence Design Systems

Developing Scalable AI Inference Chip with Cadence Flow in Azure Cloud

d-Matrix, a cutting-edge startup, is building a first-of-its-kind, in-memory computing platform targeted for AI inferencing workloads in the datacenter. A pioneer in digital in-memory computing for the datacenter, d-Matrix is focused on attacking the physics of memory-compute integration using innovative circuit techniques, an approach that offers a path to massive gains in compute efficiency. With focus on AI innovation, the team chose full Cadence flow; and rather than setting up on-prem compute infrastructure, the design flow leveraged Azure Cloud. This session will discuss how d-Matrix setup a productive Azure cloud infrastructure running Cadence flow, the lessons learned and the key success factors that led to delivering first AI chip within 14 months.

Farhad Shaker, Azure-dMatrix
Andy Chan, Microsoft
Sudeep Bhoja, Azure-dMatrix

Designing Planet-Scale Video Chips on Google Cloud

Recently, Google announced a custom video chip to perform video operations faster for Youtube videos. Google’s TPU series of chips are also well-known for accelerating AI/ML workloads. These are just a couple of examples of chips designed by Google engineers. Google hardware teams have been leveraging Google Cloud for chip design for a few years, and this presentation highlights the benefits of using Google Cloud for chip design. Semiconductor customers can accelerate their time to market by leveraging the elasticity, scale and innovative AI/ML solutions offered by Google Cloud. Many large enterprises are choosing Google Cloud for their digital transformation. Google Cloud provides a highly secure and reliable platform, infrastructure modernization solutions, and best-in-class AI platform for the Semiconductor Industry. We will share relevant aspects of cloud (Compute, Storage, Networking, Workload Management, reference architecture) that enable high-performance chip design. We will discuss how typical verification and implementation flows on GCP can benefit by migrating to cloud, with specific examples for RTL simulation and full-chip analysis. We will also detail how customers can get started on their cloud journey. Every cloud journey is unique. We will share how we leverage Google Cloud internally for designing chips such as the Argos video chip. The Google Hardware team will share their journey with GCP with key learnings and benefits of migrating to Cloud. We will also share the challenges faced, and discuss verification/design workload migration and best practices

Sashi Obilisetty, Google
Peeyush Tugnawat, Google
Jon Heisterberg, Google

Computational Fluid Dynamics

How Computational Fluid Dynamics Extends Cadence’s Multiphysics System Analysis and Design

Charles Hirsch, Cadence Design Systems/Numeca

Omnis, from Meshing to Solving to Optimization, in One Single Multiphysics Environment

Engineers and designers today need many different tools for their CFD analyses. Omnis combines them all into one single environment, from meshing to solving to optimization. High-performing technology in a slick, easy-to-use interface, streamlines the workflow of all its users. From the detailed analysis of a single component (e.g. IC chip) all the way to simulating a full system (e.g. entire car), with Omnis users can combine the different physics, scale fidelity to need and create as many designs as desired. It can be fully automated, driven by AI models and optimization algorithms, and is open to third-party software through powerful APIs.

Yannick Baux, Cadence Design Systems/Numeca

Pointwise - The Choice for CFD Meshing

Simulation during system design, before constraints are frozen, is the most opportune time to optimize and differentiate your product. High quality mesh generation is a key enabler of simulation-lead design which requires robust, efficient, and accurate simulations. The Cadence Pointwise mesh generation tool provides the flexibility and features needed to achieve faster system simulation. Based on an aerospace heritage, the Pointwise meshing philosophy focuses on mesh quality and robustness while maintaining the flexibility to drop in to a wide variety of work flows. The result is best-in-class geometry and meshing technologies which combine to provide the choice for CFD meshing.

Nick Wyman, Cadence Design Systems

Custom / Mixed Signal

Fast and Accurate Method to Check the Impact of Aging in SRAM Memories

Speed=1 mode is best suited for the postage and stress simulation compared to APS or S3 mode to study impact of aging on the SRAM memories. Key benefits are like the gain in the CPU runtime and memory.

Divyank Gupta, ARM
Azhar Ahmed
Sachin Gulyani

Solving Analog Mysteries Inside A Digital Cockpit

In top level mixed-signal design verification, locating the source of abnormal current consumption is difficult and tedious. It is impossible to visualize and trace the text based analog content in the complicated mixe- signal design: •Analog waveform viewer: no debug capability to link the individual waveform to its netlist or source code. •Digital/mixed signal debug tool, SimVision: unable to trace the text based analog content. SimVision MS, a new digital/mixed-signal simulation cockpit integrated in SimVision, is presented as a novel solution. With this solution, terminal currents can be viewed interactively and traced down to the leaf level of every node; Verilog-AMS/Spectre/SPICE text contents will be shown in Schematic Tracer as schematics, which allows reviewing connectivity and current distribution easily; Spectre and SPICE source file can be cross probed in the Source Browser. This new solution reduces the top level mixed-signal debug time dramatically especially for those testbenches related to current or SPICE/Spectre netlist: about 4x debug time reduction with respect to the traditional SimVision is observed. SimVision MS provides a unified debug suite for analog/digital/mixed-signal design verification.

Jerry Chang, Texas Instruments

Spectre X: Speed with Accuracy to Meet Growing Circuit Simulation Demand

Achieving high throughput and coverage in design verification (DV) is recognized as one of the biggest challenges in a design cycle at Texas Instruments. Improved circuit simulator performance - with minimal accuracy loss - is a central enabler to meeting these challenges. In 2019, Cadence introduced its next-generation analog circuit simulator, Spectre X. TI Analog EDA has been collaborating with Cadence’s development team to characterize and qualify the simulator in preparation for adopting it into the TI analog/mixed-signal design community. This paper describes the qualification process, span of test circuits, and benchmarking results. We then analyze the results and show what type of circuits benefit more by adopting Spectre X Simulator.

Ziyan Wang, Texas Instruments

Delivering Best-in-Class PPA and TAT for Arm Total Compute Using Cadence Digital Full Flow

As consumers expect richer, more interactive, and more intuitive user interfaces, the way compute systems are engineered must continually evolve to keep up. Ever increasing compute performance is one of the key principles of the Arm Total Compute strategy. Arm and Cadence have been collaborating for many years on delivering implementation flows for the various IP components of Total Compute. During this session both Arm and Cadence will demonstrate how the Cadence Digital full flow is being used to deliver the power and performance goals required for Total Compute, and how system on chip designs can benefit from this shared experience.

Pierre-Alexandre Bou-Ach, ARM
Paddy Mamtora, Cadence Design Systems

Accuracy, Performance, Capacity: Finding the Right Balance for Memory/SoC Verification

The complexity of circuitry created by the combination of mixed-domains, advanced nodes, and impossible schedules has pushed the all-important verification stage of design to its breaking point. Engineers are forced to make so many compromises in terms of what parts of the design can be tested, and how extensive those tests can be, and when can the results be returned to be useful that there becomes a real risk of critical parts “slipping through.” In this session, we will unlock the latest methodical secrets for reducing risk during your custom verification with a powerful combination of Virtuoso and Spectre platform tools and flows.

Steve Lewis, Cadence Design Systems

Custom Advanced Node

Useful Utilities and Helpful Hacks

Layout design complexity increased with advanced nodes. Multiple SKILL utilities were developed to reduce steps and save time by focusing on repetitive tasks. This session includes a via enclosure solution, color related checks, layout backup outside of design management, a layer palette procedure and the principles for selecting palette colors. The initial challenge and hints and hacks on how to implement are also included.

David Clark, Intel

Current Data-Driven Analog Routing Using Virtuoso SDR

Tight design specifications with increase in current density, impact on metal dimensions / current density with technology shrinkage, cost and time to market, are the crucial considerations in designers mind while working on a high-speed analog IP. Virtuoso EAD and simulation-driven routing (SDR) are the steps towards correct by construction routing driven by electrical requirements. It provides an environment to consider current density and maximum resistance design rules during interactive routing. It is a solution that lets you take into consideration the current information and automatically size wires and vias during interactive routing. For that we need of knowing current topology during layout phase to avoid EM-IR and design iterations. We need the final topology to be able to estimate the current going through all wires. The proposed methodology assists the layout designer to achieve what he wants, while taking into account current density information. Finally, we are able to save design and layout turn around time and its iterations by creating layout correct by electrical reliability factors like EM and R. Please refer the attached document for more detail on methodology, flow and how it benefited in our design closures.

Devendra Gupta, STMicroelectronics PVT Ltd

Samsung Foundry AMS Design Reference Flow - Advanced Node

Seongkyun Shin, Samsung

Dragon Unleashed: Voice to Virtuoso Environment

Layout mask design requires many repetitive tasks and motions. Converting much of these tasks to bindkeys and/or creating macros makes the layout design faster but without reducing the repetitiveness which is required to prevent fatigue or injury. Using Dragon voice software to issue voice commands can reduce repetitive motions, increase productivity and reduce repetitive stress injury(RSI). A database of over one hundred fifty commands which range from Fit to multi-step combinations was created. Database of commands is distributed to eliminate duplicated setup tasks and commands can optionally be renamed to the user’s preference. A hybrid usage model of voice and bindkey combinations was successfully used in layout creation. The implementation steps and a demo of a few voice to layout steps and rationale for creating the commands are included in the session.

Scott Olsen, Intel

Digital Design and Signoff

Delivering Best-in-Class PPA and TAT for Arm Total Compute Using Cadence Digital Full Flow

As consumers expect richer, more interactive, and more intuitive user interfaces, the way compute systems are engineered must continually evolve to keep up. Ever increasing compute performance is one of the key principles of the Arm Total Compute strategy. Arm and Cadence have been collaborating for many years on delivering implementation flows for the various IP components of Total Compute. During this session both Arm and Cadence will demonstrate how the Cadence Digital full flow is being used to deliver the power and performance goals required for Total Compute, and how system on chip designs can benefit from this shared experience.

Pierre-Alexandre Bou-Ach, ARM
Paddy Mamtora, Cadence Design Systems

Methodology for Timing and Power Characterization of Embedded Memory Blocks

Intel’s NAND Flash ICs have increasingly complex memory controller for running the NAND algorithms, such as write, read, and erase with precision. This complex logic involves multiple custom designed memory blocks like SRAM, ROM, Register Files and CAMs. Due to high density and switching activity, the energy consumption has become an important metric along with the speed for our latest ICs. As embedded memories occupy significant area, the timing and power estimation of the IC must include characterization of these embedded Memories. The memory characterization solution from Cadence has timing, power, and other characterization capabilities in a single tool. This greatly amortizes the development effort from designers for comprehensive characterization. In our observation, the best methodology is to first setup a robust flow for timing characterization and then extend it to other parameters. After setting up a flow for timing characterization of 20 memory blocks in little over a quarter, we were able to extend this setup to power characterization for all of these blocks in under 6 weeks. This includes verification of timing and power numbers generated by the tool against the numbers obtained in spice simulations. With the setup ready, characterization across multiple corners can be run by the tool in a day. In this presentation, we examine the use of Cadence memory characterization tool for timing and power characterization of various types of memory blocks. We will also review interesting cases where the debug information generated by the tool was used to pinpoint and fix timing and power issues in the design.

Trupti Bemalkhedkar, Intel

Full Flow Power Reduction Using RTL Vectors

This presentation will address benefits of an RTL simulation driven dynamic power analysis and optimization flow. - Adapt existing RTL Verification investment into an RTL power analysis flow - Leverage RTL vectors to drive synthesis and PnR power optimizations - Utilize RTL vectors into functional dynamic power signoff and analysis

Jon Haldorson, Microchip

Intel 22FFL Process Technology-Based PPA on Tensilica ConnX B10 DSP

In this paper, we discuss novel PPA optimization methods developed using Intel 22FFL technology and Cadence Stylus Digital implementation flow. The PPA methodology mentioned here involves automation that helps generate a PPA configuration for each round of APR (automated place and route) runs. As a starting point, random PPA recipes are generated during several APR runs to train and learn the effectiveness of each PPA configuration. During these runs, the automation flow generates a group of PPA recipes to achieve optimal performance, power, and area

Young Gwon, Intel

Multiplier Internal Truncation Usage and Strategies for Better Timing, Area and Power

When multiplying two numbers, the number of output bits grows to be the sum of the sizes of the two inputs to the multiplier. In order for the bit widths to not increase throughout the datapath, it is typical to truncate/round the lower bits of the result before sending it to the next computation. This truncation/rounding is possible because the LSBs of the result have very little information content. The issue though is that you need to generate the full logic for the multiplier before doing the truncation. Cadence Genus provides two system functions $inttrunc(X,pos) and $intround(X,pos), that creates a smaller version of the multiplier that does not even implement the logic that multiplies the LSBs of the two operands with each other. In using these system functions, the resulting logic does not correspond exactly to the multiply, so there will be some error, but may be acceptable if LSBs will be truncated away anyway. In this presentation, I will provide details on how to use these system functions, and also provide snippets of code that show where in a datapath these should be used relative to pipeline stages and later arithmetic operations. I will provide some diagrams on what is happening during internal truncation or rounding. Will also provide area, power, timing improvement examples. Also will provide information on strategies to handle the fact the the generated gates do not LEC with the RTL.

Venkata Gosula, Cisco Systems, Inc.

HLS Enables ML-Assisted Architectural Exploration

High-Level synthesis allows designers to separate the High-Level Synthesis (HLS) allows designers to separate the functional core of a design from the implementation details. Both of these are, of course, important for the quality of the final RTL. Segregating the functionality and implementation details does not come for free – it requires planning on the part of the designer and intuition about where such generality will pay off. The payback of this segregation is that it allows an HLS tool (Cadence Stratus) to easily explore many aspects of design quality such as, • Different pipelining schemes • Memory implementation choices • Function sharing tradeoffs The HLS tool offers a number of “knobs” – switches which can be controlled to affect the Quality of Results (QoR) of the final RTL. The number of choices is large – the Stratus manual runs to more than 1000 pages. On top of the “CAD tool knobs” offered by the HLS tool, SystemC allows us to extend the generality of our design descriptions to include “Design Knobs”: • Tradeoff of precision of numerical accuracy for area/delay/power • Evaluation of different core algorithms in the context of an existing design. • Replacement of streaming interfaces by memory-based io. The combination of CAD tool knobs and design knobs leaves a large design space to be explored. ML frameworks are commonly used to explore “tuning CAD tool knobs” to get good settings. We extend that approach to include using an ML toolkit (SciKit-learn) to tune both “CAD tool knobs” and “design knobs” In the detailed presentation we will cover: Specific SystemC coding techniques to make generalize designs and expose the control of the generality to a ML framework. Harvesting non-HLS properties (accuracy, transactional latency) and HLS QoR properties (Area, Delay Power) in a way usable by an ML framework. Embedding the results in a Jupyter framework allowing for architect-in-the-loop exploration.

Bob Condon, Intel

Optimizing EDA Cloud Cycles Using Arm Servers

Virtually limitless scaling of compute cycles and preferential pricing using spot availability allow design teams to optimize throughput per cost with never-before-seen flexibility. EDA tools for simulation and characterization are prime targets to harness the parallelism of compute cycles available in the cloud. In addition to flexible virtual machine configurations offering different memory and processor performance combinations, design teams can now use alternative processors like the Arm architecture, delivering power and performance advantages for specific workloads. This presentation will provide an update of the Arm/AWS/Cadence partnership porting critical simulation and characterization offerings that require significant compute cycles and can benefit from parallelization on Arm-based Gravtion servers in the AWS Cloud. Specifically, we will discuss the optimization of verification throughput using Cadence Xcelium and JasperGold, taking full advantage of multicore parallelization on Arm-based servers. We will also introduce the Spectre Simulation Platform and Liberate Library Characterization on Arm servers' latest performance results. Finally, we discuss use models, customer examples, and optimization strategies to minimize cost per throughput of Cadence EDA tools executing on Arm-based servers in the AWS cloud.

Frank Schirrmeister, Cadence Design Systems
Tim Thornton, ARM
Kushal Koolwal, ARM

Digital Design and Implementation 2021 Updates

Latest innovations from the Digital Design and Implementation group relating to power savings, advanced node coverage , machine learning and multi-chipset flows will be presented.

Vinay Patwardhan, Cadence Design Systems
Rob Knoth, Cadence Design Systems

A Comprehensive Low-Power Verification Signoff Solution with Conformal Equivalence Checker

Structural Low Power verification is a proven approach for validating power domain crossings in a design prior to tape-in. Since it only analyzes a single design netlist and power intent description, it does not guarantee that the tape-in netlist and power intent is consistent with the golden RTL design and power intent as the design goes through the digital implementation and refinement process. In this paper, we will review Intel’s methodology, flow, and tool commands for signing off low power designs with Conformal equivalence technologies.

Kundan Kumar, Intel

Hierarchical ATPG Scan Challenges and Successful Solutions

In designs with multiple instances of same blocks, use of hierarchical scan technique can improve pattern generation efficiency and reduce runtime. In certain designs the complex architecture where scan is inserted on Cores outside the main digital block brought new challenges in the pattern migration process. The Modus ATPG tool had challenges when migrating cores using hierarchical scan, specifically for non-identical cores and cores without IEEE1500 wrapper. Customized tweaking in the scripts for ATPG pattern generation, migration, and simulation work pre-tapeout, helped solve these challenges, without any design changes. The new Modus ATPG pattern generation techniques were also used to find the root cause for the scan clock failures on silicon arrival. In this paper, we will discuss the successful solutions to these ATPG challenges and post silicon debug.

Aparna Tata, Analog Devices Inc
Sri Tummala, Analog Devices Inc
Joyce Kraley, Cadence Design Systems

Machine Learning Implementation in DFM Signoff and Auto-fixing Flow

We are presenting a new fully integrated Machine Learning (ML) solution to finding and fixing design weak points. This methodology has been jointly developed between GLOBALFOUNDRIES and Cadence, and seamlessly integrates into existing DFM tools used in design sign-off and Innovus router based auto-fixing flows. The introduction of Machine Learning gives foundry customers greater accuracy in detecting process weak points compared to traditional methods. Moreover, previously unknown weak points can be predicted and fixed already in the design phase. This gives designers the opportunity to create more stable designs, with the potential for better yield and faster time to market

Janam Bakshi, GLOBALFOUNDRIES

Rapid RISC-V Automotive Processor IP Development Accelerated by Cadence Digital Full Flow

In commercial processor IP development, quick PPA optimization is an essential task for launching high-quality and high-performance IPs to the market. Cadence Digital Full Flow provides a seamless connection between tools, enabling designers to focus on architecture exploration and optimization. Thanks to the usability of the flow, front-end engineers can join a floorplan exploration. It realizes rapid front-end to back-end iterative optimization to tailor the IP design across process libraries and PPA targets. In this presentation, we will look at how the flow contributed to our RISC-V processor IP development, highlighting Joules RTL power estimation and Genus Synthesis from architecture exploration to implementation, and how adding Stratus High-Level Synthesis allowed us to develop a NN accelerator without modifying RTL code.

Naoaki Okubo, NSITEXE, Inc.

The Future of Timing Constraint Validation - A New and Smart Solution in Renesas to Create Precise SDC Using Conformal Litmus

The validation of timing constraints (SDC) had become a time consuming manual and inefficient process due to increasing design complexity in LSI designs with inaccurate SDC verification tools. The major problem was that these tools reported numerous insignificant occurrences as error. To solve the problem, we have been partnering with Cadence on a new solution, Conformal Litmus, which addresses Renesas’ vision of timing verification. We have applied Conformal Litmus to several projects and have been able to identify the root cause of errors and reduce verification resources by 50%. In this presentation, we will introduce how we improved our design efficiency with the developed features in Conformal Litmus.

Hiroshi Ishiyama, Renesas

IP

The Open Verification Method Used by OpenHW for the CV32E40P RISC-V Core

This talk explores the background, development and implementation of the OpenHW verification environment for CV32E40P known as “core-v-verif”. Since the goal of the project is to support adoption on of an open-source core, the initial deliverable quality is not the only concern. One attractive aspect of an open-source core is the potential for adopters to modify, adapt, or extend the base core features. Thus, the verification plan needs to anticipate the future use case with flexibility built in and clear documentation for the full test bench to be adopted and further adapted by end users.

Lee Moore, Imperas Software
Mike Thompson, Open HW Group

Implementation of a Highly Parallel Complex FFT Architecture using TIE in Tensilica ConnX B20 DSP

This work represents the design and TIE implementation of a flexible yet fast 2n-point complex FFT architecture using FIFO. Four parallel read-write queues are used instead of shared data memory to improve the overall performance and throughput. The proposed FFT architecture follows the basic concepts from Radix-2 and Stockham-FFT algorithms, but stands unique with the data flow patterns. This approach requires only 768 cycles to perform complex 4K-FFT when compared to the 2070 cycles requirements in the ConnX B20 (library) itself. It is achieved by a total of 2048-bits read and write per cycle (2 to 4 queues in parallel). The twiddle factors for the FFT can either be computed during runtime or pre-computed and stored in a separate reusable queue. The latter technique is followed in this work in order to reduce the computational complexity. Four hardware parallelization methods are proposed to achieve better performance based on the FFT sizes. In every cycle, each method reads data-points from two different queues, performs butterfly operation and writes the results not necessarily in the same two queues. The 1st method takes only 2-data points per cycle and performs the butterfly operation, whereas the 2nd, 3rd and 4th methods take 16-, 32- and 64-data points per cycle respectively. Each data point is considered as 16-bits real and 16-bits imaginary fixed-point complex element. The 4th method requires two 32-data points from two respective queues, which corresponds to the data width of 1024-bits per queue. In each cycle, the 1st method reuses 4 32-bit multipliers and 8 16-bit adders that are required to perform a butterfly operation and the 4th method requires reusable 128 multipliers and 256 adders. The data flow pattern followed in this work ensures the reusability of twiddle factors. Thus the twiddle factor computation or the twiddle queue read is not required in every cycle. With any fixed hardware parallelization methods, the FFT size can be configured. For instance, with the 1st method, 24 to 212-point FFTs or above can be realized. Similarly, with the 4th method, 29 to 212-point FFTs or above can be realized. With the 1st method, a 16-point complex FFT takes 32 cycles to perform the whole operation. The 4th method computes 512-, 1024-, 2048- and 4096-FFTs within 72, 160, 352 and 768 cycles respectively. The application specific SIMD and FLIX instructions are created using Tensilica Xtensa and the FFT architecture is realized using ConnX B20 DSP.

Prasath Kumaraveeran, Fraunofer IIS/EAS

Silicon Photonics

On the Silicon Photonics Physical Design’s Productivity and Reliability Enhancement

This presentation will discuss how a silicon photonic physical design can be enhanced in terms of productivity and reliability using Virtuoso Electronic Photonic Design Automation (EPDA) framework. Schematic driven layout (SDL) methodology is introduced for a photonic design, and its connection to a physical design’s representation will be explored. As ubiquitous photonic components, photonic waveguides are used to demonstrate algorithms, methodologies, and flows in this presentation. We will establish waveguide template concept and its relationship with a technology file and explore the need for different types of primary waveguide devices. Use cases of different waveguides will be discussed, and the need for different waveguide connectors and waveguide routing options will be evaluated. We will show how an interactive waveguide router can be implemented and demonstrate its use cases. A few examples will be provided on how to use Virtuoso® CurvyCore engine to implement different types of waveguides and also introduce new concepts such as facets, path, surface, etc., and how they are used in the parameterized cells (PCells) implementation. We will examine features provided by Virtuoso® CurvyCore's mathematical engine to enhance the reliability of our physical design and PCells.

Ahmadreza Farsaei, Intel

Voltus-Fi Custom Power Integrity Solution’s Electromigration Analysis and Self Heating Flow for FinFET and Silicon Photonics PDKs

The presentation will provide details about the electromigration and IR drop analysis and Voltus-Fi self-heating flow. It will include all the requirement for SHE Flow. It will provide foundry prospective about SHE requirement to generate EM-ICT techfiles and mapping files. We will go in-details about SHE enabled EM-ICT files, templates, and mapping files. Self-heating analysis is needed to avoid over design and to close on EM considering the thermal coupling or thermally induced heating in the surrounding wires/vias. In traditional flow, self-heating is built into the EM limits and this may lead to costly or unreliable design because of the higher EM limit needed for all wires/vias in the design independent of the wire experiences self-heating or not. Thermal coupling coefficients between devices and wires/vias are now added to the PDK and the self-heating flow EDA tools can read these coefficients to calculate the self-heating in the nearby wire within a specified zone of influence. The thermal coupling coefficients are calculated from 3D TCAD simulations, where 3D models of a particular metal stack are translated into 3D models including realistic boundary conditions Self-Heating is supported for 12LP Finfet and 45 Silicon Photonics PDK. Will also provide details about enhanced features supported for Cadence Voltus-Fi EMIR analysis Thermal coupling coefficients between devices and wires/vias are now added to the PDK and the self-heating flow EDA tools can read these coefficients to calculate the self-heating in the nearby wire within a specified zone of influence. The thermal coupling coefficients are calculated from 3D TCAD simulations, where 3D models of a particular metal stack are translated into 3D models including realistic boundary conditions

Amit Kumar, GLOBALFOUNDRIES

Streamlined Foundry-Compatible Custom Photonic IC Design with Ansys-Lumerical, Cadence Virtuoso Environment, and Tower Semiconductor’s Foundry PDK

Zeqin Lu, Ansys (Lumerical)
Amir Chaudhry, Tower Semiconductor

Application of Virtuoso CurvyCore Technology in the Development of Photonic Elements for a Foundry Process Design Kit (PDK)

The origins of the GLOBALFOUNDRIES photonic technology foundry offerings date back over 10 years. Originally there was little commercial CAD software dedicated to photonics design, and our initial PDKs adopted the electronic design tools for use in photonics. Over time, commercial CAD tool availability became more widespread. Recently, GLOBALFOUNDRIES has adopted Curvy Core and other photonic flow enhancements of the Virtuoso® tool suite as a fundamental part of its 45nm technology PDK offering. This paper describes the customization of the Cadence® framework to encompass the specific features and requirements of our 45SPCLO technology. Earlier PDK development required extensive development and coding of custom functions to process the signals, generate the layouts required, and validate the physical design. Many of these custom functions have been replaced by functions provided by Cadence®. GLOBALFOUNDRIES created design elements for basic straight waveguides, circular bends, and taper elements which are compliant with design rules, by creating higher level functions which encapsulate basic Cadence® functions. Additionally, more sophisticated passive waveguide elements such as S-bends and clothoid curves have been newly provided. These primitive elements can be used directly by the designer as waveguide connectors, but they are also used hierarchically in other PDK elements such as couplers, splitters, and modulators. The elements can be created by the end designer through Skill function calls in a scripting methodology. Alternatively, the same elements are provided as pcells for more traditional design in a graphical editor. The pcells are an encapsulation of the Skill functions, ensuring the same result regardless of the designer’s preference. The mathematically intensive functionality of curve generation, manipulation, and discretization are provided by the platform. As a foundry, we completed the design environment through technology specific customizations.

Bradley Orner, GLOBALFOUNDRIES

System Design and Analysis

Design and Performance Consideration of Silicon Interposer, Wafer-Level Fan-Out and Flip Chip BGA Packages

With the advent of various modern packaging technologies, product design teams are challenged with identifying and selecting the most suitable package. Certainly, device specification is one of the key parameters that plays a critical role in selecting the right integration strategy and substrate. Pathfinding and design exploration are extensively conducted upfront to arrive at optimal cost vs. performance solution. Recently, supply chain shortage and availability has made a critical role in packaging strategy. Package selection variables are now categorized as “Cost vs. Performance vs. Availability”. In this presentation, we will explore design and performance consideration, as well as supply chain availability, to aid in packaging strategy.

Farhang Yazdani, BroadPak Corporation

A Reference Flow for Chip-Package Co-Design for 5G/mmWave Using Assembly Design Kit (ADK)

The design effort for upcoming integrated circuit and package technologies is rising because of increasing complexity in production. To cope with that situation it is essential to reuse pre-qualified elements for handle complexity. For package technologies this becomes more and more apparent. Looking at the requirements for 5G applications in a radio frequency up to 60 GHz for package technologies it is no longer feasible to start from scratch. So it becomes more and more import to use prequalified elements for a technology. This paper deals with the implementation of RF-structures for manufacturing and characterization and the how to cover the interaction in the system across IC and different package levels with dedicated tooling. For upcoming package technologies it is getting more and more important to include these devices into so called Assembly Design Kits (ADK) to enable designs by potential customers. In this paper a package that includes two different levels of integration is presented. Package level one is a rdl technology for flip-chip assembly and level two is based on eWLB and is integrated on level one as a package-on-package approach. Both are wafer level package technologies. The paper deals with that technology but the general approach is valid for other package technologies as well. The flow starts with designing within Cadence Virtuoso with multiple modification in terms of addressed frequency and optimization according RF properties like gain, loss or target impedance. The designs are transferred into radio-frequency analysis tools like EMX or Clarity to investigate their behavior on a model basis. This enable the possibility to optimizes the elements according to certain target parameters across technologies. After finishing the design process, these elements are produced on a 300 mm wafer. After production the wafers are ready for characterization. Due to the distribution across a wafer and running lots with multiple wafers, characterization will also include statistics within wafers and across wafers and the reproducibility of the structures can be assured. All the results are bundled into a construction kit with a Cadence-based ADK involving symbol, schematics and layout for future designs and also models for running analysis based on these elements. A set of elements are available for the listed package and IC technology for usage and transfer to customers.

Fabian Hopsch, Fraunofer IIS/EAS

Cadence and Deca Chip(let) Solutions with Adaptive Patterning

Cadence and Deca have collaborated on a novel multi-chip(let), high-density RDL packaging solution built on-top of Deca Adaptive Patterning™ technologies and Cadence Allegro® packaging solutions. A new design approach has been introduced that removes several existing design hurdles, while providing a low-cost alternative to existing packaging technologies. The design solution leverages Deca’s Adaptive Alignment™ technique to translate and rotate polymer and copper routing layers to precisely align with individual chip(let) locations. Additionally, Deca’s Adaptive Routing™, allows the layout designer to create dynamic regions within Cadence Allegro® Package Designer where the dynamic layout features will be executed in Adaptive Patterning™ software to create unit-specific patterns aligned with the actual location of each chip(let) bump. With the option to choose any combination of these Adaptive Patterning technologies within the Allegro package design tool, the designer can confidently layout complex multi-chip(let) packages while being assured that all normal variations in manufacturing can be accommodated based on certified design rules from Deca. This presentation highlights ultra-high-density RDL layout of single-chip and multi-chip(let) designs using Deca Adaptive Patterning technology embedded in Cadence Allegro Package Designer Plus and the Silicon Layout Option, including showing entire tool flow for importing user data and configuring the Cadence database for M-Series™ FX fan-out designs.

Edward Hudson, Deca Technologies

Priority on Power: How & Why to Start with “Power First” for PCB Implementation

The drive for faster throughput, increased mobility, and maximum efficiency of modern electronic devices has made power delivery a critical piece of design success. That said, meeting the power needs of modern designs is no easy task. Furthermore, if power requirements are not considered and accounted for upfront, finding and resolving power delivery problems later in the cycle can be incredibly difficult. This leads to schedule delays and often lots of time in the lab trying to debug. This paper will present a methodology for power-driven PCB layout that will set designs up for power delivery success. We will discuss the common challenges that arise when trying to meet modern power delivery requirements, best practices on how to set up your design for effective power delivery, as well as tools and tips to measure and validate power performance at all stages of your design.

Terry Jernberg, EMA Design Automation

Accelerating PCB Design Cycles with In-Design Analysis

Many PCB Design teams face a moment of truth when they hand over their design files to the revered signal and power integrity engineers. Despite following the physical constraints, you know the SI/PI team is going to throw it back over the wall with a list of things that must be corrected. What if there was a better way? What if the PCB Design tool contained in-design analysis? What if you could eliminate 10 percent of those ECOs? What about 50 percent … maybe even 75 percent? Your PCB would be off to production weeks earlier. Cadence now has in-design analysis for signal and power integrity that does not require any models. On the other hand, if you choose to store your IBIS models with the design, you can perform more advanced SI analysis while designing.

Michael Nopp, Cadence Design Systems

48V 250A FET PCB Thermal Profile - A Thermal Simulation Using Celsius Thermal Simulator's 2D and CFD Airflow Analysis

This case analysis evaluates the thermal profile of 10 FETs rated at 48V, 250A under continuous soak time with all the Thermally-Enhanced mechanical attachments installed. The goal was for the MOSFETs to operate at 8W peak power output, maintaining a temperature of no greater than 60°C – applying all the thermal handling enhancements. RSENSE installed targets to have the minimum allowable Temperature after MOSFET’s thermal analysis. The setup is a household thermal mechanism to handle average power level systems that involve mounting of heatsink on the bottom side, attaching the RSENSE, and employing Filled VIAs. Thermal simulations were conducted using Celsius 2D for several power ramp cases and Celsius CFD (Computational Fluid Dynamics) to analyze the Airflow Application. Celsius 2D investigated the Electrical and Thermal performance over a sweep of conditions. Boundary conditions were met at approximately 50°C measured as the highest measured temp on the range of installed MOSFETs. Thermal performance was furtherly improved by 10degC less after employing the Airflow mechanism. Airflow design was modeled and simulated using Celsius CFD and yielded the best Thermal

Marcus Miguel V. Vicedo, Analog Devices
Richard Legaspino, Analog Devices
Kristian Joel S. Jabatan, Analog Devices

Reimagining 3D FEM Extraction with Clarity 3D Solver

Larn and apply the latest innovations in the full-wave Cadence Clarity™ 3D Solver to analyze your next-gen system design. Deep dive with us into the fully distributed architecture of the Clarity 3D Solver that enables you to extract large and complex packages and PCBs using hundreds of cores in the cloud or your on-premises farm — all while taking as little as 8GB memory per core.

Robert Myoung, Cadence Design Systems

A Dive into DDR5 – Whats New, What Changed, and How Do we Model it All

There are some major physical layer updates to DDR5 which may cause challenges during simulation. Specifically, VDD is now reduced from the DDR4 level of 1.2V to 1.1V. Additionally, DIMMs are now specified to use local VRMs, which will benefit to system power integrity but cause more complications in design and analysis. Additionally, DDR5 now implements many methods to improve data integrity at extreme data rates by utilizing decision feedback equalization (DFE), duty cycle adjustment (DCA) and ECC. The Command/Address bus has also been simplified in its pinout but now uses an internal Vref like DDR4's DQ Vref. This presentation will explore the updates in DDR5 and look at reasonable simulation methods to ensure accuracy without omitting the benefits of new features.

Stephen Newberry, EMA - Shield Digital Design

Using Sigrity Technology to Address USB 3.1 Signal Integrity Compliance

Serial Link Simulations require meeting stringent industry compliance standards in order to ensure that products can plug-and-play with other products successfully. 3D cameras are becoming more prevalent in robotics systems designs, and as a result the need for full-speed, 5-gigabit USB 3.1, with good signal integrity. This paper covers how a product was successfully designed meeting all the USB 3.1 compliance requirements using Sigrity products. We will discuss how pre-route analysis, power-aware interconnect model extraction, circuit simulation, channel simulation, and power integrity analysis all come into play to establish USB 3.1 compliance. The presentation will also show some forward-looking work planned where the Cadence support team has guided us on how interconnect modeling can be upgraded to full 3D extraction using Clarity 3D Solver.

Jeff Comstock, EMA - Ologic, Inc.

A New Way to Tackle System-Level Power Integrity Analysis

With the massive increase in bandwidth coursing over the internet today, design challenges for applications like telecom, data centers, and cloud infrastructure have all increased as well. One of these key challenges is power integrity. Power integrity analysis has morphed over the last 20 years from back-of-the-envelope calculations to detailed extraction, simulation, and signoff of physical PCB and package layouts. But now the need pushes beyond a single physical fabric to include representations of the ICs, interposer/packages, PCBs, and VRMs all together in a single view, to be analyzed together as a complete power distribution system (PDN). In this session, a full PDN from a Cisco system design is extracted, modeled, and simulated in the new “SystemPI” environment, from VRM all the way to IC Die. Frequency and time domain simulation is used to analyze the PDN and examine its current characteristics. This opens the door for new methodologies for not only system-level PDN verification, but also for pre-design feasibility and tradeoff studies that can help you optimize your next design, or to improve your current design.

Stephen Scearce, Cisco Systems
Quinn Gaumer, Cisco Systems

Verification

Audience-Specific Regression Data Distillation and Presentation with vManager Verification Management

During the design verification process, literally terabytes of data are produced. Distilling regression data into useful and consumable formats and levels of abstraction are critical to efficient verification closure. This presentation will explore the usage of vManager features beyond the standard Linux GUI for the purpose of collecting, formatting and publishing regression data to multiple consumers in the chip development team (and expanded team). Capabilities such as customized logfile data extraction, CSV data export, HTML report generation, and how these data sources can be utilized by Python scripts to create compelling, audience specific reports. Usage of the vManager web interface, and the value it can bring to specific regression data consumers will also be highlighted. Overall, attendees of this presentation can expect to gain an appreciation of some of the unique ways in which vManager features beyond the GUI can be leveraged.

Brad Mader, Renesas

Arm/Cadence SoC Hardware/Software Development Solutions

Developing hardware in isolation and hoping software works when the silicon comes back has not been an option for decades. However, the use cases for SoC Development have evolved with ever growing system on chip (SoC) complexity, more complex software functionality and intricate dependencies at the hardware/software interface. This presentation will illustrate a set of use cases that challenge design teams for modern SoCs, including but not limited to architecture and IP selection though IP and system configuration, the development of virtual platforms, balancing different ways of pre-silicon software development on virtual and physical prototypes and emulation, hardware verification and system validation in the context of domain specific software, SoC bring-up in the pre and post silicon development phases, and performance optimization. There is no one development tool, and no single set of models that enables all use cases. In this presentation, Arm and Cadence will discuss the joint offerings of both companies, how they interfaces and flows and methodologies how they can improve productivity for SoC development.

Frank Schirrmeister, Cadence Design Systems
Eric Sondhi, ARM

De-risking and Accelerating OS Boot for Arm SystemReady SoCs: A Morello Case Study

Booting a commercial OS on any SoC is a significant milestone and can require long debug time, software workarounds and the resultant delays in the delivery of programs if OS boot does not work just “out of the box”. For complex OSes such as Linux, Android, and Windows the risk is even higher as debug turnaround times are longer. Software patches are not always possible, meaning there is a risk of requiring a re-spin of silicon. Common issues are the OS may fail to boot, or crashes or hangs. In devices that integrate PCIExpress interfaces, the hierarchy may be invalid or have no Root Port or there may be End Point interoperability issues. Demonstrating pre-tapeout OS-boot can be an important test for building confidence in the design, however there are several problems in doing that. This presentation will detail some of the major challenges that hinder achieving “out of the box” OS boot. It will introduce new EDA tools and content which accelerate the process of PCIExpress integration, enable Arm SystemReady architecture compliance tests to be run on bare-metal RTL using HW acceleration and thus achieve pain-free OS boot ahead of SoC tapeout. The presentation is based on a project called Morello, developed at Arm, which involved creation of a prototype SoC deployed on a board and the associated firmware. The SoC had a requirement to be compliant with Arm’s Server Base System Architecture (SBSA) standard. This presentation will describe the SBSA system standard, the benefits of complying to such architecture standards and how compliance was demonstrated on Morello. The primary focus will be to describe the challenges faced in proving compliance and how a Cadence based solution helped us to verify PCIExpress integration in an Arm SoC. This solution was based on the Perspec library solution from Cadence. The project has now demonstrated successful pre-silicon booting of OpenSuse Linux distribution and WinPE Operating Systems on an FPGA platform.

Pallavi Kulkarni, ARM
Peter Uttley, ARM
Paul McCloy, ARM

Achieving Code Coverage Signoff on a Configurable and Changing Design Using JasperGold Coverage Unreachability App

We developed a code coverage signoff flow using Xcelium simulator’s constant propagation, Incisive Metric Centre’s coverage merge and Jaspergold’s unreachability app. Our goal was to have a signoff criteria based on coverage reports that does not involve writing custom waivers. Merged code coverage database from regressions was handed over to Jaspergold to analyse the coverage holes for unreachability using formal techniques. Unreachable lines of code were removed from the list of holes to create a list of real holes. Constant propagation features to avoid instrumenting tied off code is helpful but there are a lot of places that are missed by simulation constant propagation. Having the formal analysis approach to eliminate the unreachable code is key to achieving code coverage closure and on time delivery of a highly configurable design. Automated coverage signoff flow without text based filters, waivers and exclusions, enables replicating the success across EDA vendors.

Mayur Desai, Broadcom

The Step-and-Compare Methodology for High-Quality RISC-V Processor Verification

The open standard ISA of RISC-V has generated significant interest around custom processor design options and the associated design freedoms beyond the roadmap of the mainstream processor IP providers. Thus, RISC‑V has enabled any SoC developer to consider undertaking a custom processor design, which in turn has stimulated the interest in adapting the established SoC design verification (DV) flows based on UVM and SystemVerilog to also address the complexities of processor verification. This talk introduces the various options for RISC-V processor verification from the simple trace analysis through to the latest techniques with test benches that support UVM SystemVerilog with Step-and-Compare for asynchronous events. With illustration of the various options and approaches including details of bugs found on some popular open-source cores.

Simon Davidmann, Imperas Software

Accelerating DFT Simulation Closure with Xcelium Advancements

DFT simulation execution is crucial to the SOC design process: rapid turnaround time allows for faster pattern verification for each netlist release and more iteration cycles during development, while efficient disk space and compute farm utilization enable a greater number of patterns to be simulated with each iteration cycle. Maintaining DFT simulation execution has proven challenging due to increasing design sizes and the higher test coverages and multiple fault models (ex. stuck-at, transition fault, bridge, IDDQ, RAM sequential, etc.) needed to meet automotive quality requirements. To address these challenges, additional Xcelium features, namely Multiple Snapshot Incremental Elaborate (MSIE) and simulation save and restore, were incorporated into the DFT simulation flow. Results on two recent automotive designs include up to a 38% reduction in turnaround time from netlist release, up to a 96% reduction in disk space required to store simulation build, and 4,289 days of CPU time saved.

Rajith Radhakrishnan, Texas Instruments
Benjamin Niewenhuis, Texas Instruments

Automating Connection of Verification Plans Requirements - OpsHub Integration Manager and vManager Verification Management

OpsHub Integration Manager platform helps enterprises synchronize data between tools used in the Product (Software and/or Hardware) development ecosystem. With its latest release, OpsHub Integration Manager starts support of vManager which helps Cadence customers synchronize requirements and associated information between vManager and other Requirements Management tools (e.g. Jira, Jama, IBM DOORS, etc.). Such a bi-directional synchronization allows enterprises to have full traceability of requirements and test cases through its lifecycle in vManager and other tools. The vManager product pioneered verification planning and management with the industry’s first commercial solution to automate the end-to-end management of complex verification projects—from goal setting to closure. Now in its fourth generation, with the introduction of the High Availability feature, Cadence® vManager™ Verification Management provides capabilities, in partnership with OpsHub, to allow Requirements traceability by sync vPlan in DB with 3rd party Requirement Management tools (e.g. Jira, Jama, etc.) to describe and trace the life of a requirement. In this session we will review the motivation for requirements traceability and the need for connection of Requirements Management systems to vPlans in vManager. We will also explore the need for a system like OpsHub Integration Manager to manage this connection and its benefits to the overall development environment.

Nili Segal, Cadence Design Systems
Sandeep Jain, Opshub

Verification of IP and Embedded Systems

The Open Verification Method Used by OpenHW for the CV32E40P RISC-V Core

This talk explores the background, development and implementation of the OpenHW verification environment for CV32E40P known as “core-v-verif”. Since the goal of the project is to support adoption on of an open-source core, the initial deliverable quality is not the only concern. One attractive aspect of an open-source core is the potential for adopters to modify, adapt, or extend the base core features. Thus, the verification plan needs to anticipate the future use case with flexibility built in and clear documentation for the full test bench to be adopted and further adapted by end users.

Lee Moore, Imperas Software
Mike Thompson, Open HW Group