CadenceLIVE China – OnDemand

Automotive, Digital Design Creation and Signoff

Cadence’s Automotive Strategy

The automotive world is evolving at a fast pace. Trends such as software-defined vehicles, vehicle electrification, zonal architectures, and advanced driver assistance are rapidly redefining how automobiles are built. Cadence offers several solutions to aid in this transformation, including:

• A comprehensive Functional Safety tool suite for silicon design, featuring our world-class Midas FMEDA solution, digital and analog verification solutions for fault injection, and digital implementation of safety mechanisms.

• New chiplet services and reference designs to kickstart the automotive ecosystem

• A world class IP portfolio

• Our Helium digital twin virtual prototyping solution.

• Electrical and mechanical system tools for a variety of auto simulations.

This talk will provide an overview of the automotive space and demonstrate how Cadence’s rich portfolio can help customers across the supply chain achieve success.

Chuck Alpert, Cadence

Powering Intelligent System & Chip Design in Cloud with Cadence Cloud Solutions
如您想了解更多关于Cloud Solution的信息,
请您使用公司邮箱联系 Cadence_China_Marketing@cadence.com
在邮件中留下您的姓名、公司、手机号等信息,我们将尽快联系您

Mahesh Turaga, Cadence

Using Joules RTL Design Studio for Evaluation and Application on Critical Blocks

当前芯片设计普遍经由后端PR实现时发现的问题,反推前端代码优化,该过程耗时良久。而前端提前识别风险的工具存在数据分析不准确,单个工具只专注于某一个或几个着重点等现象,不利于提前发现PPAC风险。joules RTL Design Studio工具可以很好地发现代码里可能存在的问题,较为精准的预测时序,congestion,功耗等信息,通过界面更好的展示给开发人员代码中存在的风险点,有助于提前规避PPAC风险,助力芯片正常流片

Linjie Sun, Sanechips

Logical and Physical Aware Automatic Solution for Scan Chain Insertion of Newly Added Registers in Conformal ECO

During Conformal ECO(C-ECO), pre-mask or post-mask, newly added scan flip flops (SDFs) cannot be inserted into existing scan-chain automatically as wished. Here we provide a comprehensive solution for C-ECO engineers to overcome above both under logical and physical awareness. The logical part includes correct clock domain and instance parent chain segments selection, occ chain exclusion. While the physical part mainly contains two concerns: longest chain length, minimum distance SI-Q connection for DRV and timing. For the first one, the solution provide a way of group strategy which spreads the SDFs into each chain segment by the number of available rooms under concern of the longest chain length does not change or evenly splits the SDFs for all targeted chain segments for a reasonable increase of longest chain length if necessary. For the second one, based on Innovus APR tool we develop an “S Shape” connection algorithm which achieves better minimum distance realization both between new-old segment and inside the new segment itself. Consequently, the solution could automatically and comprehensively handle the scan chain insertion for hundreds of new SDFs in C-ECO based on both logical and physical concerns.

Zihao Zeng, Bestechnic

Using Conformal ECO Recipes and Cut Point Flows to Refine Patch Sizes

Conformal ECO 提供了基于flattened和hierarchical的两种ECO flow。hierarchical flow 会根据给定的ECO modules进行hierarchical compare,在module boundaries帮助下有时能够得到更小的patch size。然而,在复杂的设计中,若给定的ECO modules涉及端口信号的变更而未识别到,此时重新综合的网表由于边界优化导致ECO modules外部逻辑不同且不在ECO的范围内,则会导致整个design的lec fail。相比之下,flatten ECO flow则有输入简单,适用范围广的优点。因此,本文基于实际项目试验探讨了flattened flow 中的recipes flow和ECO cut point (ECP) flow相比传统flattened flow 对patch size优化效果。 结果表明,相比传统的flattened ECO flow,recipes flow在牺牲少量runtime的前提下同时产生多个recipes,并可根据用户自定义quality formula得到最优的patch。 ECP flow 则使用rtl与netlist产生data base,自动寻找cut point,实验结果表明在寻找到cut point时对patch size的优化效果最佳,但runtime 较长,非常适用于物理实现存在瓶颈的design。综上所述,在对patch size要求较高的ECO design中,使用Conformal ECO recipes 与ECP flow可以对patch size进行进一步优化。

Qin Xu, Sanechips

PDN Debugging Panel: Powered by Voltus Reports Automated Extraction Flow

With the rapid development of power system, the scale and complexity of power distribution network (PDN) are increasing, and the requirements for PDN analysis and optimization are also increasing. The traditional PDN analysis and optimization methods are mostly based on manual calculation, which is time-consuming and inefficient.Voltus is a powerful PDN analysis and optimization tool that can automatically generate reports on various aspects of PDNs, including current and voltage distribution, component power consumption, and network efficiency. This paper introduces the application of Voltus debugging panel, which assisted in input files analysis and PG grid optimization.For input files, accurate PDN simulation environments require considering various factors and input files (lib/lef/def/etc.) from diverse sources(APR/STA/DFT/Integration). Verifying input file quality, like the coverage and missing cells, is crucial. A deeper challenge lies in ensuring instance toggle logic aligns with design intent. Power/current calibration issues (lib/pgv) or incorrect clock allocations (twf) can cause excessive toggles or power consumption. This paper presents a check summary to streamline the verification process.  The panel consolidates all necessary coverage information and highlights abnormal values for quick issue identification. It also visualizes toggle activity and current information with histograms to provide an intuitive understanding of performance at each simulation moment.  Finally, the summary combines the results of static, dynamic, and selfheat EM runs for comprehensive analysis. By centralizing all information in one place, the debugging panel eliminates the need to navigate through folders and check individual files.For output results, debugging panel helps optimize power distribution for instances and refine power networks structure.  Panel provides the toggle number of each moment to ensure the proportional toggle in each cycle. The PG structure is accordingly strengthened if panel showed the specified areas were shown with large current, by remapping the bump locations to balance the power source distribution.In conclusion, the Voltus tool provides rich command statements and detailed reports. The panel allows you to overview all the data of concern, avoiding omissions, greatly improving efficiency, and reducing human errors. At the same time, the tool can also provide some suggestions for improving current and voltage.

Shining Dong, NXP

Streamlined Signoff: Optimizing Timing and Power with Cadence Certus/Tempus/Quantus Solutions

芯片Tapeout之前的时序签核和收敛一直是我们后端设计团队最紧张的阶段。任何能提前预估风险和缩减迭代的方案都能极大地提高团队的工作效率和信心。Cadence的Quantus -> Tempus STA signoff引擎已经无缝集成在Innovus P&R工具中,这使得我们能够在早期准确获取到block level的签核时序,避免过修,并减少了多引擎间切换带来的时序损失。最新的分布式final timing ECO工具Certus,充分利用已有的机器资源,在保障全局收敛的同时,有效地补充了对接口时序和功耗的优化。全自动操作更能节省后端工程师的人力和时间。本文将详细阐述如何组合使用Tempus、Quantus和Certus高效保障和加速芯片签核的进程。

Zhaoke Xie, Anyka

Mitigation of Non-Critical Logic Timing Margin in Tempus Timing Solution

To close the timing closure of chip designs is a hard criterion for tape-outs, where EDA tools and engineers always focus on timing critical portion of designs with first priorities. However, for power-sensitive chips like in embedded or portable devices, every bit of power savings will be tremendously useful for lowering chip power dissipation. To squeeze the timing margins of non-critical logic, where even designers did not pay too much efforts on, is therefore worthwhile to be investigated and implemented at physical design stage.

The goal of this work is to reduce potential timing margins as much as possible without sacrificing timing. To achieve this, the authors firstly performed an exploration for checking the existing timing margins that emerged in 40nm chips which were defined for low power applications. The demonstrated results show that there still has positive timing slacks on non-critical circuit paths even with multiple executions of power optimization (opt_power) made in Cadence Innovus and Tempus.

To mitigate such forementioned useless timing margins, the paper introduces an automated flow to specifically down-size those cells who have relatively large positive timing slacks, so that can further save power consumption in an incremental way. The proposed flow is made up of three steps: 

1) An auto-filter is created for selecting cells with large timing margins. It is performed with pre-defined timing margin and transition criteria to only select cells with enough setup timing slacks and small transition time. 

2) A topology-aware compactor is defined for shortening the list of those selected cells in step (1). This aims to minimize the negative impact (e.g., timing and transition violations) on design from down-sizing cell operations. So the second step is specifically to exclude those cells which are in the same timing circuit path, where each path only can adjust one cell in each optimization iteration. 

3) All cells in the final list will be downsized by one drive strength step. The timing analysis results will be updated within Cadence Tempus. So one optimization loop is completed, and the flow will iterate those steps for following optimizations. 

The introduced flow is implemented in TCL script language using built-in commands in Cadence Tempus. It also has been successfully applied for optimizing 40nm commercial MCUs which has already been taped out. The achieved results confirmed that the proposed methodology is capable of further mitigating the (redundant) timing margins on non-critical portion of designs and ultimately can save the chip power to a greater extend.

Linan Cao, NXP

Experience of Virtuoso In-Design Physical Verification - How to Improve Layout Efficiency by Using iPegasus DRC/FILL

实时DRC检查,金属填充

Wei Hu, Sanechips

Custom/Analog Design

Application and Practice of Virtuoso Studio APR Flow for Automatic Placement and Routing on Custom Mixed-Signal SoCs in FinFET Advanced Process

伴随集成电路工艺节点的不断演进以及芯片工作速率的持续提高,超高速模拟版图设计受限于越来越复杂的drc约束设计难度也在呈指数级增长,版图设计工程师需要在满足drc的条件下同时设计符合设计指标的版图不得不投入巨大的精力。如何在drc修正和设计指标之间达成更好的平衡,成为每个模拟版图设计人员面临的挑战。本文将会论述,如何应用Cadence的Virtuoso auto APR工具来实现晶体管级和gate level的自动化布局布线,以达到大幅缩减设计周期,快速满足drc验证的目的。Art Flow提供了处理大规模设计的功能,并且能够适应先进工艺的要求。在模拟电路设计中不乏复杂的数字逻辑电路,使用Art Flow来处理这部分版图,相较于传统的手动布局布线方式,效率更高,版图质量也更高,版图设计工程师还可以根据项目需求,迭代需求,灵活优化版图布局布线。而且Virtuoso具备模拟电路的版图编辑和物理检查,将Art Folw嵌入Virtuoso,则可以将两者的优势相结合,协同工作,在同一个环境中同时处理模拟和数字电路,使设计更加顺畅,提高效率。

Shaojie Yang, Sanechips

Spectre AMS Designer-Spectre FX Simulator with SimVision MS Debug and DSPFIM Flow, the Efficient Simulation and Debug Tool for SerDes Design

随着电子产品的不断创新,产品应用对SerDes的需求也不断提高。近年来SerDes电路的发展趋势有更高的复杂度、更快的通信速率、更先进的流片工艺、更复杂的模拟数字信号的交互等特点。这些变化趋势对SerDes电路的全面验证提出了更高的挑战,对电路的时序、功能、功耗需要进行更严格精确的验证。SpectreFX是Cadence  Spectre仿真器平台中新一代的Fast-Spice仿真器,使用可扩展创新型FastSpice仿真器架构,能够在保持精度的同时,提供更快的仿真速度、更大的仿真容量、更全面的流程支持。AMS-SpectreFX是SpectreFX与Xcelium无缝集成的混合信号仿真验证工具,可以为设计工程师提供全套的快速仿真、调试和验证解决方案,帮助项目快速迭代,提高设计效率。SimVision Mixed-Signal(SimVisionMS) 是Xcelium中实现高效混合信号仿真调试的新功能,在SimVion里不仅能查看波形,并且能对混合信号仿真进行调试。用户可以使用Designer Browser 显示混合信号设计的层级结构,使用Schematic Tracer跟踪检查模拟部分电路, 使用Current Brower探测和显示模拟电流,使用Mixed-Net Browser 检查和调试混合信号接口。DSPF-in-the-Middle flow(DSPFIM) 是指,用户在对顶层模块抽取DSPF文件时,可以根据需求设置子模块为blackbox,之后调用该顶层模块DSPF文件进行仿真时,设置为blackbox的子模块可以灵活配置成 Verilog/Verilog-AMS/SystemVerilog/schematic 等不同类型的cellview进行仿真,从而提高TOP level 后仿真调试验证的灵活性和效率。本文将简单介绍AMS-SpectreFX混合信号仿真时应用SimVision MS和DSPFIM flow进行调试和优化设计的经验体会。

Gaolei Zhou, UNISOC

Applying Virtuoso RF Flow to Analyze Complex Passive Device in RFIC Designs

CMOS RFIC或者高速电路设计当中,无源器件如电感、巴伦和传输线等应用广泛。通常片上无源器件在最终项目迭代中,其结构或者跨越的电路层次会极其复杂。因为其本身设计变化或者电路当中其它设计变化导致的迭代可能会极其需要时间。Virtuoso RF Solution是Cadence这些年提供的包含IC/PKG联合物理设计与电磁仿真的设计流程。设计工程师可以在Virtuoso环境下同时设计/检查/修改IC和PKG的layout以及进行跨IC/PKG的电磁场仿真,最终的结果可以非常方便的集成在普通模拟设计IC设计流程当中。另一方面,Virtuoso RF还提供IC电磁场分析流程(VEM),它可以帮助设计工程师方便地选择IC版图当中哪些部分使用电磁场模型,哪些部分使用普通晶体管级的RC寄生,设计工程师几乎无需在原理图设计中构建极其复杂的无源器件设计层次,高效的完成后仿真迭代。本文主要讨论VEM在某个高速时钟网络当中的应用。

Ruijia Yi, Sanechips

Rapid Design Iteration: A Post-Simulation Method for Early-Stage Incomplete Layouts

随着工艺的演进,先进工艺下model的参数越来越多,对于电路功性能的影响越来越大,电路工程师在设计过程中必须反复调整庞多的电路参数才能够保证前仿的结果尽量靠近后仿结果,效率非常低下;另一方面,由于金属走线寄生电阻过大,时常导致电路设计人员难以在前仿考虑圆满,从而前仿的最终结果和后仿相去甚远。传统设计流程中的后仿寄生参数只能在版图完成且LVS物理验证通过后才能进行提取,同时,电路和版图的修改迭代会进一步加长前仿完成到后仿结果达到设计需求的时间,从而面临着整个设计迭代周期过长的问题。正因如此,模拟设计人员亟需一个能缩短迭代周期加速收敛使前仿尽可能靠近后仿结果的工具。Cadence EAE Partial Layout RESIM Flow是一个在设计前期辅助电路设计、优化版图布局的工具。Cadence的EAE flow可在版图仅有布局或部分连线的情况下进行寄生抽取,根据寄生结果提前进行后仿分析可及时发现布局问题。同时,根据后仿结果可进一步调整model参数设置,避免了版图反复修改的问题,辅助设计人员快速优化,极大程度缩短了迭代时间。综上,模拟工程师可以利用EAE实现快速前仿、提前后仿分析,确定版图布局,大幅提高工作效率。

Xin Shu, Sanechips

Application of AOP in Circuit Parameter Optimization

Advanced Optimization Platform (AOP) 是Virtuoso Studio IC23.1中基于ADE Assembler的一个先进的电路优化平台,是一个基于机器学习的优化算法和Virtuoso ADE Assembler通讯的框架,旨在帮助电路设计工程师更高效地优化电路参数。当工程师设计的电路结构已经确定,还需要优化器件尺寸以满足设计规格时,可以将原理图中影响电路性能的器件进行参数化设置,同时定义好关心的电路性能指标。在AOP优化算法的帮助下,AOP可以实现在巨大的参数组合空间内进行高效搜索,快速有效地寻找符合设计指标的参数组合,以优化电路性能。AOP内置了多种优化算法供用户选择,也可以方便地集成用Python/C++ 实现的自研算法。AOP可以在多种应用场景帮助电路设计工程师提高效率,例如电路设计进行工艺迁移以及常用IP 在不同项目复用时的指标优化等场景。本文针对本公司的一些实际电路模块,利用AOP进行优化,验证了其在电路优化上的高亮表现。结果显示,在给定设计空间中,AOP可以达到和工程师相当的优化能力,但可以很大程度节省电路设计工程师对电路手动优化所花费的时间。综上,前端工程师可以利用AOP实现更多工作,大幅度提高工作效率,缩短项目交付周期。

Shengli Fang, Sanechips

DDR5’s IBIS-AMI Simulation Using SystemSI and Spectre

As a trend in DDR-SDRAM,the IO frequency has reached up to 6.4Gbps/pin in DDR5. High datarate meas more channel impairment,such as inter-sysytem-interfere(ISI),reflection from multi-point  channel structure,and crosstalk from neighboring lines.then equalization circuits are needed to solve the problem of channel damage.4-tap DFE within DRAM RX is specified to eliminate the impact of ISI in the bits.So the channel simulation with DFE algorithm is very important in DDR5 .As an advanced signal integrity simulation tool,SystemSI from Cadence provides an efficient and accurate simulation environment for DDR5 channel simulation. SystemSI support multi-bits channel simulation for parallel interface – DDR5 with equalization using IBIS-AMI model from DRAM manufacture.Characterization in SystemSI contains rising and falling ramp response for single-ended signal in DDR5,then the DQ EYE result from SystemSI can accurately simulate DDR5 system performance.

Jiajia Xia, Zhaoxin

Virtuoso Studio with iQuantus Insight and Quantus Insight Flow for Accelerated Post-Simulation Iteration in FINFET Advanced Process

随着工艺的演进,先进工艺复杂的DRC(Design Rule Check)要求引入了更多额外的寄生通路,加上金属走线寄生电阻过大,从而前仿的最终结果和后仿相去甚远,传统设计流程中只能通过extract view能够相对直观一点的分析寄生电阻电容,但是无法进行P2P,各层寄生电阻电容等的快捷计算和显示,如何更好的分析寄生网表成为了设计者们面临的极大挑战。Cadence iQuantus Flow是一个在设计过程中辅助版图寄生迭代的工具。iQuantus可基于版图抽取得到的寄生网表文件进行后仿寄生分析,通过精确的寄生电阻、电容分析辅助进行模拟版图设计优化。同时可通过修改网表、重复分析,快速迭代后仿结果,为版图迭代提供修改方向及参考依据,避免了版图反复修改带来的时间成本问题,辅助设计人员快速优化,极大程度缩短了迭代时间。综上,模拟工程师可以利用iQuantus Flow实现快速的后仿迭代、提高模拟版图迭代速度,大幅提高工作效率。

Zhiyi Li, Sanechips

DSPF Interactive Output

i-dspf 作为Quantus一种先进的feature,支持交互式分析,可以加快design closure

Lirui Zhu, UNISOC

Application on High-Sigma Yield Verification Based on Spectre Fast MonteCarlo and Spectre FX

随着集成电路技术节点的进步,应用领域的不断拓宽,  电路设计日益复杂,电路规模不断增大,对电路设计的可靠性要求不断增加。传统蒙特卡罗方法(MonteCarlo)已经不能适应现如今业界对high-sigma良率的验证需求,快速蒙特卡罗方法(Fast MonteCarlo)算法便呼之欲出。Cadence 新一代的Spectre Fast MonteCarlo(FMC)算法主要分为worst sample和yield estimation两种类型。算法基于机器学习和更先进的统计学算法,建立起采样点参数变化与最终仿真结果之间的数学模型,能有效通过尽可能少的采样点仿真,得到准确的统计性结果。针对3-6 sigma的应用,与传统MonteCarlo相比,在保证精度的前提下,效率显著提升,并且使用方法具有良好的可调式性、可验证性、准确性和通用性。Spectre FMC配合Cadence最新一代的FastSPICE仿真器Spectre FX,更是能做到将仿真时间进一步缩短。Spectre FX采用了全新的逻辑架构,优化的电路划分算法,能针对大规模电路进行快速仿真,并且具有高效的多核扩展能力。经过了三年的市场考验,已经证实了在不同工艺节点,不同类型的电路,前仿、后仿中都能获得较大的仿真速度提升。本文将介绍最新的Spectre FMC的原理以及使用方法, 分享我们结合使用Spectre FMC 和Spectre FX,高效进行high-sigma良率验证的经验。

Changlin Tang, Sanechips

Digital Design Implementation

Generative AI-Driven Digital Full Flow Innovation and Roadmap Delivering Faster Design Closure

Miao Liu, Cadence

Exploration of the Physical  Implementation Efficiency Improvement Based on the Cadence Methodology

随着设计工艺的不断精进,芯片的集成度变得越来越高,模块的大小和复杂程度也随之大幅提升,这使得在传统流程下物理实现所需的耗时增加,效率降低。因此,目前急需寻找出新的流程和技术来解决上述痛点。本文针对大型模块以及子系统,探索使用Cadence流程中的新技术来提高模块物理实现的效率。这些新技术具有自适应,分布式,大容量的特点,可以满足大体量模块对时间和效率的要求。文章首先讨论了Smart-hierarchical流程,即通过切分partition的形式将模块分成多个子模块并进行分布式物理实现,相比普通流程较大程度节省了运行时间;然后介绍了Podv2、Cod、Rod和Sod等新一代innovus指令流程,以及分布式优化工具certus在实际项目中的应用,对物理实现的效率均有不同程度的提升。

Junting Xiao, Sanechips

Utilization-Driven Die Size Analysis and Optimization

Die size reduction is always critical for chip design. Usually instance-utilization is used for die size estimation and optimization. However, there’re three limitations for instance-utilization. 1. Instance-utilization is not matching well with routing congestion in some case; 2. Instance-utilization doesn’t help to channel area optimization; 3. Instance-utilization can’t tell area efficiency inside hard macro. For this reason, this paper proposes track-utilization and device-utilization for chip analysis and optimization.Track-utilization is routing track utilization. Compared to instance-utilization, track-utilization is really the limitation for chip size. To increase track-utilization, different ways are proposed. “Route ingredient check” not only calculates track-utilization by region but also monitors different track-occupy objects by type. Diesize Doctor 2.0 helps to show SoG track-util density map in Innovus for congestion mitigation. “Channel/moat space check” reports and shows area optimization opportunity between hard macros, memories and padring. “Analog overhead check” is analog based special net cost check. Layer-based utilization is emphasized to find uneven layer distribution region for area reduction. “PG edge check” suggests PG edge on-track for PG mesh efficiency. “M1 route check” recommends to open M1 for routing as test results show DRC improve with little decap loss.Device-utilization is to improve area efficiency over all chip, include hard macro inside. To increase device-utilization, different methods are proposed. “Padring check” is for padring area optimization. If it’s core limited, padring space can be used to put small hard macro; if it’s pad limited, selected inter pad solution is good for chip size. “Transistor-utilization check” is to check and find low area-efficiency region all-around chip, especially inside analog macro and pad. “Domain gap area check” is to check overhead from different power domains. “Memory efficiency check” is to check memory area versus capacity efficiency with memory timing margin.Different utilization-driven die size reduction methods are integrated into Diesize Doctor 2.0. With utilization-push physical design, multiple chips are optimized for compact layout and smaller die size.

Summer Gao, NXP

Customized Routing Automatic Methodology in Innovus Implementation

Reducing the time cost of the project schedule and improving efficiency have always been the pursuit of designers. The growing number of chip functions and analog IP has made the time cost of customized routing a major concern.Traditional customized routing for special nets requires physical implementation designers to manually route wires one by one at Floorplan stage, which is time-consuming and low efficiency due to a lot of manual operation.This paper presents an efficient and novel methodology for automated customized routing. The proposed integrated flow consists of three main steps: automatic routing, width adjustment with resistance-driven, and added shielding. First, we use the command (route_point_to_point) provided by Innovus to implement simple path routing. For nets with complex routing requirements, we make use of NanoRoute engine to plan routing path. With proper setup, “route_design” command can be used to route assigned special nets just as regular nets. Then we perform “route_point_to_point” to reproduce corresponding special nets based on routing path of “route_design”, but with resistance-driven width adjustment to achieve resistance target. Finally, one procedure is applied to add shielding for wire by requirement. One key point is to use Non Default Rule in first step to reserve enough space for second and third steps. The advantage of the proposed methodology is routing special nets automatically and quickly.The experimental results demonstrate the effectiveness of proposed automated routing flow. It can complete 76 nets routing in 1 hour for one 90nm chip and 53 nets routing in two hours for one 40nm chip. 

Mart Guo, NXP

Low Power Optimization in Innovus Implementation with “Low Power Advisor”

Chip power is as important as chip performance and area in physical implementation. However, compared to timing and area, chip power is application based and it’s not easy to evaluate and constraint. And more, low power implementation is experience based and scattered. For this purpose, “Low Power Advisor” is proposed to concrete abstract target to detail index, based on it, provide suggestions to support low power implementation and optimization.In “Low Power Advisor”, “Instance Distribution Check” and “Cell Distribution Check” help you to know different Vt ratio, different channel-length cell ratio, different cell size distribution, timing buf/inv ratio, hold-fix cell ratio… And then, reminding message will be given if value is out of bound.“Clock Tree Check” helps you to know clock tree efficiency. Clock buf/inv area per flop and clock latency are listed and checked to see if clock tree is built with good shape. Suggestions will be given for possible improvement opportunities. Also “Clock Tree Check” reports clock tree transition distribution and provide proposal to avoid over design.“Wire Routing Check” is inspecting both signal wires and clock wires, reports total wire length and average wire length per gatecount (for clock net, average wire length per flop). Wire-detour design is not low power friendly and should be carefully designed. For clock nets, decap and frequency are also reported for optimization. “Input Cap Check” reports all input cap value of library cells and hard macros. Pins are highlighted whose cap is larger than benchmark.“Tool Option Check” reviews all tool setup in Innovus. Different from timing and routing, some power-friendly options need to open and update case by case in low power design. “Tool Option Check” helps you to know power-friendly options and give suggestions for tool setup. Designer makes decision considering timing and routing. “Low Power Advisor” has more features such as “Timing Margin Check”, “Clock Gating Check” and “MBFF Ratio Check”.Combined all of these, different MCU chips are checked with “Low Power Advisor”. Check results show there’re different power optimization opportunities for low power and detail data is shared.

Glen Ge, NXP

The Application of Cadence Cerebrus in the Physical Implementation of High-Performance Digital IPs

随着集成电路芯片制造工艺的进步和产品性能的要求不断提升,芯片的规模正在不断扩大,这对整个芯片能否在预期的时间内达到设计要求带来了更大的挑战。Cerebrus基于机器学习(ML),结合芯片物理设计全流程平台,通过对综合和物理实现过程中产生的大量数据进行机器学习建模,可以产生多种设计参数的组合,同时对所有组合的结果进行分析迭代,不仅能够缩短迭代周期,还能改善芯片的性能和功耗。利用Cerebrus的高效自动化控制流程,对具有高性能、低功耗要求的数字IP进行物理实现,与传统人工方法的结果对比,在加快设计实现收敛的同时,频率提升约13.4%,功耗降低5.9%。

Xinyi Zou, VeriSilicon

Power and Area Optimization Based on Genus+Innovus Flow of Cadence Cerebrus

对于性能功耗面积(PPA)的追求已成为IC芯片设计的共识,尤其是发展到先进工艺节点,PPA已成为IC设计综合性能的重要指标,尤其是对于大型SOC芯片中clone很多次的模块,对于PPA的追求变的更加极致。本文介绍了基于Cadence公司的Genus工具和Cerebrus 工具,通过综合阶段与后端PR各个阶段的优化,共同提升PPA的优化方案。最终结果显示,在时序及DRC基本收敛的情况下,使用Cerebrus工具相比Innovus可以使功耗降低4.1%,面积降低5%,使用Genus+Cerebrus流程可以使功耗降低7%,面积降低8.5%,极大的降低了芯片的面积及功耗。

Shuo Wang, Sanechips

Voltus InsightAI

半导体芯片工艺越来越先进会带来更高的芯片集成度,但也会让单位面积内的电流密度或者说单位面积的功耗密度不断增加,芯片电源网络的电阻增加和高密度的晶体管同时翻转会导致IR drop的产生,继续产生timing问题和功能问题,IR-EM 已经成为芯片设计最迫切的挑战之一。快速的找到IR drop的原因同时能够提供快速的fix方案成为芯片设计很重要的一环,本文基于voltus insight AI feature的应用,找到了一种快速fix IR drop的方法,实现了高达 95% 的IR drop violations的修复率,同时不会造成timing 和DRC 的恶化。

Zhiyong Zeng, Enflame

Application of Efficient Analysis and Repair of IR Drop Base on Voltus InsightAI Technology

随着先进工艺的采用,高性能芯片的电源完整性成为芯片设计主要的挑战,工程师在功耗完整性解决签核时面临大量的EMIR 的违例。 Voltus InsightAI 利用创新突破性的AI技术,可快速完成增量IR分析,使用该技术可提高工程效率,及早发现问题并提供关键性生产力的提升功能。本文阐述一个ARM High Performance Core 使用Innovus 做后端实现中,采用Voltus InsightAI 技术,在兼顾时序和设计规则检查 (DRC) 的同时自动修复IRdrop 问题,在postCTS和postRoute阶段,工具自动用不同的优化策略来优化IRdrop,加速功耗完整性收敛。该方法可以准确分辨造成IRdrop的真实高风险并针对性修复,极大减少了对IRdrop区域使用传统修复方法造成的时序和设计规则违例,优化人工工作量。

Tianyu Zhuang, Sanechips

Fixing IR Drop Based on Pegasus PG Fill

随着工艺制程的演进,芯片的集成度越来越高,先进工艺下的金属线宽不断缩小,单位面积下的晶体管密度也越来越高。IR Drop的问题也变的越来越显著,如何对出现的违例进行有效的修复一直是棘手的问题。在本文中,我们在某低压的项目中,应用pegasus工具的PG Fill流程对大片的IR drop违例进行增强PG的动作,修复效果明显,显著降低了IR drop的违例数量与最大违例值,助推项目顺利收敛。该工具采用Cadence下的PV signoff工具对局部进行PG填充,在对PV和STA影响较小的情况下实现了IR Drop违例的修复。这为芯片后端设计工程师提供了一种有效的IR drop修复方法,对提高芯片设计收敛的效率和质量具有重要的意义。

Siji Qu, Sanechips

PCB, Package Design, and System SI/PI Thermal Simulation

Intelligent Multiphysics System Analysis

Andrew Wang, Cadence

Accelerating PCB Design with Allegro X Platform

Steve Durrill, Cadence

Optimization of High-Speed SerDes ERL and COM Based on Optimality AI

Serdes速率越来越高,它对无源通道设计提出了非常大挑战;因此在高速Serdes(比如112Gbps Serdes)都提出了ERL和COM等评估指标,其不同于传统的插损、回损、串扰等指标,它考虑整个链路的端接匹配和TX/RX均衡技术的影响,更能反映最终Serdes真实的SI性能。单个局部区域设计常常使用传统频域指标优化到最佳,但是级联组成全链路时可能达不到最佳性能。而全链路级联进行优化,传统扫参的方式几乎不可行,因为扫描变量可能达到上10种。本文提出利用Optimality AI优化技术,结合Clarity3D进行自动参数化建模封装垂直VIA结构,再导入Topology Workbench中进行电路拓扑级联(加上bump区域,Tline区域,PCB区域),最后调用外部COM/ERL脚本作为优化目标计算,最终实现高速serdes封装设计优化。这个基于Optimality AI驱动的优化流程,能完成无用户干扰的参数迭代更新,计算目标函数和约束条件,直到最终达成收敛的标准,降低了高速serdes设计难度。

Shineng Ma, Sanechips

How Standards-Based Protocols Are Imperative for the AI Workloads of Tomorrow

Arno Li, Cadence

Electrical and Thermal Simulation Analysis With Celsius Studio

随着ChatGPT孕育而出,人工智能热潮也随之产生,GPU也成为了AI大模型训练平台的基石,甚至是决定性的算力底座。在此背景下,如何应用现有的SI/PI/TI软件评估GPU服务器能否在复杂系统下稳定运行,是当前重要的问题。对于设计者来说,传统设计方法是通过打样以及后期的测试来评估PCB板子的性能,这将导致时间以及设计成本的浪费。那么有没有办法在设计前期通过一些手段来把控风险点呢?本文分享了基于Cadence Celsius 的一些仿真软件,联合SI/PI/TI 三大领域,对PCB本身可靠性进行模拟仿真。仿真包括TI对PCB上电源完整性影响,TI对信号完整性两大部分。通过一些数据以及图表形式表现出来, 从而保证产品在设计阶段的质量

Ji Zhu, Inventec

In-Depth Research About the Efficient of Optimality Intelligent System Explorer In Time Domain Simulation of Multiple Design Cases

随着产品的速率及复杂性越来越高,针对仿真而言,除了要求仿真本身具有非常高的精度外,还对仿真的效率提出了很高的要求。具体到不同的信号模块,像DDR系统或者高速串行信号上,基于速率越来越高,越来越希望仿真给能出“最优解”的配置,例如DDR5颗粒的ODT的最优配置,高速信号芯片的加重均衡的最优配置等参数。那么如何在成百上千中组合的参数中选择相对最优的参数呢?传统的软件只能通过大量的扫描来进行筛选,在仿真时间和工程师的精力两方面都有比较大的耗费。而本文使用Optimality软件,通过分享一些具体的仿真案例,可以让大家看到软件的智能性,能帮助我们更快速挑选出最优的参数,使DDR及高速串行的仿真工作变得更加轻松,充分体现出Optimality软件的高效性。

Gang Huang, EDADOC

Cadence PFM Collaboration with PDM

分享Cadence 所提供的Cadence EDM 和 PDM PTC -Windchill 之间数据交互的解决方案-PFM,其中包括在施耐德如何定义和管理ECAD 物料清单、技术文档,及相关数据的架构,如何建立Cadence和Windchill的数据流的模型,还有Cadence的PFM如何来支撑整个数据架构并使其高效地运转,还会分享在整个PFM部署过程中的经验以及踩过的坑。

Chenchen Hu, Schneider Electric

Application of Electrical and Thermal Co-Simulation in System Products

伴随通信产品高密度,大功耗的发展趋势, 只单纯考虑固定温度的电源压降仿真,不能反映其真实的情况, 所以需要精确的对热进行建模, 考虑电热耦合的影响,精准的对板级的压降进行评估。

Nan Hou, ZTE

Verification

Cadence AI-Driven Verification Solution and Dynamic Duo New Features Update

Matt Graham, Cadence

Emulation Moves Into 4-State Logic and Real Number Modeling

Michael Young, Cadence

Maximizing Coverage and Accelerating RISC-V Verification with SimAI

Amit Dua, Cadence
Yu Zhi, Sanechips

Accelerating Verification Efficiency with Cadence AI Technology

随着硬件设计规模的扩大,复杂程度不断增加,验证过程按期收敛的挑战难度不断增大,单纯依靠增加 CPU 核数量和运行更多的并行测试治标不治本。如何在市场压力和严格的投片时间前做到验证关键指标收敛到目标,是验证工程师面对的难题。为解决这一难题,cadence推出了采用机器学习(ML)技术Xcelium ML App和利用人工智能驱动的verisum apps来提升验证效率,前者使用机器学习技术来优化仿真回归,以产生一个更紧凑的压缩回归。然后这个优化的回归被用来重现与原始回归几乎相同的覆盖率,并通过运行现有随机测试平台可能出现的边界场景,快速找到设计错误;后者包括Versium AutoTriage、Verisium SemanticDiff、Verisium WaveMiner、Verisium PinDown等,其中Verisium SemanticDiff 帮助瑞萨快速识别失败原因,比传统 diff 工具更加高效。SemanticDiff 专注于设计环境,可以提供更相关的差异分析。此外,逐条检查 diff 指令的历史文件是很繁琐的,SemanticDiff app 可以大幅缩短纠错时间,显著提升效率。 Verisium WaveMiner 可以高效识别差异点,用户可以在 PASS 和 FAIL 中将差异点可视化,便捷地比较 PASS 和 FAIL 的波型及源代码。本文主要介绍verisum apps使用流程,并给出了结合实际项目 使用产生效率提升结果,以及对未来应用前景的展望。

Jiashan Xu, Sanechips

AI-Accelerated FPV Lib

基于AI加速的可复用FPV平台库与传统的动态仿真相比,形式验证可以针对待测设计代码的所有状态空间进行遍历,即使面对结构复杂的设计也能够准确地覆盖边界场景,这种全面激励确保了验证的完备性,这种特性使其在验证过程中更加可靠和高效。形式验证工具一般会提供多种封装好的APPs,大体可分为无业务APPs和业务相关APPs。无业务APPs无需感知模块具体的功能,投入较小但是收益较高,如Conn、UNR、Xprop等;而对于业务相关APPs,使用者需要感知待测模块的具体功能点。相比于无业务APPs,业务相关APPs投入较高但可能产生巨大的收益,如FPV、C2RTL。在业务相关APPs中,FPV(Formal Porperty Verification)是收益较高的一类,属于Model Checking的一种,将使用者编写的Property和RTL代码共同抽象成CNF求解表达式(Conjunctive Normal Form),使用形式验证工具中不同的SAT求解器(Satisfiability)对其进行证明。但是FPV对使用者有较高的要求,高质量的Property编写依赖于对设计的深入理解以及丰富的形式验证经验。并且FPV对设计的规模有一定限制,对于某些状态空间较大的模块,Porperty证明会花费较多的时间和服务器资源。为了应对此类挑战,中兴微电子提出了基于AI加速的可复用FPV平台库解决方案。针对功能类似的设计,开发一套通用的Property代码与配套文档,可实现同一项目内复用与不同项目间复用。并且在Jaspergold Proof Master@Cadence工具的支持下,基于平台库抽象成的CNF记录当前使用的SAT,以database的形式存储下来,复用至其余功能类似的模块。FPV平台库 + AI Database可以极大减少Porperty开发时间与运行时间,提升FPV验证效率与质量,该流程已在中兴微电子某车规项目使用,针对某访问权限控制模块,使用AXI ABVIP@Cadence和Scoreboard@Cadence搭建可复用平台库,并使用Proof Master训练对应的AI Database,在不同模块间进行复用。Property复用率可达75%,减少FPV运行时间40.15%,缩短形式验证时间一半以上,对FPV验证工作的快速收敛起到重大助力。

Sihang Shang, Sanechips

Deployment and Acceleration of High Perfomance Processor Co-Sim Verification Framework on Palladium

近年来,开源硬件及硬件敏捷开发方法吸引了学术界和工业界越来越多的关注。“香山”开源高性能RISC-V处理器项目以面向世界的体系结构创新开源平台为目标,从2020年6月开始持续进行微结构创新与敏捷开发方法实践,目前已经快速设计迭代出了三代处理器芯片,在性能、面积、功耗等各项指标上均达到了工业级高性能处理器的评估标准。最新一版的昆明湖架构SPECCPU2006 达到 15 分/GHz,是目前国际上性能最高的开源处理器核。在高性能处理器的敏捷开发流程中,验证是保证处理器功能正确性的重要方式,也是耗时最长的关键环节。“香山”处理器团队提出了一套基于参考模型的DiffTest软硬件协同仿真验证方案,通过实时提取处理器运行状态,驱动并比对参考模型运行结果,从而实现指令级精确的系统级验证。DiffTest差分验证框架能够充分验证处理器符合指令集规范的各项行为,并精准定位错误现场。由于软硬件协同仿真方案需要部署软件侧的参考模型,并支持处理器和参考模型的交互,过去本团队直接基于软件RTL模拟器(如Verilator)进行整体部署。然而,随着处理器设计规模的扩大,软件RTL模拟器的仿真速度严重下降,芯片设计及验证在敏捷度上的剪刀差不断扩大,形成了一堵“验证墙”,阻碍了芯片开发的快速迭代。如何兼顾验证框架仿真速度及检错调试能力的双重提升已经成为了加速处理器芯片敏捷开发流程的关键问题。Palladium能够极大加快硬件仿真速度,支持软硬件协同部署以及高达200MB/s的IB通信带宽,为高性能处理器的敏捷验证带来了新的动力。本文将介绍如何在Palladium上部署DiffTest协同仿真验证框架,实现对“香山”高性能处理器核的bug高效定位方法。此外,基础的协同验证方案要求处理器、参考模型依次执行,每个时钟周期都进行一次实时同步,具有通信频次高、传输数据量大、同步中断频繁等特点,本团队针对 Palladium 平台特性进行了一系列加速工作,采用检查状态的信息含义压缩、数据包合并、异步非阻塞优化等优化策略,降低了额外通信开销。在不降低检错调试能力的前提下,优化后的协同验证方法相比未优化状态实现了38倍以上的速度提升,相比纯软件仿真实现了100倍以上的速度提升。该框架成功应用于高性能处理器“香山”的实际开发验证流程,并在多核验证、H拓展调试、处理器核整体验证等场景下有效协助定位了内存信息不一致、处理器缺页异常等复杂Bug。

Luoshan Cai & Kunlin You, BOSC

Efficient SoC Performance Analysis with Cadence Verification Suite

With the rise of intelligent driving technology and AI applications, the concept of computing power is gaining traction in SoC chip design. Computing power, defined as the ability to complete computing tasks within a specified timeframe, plays a crucial role in maximizing the performance of SoC chips. Achieving high computing power entails optimizing the internal design of computing engine IPs (such as CPU, GPU, NPU) as well as the overall SoC architecture and performance of critical data paths.As chip design technology advances, there is a growing demand for performance analysis of SoC chips. Optimizing SoC performance involves updating the top-level architecture, which can extend the development timeline. Failure to identify performance bottlenecks early on may result in SoC products losing their competitive edge in the market. Thus, efficient performance analysis in the pre-silicon design phase is essential for enhancing chip design quality and market competitiveness.However, the increasing complexity of SoC design poses challenges to performance analysis. Critical performance paths within a chip involve various components like IPs, buses, and memory interfaces, where bottlenecks can stem from individual IPs or the coordination among different components. Moreover, different data paths in the SoC have diverse performance requirements – for instance, the CPU prioritizes low latency, the NPU focuses on bandwidth, and the image display path demands real-time performance.To address these challenges, performance analysis in SoC design can be viewed as a complex data fusion process. It encompasses defining performance goals, collecting data samples, correlating and analyzing multiple data sets to pinpoint actual performance values and bottlenecks. While performance analysis is not a new concept, the complexity of modern chip design makes these tasks error-prone and time-consuming.Fortunately, tools like Cadence Palladium Emulator, Xcelium, and System VIP can streamline the data collection process, reducing the time required to gather data samples. Additionally, tools like Cadence Indago Python API and other scripts simplify data extraction and analysis, presenting key performance metrics through visual charts. These tools aid in understanding chip performance under different scenarios, facilitating the identification of patterns and bottlenecks.

This paper aims to delve into efficient performance analysis methods in chip design with the support of Cadence tools. It outlines techniques for analyzing different data paths in SoC, explores strategies for enhancing performance during the design phase, and utilizes examples to illustrate the performance analysis and improvement process effectively.

Yibin Jia, Innosilicon

GLST Acceleration with Xcelium Advanced Feature

Mingjie Wu, UNISOC

Functional Safety Solution and Practice Based on Verisium Manager Safety

自动驾驶技术和新能源汽车近年来发展迅猛,汽车电子需求增加,而由于其特殊的应用场景,对电子器件和系统提出了更高的要求,功能安全问题成为研究的热点。开发能够符合安全标准的半导体芯片变得至关重要,功能安全验证正面临重大挑战。Cadence提供全面的Safety Solution安全方案,可支持失效模式诊断分析和故障项目管理,极大地提高验证效率并加速达到安全标准。本文将介绍Cadence Safety Solution在项目中的成功实践,通过Xcelium Fault Simulator的串行或并发引擎进行故障注入仿真,同时采取Jasper FSV App进行形式化故障分析,识别等效故障点以及不可观测点,加快功能安全验证进程,并通过Verisium Manager Safety将仿真器和formal工具链串联起来,实现整个故障活动的管理,自动化地调用各种安全引擎执行复杂的故障注入仿真。最终生成故障报告并将仿真结果反标到Midas,达到ISO 26262标准的ASIL D安全等级。

Xue Zheng, Sanechips

Jenkins Pipeline Techonology and Cadence Verisium Manager Plug-In for Automated Simulation Verification

如今芯片设计规模随着产品需求复杂变得越来越大,验证工程师使用传统的脚本方式进行日常回归存在时间消耗大,服务器资源利用不充分的问题。鉴于jenkins 技术在自动化集成和运维中得到了广泛应用,Cadence基于jenkins推出了Verisium Manager插件,将自动集成技术引入芯片仿真流程,该插件支持free style和pipeline方式使用,支持静态或动态vAPI调用,同时支持用户凭据认证登录Verisium Manager,可方便呈现清晰的仿真数据结果。本文率先使用最新的jenkins pipeline/共享库技术与Verisium Manager插件相结合,利用jenkins job的灵活触发机制,以及共享库的凭据安全性,快捷地部署了基于Cadence Verisium Manager自动集成环境,省去验证工程师繁琐的环境配置和Verisium Manager工具调用脚本编写,并在实际项目中进行了推广测试,实现了芯片验证的持续集成和仿真交付,提升了芯片验证交付效率。

Kai Liu, Sanechips