6nm Design Implementation with Cadence iSpatial Full Flow
At advanced process nodes (e.g., 6nm node), circuit design/implementation schedule and timing convergence remain challenging tasks during the physical design phase. Additionally, the placement stage can be one of the most important and most time-consuming steps in the physical design implementation flow, since circuit timing and routing congestion must be considered during this stage. Therefore, in order to consider physical layout effects during the circuit synthesis phase, it is preferred to adopt a full flow, which bridges the quality gap between physical-aware circuit synthesis and physical placement for reducing design iterations. The iSpatial integrates Innovus Implementation System’s GigaPlace placement engine and the GigaOpt Optimizer into Genus. Through the unified core engines inside iSpatial, the circuit design performance can be accurately predicted at the circuit synthesis phase, faster turn-around time for RTL regression can thus be achieved. Moreover, only purely incremental optimization steps after iSpatial need to be applied, that is, the full flow has achieved the overall reduction of total design time. In this presentation, we shall share our iSpatial-based design flow, relevant tool settings, and results of our designs using traditional design flow versus iSpatial-based full flow.
Accurate Timing Analysis for Hierarchical Design
For hierarchical design with low power schemes, each block has its own power state table (PST) and all block’s PST are integrated by EDA tools, that is, to expand all PSTS into a SoC level PST according to IEEE standard definitions. This method induces two problems. First, SoC level PST result goes beyond human’s imagination while integrating a big enough design, e.g., over 10 blocks. Second, there is no clue to tell user why zero, incomplete or 100X size PST happens at SoC level PST. Incomplete PSTs can impact low power implementation and verification quality whereas integration with 100X power states can make CLP (Conformal LowPower) run time become 2X to 10X longer. Based on our experience, manually debugging such issue for a 10+ blocks design with 100+ power states usually cost 2~3 weeks. Hence, we propose a solution including two parts. First, a debugging flow based on CLP to find the root cause of incomplete or oversize PST. Second, MTK PST integration results to let user aware unexpected PST size. By adopting our solutions, integrators can efficiently complete SoC level PST integration instead of try-and-error, and therefore, reduce not only the trail period from 2~3 weeks to 2~3 days but also the size of SoC level PST to user’s expectation.
Hierarchical IR Drop Analysis with Power Calculation with Distributed Processing (Power-DP) and Extreme Power Grid View (XPGV) Flow
MediaTek is the largest fabless IC design house and have different product lines. As chip becomes larger and complex, power integrity analysis has become dominating stage on time-to-market. To meet the requirement of tape-out schedule, we focus on critical items about capacity, run time and efficient working model. In this session, we show Cadence Volta’s technologies about hierarchical IR-drop analysis with XPGV model and power-DP. About hierarchical IR-drop analysis, it’s new technology to simplify the Power/Ground mesh saving as XPGV model without accuracy loss. We present XPGV model generation, XPGV model accuracy and runtime reduction. The results show that IR-drop difference is smaller than 1% comparing between fully flatten IR-drop analysis and hierarchical IR-drop analysis which reduces around 50% runtime of rail analysis based on different setup of #XPGV model. Furthermore, Cadence Voltus provides a distributed platform to do the power calculation called power-DP. It reduces about 50% peak memory and 30~70% run time of power analysis. Thus, we are able to use multiple small machines to handle large designs. At the end of our session, we propose an efficient working flow and demonstrate how to handle large design to meet the tighten tape-out schedule.
Physically-Aware Synthesis. Antidote or Placebo?
Over a decade ago, physically-aware synthesis technology was announced and heralded a new era from logical synthesis to physical solution. Before the technology involving, wire R/C (Resistance and Capacitance) is modeled by WLM (Wire Load Model) which is generated by statistic results from foundry. With the technology, wire R/C (Resistance/Capacitance) accuracy can be dramatically improved because it performs real placement and extract wire R/C based on specific physical locations. Therefore, synthesis results can be close to the ones at PD (Physical Design) placement stage, design over-constraint and over-optimization as WLM synthesis are un-necessary. However, at advance process nodes, process design rules become much more complicated than previous. This thing impacts physically-aware synthesis seriously. In real cases, we can see very different results between synthesis and after cell placement. So, we realize that that considering placement and physical location are not enough to advance process nodes. Even using physically-aware synthesis, over-constraint and over-optimization are still necessary like WLM synthesis, advantage of physically-aware synthesis is disappeared. In this work, we discover key items to impact physically-aware synthesis accuracy. Based on the findings, we compare the engine/flow differences between Cadence older synthesis solution and the latest one, Genus iSpatial. Furthermore, we can see that Genus iSpatial has significant improvement in PPA (Performance, Power, Area) correlation from real project cases. With the improvement, we can also see the benefit to timing closure and PPA. Finally, we can give advice about Genus iSpatial for further quality enhancement.
Saving Next-Gen 112G LR and Extremely Short Reach (XSR) SerDes Power
The 112g LR(Long Reach) & XSR (eXtreme Short Reach) SerDes(Serializer/Deserializer) are designed for next generation data center ASICs. Due to the high bandwidth, high speed data transmission behavior, the power consumption of every design part has to be reviewed carefully. In this presentation, we'll show how we implement designs with Innovus pattern-based power optimization to save both switching & internal power of each stage. Moreover, we try to reclaim every possible power with PBA sign-off timing by Tempus total power ECO and finally meet our goal.
Timing Closure by Machine Learning and SOD
Machine Learning is a new skill to improve postRoute timing, SOD is very powerful for AOCV timing closure in Innovus. We apply these 2 skill in our UMC 28nm process and have got good result.