Energy-Efficient On-Device AI Platform to Accelerate AI Everywhere for Everyone

  • Domain-specific, extensible, and configurable AI platform built upon mature, volume production-proven Tensilica architecture
  • Industry-leading performance and energy efficiency for on-device AI applications
  • Comprehensive, common AI software addresses all target markets
  • Low-, mid-, and high-end AI product families for the full spectrum of PPA and cost points
  • Scalable from 8 GOPS to 32 TOPS currently and 100s of TOPS for future AI requirements

As demand has increased for AI-based tasks in a wide range of applications and vertical segments, on-device and edge AI processing has become more and more prevalent. As these solutions are deployed in SoCs with varying computational and power requirements, meeting the market needs for a wide range of automotive, consumer, industrial, and mobile applications can be a challenging task for both silicon IP providers and SoC companies.

The comprehensive Cadence® Tensilica™ AI platform accelerates SoC developers to design and deliver their solutions in applications based on their KPIs and requirements. The platform includes three AI product families optimized for varying data and on-device AI requirements to provide optimal power, performance, and area (PPA) coupled with a common software platform. These deliver the scalable and energy-efficient on-device to edge AI processing that is key to today’s increasingly ubiquitous AI SoCs. Cadence’s comprehensive AI platform spanning the low-, mid-, and high-end classes is built upon the highly successful, application-specific Tensilica DSP architecture.

These three AI product families are AI Base, AI Boost, and AI Max:

  • AI Base includes Tensilica HiFi DSPs for audio/voice, Vision DSPs, and ConnX DSPs for radar and communications, combined with AI instruction-set architecture (ISA) extensions.
  • AI Boost offers companion neural network engine, initially the Tensilica AI NNE 110 engine, scales from 64 to 256 GOPS and provides concurrent signal processing and efficient inferencing.
  • AI Max includes the turnkey Tensilica AI NNA 1xx accelerator family—currently including the Tensilica AI NNA110 accelerator and the NNA 120, NNA 140, and NNA 180 multi-core accelerator options—which integrates the AI Base and AI Boost products. The multi-core NNA accelerators can scale up to 32 TOPS, while future NNA products are targeted to scale to 100s of TOPS.

Plus a common AI software is designed to accommodate all workloads and markets streamlines product development, enabling easy migration as design requirements evolve. Tensilica AI products can run all neural network layers, including but not limited to convolution, fully connected, LSTM, LRN, and pooling.

 Tensilica AI Platform Graphic

Low-, Mid-, and High-End AI Platforms for the Full Spectrum of Performance, Power, and Cost Points

AI platforms with extensibility, configurability, and sparse compute engine

Scalable Design to Adapt to Various AI Workloads

AI Base built on successful and power-efficient domain-specific DSP. Scalable AI Boost and AI Max can range from low (<1 terra ops per second (TOPS) to very high compute (100s of TOPS) needs

Efficient in Mapping State-of-the-Art DL/AI Workloads

Best-in-class performance for inferences per second with low latency and high throughput

End-to-End Software Toolchain for All Markets and Large Number of Frameworks

GLOW-based Xtensa® Neural Network Compiler (XNNC), interpreter, and delegate-based AI software tools

True Random Sparsity Gain

Sparse compute AI engine exploits tensor sparsity (both weights and activations)

Industry-Leading Performance and Power Efficiency

High MAC utilization and TOPS/Watt combined with low energy consumption

Answering the Needs of a Wide Range of End Applications

  • Hearable and wearable
  • True Wireless Stereo
  • Smart Speakers
  • AR/VR Headsets
  • Automotive
  • Mobile
  • Drones and Robots
  • Intelligent Cameras
  • Private on-premise compute

Comprehensive Offerings for AI Hardware and Software

AI Max

  • Turnkey AI product family offers fast time to market
  • Built using AI Base and AI Boost products
  • Integrated DMA support and necessary system logic for multiple cores
  • Tensilica NNA 1xx family of products offers up to 32 TOPS

AI Boost

  • Built on top of AI Base by adding highly scalable and energy-efficient AI engines
  • First AI engine is Tensilica NNE 110 that offers 32 to 128 8x8 MAC offering 64 to 256 GOPS
  • AI engines offer sparse compute to achieve the best performance and energy efficiency
  • Tensor compression for higher memory bandwidth and lower energy
  • Up to 4X higher performance compared to GPU
  • 80% less energy compared to Tensilica DSPs
  • >4X TOPS/W compared to AI Base

AI Base

  • Based on highly successful Vision and HiFi DSPs and offer 8 GOPS to up to 2 TOPS of AI performance
  • These highly successful DSPs offer an instruction set optimized for the AI workload, the necessary 8x8, 8x16, and 16x16 multipliers, and other instruction improvements for various NN layers
  • All DSPs are VLIW and SIMD architectures that scale from 128-bit SIMD to 1024-bit SIMD
  • Support for fixed, float, and complex data types
  • Configurable and extensible with the Tensilica Instruction Extension (TIE) language
  • >30X higher performance compared to CPU
  • Up to 10X better energy efficiency compared to CPU

AI Software

  • Turnkey solution: Automated end-to-end tool chain
  • Support for NN compiler, Android Neural Network (ANN), TFlite Delegates, TFLiteMicro, interpreters, and comprehensive NNLib
  • Lightly pruned quantized model exploits sparse AI engine
  • Adaptable for various workloads and frameworks
  • Fixed-point quantization approaches floating-point model accuracy
  • Pruning and clustering reduce model size by up to 8X
  • Tensor compression for higher memory bandwidth and lower energy