High-level synthesis (HLS) is already in production use today, but other exciting and complementary new technologies and capabilities are coming in the future. Recently I talked with Michael “Mac” McNamara and Luciano Lavagno – both of whom are speaking at an April 24 DATE workshop on the Future of ESL Synthesis – about their upcoming presentations and their views on the future of HLS.
Luciano Lavagno is a Cadence architect and a professor at Politecnico di Torino, Italy. His DATE presentation is entitled “An ESL-to-GDSII design flow for intellectual property creation, optimization and reuse.” This presentation will discuss a rapidly emerging transaction-level modeling (TLM) based design flow that can co-exist with RTL. The initial focus is silicon IP development, and the key benefit is automated IP reuse. “I’ll be focusing on the productivity gains that come from faster design space exploration,” Luciano said.
In this new TLM-driven flow, Mac said, designers develop algorithms at the transaction level and delay mapping those algorithms into protocols until later. “The TLM-driven approach lets you focus on getting the functions correct and abstract away transactions,” Mac said. “Then, you use a tool like C-to-Silicon Compiler to take algorithms that have been proven correct, and combine them together using the best protocols you can use. You then have high confidence that you’ll map to transactions correctly.”
What else is coming up in HLS synthesis? What are researchers in academia and industry talking about? In our discussion, Mac and Luciano noted that HLS or adjacent tools may take on capabilities such as the following:
Optimizing communications networks
“In the future, HLS opens up the opportunity to optimize communications networks,” Mac said. He predicted new technology will allow designers to focus more on interconnect, look at issues like system-wide stalls, bursting, throughput, and evaluate the need for FIFOs. This technology will better enable designers balance communications networks. For example, if one task takes 8 cycles and another takes 2 cycles, maybe it’s better to have two tasks that each take 4 cycles.
Cadence once had a tool called Block Oriented Network Simulator (“BONeS”) that focused on network design. That was in the pre-SystemC era. Now, with SystemC as a standard, and production-ready HLS providing the implementation link to silicon, “we’ve created a much better environment to use such a tool,” Mac said.
“Communications is just the first step,” Luciano said. “What we are really looking at is the balancing and cost of both communications and computation.”
Optimizing across multiple blocks
Luciano went on to note that HLS today mainly focuses on optimizing individual blocks. “So now we have two blocks. How do we optimize and synthesize them so both work together, without spending too much hardware to get unnecessary performance out of one block while another becomes a bottleneck?”
Low power optimization using HLS is another area that is “already more present than future,” he said, “but to get the best out of it we need to go beyond just individual blocks and analyze early in a whole-system context.”
Considering embedded software
HLS is not designed to optimize and generate embedded software. However, as Mac noted, a lot of the input to HLS would otherwise be embedded software. “I think there is an opportunity to more directly accept embedded software input without as much massaging,” he said.
To evaluate the cost-performance of implementing a function in hardware vs. software, C-to-Silicon Compiler today can already perform tasks like loop unrolling and functional in-lining. This enables designers to tradeoff hardware cost (i.e. area) vs. latency and throughput. Going further, Mac said, future tools could potentially recognize pure algorithms within a C program and automatically evaluate the optimal implementation for those. “This starts to get into the ‘reverse synthesis’ space – very difficult but quite interesting,” he said.
One other current area of academic research, Luciano said, involves parallelization at a high level of abstraction. This takes on particular importance with the recent trend towards multicore SoCs. Consider, he noted, the need to split and balance the workload in a multi-CPU system and to optimize both hardware and software. This requires new TLM-driven techniques that can collect, analyze and present massive amounts of data at high speed. By working at a higher level of abstraction, designers can run much faster and more realistic simulations, allowing them to analyze the behavior of these multicore systems more accurately and reliably.
The presence of HLS in production environments is no longer speculative, but definitive and growing, as noted in a recent Industry Insights posting. Engineers don’t have to wait for future capabilities to adopt it, as Mac said – “adopt it now and be part of that future.”
Meanwhile, additions to this list of future directions are welcome!