Triton Tool Suite Enables Designers to Trade off Performance, Power and Cost for SoCs, Platform ASICs, Structured Arrays and FPGAs
Poseidon Design Systems, Inc. today announced an Electronic System Level (ESL) tool suite - Triton Tuner(TM) and Triton Builder(TM) - that automates the process of optimizing and substantially accelerating processor-based designs. Based on a SystemC software and hardware co-simulation environment, transactional-level modeling (TLM) technology, and Poseidon's innovative HW/SW partitioning technology, the Triton tool suite enables SoC designers to co-simulate hardware and software at the architectural level, then tune and accelerate the embedded system for optimal performance, power and cost.
Triton Tuner is a simulation and analysis environment based on SystemC that analyzes the performance of an embedded system, including software performance (using performance counters, code profiling, and bottleneck analysis) and hardware performance (checking memory bandwidth, pipeline stalls, and cache miss-hits). It helps designers fine-tune a system architecture by determining the optimal HW/SW partition for a given end-use application, and by generating more efficient code based on the new partition.
Key Functions of Triton Tuner
-- Increases system performance by creating an efficient memory hierarchy
-- Optimizes memory hierarchy to create designs with lower power dissipation
-- Tunes software algorithms to run faster with reduced execution cycle
-- Identifies hot spots in algorithms through detailed profiling and reduces power by optimizing critical code
-- Identifies and eliminates bottlenecks between the hardware and the software
Triton Builder is a synthesis tool that automatically generates algorithm-specific hardware accelerator blocks in RTL. These new blocks offload the math-intensive algorithms from the host processor, as determined by Tuner's new partitioning. Besides accelerating the processing performance for a given algorithm, Builder creates highly efficient communication interfaces to get the data into and out of the custom accelerator hardware.
Key Functions of Triton Builder
-- Profiles application to identify candidates for hardware implementation
-- Synthesizes application-specific hardware accelerators directly from standard ANSI C
-- Generates efficient RTL for new hardware accelerators in either Verilog or VHDL
-- Explores multiple accelerator communication templates to meet system requirements
-- Complements Triton Tuner as an integrated environment to verify the accelerated system
-- Automatically generates test benches, drivers, and modified ANSI C application code
"Simply relying on Moore's Law to provide the additional processing power needed to accommodate the math-intensive algorithms can only lead to untenable architectural and economic efficiencies. Poseidon's Triton tool suite reveals the hidden inefficiencies in any processor-based design, and automates the optimization and acceleration process. In so doing, these ESL tools pave the way for convergence system designers to fulfill the seemingly insatiable demand for higher performance, lower power, and lower development costs," said Ravi Janak, CEO & President of Poseidon Design Systems. "As the inevitable convergence of video, audio and data communication draws near in both the enterprise and consumer markets, the need to create more efficient systems becomes paramount," said Farzad Zarrinfar, vice president of worldwide sales and marketing at Poseidon Design Systems.
Poseidon has implemented a wavelet encoder for a JPEG 2000 application to demonstrate the degree to which the Triton tool suite can effectively accelerate a system. Beginning with a design available from the public domain, we used Triton Tuner to determine the number of execution cycles needed to process a given image - 81.13 million cycles. By performing an analysis of the system with Tuner, we were able to identify where and how to optimize the code. We used Triton Builder tool to partition the design, to generate RTL code for the selected hardware accelerator blocks, and to automatically generate the necessary drivers, test benches and transactional models. Finally, using Tuner once again, we performed functional and performance verification before implementing the accelerated design on a Xilinx(R) Virtex-II(TM) FPGA. The design employs a MicroBlaze(TM) processor supported by instruction and data cache, several peripheral cores, and DDR-DRAM for main memory. The total optimization and acceleration enabled us to achieve a 23X reduction in execution cycles - or 3.54 million cycles.
"Our customers are constantly pushing the performance limits," said Steve Lass, Director of Software Product Marketing for Xilinx. "Today's real-time computing applications require smarter designs with better partitioning, and better use of hardware resources to off-load embedded processors. By pinpointing the bottlenecks in a design, then automating the path to more efficient silicon, system level design tools like those from Poseidon can help our users optimize their designs to achieve the best performance/area tradeoffs."
Poseidon Design Systems, Inc.
Explore further: Hackathon team's GoogolPlex gives Siri extra powers