New-Tech Europe Magazine | June 2016

What Is RocketSim? Why Did Cadence Acquire Rocketick?

Paul McLellan, Cadence

Incisive and the design simulation uses fine-grained parallelism to run on multiple threads. During compilation, the RocketSim engine's compiler runs on the host machine, separating the design from the testbench. During runtime, a large portion of the logic is offloaded to the RocketSim parallel simulation engine, running on standard multi-core servers, while the testbench runs on the host simulator. The PLI is used during simulation execution to maintain the state synchronization between the compiled-code simulator running the testbench and the RocketSim engine running the design. Simulation is challenging to parallelize compared to something like a DRC since there is at least a global clock and potentially other high-activity signals that don't have well-behaved locality. When I was at Virtutech, we

code, neither the RTL (or netlist) nor the testbench. As a result, RocketSim is in production use at many leading system and semiconductor companies. Intel Capital and NVIDIA were investors and NVIDIA has endorsed them (Intel doesn't endorse EDA suppliers). The sweet spot seems to be large SoCs that are highly active, such as stress tests. The speedups are less with circuits that are not very active, which you would expect. No simulator needs to spend any time simulating inactive parts of a circuit, and it is obviously not possible to take less time than none. The focus of RocketSim is running complex simulations as fast as possible, so latency rather than throughput. RocketSim block diagramUnder the hood, a simulation with Incisive and RocketSim runs the testbench on

I talked to Uri Tal last week, who has just joined Cadence as a result of the Rocketick acquisition. Prior to the acquisition, he was Rocketick's CEO. He gave me a little history. Rocketick started development eight years ago. They have a product called RocketSim that accelerates logic simulation. They started by using GPUs to do this, but then switched to multi- core CPUs. They can run on all the cores in a socket, in practice up to 32 today, although like a surfer they will ride that wave as the number of cores per socket increases. I call this Core's Law: the number of cores on a processor doubles every two years. 6X for Verilog and SystemVerilog at RTL level 10X for gate-level functional simulation 30X for gate-level DFT simulation Another part of the value proposition is that you don't need to change any

24 l New-Tech Magazine Europe

Made with