We consider the problem of synthesizing circuits (from C to Verilog) that are optimized to handle unpredictable latencies of memory operations. Unpredictable memory latencies can occur due to the use of on chip caches, DRAM memory modules, buffers/queues, or multiport memories. Typically, high-level synthesis compilers assume fixed and known memory latencies, and thus are able to schedule the code's operations efficiently. The operations in the source code are scheduled into states of a state machine whose states will be synthesized to Verilog. The goal is to minimize scheduling length by maximizing the number of operations (and in particular memory operations) that are executed in parallel at the same state. However, with unpredictable latencies, there can be an exponential number of possible orders in which these parallel memory operations can terminate. Thus, in order to minimize the scheduling, we need a different schedule for any such order. This is not practical, and we show a technique of synthesizing a compact state machine that schedules only a small subset of these possible termination orders. Our results show that this compact state machine can improve the execution time compared to a regular scheduling that waits for the termination of all the active memory references in every state.
|ACM Transactions on Reconfigurable Technology and Systems
|Published - Dec 2013
ASJC Scopus subject areas
- General Computer Science