TY - JOUR
T1 - Optimizing wait states in the synthesis of memory references with unpredictable latencies
AU - Ben-Asher, Yosi
AU - Meldiner, Ron
AU - Rotem, Nadav
PY - 2013/12
Y1 - 2013/12
N2 - We consider the problem of synthesizing circuits (from C to Verilog) that are optimized to handle unpredictable latencies of memory operations. Unpredictable memory latencies can occur due to the use of on chip caches, DRAM memory modules, buffers/queues, or multiport memories. Typically, high-level synthesis compilers assume fixed and known memory latencies, and thus are able to schedule the code's operations efficiently. The operations in the source code are scheduled into states of a state machine whose states will be synthesized to Verilog. The goal is to minimize scheduling length by maximizing the number of operations (and in particular memory operations) that are executed in parallel at the same state. However, with unpredictable latencies, there can be an exponential number of possible orders in which these parallel memory operations can terminate. Thus, in order to minimize the scheduling, we need a different schedule for any such order. This is not practical, and we show a technique of synthesizing a compact state machine that schedules only a small subset of these possible termination orders. Our results show that this compact state machine can improve the execution time compared to a regular scheduling that waits for the termination of all the active memory references in every state.
AB - We consider the problem of synthesizing circuits (from C to Verilog) that are optimized to handle unpredictable latencies of memory operations. Unpredictable memory latencies can occur due to the use of on chip caches, DRAM memory modules, buffers/queues, or multiport memories. Typically, high-level synthesis compilers assume fixed and known memory latencies, and thus are able to schedule the code's operations efficiently. The operations in the source code are scheduled into states of a state machine whose states will be synthesized to Verilog. The goal is to minimize scheduling length by maximizing the number of operations (and in particular memory operations) that are executed in parallel at the same state. However, with unpredictable latencies, there can be an exponential number of possible orders in which these parallel memory operations can terminate. Thus, in order to minimize the scheduling, we need a different schedule for any such order. This is not practical, and we show a technique of synthesizing a compact state machine that schedules only a small subset of these possible termination orders. Our results show that this compact state machine can improve the execution time compared to a regular scheduling that waits for the termination of all the active memory references in every state.
UR - http://www.scopus.com/inward/record.url?scp=84891809791&partnerID=8YFLogxK
U2 - 10.1145/2535936
DO - 10.1145/2535936
M3 - Article
AN - SCOPUS:84891809791
SN - 1936-7406
VL - 6
JO - ACM Transactions on Reconfigurable Technology and Systems
JF - ACM Transactions on Reconfigurable Technology and Systems
IS - 4
M1 - 19
ER -