TY - GEN
T1 - Optimizing wait-states in the synthesis of memory references with unpredictable latencies
AU - Ben Asher, Yosi
AU - Meldiner, Ron
AU - Rotem, Nadav
PY - 2011
Y1 - 2011
N2 - In here we consider the problem of automatic synthesizing, from C to Verilog, circuits that are optimized to handle unpredictable latencies of memory operations. Unpredictable memory latencies can occur due to the use of on chip caches, DRAM memory modules, buffers/queues or multi-port memories. Typically highlevel synthesis compilers assume fixed and known memory latencies thus the technique presented here expands the use of HLS. Assume that in a current state we have k active memory references that can terminate in this clock cycle or possibly continue for an unknown number of clock cycles. Thus, the scheduler must emit 2k new states, one state for each possible combination of memory requests that can continue in the next clock cycle. Synthesizing a state machine with exponential number of states is not practical thus we show a simple technique of synthesizing a compact state machine which is a compromise between the fast full exponential state machine and the linear state machine that would have been generated had we waited for the termination of all the active memory references in every state. Our results shows that the compact state machine obtains similar performances as the full state machine but with a significant less number of resources.1
AB - In here we consider the problem of automatic synthesizing, from C to Verilog, circuits that are optimized to handle unpredictable latencies of memory operations. Unpredictable memory latencies can occur due to the use of on chip caches, DRAM memory modules, buffers/queues or multi-port memories. Typically highlevel synthesis compilers assume fixed and known memory latencies thus the technique presented here expands the use of HLS. Assume that in a current state we have k active memory references that can terminate in this clock cycle or possibly continue for an unknown number of clock cycles. Thus, the scheduler must emit 2k new states, one state for each possible combination of memory requests that can continue in the next clock cycle. Synthesizing a state machine with exponential number of states is not practical thus we show a simple technique of synthesizing a compact state machine which is a compromise between the fast full exponential state machine and the linear state machine that would have been generated had we waited for the termination of all the active memory references in every state. Our results shows that the compact state machine obtains similar performances as the full state machine but with a significant less number of resources.1
UR - http://www.scopus.com/inward/record.url?scp=80155186173&partnerID=8YFLogxK
U2 - 10.1109/SAMOS.2011.6045471
DO - 10.1109/SAMOS.2011.6045471
M3 - Conference contribution
AN - SCOPUS:80155186173
SN - 9781457708008
T3 - Proceedings - 2011 International Conference on Embedded Computer Systems: Architectures, Modeling and Simulation, IC-SAMOS 2011
SP - 270
EP - 273
BT - Proceedings - 2011 International Conference on Embedded Computer Systems
T2 - 2011 11th International Conference on Embedded Computer Systems: Architectures, Modeling and Simulation, IC-SAMOS 2011
Y2 - 18 July 2011 through 21 July 2011
ER -