g b- 1 er e 9) e, It is worth noting that other optimization formulations are possible. For instance, the following formulation could be used for scenarios where performance, rather than power, is the main design objective: min nd,nm,s Lat(nd,nm,s) s.t. Res(nd,nm,s) R⇤, (12) Latency Model Archytas derives the latency model by calculating the critical path latency of the M-DFG given the analytical latency models of each of the primitive nodes: Lat(nd,nm,s) = Iter ⇥ LNLS (nd,s)+ LMarg(nd,nm,s) (13) where LNLS denotes the latency of an iteration of the (itera- tive) NLS solver, Iter denotes the total number of iterations in the NLS solver — a parameter set by the application, and LMarg denotes the marginalization latency. The critical-path latency of an NLS iteration (the blocks along the solid arrows in Fig. 4) is expressed as follows: block and the number of MAC units in the D-type Schur and the M-type Schur blocks, denoted nd and nm, respectively. Problem Formulation The task of hardware generation is expressed in the form of a constrained optimization: min nd,nm,s Power(nd,nm,s) s.t. Lat(nd,nm,s) L⇤, Res(nd,nm,s) R⇤, (11) where Power(·), Lat(·), and Res(·) denote the total power, latency, and resource utilization, respectively; they are func- tions of nd, nm, and s. L⇤ is the latency constraint speciﬁed by the designer, and R⇤ is the resource constraint imposed by a particular FPGA system. It is worth noting that other optimization formulations are possible. For instance, the following formulation could be used for scenarios where performance, rather than power, is the main design objective: ⇥ 1 ber ype (9) are, 10) out d b Latency Model Archytas derives the latency model by calculating the critical path latency of the M-DFG given the analytical latency models of each of the primitive nodes: Lat(nd,nm,s) = Iter ⇥ LNLS (nd,s)+ LMarg(nd,nm,s) (13) where LNLS denotes the latency of an iteration of the (itera- tive) NLS solver, Iter denotes the total number of iterations in the NLS solver — a parameter set by the application, and LMarg denotes the marginalization latency. The critical-path latency of an NLS iteration (the blocks along the solid arrows in Fig. 4) is expressed as follows: LNLS (nd,s) = a X i=1 max{LJac,LDS chur(nd)}+ LCholesky(s)+ LS ub (14) Mixed-Integer Convex Programming