# A Statistical Static Timing Analysis Considering Correlations Between Delays

Shuji Tsukiyama<sup>+</sup>

Masakazu Tanaka\*

Masahiro Fukui\*

+ Dept. of EECE Chuo University Tokyo, JAPAN 112-8551 tsuki@elect.chuo-u.ac.jp \* Advanced LSI Tech. Development Center Matsushita Electric Industrial Co., Ltd. Nagaokakyo, JAPAN 617-8520 {tanack, fukui}@ngk.csdd.mei.co.jp

Abstract: In this paper, present new algorithm timing analysis for the statistical static of a CMOS combinatorial circuit, which can correlations of arrival times of input signals to a logic gate and correlations of switching delays in a logic gate. We model each switching delay by a normal distribution, and use normal distribution two variables with coefficient correlation computing distribution of output delay of a logic Since the algorithm takes correlation into account, the time where O(n•m) in the complexity is worst-case. and m are the numbers of vertices and edges of the graph acvelie combinatorial representing a given circuit.

## I. Introduction

The importance of statistical static timing analysis[1-5] is increasing in designing high density, high speed and low power VLSIs in deep sub-micron era. Because, designers often set excessive margins derived from the worst-case analysis in order to avoid the effect of the delay time uncertainty, and such excessive margins usually bring overdesign of circuits. If designers can estimate the distribution of critical path delay caused by all local uncertainties, overdesign of circuits may be eliminated so that high density and high performance VLSIs are produced with high yield[6-8].

Although several researches have been done on this topic[1-7], all of them except [5] assume that the distributions of all signal delays are independent each other. However, such an assumption is not realistic in



Fig. 1 Arrival times of signals x and y have a correlation

combinatorial circuits with re-convergent paths. For example, if the delays of signals x and y of the circuit shown in Fig. 1 are heavily depend on the delay of signal v, then we cannot ignore the correlation between delays of signals x and y in computing the maximum delay of signal z.

In this paper, we present a new algorithm for the statistical static timing analysis of a CMOS combinatorial circuit, which can treat not only correlations of delays of signals denoted in Fig. 1 but also correlations between delays of transistors contained in a logic gate. This algorithm assumes that the delay of each logic gate is modeled by a normal distribution (Gaussian distribution), as usually done in the previous researches. In order to compute the distribution of the output delay of a logic gate, a maximum operation on the stochastic variables is necessary, and the proposed algorithm uses a normal distribution of two stochastic variables with a coefficient of correlation[9] for the maximum operation.

If delays of all signals are independent, the distribution of the maximum delay can be computed in O(n+m) time[3], where n and m are the numbers of vertices and edges of the graph representing a given combinatorial circuit. But, since the proposed algorithm take the correlation into account, the time complexity of the algorithm becomes  $O(n \cdot m)$  in the worst-case. However, for real combinatorial circuits, we can expect that the time complexity is much less than  $O(n \cdot m)$ .

### II. PRELIMINARIES

In order to find the distribution of the maximum delay of a CMOS combinatorial circuit, we represent the circuit by an acyclic graph G=(V,E), as shown in Fig. 2. In the figure, each box denotes a logic gate, which is drawn to show the correspondence between the circuit and the graph. Each vertex contained in a box represents a terminal of the corresponding logic gate, and a vertex in the left or right side in a box corresponds to an input or an output terminal, respectively. Each sink, from which no edge goes out, corresponds to a primary output, and T denotes the set of



Fig. 2 Acyclic graph G=(V,E) representing a circuit

sinks. Vertex  $v_0$  is the unique source into which no edge comes, and each edge going out from  $v_0$  comes into a vertex corresponding to a primary input. These edges are used to introduce the differences and distributions of arrival times of primary inputs. Each edge in a box goes out from the vertex representing an input of the corresponding logic gate, and comes into the vertex representing the output of the gate. Edges exclusive of the ones going out from  $v_0$  represent interconnects, if they are not contained in a box.

Each edge e=(v,w) contained in a box has delays  $t_0(e)$  and  $t_1(e)$ , such that  $t_0(e)$  and  $t_1(e)$  denote the switching times of pMOS and nMOS transistors, respectively. Henceforth, we use "b" to indicate 0 or 1, and  $t_b(e)$  denotes  $t_0(e)$  or  $t_1(e)$ . Delay  $t_b(e)$  has a certain uncertainty, and is a stochastic variable. It is determined by saturated current  $I_{dsat}$ , load capacitance  $C_{load}$  of the transistor, and slew rate  $t_{slew}$  of the gate voltage. The uncertainty of the delay comes from the distribution of  $I_{dsat}$ , which mostly depends on the distribution of gate length  $L_g$  of the transistor. Since the distribution of  $L_g$  can be modeled by a normal distribution like the distribution of threshold voltage  $V_t$  [10], we model the distribution of delay  $t_b(e)$  of edge e in a box by a normal distribution  $N(\mu_b(e), \ ^2_b(e))$ , where  $\mu_b(e)$  and  $\ ^2_b(e)$  are the mean and the variance of the distribution of  $t_b(e)$ .

The distribution of  $L_g$  depends on the distributions of space  $S_{poly}$  between adjacent polysilicon gates, and gate width  $W_g$  and length  $L_{diff}$  of diffusion area of the transistor. Therefore, by extracting these quantities from the mask pattern, the distribution of  $L_g$  can be estimated, and hence the distribution of the switching delay  $t_b(e)$  of a transistor. With the use of the distribution data and the proposed algorithm, we can execute post-layout timing analysis, gate size optimization[6,7], and layout optimization of a macrocell[8].

Since the distribution of  $L_g$  depends on the distribution of space  $S_{poly}$  between adjacent gates, delay  $t_b(e)$  has a correlation with the delay of the edge corresponding to the adjacent transistor. Therefore, we introduce a correlation between delays  $t_b(e')$  and  $t_b(e'')$  of edges e' and e'' which correspond to transistors with the same type in a logic gate, and denote the coefficient of correlation between these delays

by <sub>b</sub>(e',e"). Such correlations are introduced only for edges contained in a single logic gate.

For an edge e representing an interconnect, delay  $t_b(e)$  designates the interconnect delay, which is assumed to be constant, that is, b(e)=0. This delay is evaluated by a method such as PRIMO[11]. If interconnect delay is also distributed by a normal distribution, then we may introduce a variance b(e) 0. But, in such a case, the proposed algorithm must be modified, because the distributions of  $b_0(e)$  and  $b_1(e)$  may have a correlation.

Now, let us show that the maximum delay to a vertex w in G can be estimated by using the edge-delays introduced above.

Let d(v,0) and d(v,1) be the maximum delays spent for transmitting signal 0 and 1 from source  $v_0$  to a vertex v V, respectively. These are stochastic variables. In the following, the mean and variance of d(v,b) are denoted by  $m_b(v) = \text{Exp}[d(v,b)]$  and  $s_b(v) = \text{Var}[d(v,b)]$ , respectively.

Firstly, we consider the case where vertex w corresponds to a sink or an input node of a gate. In this case, only one edge e = (v,w) corresponding to an interconnect comes into w, and hence d(w,0) and d(w,1) can be calculated by the following equations:

$$d(w,0) = d(v,0) + t_0(e),$$
  $d(w,1) = d(v,1) + t_1(e).$ 

Since  $t_0(e)$  and  $t_1(e)$  are constant,  $m_b(w)$  and  $s_b(w)$  are obtained as follows:

$$\begin{split} m_0(w) &= m_0(v) + t_0(e), & m_1(w) = m_1(v) + t_1(e), \\ s_0(w) &= s_0(v), & s_1(w) = s_1(v). \end{split}$$

Moreover, correlation coefficient  $r_{bb}(w,u) = R[d(w,b), d(u,b')]$  between d(w,b) and d(u,b') of a vertex u-w can be obtained by the following equations. Henceforth, we use b' together with b to indicate 0 or 1, and b' is not necessarily equal to b.

$$\begin{split} r_{00}(w,u) &= r_{00}(v,u), & r_{01}(w,u) &= r_{01}(v,u), \\ r_{10}(w,u) &= r_{10}(v,u), & r_{11}(w,u) &= r_{11}(v,u). \end{split}$$

Obviously, we have

$$\begin{split} r_{00}(w,w) &= r_{11}(w,w) = 1, \\ r_{01}(w,w) &= r_{10}(w,w) = r_{01}(v,v) \; (= r_{10}(v,v) \; ). \end{split}$$

Next, let us consider the case where w corresponds to the output node of a logic gate with k inputs. Let  $v_i$  (i=1,2,...,k) be a vertex corresponding to an input node of the gate, and let  $e_i = (v_i, w)$  be incoming edges of w. (See Fig. 3)

If the logic gate is a NAND gate, then d(w,0) is calculated by the following equation:

$$d(w,0) = max[d(v_i,1) + t_1(e_i) | 1 i k],$$

since w becomes 0 when all inputs become 1. On the other hand, since w becomes 1 when any input becomes 0, d(w,1) seems to be determined by the equation



Fig. 3 A logic gate with k inputs

$$d(w,1) = \min[d(v_i,0) + t_0(e_i) | 1 i k].$$

However, when all inputs other than j are 1, w becomes 1 after the delay  $d(v_j,0) + t_0(e_j)$ . Since d(w,1) is the maximum among these delays  $d(v_j,0) + t_0(e_j)$ , we can calculate d(w,1) by the following equation:

$$d(w,1) = max[d(v_i,0) + t_0(e_i) | 1 i k].$$

Similarly, if the logic gate is NOR gate, we can calculate d(w,0) and d(w,1) by the same maximum operations. If the logic gate is an invertor, we set k=1 in the above equations and need not to take the maximum.

If the logic gate is a complex gate, for example,  $v_1 \cdot (v_2 + v_3)$ , w becomes 0 if  $v_1 = 1$  and  $v_2 = 1$  or  $v_3 = 1$ . Therefore, d(w,0) may be determined the largest value of  $d(v_i,1) + t_1(e_i)$  (1 i 3). On the other hand, d(w,1) may also be determined by the largest value of  $d(v_i,0) + t_0(e_i)$  (1 i 3). Thus, if a complex gate consists of a series-parallel connection of MOS transistors, then we can determine d(w,b) by the same manner as NAND or NOR gate. It is not hard to see that for an AND gate or an OR gate a similar discussion holds, and d(w,b) can be computed by

$$d(w,b) = max[d(v_i,b) + t_b(e_i) | 1 i k]$$

although the definition of t<sub>b</sub>(e) must be modified slightly.

For the logic gates discussed above, we only need the values of d(v,b) and  $t_b(e)$ , and not d(w,b) and  $t_b(e)$  simultaneously, where  $\overline{b}$  denotes the complement of b. Therefore, we need correlations between  $d(v_i,b)$  (1 i k), but not correlation between d(v,b) and  $d(v,\overline{b})$ . However, if the logic gate is an XOR gate, then output w becomes 1 by input signal 1 or 0. Hence, d(w,1) may be equal to  $d(v_i,0) + t_{01}(v_i)$  or  $d(v_i,1) + t_{11}(v_i)$  (1 i k), where  $t_{01}(v_i)$  (or  $t_{11}(v_i)$ ) is the switching delay between the arrival time of signal 0 (signal 1) to input  $v_i$  and the time when output w turned to 1. Therefore, d(w,1) is computed by the following equation.

$$\begin{split} d(w,1) = max[ \ d(v_i,0) + t_{01}(v_i), \\ d(v_i,1) + t_{11}(v_i) \mid 1 \quad i \quad k \ ]. \end{split}$$

Through a similar discussion, we see that d(w,0) is computed by

$$d(w,0) = max[ \ d(v_i,0) + t_{00}(v_i),$$
  
$$d(v_i,1) + t_{10}(v_i) \mid 1 \quad i \quad k \ ].$$

Thus, in the case of XOR gate, we need new edge-delays different from  $t_b(e)$  and a different maximum operation. In order to simplify the descriptions, we assume in the following that a given combinatorial circuit does not contain XOR gates. Therefore, the maximum operation is conducted only among  $d(\bullet,0)+t_0(\bullet)$  or among  $d(\bullet,1)+t_1(\bullet)$ .

### III. DISTRIBUTION OF THE MAXIMUM

In this section, we show a method to compute mean Exp[ x(b) ] and variance  $Var[\,x(b)\,]$  of  $x(b)=max[\,d(v_i,b)\,+\,t_b(e_i)\,|\, 1\,$  i  $\,k\,$  ]. This is done by procedure DISTMAX(  $In(w),\,U\,$ ), where  $In(w)=\{\,e_i=(v_i,w)\,|\, 1\,$  i  $\,k\,$  }, and  $\,U\,$  is the set of vertices to which there is no directed path from  $\,w.\,$  The procedure first computes  $Exp[\,x(b)\,]$  and  $Var[\,x(b)\,]$  for b=0, and then for b=1. Moreover, it computes correlation coefficient  $R[\,x(b),\,d(u,b')\,]$  between x(b) and d(u,b') of vertex u U and correlation coefficient  $R[\,x(0),\,x(1)\,]$ . To do this, we assume that correlation coefficients  $r_{bb}(x,y)$  are known for any vertices x,y  $\,U.\,$ 

According to Ref.[9], given two stochastic variables x and y with correlation coefficient R[x,y] = w whose distributions are  $N(\mu_x, x^2)$  and  $N(\mu_y, y^2)$ , respectively, mean Exp[t] and variance Var[t] of t = max[x, y] are obtained by the following equations, unless x = -1 = 0;

$$\begin{split} & \text{Exp[t]} = \mu_{x^{\bullet}} \text{ ()} + \mu_{y^{\bullet}} \text{ (-)} + \text{ • ()}, \\ & \text{Var[t]} = (\mu_{1}^{2} + \ _{1}^{2}) \text{ • ()} + (\mu_{2}^{2} + \ _{2}^{2}) \text{ • (-)} \\ & + (\mu_{1} + \mu_{2}) \text{ • ()} - \{\text{Exp[t]}\}^{2}, \end{split}$$

where

$$= \sqrt{\frac{2}{x} + \frac{2}{y} - 2}, \qquad = (\mu_1 - \mu_2)/\sqrt{2}$$

$$(x) = \frac{1}{\sqrt{2}} \exp\left[-\frac{x^2}{2}\right],$$

$$(x) = \frac{1}{\sqrt{2}} \left[-\frac{x}{2}\right] dy.$$

Moreover, correlation coefficient R[t,z] between t = max[x, y] and a stochastic variable z with a normal distribution can be obtained by the following equation, if correlation coefficient between x and z is R[x,z] = x and that between y and z is R[y,z] = y.

$$R[t,z] = R[\max[x,y], z]$$

$$= \frac{\begin{bmatrix} \bullet & \bullet & (\ ) + & \bullet & \bullet & (\ -\ )\end{bmatrix}}{\sqrt{Var[t]}}.$$

We use these equations in procedure DISTMAX( In(w), U ) as follows.

Let  $x(b)_i^* = d(v_i,b) + t_b(e_i)$ . Then, since  $d(v_i,b)$  and  $t_b(e_i)$  are independent, we have

Exp[ 
$$x(b)_i^*$$
 ] =  $m_b(v_i) + \mu_b(e_i)$ ,  
Var[  $x(b)_i^*$  ] =  $s_b(v_i) + {}^2_b(e_i)$ .

Moreover, for  $x(b)_i^*$  and stochastic variable y, the following equation holds

$$\begin{split} \sqrt{Var[\ x(b)_{_{i}}\ ]}\ \bullet R[\ x(b)_{i}^{*},\ y\ ] = & \sqrt{s_{_{b}}(v_{_{i}})}\ \bullet R[\ d(v_{i},b),\ y\ ] \\ & + \ _{b}(e_{i}) \bullet R[\ t_{b}(e_{i}),\ y\ ]\ . \end{split}$$

If y = d(u,b') of vertex u U, then  $R[t_b(e_i), y] = 0$ . Therefore, for u U, we have

$$R[\;x(b)_i^*,\,d(u,b')\;]\;=\;\frac{\sqrt{s\;(v_i)}\stackrel{\bullet}{v}_{bb'}^*(v_i,u)}{\sqrt{Var[\;x(b)_i^{\phantom{\dagger}}]}}\;\;,$$

where  $r_{bb}(v_i,u) = R[\ d(v_i,b),\ d(u,b')\ ]$ . If  $y = t_b(e_j)$  of edge  $e_j$  In(w), then R[  $x(0)_i^*,\ t_1(e_j)$  ] = R[  $x(1)_i^*,\ t_0(e_j)$  ] = 0, and we have

$$R[\ x(b)_i^{}*,\ t_b(e_j^{})\ ] = \ _b(e_i^{})^{\bullet}\ _b(e_i^{},e_j^{})/\sqrt{Var[\ x(b)_{_i}^{}\ ]}\ \ .$$

We consider correlation coefficients R[  $x(b)_i^*$ ,  $t_b(e_j)$  ] only for edges  $e_j$  In(w) such that i<j, since we need these coefficients only.

Now, let  $x(b)_i = max[d(v_j,b) + t_b(e_j) \mid 1 \quad j \quad i]$ . Then, since  $x(b)_l = x(b)_l^*$ , for  $i \quad 2$ , we can compute  $Exp[x(b)_i]$ ,  $Var[x(b)_i]$ ,  $R[x(b)_i, d(u,b')]$ , and  $R[x(b)_i, t_b(e_i)]$  by using

$$x(b)_i = \max[x(b)_{i-1}, x(b)_i^*]$$

and the equations in Ref.[9]. Namely, let

$$= \sqrt{Var[x(b)_{i-1}] + Var[x(b)_{i}] - 2} ,$$

$$= \sqrt{Var[x(b)_{i-1}]} \cdot \sqrt{Var[x(b)_{i}]} \cdot R[x(b)_{i-1}, x(b)_{i}]$$

and

= 
$$(Exp[x(b)_{i-1}] - Exp[x(b)_{i}^*])/$$

where

$$\frac{R[\;x(b)_{i-1},x(b)_{i}^{*}\;]=}{\sqrt{s\underset{b}{(v_{i})}} \bullet R[\;x(b)_{i-1},\;d(v_{i},b)\;] + \underset{b}{(e)} \bullet \;R[\;x(b)_{i-1},\;t_{b}(e)\;]}}{\sqrt{Var[\;x(b)_{i}\;]}}\;.$$

Then, we can compute  $\text{Exp}[x(b)_i]$ ,  $\text{Var}[x(b)_i]$ , and  $\text{R}[x(b)_i]$ , d(u,b') for u U as follows.

$$\begin{split} & Exp[\ x(b)_i\ ] = Exp[\ x(b)_{i-1}\ ]^{\bullet} \quad (\ ) \\ & \quad + Exp[\ x(b)_i^*\ ]^{\bullet} \quad (\ ) + \quad \bullet \quad (\ )\ , \\ & Var[\ x(b)_i\ ] = \{(Exp[\ x(b)_{i-1}\ ])^2 + Var[\ x(b)_{i-1}\ ]\}^{\bullet} \quad (\ ) \\ & \quad + \{(Exp[\ x(b)_i^*\ ])^2 + Var[\ x(b)_i^*\ ]\}^{\bullet} \quad (\ ) - \{Exp[\ x(b)_i\ ]\}^2, \\ & \quad + \{Exp[\ x(b)_{i-1}\ ] + Exp[\ x(b)_i'\ ]\}^{\bullet} \quad \bullet \quad (\ ) - \{Exp[\ x(b)_i\ ]\}^2, \\ & R[\ x(b)_i,\ d(u,b')\ ] = \frac{X_u + Y_u}{\sqrt{Var[\ x(b)_u^{}\ ]}} \quad , \\ & X_u = \sqrt{Var[\ x(b)_{i-1}\ ]} \quad \bullet R[\ x(b)_{i-1},\ d(u,b')\ ]^{\bullet} \quad (\ )\ , \\ & Y_u = \sqrt{Var[\ x(b)_i^{}\ ]} \quad \bullet R[\ x(b)_i^{}\ ,\ d(u,b')\ ]^{\bullet} \quad (\ )\ . \end{split}$$

Correlation coefficient R[  $x(b)_i$ ,  $t_b(e_j)$  ] (j > i) can be computed as follows.

$$\begin{split} R[\;x(b)_{i},\,t_{b}(e_{j})\;] &= \frac{X_{e} + Y_{e}}{\sqrt{Var[\;x(b)_{i}\;]}}\;\;,\\ X_{e} &= \sqrt{Var[\;x(b)_{i-1}\;]}\;\bullet R[\;x(b)_{i-1},\,t_{b}(e_{j})\;]\bullet \;\;(\;)\;,\\ Y_{e} &= \sqrt{Var[\;x(b)_{i}\;]}\;\bullet R[\;x(b)_{i},\,t_{b}(e_{j})\;]\bullet \;\;(\;-\;)\;. \end{split}$$

Repeating the computations stated above for i=2,3,...,k, we have the distribution of x(b), since  $x(b)=x(b)_k$ . Hence, by conducting this repetition for b=0, we have  $Exp[\ x(0)\ ]$ ,  $Var[\ x(0)\ ]$ , and  $R[\ x(0),\ d(u,b)\ ]$  for u U. Then, conducting the repetition for b=1, we obtain  $Exp[\ x(1)\ ]$ ,  $Var[\ x(1)\ ]$ , and  $R[\ x(1),\ d(u,b)\ ]$ . In this case, however, we need not to compute  $R[\ x(1)_i,\ d(v_j,0)\ ]$  for  $v_j$  such that  $e_j=(v_j,w)$  In(w). Because, we do not use these values. But, we compute  $R[\ x(1)_i,\ x(0)\ ]$  as follows. Since  $R[\ t_1(e_j),\ x(0)\ ]=0$ , we have

$$R[\ x(1)_i^*,\ x(0)\ ]\ =\ \frac{\sqrt{s_1(v_i)}\, {}^\bullet\!R[\ d(v_i^{},1),\ x(0)\ ]}{\sqrt{Var[\ x(1)_i^{}\ ]}}\ ,$$

and

$$\begin{split} R[\;x(1)_i,x(0)] &= \frac{X_{01} + Y_{01}}{\sqrt{Var[\;x(1)_{_i}\;]}}\;\;,\\ X_{01} &= \sqrt{Var[\;x(1)_{_{i-1}}\;]}\;\bullet R[\;x(1)_{_{i-1}},x(0)\;]^\bullet\;\;(\;)\;,\\ Y_{01} &= \sqrt{Var[\;x(1)_{_i}\;]}\;\bullet R[\;x(1)_{_i},x(0)\;]^\bullet\;\;(\;-\;)\;. \end{split}$$

Hence, we have R[ x(1), x(0) ] = R[ x(1), x(0) ] = R[  $x(1)_k$ , x(0) ].

### IV. ALGORITHM

In this section, we describe the proposed algorithm by using procedure DISTMAX( In(w), U ) shown in the previous section.

Our problem is to find mean  $m_b(v)$  and variance  $s_b(v)$  of d(v,b) for each sink v — T and correlation coefficient  $r_{bb}(v,w)$  for each pair of sinks v, w — T. Once these values are obtained, we can compute the probability for the maximum delay  $M = max[\ d(v,b) \mid v \ T,\ b \ \{0,1\}\ ]$  to be not greater than D, from the following equation

Pro[ M D] = 
$$_{-}^{D}$$
  $_{-}^{D}$  ...  $_{-}^{D}$   $f(x_{1},x_{2},...,x_{n})dx_{1}dx_{2}...dx_{n}$ 

where n=2|T|, each  $x_i$  corresponds to d(v,b), and  $f(x_1,x_2,...,x_n)$  is the probability density function of normal distribution with 2|T| variables. If we need its probability density function g(D) of distribution  $G(D)=Pro[\ M\ D\ ]$ , it can be obtained by differentiating G(D). Although mean  $Exp[\ g(D)\ ]$  and variance  $Var[\ g(D)\ ]$  of the maximum delay M are computed numerically, they can be computed by a method similar to procedure DISTMAX by assuming g(D) as

a normal distribution. The outline of the proposed algorithm is as follows.

We first reduce a given circuit graph G=(V,E) by repeating the reduction of two series edges e'=(u,v) and e''=(v,w) into one edge  $e^*=(u,w)$ . Since the delays of two series edges are independent, the mean and variance of  $t_b(e^*)$  are given by

$$\begin{split} \mu_b(e^*) &= \mu_b(e') + \mu_b(e''), \\ {_b}^2(e^*) &= {_b}^2(e') + {_b}^2(e''), \end{split}$$

respectively, and correlation coefficient  $_b(e^*,e)$  of  $t_b(e^*)$  with  $t_b(e)$  of edge e coming into w is given by

$${}_{b}(e,e) = \frac{{}_{b}(e',e) + {}_{b}(e'',e) + {}_{b}(e'',e)}{\sqrt{{}_{b}^{2}(e') + {}_{b}^{2}(e'')}},$$

in these equations, if  $b(\bullet) = 0$ , then let  $b(\bullet, \bullet) = 0$ .

Then, we determine  $m_b(v)$  and  $s_b(v)$  of each vertex v of the reduced graph  $G^*=(V^*,E^*)$  in the topological order with the use of procedure DISTMAX as follows. Let *Front* be a set of vertices satisfying the following conditions.

- (A) For any vertex v *Front*,  $m_0(v)$  and  $s_0(v)$  of d(v,0) and  $m_1(v)$  and  $s_1(v)$  of d(v,1) are known.
- (B) For any pair of vertices u, v *Front*,  $r_{00}(u,v)$ ,  $r_{01}(u,v)$ ,  $r_{10}(u,v)$ , and  $r_{11}(u,v)$  are known.

Initially, set *Front* as the set of vertices corresponding to primary inputs, and we see that conditions (A) and (B) can be satisfied for this *Front*. Then, repeat the following procedure, until *Front* becomes T.

- 1°: select a vertex w *Front* such that all incoming edges of w come from vertices in *Front*;
- 2°: let *Del*(w) be the set of vertices v *Front* such that all outgoing edges (v,u) from v come into vertices u *Front* {w};
- $3^{\circ}$ : set Front := Front Del(w);
- 4°: conduct procedure DISTMAX( In(w), *Front* ), and determine m<sub>b</sub>(w), s<sub>b</sub>(w), and r<sub>bb</sub>(w,u) for u *Front*, according to the type of the gate containing w as its output terminal. Namely, for example, if the gate is AND gate, then set m<sub>b</sub>(w)=Exp[x(b)] and so on. Moreover, since we have R[x(0), x(1)], we can determine r<sub>bb</sub>(w,w) as follows:

$$\begin{split} r_{00}(w,w) &= r_{11}(w,w) = 1, \\ r_{01}(w,w) &= r_{10}(w,w) = R[\ x(1),\ x(0)\ ]. \end{split}$$

### $5^{\circ}$ : add w to *Front*;

If Front becomes T, we compute Exp[  $M_b$  ] = Exp[ max[  $d(v,b) \mid v \mid T$  ], Var[  $M_b$  ], and R[ $M_0$ ,  $M_1$  ], with the use of procedure DISTMAX( { (u,t) | u | T | }, T ), under the assumption that the delay of all edges (u,t) are constant and equal to 0. Then, we compute Exp[  $M_1$  ] = Exp[  $M_0$ ,  $M_1$  ]

and Var[M] by the equations in [9].

The time complexity of the proposed algorithm can be analyzed as follows.

In the worst-case, the maximum size  $h_{max}$  of *Front* is O(n), where n = |V| is the number of vertices. However, in real circuits,  $h_{max}$  is smaller than n, so that the time complexity of the algorithm is less than  $O(n \cdot m)$ .

### V. EXPERIMENTAL RESULTS

In order to see the performance of the proposed algorithm, we first applied our algorithm to the graph in Fig. 4, which corresponds to the circuit in Fig. 1. The mean  $\mu$  and standard deviation of delays of edges (a,x) and (c,y) are 20 and 1.4, respectively, and those of edges (v,x), (v,y), (x,z), and (y,z) are 10 and 0.7, respectively. We changed the delay of edge (u,v), and computed the maximum delay to vertex z, as shown in Table 1. In the experiments, we assumed that all edge-delays are independent.

In the table, "Monte Carlo," "Ours," and "No Correlation" show the results obtained by Monte Carlo simulation (20,000 iterations), by our algorithm, and by ignoring correlation of delays d(x,b) and d(y,b), respectively, and "error" is the relative error to the result of Monte Carlo simulation. We showed in the table correlation coefficient r(x,y) and cpu-time spent in ULTRA-SPARC IIi 400MHz workstation, too. Fig. 5 shows the distributions of the maximum delay d(z) in the case of the top row in the table. Although the difference between "Ours" and "No Correlation" may not be large, the difference increases if becomes large.



Fig. 4 Graph representing the circuit of Fig. 1

TABLE 1 Delays of Edge (u,v) and the Maximum Delay d(z)

|           | delay            | maximum delay d(z) |                                |                                | r(x,y) |
|-----------|------------------|--------------------|--------------------------------|--------------------------------|--------|
|           | of edge<br>(u,v) | Monte<br>Carlo     | Ours (error)                   | No Correlation<br>(error)      | 1(A,y) |
| μ         | 80<br>5.6        | 100.0<br>5.67      | 100.6 (0.54%)<br>5.66 (-0.22%) | 103.2 (3.19%)<br>4.70 (-17.2%) | 0.980  |
| cpu [sec] |                  | 4.48               | 0.67                           | 0.27                           |        |
| μ         | 40<br>2.8        | 60.27<br>2.90      | 60.56 (0.48%)<br>2.92 (0.54%)  | 61.68 (2.34%)<br>4.70 (-15.5%) | 0.941  |
| cpu [sec] |                  | 4.10               | 0.52                           | 0.21                           |        |
| μ         | 20<br>1.4        | 40.36<br>1.63      | 40.56 (0.48%)<br>1.62 (-0.27%) | 40.97 (1.49%)<br>1.42 (-12.9%) | 0.800  |
| cpu [sec] |                  | 3.91               | 0.29                           | 0.18                           |        |
| μ         | 10<br>0.7        | 31.21<br>1.07      | 31.34 (0.42%)<br>1.03 (-4.26%) | 31.37 (0.52%)<br>1.01 (-6.05%) | 0.122  |
| cpu [sec] |                  | 3.92               | 0.19                           | 0.16                           |        |



Fig. 5 The distributions of maximum delays

TABLE 2 RESULTS OF ISCAS85 BENCHMARK DATA

|                                |          | Ours                    | No Correlation (error)                    |
|--------------------------------|----------|-------------------------|-------------------------------------------|
| c432<br>n = 541<br>m = 722     | μ        | 6.64 nsec<br>0.29 nsec  | 6.70 nsec (0.92%)<br>0.28 nsec (-6.23%)   |
| III = 722                      | cpu time | 107.0 sec               | 3.7 sec                                   |
| c499<br>n = 685                | μ        | 9.98 nsec<br>0.28 nsec  | 10.10 nsec (1.18%)<br>0.205 nsec (-28.4%) |
| m = 921                        | cpu time | 208.4 sec               | 6.5 sec                                   |
| c880<br>n = 1,200<br>m = 1,570 | μ        | 14.64 nsec<br>0.58 nsec | 14.73 nsec (0.67%)<br>0.55 nsec (-5.07%)  |
| m = 1,570                      | cpu time | 371.7 sec               | 6.8 sec                                   |

Table 2 shows the results of a few circuits in ISCAS 85 benchmark, where n and m are the numbers of vertices and edges of the graph representing the circuits, respectively, and error is the percentage of the difference between "Ours" and "No Correlation." In the experiments, the circuits are laid out by a macrocell layout system[8], and the mean values of delays are extracted from the layout. The standard deviation

is set as  $=0.15\mu$ , and the edge delays are assumed to be independent again. The cpu-time is mainly depend on the number of numerical computations for normal distribution, and hence it can be reduced by using table look-up technique. Thus, further investigation about correlation and reduction of CPU time are future work.

### VI. CONCLUSION

In this paper, we proposed a new algorithm for the statistical static timing analysis which can treat correlations of delays of re-convergent paths and correlations of delays of adjacent transistors. Since the algorithm takes the correlations into account, the worst-case time complexity is O(n•m), where n and m are the numbers of vertices and edges of a given acyclic graph representing a CMOS combinatorial circuit, respectively.

In order to see effects of correlation, we applied our algorithm to some benchmark circuits, and found that the difference between the results obtained by our algorithm and those obtained by ignoring correlations is not large enough, comparing to the increase of the computational effort. However, since the difference increases, if the distribution becomes large, further research is necessary.

There still remain a few problems on the statistical static timing analysis, among which one of the most important issues is the deletion of false paths[12]. We are tackling this problem by using the proposed technique.

### ACKNOWLEDGEMENT

The authors express their gratitude to Prof. T. Nakano, Dept. of Physics, Chuo University, for his valuable discussions about calculations of normal distributions, and Mr. Shuji Nishimoto, Dept. of EECE, Chuo University, for his programming of the proposed algorithm and collection of experimental results.

### REFERENCES

- H.-F. Jyu, S. Malik, S. Devadas, K.W. Keutzer, "Statistical timing analysis of combinatorial logic circuits," IEEE Trans. VLSI Systems,
- L. Rung-Bin and W. Meng-Chiou, "A new statistical approach to timing analysis of VLSI circuits," Proc. 11th Int. Conf. on VLSI Design, pp.507-513, 1997
- M.Berkelaar, "Statistical delay calculation, a linear time method,"
- Proc. Int. Workshop on Timing Issues in the Specification and Synthesis of Digital Systems (TAU97), pp.15-24, 1997.

  H. Matsunaga, D. Mizoguchi, H. Yasuura, "Estimation of combinatorial circuit delays using the distribution of CMOS gate delays," Proc. DA Symp. '99, pp.77-82, 1999. (in Japanese)
- S. Tsukiyama, M. Tanaka, M. Fukui, "An estimation algorithm of the critical path delay for a combinatorial circuits," Proc. the 13th Workshop on Circuits and Systems in Karuizawa, pp.131-136, 2000. (in Japanese)
- M.Hashimoto and H.Onodera, "A performance optimization method M.Hashimoto and H.Onodera, "A performance optimization metition by gate resizing based on statistical static timing analysis," Proc. Workshop on Synthesis And System Integration of MIxed Technology (SASIMI 2000), pp.77-82, 1999. E.T.A.F.Jacobs and M.R.C.M.Berkelaar, "Gate sizing using a statistical delay model," Proc. Design Automation and Test in Europe
- (DATE2000), pp.283-290, 2000.
  T. Tamai, S. Nishimoto, S. Tsukiyama, M. Tanaka, M. Fukui, "A
- layout optimization system of macrocells," Tech. Report of IEICE, VLD 99-128, pp.85-91, 2000. (in Japanese)
  C.E. Clark, "The greatest of a finite set of random variables"
- C.E. Clark,
- Operations Research, vol.9, pp.145-152, 1961.

  [10] M. Kondo, H. Onodera, K. Tamaru, "MOSFET statistical modeling method using an intermediate model," Trans. of IEICE A, vol.J81-A, no.11, pp.1555-1563, 1998. (in Japanese)
- [11] R. Kay and L. Pilleggi, "PRIMO: Probability interpretation of moment for delay calculation", Proc. Design Automation Conf., pp.463--468, 1998.
- [12] H.C. Chen and D.H. Du, "Path sensitization in critical path problem," IEEE Trans. Computer-Aided Design of ICs and Systems, vol.12, no.2, pp.196-207, 1993.