# Transmission line design of Clock Trees

Rafael Escovar Mentor Graphics rafael\_escovar@mentor.com Robert Suaya Mentor Graphics roberto\_suaya@mentor.com

# ABSTRACT

We investigate appropriate regimes for transmission line propagation of signals on digital integrated circuits. We start from exact solutions to the transmission line equations proposed by Davis and Meindl. We make appropriate modifications due to finite rise time. They affect the delay calculation and hypothesis pertaining the constancy of the electromagnetic parameters. We study these effects in detail. To find the domain of physical variables where transmission line behavior is feasible, we pose the problem as a nonlinear minimization problem in a space spanned by two continuous variables, with four parameters. From the resulting solutions and employing monotonicity properties of the functional we extract regimes of validity. These regimes of validity happen to be commensurate with what is reachable and doable with todays leading technologies. We complete this study with a qualitative analysis of driver insertion in the presence of transmission lines. The resulting configurations are suitable for the development of an improved clock design discipline.

#### 1. INTRODUCTION

Inductance coupling effects on interconnects is an emerging concern in high performance digital integrated circuits. Inductance effects become appreciable for global signals wires driven by low impedance drivers. It is a high frequency concern. The serial impedance of a line is given by

$$Z(\omega) = rL + j\omega lL, \tag{1}$$

with *r* the resistance per unit length of the line, *l* the loop inductance per unit length, and *L* the length of the line. At low frequencies *Z* is dominated by the resistance, its reactance part will start playing a major role at frequencies in the neighborhood of  $\omega$  where  $\omega l \approx r$ . The inductance effect first manifests on low resistivity signals: Clock signals and data busses. Clocks signals run on global upper metal layers, with reversed scaled thickness. Typical technology parameters for clock *Cu* wires are:  $r = O(5000)\Omega/m$ ,  $l = O(10^{-7})H/m$  and  $c = O(10^{-10})F/m$ . Attempting to simulate inductance effects with single rlc, *T* or  $\pi$  networks is justifiable only for distances  $L < \frac{\lambda}{10}$ , with lambda the wavelength of the electromagnetic signal,  $\lambda/10 \approx 1.5mm$  at 10GHTz for homogeneous *SiO* 

propagation, implying that lumped description is not applicable for global clocks and buses, that run distances that can be significantly longer. A transmission line representation is required. This is possible in the time domain, provided that the electromagnetic parameters r, l, c can be treated as constant. If, in addition, the rise time is negligible, the time domain response can be expressed in terms of known analytical expressions. Circuit designers inclination is towards developing techniques to minimize inductance influence so as not to perturb well developed and understood design and verification flows, that treat parasitics purely in terms of rc networks [1, 2].

On this paper, we attempt to use inductance effects on wire line delays so as to attempt to minimize these delays by a careful balancing of the electric and magnetic components of the energy content on the medium such that among the resulting effects the propagation of a voltage perturbation on the line can be as fast as possible: The speed of light on the surrounding dielectric - the maximum possible propagation speed of a signal.

In order to make the voltage timing behavior satisfy safe margins of reliable operation we impose a subsidiary constraint: We demand the absence of overshoots. Finally, when and if one can guarantee maximum speed propagation with minimum attenuation - one can economize the number of repeaters to be included, and their size, and as a consequence diminish the power consumption on the clock signal path.

The approach we describe is a combination of global minimization with efficient and accurate parameter estimation. The relevant electromagnetic parameters are resistance, inductance and capacitance. The independent physical variables are the layout parameters, the rise time of the signals and the lithography dimensions.

The motivation is to provide a signal wire layout discipline with parameter values for the layout (lengths, widths, spacing, thickness, driving strength) that ensure transmission line behavior.

We find that the range of layout controllable parameters that satisfy the requirement of minimum propagation delay, without overshoot, fall within what is reasonable and doable with current and projected scaled technologies for long wires.

# 2. SIGNAL PROPAGATION IN THE RLC DO-MAIN

The differential equation that describes a distributed rlc line in the presence of a ground plane is the Telegraphist equation. Its parameters are r, l and c are the distributed resistance, inductance and capacitance per unit of length, respectively. It assumes quasi TEM propagation, this is satisfied provided that the transverse dimensions are less than  $0.1\lambda$ .

Davis and Meindl (DM) [3] recently derived a series expansion of the exact analytical solution in the time domain for a step function driving source, for a finite line with open termination and constant *r*,*l*,*c* parameters. Time domain analysis is important for timing simulation. DM further identify the domain of electromagnetic parameters that characterize transmission line behavior, that is to say linear delay with distance *x*,  $delay = t_f = \sqrt{lc} x$ .

The region (Region I) in which the delay is linear with distance is characterized by the conditions:

$$\frac{rL}{Z_0} \le 2\ln(\frac{4Z_0}{R_{tr} + Z_0})$$
(2)

and

$$\frac{R_{tr}}{3} < Z_0, \tag{3}$$

with *L* the length of the line,  $R_{tr}$  the driver's impedance and  $Z_0 = \sqrt{l/c}$  the characteristic impedance of a lossless transmission line.

A second region, characteristic of diffusion like behavior is present when either of the conditions is not satisfied. This case corresponds to the validity of the Sakurai expression [4].

#### 2.1 Clock Routing Problem

The most critical signal in a clocked digital system is that of the clock itself. The task is to minimize the maximum delay, and the skew.

We adopt Balanced H-Tree (BHT) style that ensures equal length of the different paths, making the skew negligible (in the absence of drivers.) This approach is often used in computer architecture [5]. At frequencies where inductance effects become relevant, suitably modified BHT's become advantageous. The modification amounts to sandwich the signal in between ground wires. We call this configuration SBHT. In figure 1 we display an SBHT of depth four. Its key advantage is to have controllable loop inductance. This is due to the fact that the local ground wires that make the sandwich serve as return paths for the current. Moreover, the loop inductance of the entire configuration  $\mathcal{L}_{loop}$  satisfies the cascade rule [6]:

Given a path  $P(I, D_i) = \{I = N_0, N_1, \dots, N_n = D_i\},\$ 

$$\mathcal{L}_{loop}(P(I,D_i)) = \sum_{j=1}^{n} \mathcal{L}_{loop}(\{N_{j-1},N_j\}).$$
(4)



Figure 1: A Sandwich Balanced H-Tree of depth four

### 3. PARAMETER EXTRACTION

We need to calculate the r, l, c parameters for the equivalent transmission line that represents an SBHT tree. Each section can be characterized by the following physical parameters: signal wire width w, metal layer thickness h, ground wires width g, common spacing between the signal wire and one ground wire s (we have the freedom to make symmetrical sandwich sections), and wire length *L*. In figure 2 a section of a SBHT with its respective physical variables is displayed.



#### **3.1** *r*,*l*,*c* **Parameters**

The equivalent serial circuit: of a line with a return ground plane can be obtained from that of a sandwich configuration by straightforward circuit algebra: In figure 3 we display the two circuits.

The equivalent loop resistance per unit length is given by:

$$r_{loop} = r_s + \frac{r_g}{2}.$$
 (5)

The total equivalent resistance of a path from source to a destination  $R = \sum_{i=0}^{i=n} r_{loop} L_i$ , with  $L_i$  the length of the *i*-th partition, is clearly separable and given by the sum of the total equivalent resistances for each partition.



Figure 3: Equivalent circuit for a SBHT

To compute the equivalent loop inductance per unit length l for one partition of the SBHT tree we refer again to figure 3. The resulting  $l_{loop}$  can be immediately inferred:

$$l_{loop} = l_s - 2l_{s,g} + \frac{1}{2}l_{g,g} + \frac{1}{2}l_g.$$
 (6)

The values  $l_{s,g}$  and  $l_{g,g}$  refer to the signal to ground and the ground to ground partial mutual inductances, respectively.

The equivalent capacitance per unit length of a sandwich  $c_{tot}$  is given by:

$$c_{tot} = 2c_{s,g} + c_r,\tag{7}$$

with  $c_{s,g}$  the mutual capacitance between one ground wire and the signal wire, and  $c_r$  the sum of the capacitances of the signal wire to substrate and other wires not participating in the return path, all of the previous quantities are per unit length. The total equivalent capacitance of the signal path from source to destination is separable and given by the sum of the total equivalent capacitances of each partition. Errors due to mutual coupling between partitions are negligible.

#### 4. COMPUTATION OF PARAMETERS

We need to compute accurately  $r_{loop}$ ,  $l_{loop}$ ,  $c_{tot}$ , parameters that correspond to the r, l, c values that enter in the transmission line equation. The computation is made as a function of the physical parameters: L, w, h, g and s. See Figure 2. The resistance calculation for a straight wire  $r = \frac{1}{\sigma wh}$ , with  $\sigma$ , w and h the conductivity of the metal, width and thickness of the particular wire. The results are inserted in equation (5).

# 4.1 Inductance

The calculation of the loop inductance for the configuration of interest, the sandwich configuration, can be obtained from the general expression for the partial mutual inductance of two filaments i, j:

$$\mathcal{L}_{i,j} = \frac{\mu_0}{4\pi} \int \int \frac{\mathbf{l}_i \cdot \mathbf{l}_j}{\|\vec{x} - \vec{x}'\|} ds' ds, \tag{8}$$

with  $\mu_0$  the permeability of the vacuum, **l** the unit vector in the direction of the current, and the denominator of the integral is the distance between two points in the wires. The integrals being extended over the lengths of the filaments. Furthermore, integrating the above expression over the transverse area occupied by the two conductors, and dividing by the product of the total cross sectional areas, leads to the following close form solution for the partial mutual inductance of two wires of rectangular cross-section of the same length *L* [7]:

$$\mathcal{L}_{mutual} = \frac{\mu_0 L}{2\pi} \left[ \ln \left( \frac{L}{d} + \sqrt{1 + \frac{L^2}{d^2}} \right) - \sqrt{1 + \frac{d^2}{L^2}} + \frac{d}{L} \right], \quad (9)$$

with d the geometric mean distance (GMD) between the two cross sections.

To calculate the partial self inductance of a wire of rectangular cross-section the same expression (9) applies. The GMD of a single wire of area  $w \times h$ , it is to a very good approximation given by [7]:

$$\ln(d) = \ln(w+h) - 3/2. \tag{10}$$

The result of using (9) and (10) for computing partial self inductance of rectangular wires is accurate to within 1% when compared to 3D simulations with FastHenry for isolated lines in the frequency domain of interest. Some corrections apply when the wires are very close.

For partial mutual inductance we use the following approximation to the GMD of two wires with identical rectangular crosssection and equal length:

$$\ln(GMD) = \ln(a) + \ln(k), \tag{11}$$

with *a* the center to center distance between the wires and  $\ln(k)$  a tabulated correction value [7].

To compute the GMD of two wires A and B of the same length, same thickness with different (but rectangular) cross-sections we partition the wire with maximum width, wire B into  $n = \lfloor B/A \rfloor$  pieces of area  $B_i = A$ , and a remaining piece of area  $B_{n+1} < A$ . This permits us to approximate the original GMD using the formula for equal cross-sections recursively. We continue the recursion till the last remaining term is of negligible or zero size, to discard it. A recursion depth of 4 suffices to achieve an upper bound of 1% of error in the computation of the partial mutual inductance for realistic configurations. Errors are in reference to FastHenry simulations in the frequency domain of interest. The final step is to combine the different terms in (6).

#### 4.2 Capacitance

The capacitance value c that enters in our optimization problem is the total capacitance of the signal line. This is the sum of the values of mutual capacitances with the ground neighbors  $c_{s,g1}$  and  $c_{s,g2}$  plus the sum of the cross coupling capacitances to the wires in lower layer(s) plus the capacitance to the substrate, that we call  $c_r$ . Given the symmetry of the problem:  $c_{s,g1} = c_{s,g1} = c_{s,g2}$ . We need an efficient and accurate method to calculate  $c_{tot}$  for different geometrical configurations. The iterative evaluation using field solvers is out of the question since the task is CPU intensive. We use instead function approximation techniques, that perform data fitting to 3D simulation results. The choice of functional form, is guided by experience. The parameter fitting process is the result of a non linear least square fit. We consider separately the capacitance of the signal wire respect to its neighbor ( $c_{s,g}$ ), and its capacitance respect to the layer beneath plus the capacitance with the substrate ( $c_r$ .) For fitting purposes we use as data 3D simulations performed with ICARE [9]. We propose the following functional form for the mutual capacitance between the signal wire and one of its ground neighbors.

$$\begin{aligned} \dot{s}_{s,g}(s,w) \approx \\ \left(\beta_1 e^{-\alpha_1 \frac{s}{s_{min}}} + \beta_2 \left(\frac{s}{s_{min}}\right)^{-\alpha_2}\right) \left(\frac{w}{w_{min}}\right)^{\alpha_3}, \end{aligned} \tag{12}$$

with  $s_{min}$  and  $w_{min}$  the minimum separation and width allowed by the technology. Fitting (12) to a given set of observations is handled as a nonlinear least square problem. The variables separate in the following fashion:

$$c_{s,g}(s,w) \approx c_1(s) * c_2(w), \tag{13}$$

thus permitting two independent least square problems.

We used the program VARPRO [10]. In figures 4 and 5 we display the fits.

The equivalent capacitance to other layers not participating in the return path of the currents but contributing to the total capacitance of the signal line is mimicked with that of a single line running orthogonal to the signal and a ground plane below that. Its magnitude is only weakly dependent on *s* and *g* for fixed *w*. We are thus allowed, with little loss in precision, to perform an average in *s* and *g* to represent  $c_r$ .

# 5. TRANSMISSION LINE PROPAGATION

Our main goal is to find domains in the physical variables where the configurations belong to region I. Inserting r, l and c for a given configuration and testing if inequalities (2) and (3) are satisfied is extremely inefficient. We use instead a method for solving systems of nonlinear equations to identify the feasible domains. We take some variables as user parameters: w, h,  $R_{tr}$  and L. We are left with s and g as independent variables.

We define the functionals F and T:

$$F = \frac{rL}{Z_0} - 2\ln(\frac{4Z_0}{R_{tr} + Z_0}),$$
(14)

$$T = Z_0 - \frac{R_{tr}}{3}.\tag{15}$$

We search domains where  $F \le 0$  and T > 0 are simultaneously satisfied. To ensure safe signals we add a new restriction  $Z_0 \le R_{tr}$ that guarantees the absence of overshoots [3], and the corresponding functional:



**Figure 4:** Fit of  $c_1$ , and chosen observation values.



**Figure 5:** Fit of  $c_2$  and chosen observation values.

$$P = Z_0 - R_{tr},\tag{16}$$

i.e., the solution domains must satisfy  $P \leq 0$ .

It is straightforward to verify by simple differentiation that Functional *F* is monotonically decreasing with respect to  $Z_0$  when  $Z_0 > R_{tr}/3$  (Region I). Functional  $Z_0$  is monotonically increasing with respect to *s*. Therefore *F* is monotonically decreasing with *s* in the region of interest. Functionals *T* and *P* are monotonically increasing with  $Z_0$  and thereby with *s*.

To find the solution interval for variable *s*, for a fixed value of *g*, we search the three values of *s*:  $s = s_F, s_P, s_T$  where the monotonous functionals *F*, *P* and *T* respectively vanish. The solution interval for *s* where the restrictions  $F(s) \le 0$ , T(s) > 0 and  $P(s) \le 0$  are simultaneously satisfied, if it exists, is the intersection of the following three intervals:  $[s_F, \lambda/10)$ ,  $[s_{min}, s_P]$  and  $(s_T, \lambda/10)$ . The upper bounds are the limits of validity of the transmission line representation.

Simulations show that  $Z_0$  as a function of g displays a single minimum (figure 6.) It is verifiable that F, as function of g, has at most one point where it vanishes. The interval for g, if it exists, where  $F \leq 0$  is  $[g_F, \lambda/10)$  with  $g_F$  the value of g where F vanishes.

From figure 6, it is clear that *T* and *P* as functions of *g*, have at most two vanishing points. Our approach consists in finding these zeros, if they exist, and then obtain the intervals where T(s) > 0 and  $P(s) \le 0$  by inspection.



#### 5.1 Results

We solve the system of nonlinear equations (F, T, P) = 0 using Newton's method. We approximate the derivatives using finite differences, approach that is quite tractable since the evaluation of the functionals is computationally inexpensive. To ensure the convergence from any initial point we include the method of bisection, at a small extra computational cost. Solution intervals in *s* and *g* for some configurations are displayed in table 1. We used  $h = 0.65 \mu m$ and *Cu* as a conductor. The intervals shown in the table represent valid configurations for signal propagation at the speed of light. Notice that they are continuous and sufficiently rich to permit valid layout representations.

#### 6. RISE TIME CORRECTIONS

The function  $V_{DM}(t) := V_{fin}(L, t)$ , given in equation (42) of DM, is the output solution to a Heaviside pulse. From general theorems of Linear partial differential equations it is well known that the output  $V_{out}(t)$  for a general input signal I(t) is given by:

$$V_{out}(t) = \frac{d}{dt} (V_{DM}(t) * I(t)), \qquad (17)$$

where the operator "\*" is the convolution product.

Consider for I(t) a pulse of rise time  $T_{rise}$ . The response  $V_{out}$  to I becomes (from equation (17)):

$$V_{out}(t) = \int_{-\infty}^{\infty} V_{DM}(\tau) \frac{dI(t-\tau)}{dt} d\tau.$$
 (18)

The derivative of  $I(t - \tau)$  is equal to  $1/T_{rise}$  for  $t - T_{rise} < \tau < t$ and is zero elsewhere. Therefore

$$V_{out}(t) = \frac{1}{T_{rise}} \int_{t-T_{rise}}^{t} V_{DM}(\tau) d\tau.$$
(19)

We take the parameters such that the configuration is in Region I, and perform numerical integration with (19). Results are shown in figure 7. Notice that  $V_{out}(t)$  has a discontinuity in its first derivative at a point *p* that we shall identify. For this purpose, we examine the first derivative of  $V_{out}$  by differentiation of (19):

$$\frac{dV_{out}(t)}{dt} = \frac{1}{T_{rise}} \left[ V_{DM}(t) - V_{DM}(t - T_{rise}) \right].$$
 (20)

| ID | $L(\mu m)$ | $R_{tr}(\Omega)$ | w(µm) | $g(\mu m)$   |               | s(µm)         |               |
|----|------------|------------------|-------|--------------|---------------|---------------|---------------|
|    |            |                  |       | $s = 2\mu m$ | $s = 5 \mu m$ | $g = 6\mu m$  | $g = 8 \mu m$ |
| 1  | 2500       | 100              | 6.0   | [5.3, 59.0]  | (*, *)        | [0.28, 4.0]   | [0.26, 4.0]   |
| 2  |            |                  | 8.0   | [5.3, 83.0]  | (*, *)        | [0.27, 4.5]   | [0.26, 4.5]   |
| 3  |            |                  | 10.0  | [5.3, 756.0] | [5.3, 314.0]  | [0.31, 20.0]  | [0.29, 22.0]  |
| 4  |            | 150              | 6.0   | [5.3, -)     | [5.3, 1105.0] | [0.5, 34.0]   | [0.45,37.0]   |
| 5  |            |                  | 8.0   | [5.3, -)     | [5.3, -]      | [0.49, 41.0]  | [ 0.45, 44.0] |
| 6  |            |                  | 10.0  | [5.3, -)     | [5.3, -)      | [0.91, 560.0] | [0.77, 621.0] |
| 7  | 10000      | 100              | 6.0   | [5.3, 57.0]  | (*, *)        | [0.71, 4.0]   | [0.8, 4.0]    |
| 8  |            |                  | 8.0   | [5.3, 80.0]  | (*, *)        | [0.6, 4.5]    | [0.5, 4.5]    |
| 9  |            |                  | 10.0  | [5.3, 643.0] | [5.3, 288.0]  | [1.2, 20.0]   | [0.81, 21.0]  |
| 10 |            | 150              | 6.0   | [7.6, -)     | [5.3, 874.0]  | [2.3, 33.0]   | [1.9, 36.0]   |
| 11 |            |                  | 8.0   | [5.7, -)     | [5.3, -)      | [2.0, 40.0]   | [1.6, 43.0]   |
| 12 |            |                  | 10.0  | [23.0, -)    | [5.6, -)      | [4.8, 497.0]  | [4.0, 544.0]  |

Table 1: We take  $s_{min} = 0.2\mu m$  and  $w \ge 5.3\mu m$  (to upper bound DC wire resistance by  $50\Omega/cm$ .) '-' means that the corresponding variable is bounded by  $g < 0.1\lambda$  and (\*, \*) means not feasible.



**Figure 7:** Response for  $T_{rise} = 0$  and for  $T_{rise} = 30 ps$ .

It has its first two discontinuities at  $t = t_f$  and  $t = t_f + T_{rise}$ , since  $V_{DM}$  has a discontinuity at  $t = t_f$ . We conclude that:  $p = t_f + T_{rise}$ . From equation (19)

$$V_{out}(t_f + T_{rise}) = \frac{1}{T_{rise}} \int_{t_f}^{t_f + T_{rise}} V_{DM}(\tau) d\tau.$$
(21)

We take the linear approximation for  $V_{DM}(t)$  in  $t \in (t_f, t_f + T_{rise})$ . From the mean value theorem, it follows that:

$$V_{out}(t_f + T_{rise}) \approx V_{DM}(t_f + \frac{T_{rise}}{2}).$$
(22)

The result is exact in the linear approximation.

Consider now the linear approximation to  $V_{out}(t)$  in the same interval:

$$V_{out}(t) \approx \frac{V_{DM}(t_f + T_{rise}/2)}{T_{rise}}(t - t_f)$$

Impose  $V_{out}(t) = 0.5V_{dd}$ , which is feasible since we took solutions in region I, and obtain:

$$t_{50\%} = t_f + \frac{T_{rise}}{2} \left( \frac{V_{dd}}{V_{DM}(t_f + T_{rise}/2)} \right),$$
 (23)

which is the new expression for the 50% time delay. It improves significantly over the naive shift in the delay computation  $t = t_f + Trise/2$ . The expression has been derived for  $T_{rise} < 4t_f$ , since at this this point new discontinuities appear in  $V_{DM}(t)$ . The linear approximations are not necessarily as good in the whole interval.

We have verified numerically that the relative error incurred in the delay calculation, resulting from the linear approximations presented above, is very small for  $T_{rise} \leq 2t_f$ . The error increases with  $T_{rise}$ , and for  $T_{rise} = 2t_f$  is a reasonable 3%. The error becomes large by the time we reach the upper limit  $T_{rise} = 4t_f$ . Transmission line behavior, which is by now to be identified with (23) demands:

$$T_{rise} \le 2\sqrt{lcL}.$$
 (24)

Expression (23), is a solution to the delay estimate for a signal in Region I for  $T_{rise} < 2t_f$ .

As a matter of fact, fixing  $T_{rise}$  in (23) the *L* dependence, displays both linear and quadratic behavior, at variance with the zero rise time solution. It is a second bound for a signal in Region I (2) the one that makes the quadratic term negligible. The expressions (2),(23) are consistent with the bounds given in the literature [11] for *rlc* behavior. A subsidiary consequence of finite  $T_{rise}$  is the relaxation of the overshoot constraint  $Z_0 < R_{tr}$ .

Corrections due to finite clock period in (23) are negligible for clock period of reasonable bandwidth (clock period larger than  $6T_{rise}$ .)

# 7. VALIDATION OF THE RESULTS

We reexamine our previous conclusions in the presence of frequency dependent effects. In general, r, l, c parameters are frequency dependent, due to well understood phenomena: proximity, skin effect and dielectric relaxation. The range of frequencies we are interested in is determined by the rise time of the signal. The frequency spectrum will contain appreciable content up to a frequency given by:

$$f_{max} = \frac{1}{\pi T_{rise}}.$$
 (25)

This is the view of the  $T_{rise}$  related phenomena in terms of the Laplace coordinates. The rise time  $T_{rise}$  is determined by technology and circuit considerations. Technology determines the transistor delay  $\tau_{tr} \approx 2ps$  at 130*nm*, while the  $T_{rise}$  of the signal is determined by the underlying logic feeding the line,  $T_{rise} \approx 30 ps$  at 130nm. The signal spectrum will be appreciable up to O(10)GHTz. We examine first possible corrections to our constant r and l assumption. The skin effect is controlled by the skin depth  $\delta = \sqrt{\frac{1}{\mu_0 f \sigma}}$ . We take  $h \approx \delta$  at the maximum frequency 10*GHT z*, making the skin effect correction negligible. With regards to c the frequency regime where dielectric response times are comparable to the rise times of signals is above the region of interest.

Regarding the proximity effect: It's main effects are to increase r and decrease l as a function of f. The modifications to the constant parameter assumption can be significant for wide wires separated by short distances. The partial self inductance contribution is the most sensitive to proximity effects, since the current on each wire tends to redistribute towards the surfaces closer to the neighbor wires. The classic quasi-static treatment of section 3.1 is replaced by FastHenry simulations, The partial self inductance for wires described on Table 1 can decrease up to 4% from the quasistatic values. The variation in mutual inductance is less than 1% for the entire configuration represented on Table 1. Consequently the loop inductance can decrease to a maximum of 6% for frequencies up to 10GHTz.

The r variation due to the same physical effect is larger than the corresponding reactance variation. See table 2.

| g (in μm) | % increase in r |
|-----------|-----------------|
| 10        | 25%             |
| 15        | 26%             |
| 30        | 26%             |
| 50        | 25%             |

 
 Table 2: % Increase in r from
 static value to 10GHTz.

The net result on our solution space is that an increase of  $r(\omega)$ and a decrease of  $l(\omega)$  makes inequality (2) more restrictive.

We focus our attention on the solution intervals for *s* the most suitable running variable. We display a couple of configurations from table 1 and their respective solution intervals of s. The new intervals are those at 10GHTz. The net modification for a given interval in s,  $(s_1, s_2)$ , and given w and g is to increase  $s_1$  so as to compensate for the increase in r and the decrease in l. Naturally the effect is more pronounced for longer wires. See table 3.

|                                                    | ID | old $s_1$ (in $\mu m$ ) | New $s_1$ (in $\mu m$ ) |  |  |  |  |
|----------------------------------------------------|----|-------------------------|-------------------------|--|--|--|--|
|                                                    | 1  | 0.28                    | 0.33                    |  |  |  |  |
|                                                    | 7  | 0.71                    | 1.8                     |  |  |  |  |
| <b>Table 3:</b> Renormalization of $s_1$ (table 1) |    |                         |                         |  |  |  |  |
| when proximity effects are considered.             |    |                         |                         |  |  |  |  |
| $g = 6\mu m$ .                                     |    |                         |                         |  |  |  |  |

Solutions continue to exist in the presence of proximity effects for lengths on the *cm* scale. The upper limits in *L* depend on specific details of the technology, and the particular wires under consideration. The lower limits on the other hand are exclusively dictated by (24) that only depends on the technology,  $L \leq 1mm$  for 130nm.

#### 8. REFLECTIONS

In the previous sections we showed how to find solution intervals for each branch of SBHT. We have used separability to obtain the solution for the whole tree. We have on the other hand neglected the effect of reflections on the solution space. Reflections occur at each physical discontinuity such as the T's on the tree. Reflections

are rarely accounted in the timing behavior of digital systems, an approximation that is not longer sustainable in the electromagnetic domain. Reflections are an unavoidable consequence of wave propagation. Given a SBHT of depth n, the magnitude of the reflection at each T is determined by the impedance mismatch at the boundary. Since at the T, we have two branches in parallel, to equalize the impedance, and thus eliminate reflections, each one of the branches must have:

$$Z_i = 2Z_{i-1} = \dots = 2^i Z_0 \text{ for } i = 1, \dots, n.$$
 (26)

where the subindex *i* characterizes the depth of the tree. Now.

$$Z_i = Z_{i,0} \sqrt{\frac{p + \frac{r_i}{l_i}}{p}},\tag{27}$$

with  $Z_{i,0} = \sqrt{\frac{l_i}{c_i}}$ , and *p* the Laplace complex variable. We make the high frequency approximation  $Z_i \approx Z_{i,0}$ . This is an accurate approximation for the high frequency part of the impedance. Since  $p = j2\pi f_{max}$  with  $f_{max} \approx 10GHTz$  the error in the approximation is small. We match the impedance based on its high frequency component, since it is the high frequency regime the one responsible for the linear time of flight behavior of the delay (in other words the small t behavior). We keep, for routing purposes, common w and g on all branches, i.e. r is the same for all branches. Notice that at the T's we do not need to satisfy (26) for resistance matching.

To minimize reflections, for fixed w and g, we vary  $s_i$  from one level of the tree to the next in such a way as to satisfy equation (26).<sup>1</sup> The modification demanded by impedance matching on our previous analysis translates into an iterative process. The first step consists of choosing a driver's strength  $R_{tr}$  feeding the main branch, and find its solution interval  $(s_{0,1}, s_{0,2})$ . Choose a value  $s_0$  belonging to this interval, subsequently find the values of  $s_i$  for the remaining levels using relation (26). Notice that the appropriate  $L_n$ that enters into the equation is the sum of the L's corresponding to the father plus those corresponding to the sons up to the leaf. We return to table 1. We consider the SBHT embedded in a square of side 2L, with L the length of the main branch. We found that for the parameters considered, the maximum tree depth is n = 2 (two T's.) Narrowing the signal line, permits the exploration of longer depths. This result is self evident, since the rate of increase of  $Z_{i,0}$  with s is slow. It is expected then that for deeper levels of the tree the resulting interwire separation becomes too large to be acceptable, because(equation (26) demands an exponential grow of functions  $Z_i$ .

#### 8.1 **Repeater Insertion**

Since trees with multiple branches run out of steam after two or three T's for transmission line propagation, repeaters are needed. The criteria for repeater insertion is qualitatively similar in the r, c and r, l, c domains. It is that of ensuring that the quadratic term in L in the time delay expression is negligible. Banerjee and Mehrotra have done a careful analysis [12] taking into account load capacitance and parasitic repeater output capacitance. In the regime where transmission line effects are important, the line in fact decouples the receiver from the driver within the time delay, therefore in zeroth order it is reasonable to consider the open ended line configuration. We have found configurations that behave

<sup>&</sup>lt;sup>1</sup>We neglect effects due to vias present at the T's, their dimension being negligible vis-a viz the wavelength.

as *l*, *c* transmission lines for values of inductances that are small  $(l = O(10^{-7})H/m)$ .

The criteria for selecting repeater size will be entirely different than in the r, c domain. The maximum length permissible for time of flight computation, for the same value of the r, c parameters is somewhat smaller than the optimal length of repeater insertion in the r, c domain. Both of these lengths are computed with the same criteria in mind, that of ensuring that the linear term in the time domain be dominant with respect to the quadratic term. The controlling variable is:

$$\Psi = \frac{rL}{2Z_0}.$$
(28)

Our representative values for  $\psi$  range from 0.05 to 0.48. The size requirements for repeaters are different in both regimes. Repeater size determination, in the presence of transmission lines, is governed by impedance matching. Repeater's input capacitance is dictated by (26), indicating smaller drivers downstream. In fact, the load capacitance associated with these repeaters is in general significantly smaller than the line capacitances, a posteriori justifying our initial assumptions of neglecting load capacitance effects. This result is different from [12] for the same l. In fact in [12] l,cbehavior is expected for large l. The mechanism for repeater insertion that transpires from our analysis is one in which repeaters are placed from source to destinations at half of the T's following the method of Section 8. The placing of a repeater decouples the downstream analysis from the precedent one. The repeater input impedance is matched to the incoming branch  $Z_i$ , that are decreasing in size, so are the lengths of the branches. The procedure continues till the overall length to be considered violates the lower bound (24) (L = 1.5mm at 130nm.) A full clock of course, needs to layout in such a way as to arrive to multiple no equidistant final destinations. This is achieved by the introduction of grid like interconnect in the vicinity of receivers [5, 13]. The longer paths of the clock layout with SBHT, to be followed in the neighborhood of the receivers with grid like interconnect structures. The presence of the grid interconnect does not affect the results of this work, since the grids extend over a length scale bounded by a mm length domain in which inductance effects are negligible. The delay contribution arising from these short branches can be adjusted by more traditional means such as resistance matching, and buffer insertion near the destinations, always necessary ingredients to minimize skew.

# 9. CONCLUSIONS

We have examined in detail the emergence of a regime in wire delay where transmission line behavior can describe the propagation of signals in wires. We apply a general method of transmission line description to find wire configurations that for particularly driver strengths and signal rise times can behave as transmission lines. For illustration purposes we have chosen the clock configuration, being the most critical signal in the digital domain. We have carefully examined most know factors that could impact on our analysis. We have derived a new expression for the time delay calculation in the  $l_{,c}$  domain in the presence of finite  $T_{rise}$ . It is well known that with scaling accompanied by growth in overall chip dimensions full chip synchronization is at a crossroad. Our approach to find configurations that minimized the wire delay finds a natural set of applications in local and global clock design. We propose that there is a natural way to exploit inductance, in terms of sandwich configurations with controllable return paths that for sensible values of the technology parameters can assure both maximum speed propagation, and the absence of undesirable overshoots. The representative values of the impedance of the lines are reasonable. The methodology is extendable to the treatment of important signal lines like buses. Some care in terms of noise effects needs to be added to the present treatment when describing closely spaced bus signal wires, even in sandwich configurations. An alternative to our approach is to pose the problem in the frequency domain, with parameters that are frequency dependent and perform the inverse Laplace transform numerically. This alternative approach is loaded with numerical instabilities, that we choose to obviate. Given the monotonicity properties of the functionals that determine the allowed intervals, the use of such alternative approach is unnecessary.

## **10. ACKNOWLEDGEMENT**

We thank V. Pereyra for discussions and for providing us with the program VARPRO.

# **11. REFERENCES**

- [1] Y. LU, K. BANERJEE AND M. CELIK, A Fast Analytical Technique for Estimating the Bounds of On-Chip Clock Wire Inductance. CIS, Stanford University, 2001.
- [2] B. A. GIESEKE ET AL., A 600 MHz superscaler RISC microporcesor with out-of-order execution. Digest Tech. Papers 1997 Int. Solid-State Circuits Conf., pp. 176-177.
- [3] J. DAVIS AND J. MEINDL, Compact Distributed RLC Interconnect Models. IEEE Transactions on Electronic Devices, Vol. 47, No.11, pp. 2068-2087, November 2000.
- [4] T. SAKURAI, Closed-forms expressions for interconnect delay, coupling, and crosstalk in VLSIs". IEEE Trans. Electron Devices, vol. 40, pp. 118-124, Jan. 1993.
- [5] R. M. AVERILL III ET AL., Chip integration methodology for the IBM S/390 G5 and G6 custom microprocessors. IBM, 1999.
- [6] N. CHANG ET AL., Clocktree RLC extraction with efficient Inductance modeling. IEEE Design Automation Conference (2000).
- [7] F. GROVER, *Inductance Calculations Working Formula and Tables*. Instrument Society of America, 1945.
- [8] M. KAMON, M. J. TSUK, AND J. WHITE, FastHenry: A multiple-accelerated 3-D inductance extraction program. IEEE Trans. MTT, 42, no 9, Sept. 1994.
- [9] F. CHARLET ET AL., *ICARE: A 3-D capacitance simulator*. LETI, Grenoble, France.
- [10] G. GOLUB AND V. PEREYRA, The Differentiation of Pseudo-Inverses and Nonlinear Least Squares Problems whose Variables Separate. Tech. Rep., STAN-CS-72-261, Stanford Univ., Stanford, Calif., 1972.
- [11] Y. I. ISMAIL AND E. G. FRIEDMAN, *Effects of inductance* on the propagation delay and repeater insertion in VLSI circuits. IEEE Transactions on VLSI Systems, vol 8, 195-206, Apr. 2000.
- [12] K. BANERJEE, A. MEHROTRA, Accurate Analisys of On-Chip Inductance Effects and Implications for Optimal Repeater Insertion and Technology Scaling. IEEE Symposium on VLSI Circuits, Kyoto, Japan, June 14-16, 2001, pp. 195-198.
- P. RESTLE ET AL., A Clock Distribution Network for Microprocessors. IEEE Journal of Solid State Circuits, Vol. 36, No. 5, May, 2001, pp. 792-799.