# HSpeedEx: A High-Speed Extractor for Substrate Noise Analysis in Complex Mixed-Signal SOC

Adil Koukab, Catherine Dehollain, and Michel Declerco

Swiss Federal Institute of Technology (EPFL), Electronics Laboratory CH-1015 Lausanne, Switzerland adil.koukab@epfl.ch

**ABSTRACT:** The unprecedented impact of noise coupling on Mixed-Signal Systems-On-a-Chip (MS-SOC) functionality, brings a new set of challenges for Electronics Design Automation (EDA) tool developers. In this paper, we propose a new approach which combines a thorough physical comprehension of the noise coupling effects with an improved Boundary-Element-Method (BEM) to accelerate the substrate model extraction and to avoid the dense matrix storage. The low computational efforts required, as well as speed and accuracy reached, makes this method a highly promising alternative to verify complex MS-SOCs.

#### 1. INTRODUCTION

The relentless move to a single-chip integration of such heterogeneous designs as Digital, Analog, RF front-ends, etc. has opened the door to a host of challenging substrate noise coupling effects. In fact, the large cross-talks and the power supply noise generated by digital blocks, can be transmitted to sensitive analog sections through the substrate, resulting in malfunction of the system and even its instability.

A most precise solution for substrate modeling certainly comes from the use of solver based on finite difference method (FDM) [1-4]. However, despite the improvements recently realized in this domain [1-4], the computational effort involved in computing by field solvers is still considerable. In addition, even if the size of the circuits enable the substrate modeling, the storage and the manipulation of the enormous N×N matrix of coupling resistances, is impractical [5]. Several approaches were proposed to deal with the complexity encountered when simulating large dense coupling networks [6-9]. Recently [6], an approach inspired by wavelets to sparsify the dense conductance matrix, was proposed. Other approaches, which are worthy for further investigation, are presented in [5]. Boundary-Element Methods (BEM) have been successfully adapted to the substrate-modeling problem [10-12].

This work has been supported by the Swiss Office Fédérale de l'Education et de la Science (N OFES 99-0336-2)

PERMISSION TO MAKE DIGITAL OR HARD COPIES OF ALL OR PART OF THIS WORK FOR PERSONAL OR CLASSROOM USE IS GRANTED WITHOUT FEE PROVIDED THAT COPIES ARE NOT MADE OR DISTRIBUTED FOR PROFIT OR COMMERCIAL ADVANTAGE AND THAT COPIES BEAR THIS NOTICE AND THE FULL CITATION ON THE FIRST PAGE. TO COPY OTHERWISE, OR REPUBLISH, TO POST ON SERVERS OR TO REDISTRIBUTE TO LISTS, REQUIRES PRIOR SPECIFIC PERMISSION AND/OR A FEF

DAC 2002, JUNE 10-14, 2002, NEW ORLEANS, LOUISIANA, USA. COPYRIGHT 2002 ACM 1-58113-461-4/02/0006...\$5.00.

Storage and inversion of the resulting impedance matrix, however, make the required computational effort often prohibitive and full-chip MS-SOC analysis intractable.

In this paper, we propose a new approach which combine a thorough physical comprehension of the noise coupling effects with an improved Boundary-Element-Method (BEM) to accelerate the substrate model extraction and to avoid the dense matrix storage. Our substrate analysis package HspeedEx uses the algorithms of the method in an efficient and design-oriented methodology. The unprecedented low computational efforts required as well as speed and accuracy reached by this method, make that currently it can be one of the most promising alternative, able to verify complex MS-SOCs.

### 2. BACKGROUND

Consider a system of M planar contacts, decomposed into N panels on the top of a substrate. For such a system, the relation between the N total panels potentials ( $\Phi p \in \square^N$ ) and N total currents through each panel ( $Ip \in \square^N$ ) is given by  $\Phi p = Zp.Ip$ . Where  $Zp \in \square^{N \times N}$  is the impedance matrix. Obtaining each entry  $z_{ij}$  in this impedance matrix requires computing an integral involving the Green's function over the appropriate panel surfaces as shown by Eq. 1.

surfaces as shown by Eq. 1.
$$z_{ij} = \frac{\overline{\varphi_i}}{I_j} = \frac{1}{S_i S_j} \iint_{S_i S_i} G(s_i, s_j) ds_j ds_i ; G = \sum_{m,n=0}^{\infty} \begin{cases} f_{mn} \cos\left(\frac{\alpha x}{a}\right) \cos\left(\frac{\alpha x'}{a}\right) \\ \cos\left(\frac{\beta y}{b}\right) \cos\left(\frac{\beta y'}{b}\right) \end{cases}$$
(1)

Where  $\alpha = m\pi/a$ ,  $\beta = n\pi/b$ , a and b are the substrate lateral dimensions,  $s_i$  and  $s_j$  the surfaces of the panels i and j.  $f_{mn}$  is computed using a recursion formulas as shown in [10]. It is also demonstrated in [10] that  $z_{ij}$  can be expressed as a function of 64 2–D discrete cosine transform (DCT) coefficients (K(p,q)), with

$$K(p,q) = \sum_{m=0}^{P-1} \sum_{n=0}^{Q-1} k_{mn} \cos \left( m\pi \frac{p}{P} \right) \cos \left( n\pi \frac{q}{Q} \right) \tag{2}$$

 $k_{mn}$  is a function of  $f_{mn}$ , and the 64 (p , q) terms are determined from the ratio of contacts coordinates and substrate dimensions. For more details see [10] and [11]. A high-speed computation of those coefficients can be made using the fast Fourier transform, FFT [10]. Once the impedance matrix Zp is computed, one needs to invert it in order to generate the admittance matrix Yp.

The inversion of the impedance matrix and its storage is the time and memory hungriest stage of BEM. Iterative algorithms such as Generalized Minimum Residual algorithm, GMRES [13] can be used to speed up the computation. Another approach consists of the use of the sparsification via eigendecomposition technique [12], to enhance the scope of applicability of this technique to

more large circuits. Despite these improvements, the extractions process remains too slow to perform a practical full-chip analysis accurately [12]. It's worth noting that the substrate model generated by BEM, as any electromagnetic model, often introduces physical and numerical modeling errors. Some precautions can be taken to limit these errors. For instance, we can, force GEMRES to converge to very tight tolerances. On the other hand, numerical instability problems can be avoided by using a particular green function [11]. Despite all these precautions, BEM continues to have difficulties to handle the large circuits.

#### 3. SUBSTRATE MODELING FLOW

In this section, an efficient hierarchical modeling methodology is proposed. The substrate coupling verification flow is broken into a set of independent modeling stages as shown in Fig. 1.



Fig. 1 Substrate Modeling Flow

The crucial result, we demonstrate here, is that the accuracy parameters governing the global substrate coupling (i.e., the interblock coupling) are different from those governing the local transistor level coupling (i.e. intra-block coupling). As result, we can accurately analyze the inter-block substrate coupling with a coarse local intra-block coupling representation and conversely.

#### 3.1 Inter-block Coupling fundamentals

Several investigations of the substrate noise coupling process are performed in order to capture their fundamental characteristics. The designed substrate noise evaluation chip (Fig. 2) include N inverters (N varies from 12 to 1200). Vin1, Vin2, and Vout are respectively the on-chip ground, the on-chip Vcc of the circuit and an on-chip ground (GND) node representing a sensitive node. A simulation Vout/Vin1 is performed for a chip with 12 to 1200 inverters and shown in Fig. 3. For the Simplified substrate Model (SM) only ground contacts of the layout are considered. For the Full substrate Model (FM), the whole layout (considering PMOS, NMOS, Wells, Vcc and Ground contacts) is used. In both cases the netlist of the circuits are added to the substrate models generated by FDM, to simulate the transfer functions.



Fig. 2 Overview of substrate coupling evaluation chip with a typical package parasitic (wire inductance L = 5nH).



Fig. 3 The transfer function Vout/Vin1 and Vout/Vin2 for 12 and 1200 inverters: using simplified (SM) and full substrate model (FM)

As shown in Fig.3, the simplified and full substrate models show an excellent agreement for all frequencies and numbers of inverters considered. This is due to the fact that: the coupling path from Vin1 to Vout can be decomposed into N parallel paths, and each path can also be fragmented into an indirect path through the NMOS ( $R_t$  +1/j $\omega$ C<sub>t</sub>) in parallel with a direct path through the substrate ( $R_s$ ). The term (1/j $\omega$ C<sub>t</sub>+ $R_t$ ) is much larger than  $R_s$  for two reasons: the low value of  $C_t$  ( $\sim$ fF) and high value of  $R_t$  (indirect path). At high frequency, however, we should consider also the Vcc to Vcc and Vcc to ground coupling. In fact, in high frequency range the ground to ground isolation (Vout/Vin1) is in the same order of magnitude as the Vcc to ground isolation (Vout/Vin2) (see Fig. 3).

#### 3.3 Inter-block Coupling: Numerical analysis

In this section, we describe the algorithm proposed to extract the substrate model. As demonstrated in the last section, a layout with wells, Vcc, and ground contacts, is sufficient to have an accurate representation of the inter-block substrate coupling. Despite these simplifications of the layout, the network to generate remains too dense to enable the targeted full-chip MS-SoC analysis. Therefore, further progress in numerical methods is needed in order to reduce computational efforts. The crucial observation we make here, is that the ground substrate contacts (respectively Vcc contacts) of each block of the chip are linked by metal lines and thus all substrate coupling paths between them are shorted. We can, thus, consider the ground contacts (respectively Vcc contacts) of each block (supposed to have its own on-chip ground bond-path) as the same contact when we perform inter-block substrate coupling. Consequently, the discretization explained in Fig. 4 is sufficient for an accurate inter-block coupling representation. In fact, since we focus on the coupling path between the various blocks of the chip, we can consider that the current at the contacts positioned in the edges of each block is very high, compared to the currents of the contacts situated in its center. Therefore, the edge contacts are the most dominant paths between blocks. This is the reason why a fine partition should be used in the edge region. More we go towards the center of the block, more the role of the contacts in the interblock coupling is becoming weak, and thus, more we can use a coarse partitions. The currents at the ground contacts (respectively Vcc contacts) of each partition are considered to be constant.

The question that emerges now is: how to exploit this partition to speed up the numerical computation?. Let us consider two partitions of the chip i and j, having respectively as number of contacts L and M (Fig. 4).



Fig. 4 MS-SOC partitioning example for inter-block coupling analysis

Since we assume that the currents of the L contacts (respectively the M contacts) is constant, the impedance representing the substrate coupling between the two partitions can be defined as

$$z_{ij} = \frac{\overline{\phi_i}}{I_j} = \frac{\overline{\phi_j}}{I_i} = \frac{1}{\sum_{l=1}^{L} S_l \sum_{m=1}^{M} S_m} \int_{\sum_{l=1}^{L} S_l \sum_{m=1}^{M} S_m} G(s, s') ds ds'$$
(3)

 $S_1$  and  $S_m$  are the surfaces of each contact in the partitions i and j (1)  $z_{ij} = \begin{pmatrix} \sum_{l=1}^{L} \sum_{m=1}^{M} S_m \end{pmatrix}^{-1} \sum_{l=1}^{L} \sum_{m=1}^{M} \int_{S_l} G(s,s') ds ds'$  (4)

$$z_{ij} = \left(\sum_{l=1}^{L} S_{l} \sum_{m=1}^{M} S_{m}\right)^{-1} \sum_{l=1}^{L} \sum_{m=1}^{M} \int_{S_{1}S_{m}} G(s, s') ds ds'$$
 (4)

The double integrals in this equation represent the impedance between the contact 1 of partition i and contact m of partition j. Thus  $Z_{ij}$  can be represented by a sum of impedances between the

contacts of each partition as  $z_{ij} = \sum_{l=1}^{L} \sum_{m=l}^{L} z_{lm} s_{l} s_{m} / \sum_{l=1}^{L} \sum_{m=l}^{M} s_{l} s_{m}$  (5) Using the same procedure with some algebra, we can relate  $z_{ii}$  to

the impedance between the contacts of the partition i. Thus, we

$$z_{ii} = \left(\sum_{l=1}^{L} s_l^2\right)^{-1} \left(2 \times \sum_{l=1}^{L} \sum_{j=l+1}^{L} z_{lj} s_l s_j + \sum_{l=1}^{L} z_{ll} s_l^2\right)$$
(6)

The gain in computational cost is evident, since we transform an impedance matrix of M<sup>2</sup> elements, with M the number of contacts, to an impedance matrix of P2 elements, with P the number of partitions. Suppose that the chip of Fig. 4 is composed of one million of contacts. The memory requirements is  $O(10^{12})$  space to store the impedance matrix entries. If each block is decomposed into 1000 partitions, with a chip of 7 blocks, the memory requirement is O  $((7.10^3)^2)$ . The memory gain is, thus, considerable. The inversion requirement by LU factorization, for example, is  $O(10^{18})$ , and the gain in execution time by our procedure is more important.

#### 3.4 Intra-block transistor-level coupling

For the intra-block coupling, we will distinguish between the noisy blocks modeling and the sensitive blocks modeling.

### 34.1 Noisy blocks

In addition to power supply noise, the switching activity of digital blocks injects spurious signals in the substrate through the reverse junction capacitances of transistors or by impact ionization (hot carriers). Transistor Neighboring substrate contacts pick up the most portion of these signals. On the other hand each transistor is linked to the substrate ground by (Rs+1/jCω). As C is of about few fF, the substrate resistances play a negligible role in this coupling. In conclusion, for all noisy circuits, we can connect the bulk of NMOS transistors directly to ground (respectively PMOS to Vcc) without any significant loss of accuracy.

#### 3.4.2 Sensitive analogue blocks

The sensitive analogue blocks may require a more accurate substrate model, since the former can have a direct effect on their performances. For instance, the noise figure of an LNA can be affected by the substrate resistance thermal noise. The substrate resistances can also result in a change in the input/output matching of LNA, reducing its gain and its reverse isolation. The methodology for an accurate transistor level substrate modeling in analogue blocks is the following one: once the inter-block coupling is finished, we eliminate all the Vcc to GND resistances, of the targeted sensitive circuit, generated by this process. We keep only resistances representing the inter-block coupling. The partitioning process used for transistor-level coupling analysis is exactly the inverse of the one used for inter-block coupling, that is, we use a partition which becomes increasingly coarse as we go far from the handled analogue block. An example is shown in



Fig. 5 MS-SOC Partitioning example for fast intra-block coupling analysis: (A) is the targeted sensitive block

The partition around the handled parts are only used for modeling the effects of distant contacts(i.e. the inter-block resistances coupling) on the intra-block substrate coupling. As for inter-block strategy, the currents in the partitions around the handled circuit are assumed to be constant, and the same analytical formula (Equ. 5 and 6) will be used to speed up the computation and avoid a large ill-conditioned matrix.

#### 4. EXPERIMENTAL RESULTS

In this section, we present examples that show the accuracy and the efficiency of the methodology proposed in this paper. We focus on the inter-block substrate modeling, since it is the crucial process, enabling to evaluate the efficiency of the isolation strategy adopted, such as the flooplaning, the block placement, the guard ring distribution, the multiple bond-path assignment etc [14-15].



Fig. 6 Overview of substrate coupling evaluation chip: Test structure [ 4 blocks(I,II,III,VI) - 864 Inverters ]

The test layout used for preliminary verifications is presented in Fig.6. Several experiments, using three substrate doping profiles (P1, P2, P3 in Fig. 7), has been conducted. The small thickness of P3 will enable a fine meshes discretization, for FDM analyses.



Fig. 7 Substrate profiles used for extraction: P<sub>1</sub> low resistivity; P<sub>2</sub> highresistivity; P<sub>3</sub> high resistivity and low-thickness substrate

For each of the substrate profiles indicated, extraction was performed between the GND and the Vcc contacts of the blocks. Table I (a) and I (b) show the inter-block computed resistances with BEM and the proposed method (FastBEM), of respectively substrate  $P_1$  and  $P_2$  of Fig. 7. R I-0, for instance, is the computed resistance between the ground contacts of block I and the backplane of the substrate and R I-II is the resistances between the ground contacts of block I and those of block II. Table II shows a comparaison between the results of BEM, FastBEM, and FDM in the case of  $P_3$  profile (Fig. 7). The agreement between the three methods is evident for all considered profiles.

| R         | BEM       | FastB E M |
|-----------|-----------|-----------|
| I - 0     | 9 5       | 9 5       |
| I I - 0   | 7 6       | 7 6       |
| I I I - 0 | 7 5       | 7 5       |
| V I - 0   | 9 4       | 9 4       |
| I - I I   | 9119      | 9 1 8 0   |
| I-III     | 1 0 6 4 2 | 10694     |
| I-V I     | 1 2 3 2 5 | 1 2 7 4 3 |
| II-III    | 6829      | 6930      |
| II-IV     | 10670     | 10740     |
| III-IV    | 7932      | 7919      |

| R       | BEM     | FastB E M |
|---------|---------|-----------|
| I - 0   | 7 9 1   | 8 0 1     |
| I I - 0 | 7 2 1   | 7 1 7     |
| III-0   | 7 4 6   | 7 4 6     |
| V I - 0 | 8 3 5   | 8 3 6     |
| I - I I | 1 5 8 7 | 1712      |
| I-III   | 7 4 1 2 | 8 1 1 3   |
| I-V I   | 1763    | 1822      |
| II-III  | 8 2 5   | 8 5 6     |
| II-IV   | 7 3 3 0 | 7836      |
| III-IV  | 956     | 1 1 3 3   |

(a) low-res. profile  $P_1$  (b) high-res. profile  $P_2$  TableI Inter-block substrate resistances of the test-chip

| R         | F D M   | BEM       | FastB E M |
|-----------|---------|-----------|-----------|
| I - 0     | 1 1 7   | 1 1 5     | 1 1 5     |
| I I - 0   | 1 0 4   | 1 0 4     | 1 0 4     |
| I I I - 0 | 1 0 5   | 1 0 7     | 1 0 8     |
| V I - 0   | 1 2 1   | 1 2 1     | 1 2 1     |
| I - I I   | 1 1 7 2 | 1 0 3 6   | 1 0 6 7   |
| I-III     | -       | 17069     | 2 0 0 5 0 |
| I-V I     | 1 1 8 3 | 1 1 2 3   | 1 2 0 4   |
| II-III    | 3 5 2   | 3 2 8     | 3 4 9     |
| II-IV     | -       | 1 4 0 8 7 | 1 5 6 1 5 |
| III-IV    | 4 3 0   | 4 0 7     | 4 1 2     |

TableII Inter-block substrate resistances of the test-chip using P<sub>3</sub>

| Profil(P <sub>3</sub> ) | K(p,q)  | BEM         | FastBEM | FDM    |
|-------------------------|---------|-------------|---------|--------|
| Runtime                 | 1mn 14s | 4mn 30s     | 1mn 18s | 1h 3mn |
| user+sys                |         | 3mn 16s (*) | 4s (*)  |        |
| MaxMem                  | 9328K   | 17M         | 9328K   | 1200M  |
|                         |         |             |         |        |

Table III: Performances comparison between the extractions methods in the case of  $P_3$  profile. (\*) i.e. without K(p,q) computation time

Table III summarizes the computational cost of the three methods. "Runtime line" represents both User (the time to execute the command run\_extraction\_job) and System (the additional system time to complete the job) time of the extraction including post-processing. "MaxMem line" represent the peak memory usage over the total runtime, including input data processing, parasitic extraction, and writing outputs. For run time as for peak memory usage, the supremacy of the FastBEM is evident. It should be noted also that this supremacy is increasingly important as the test cases will include a more larger and more complex designs.

### 5. CONCLUSION

In this paper, we have presented a new approach, based on an improvement of Boundary-Element-Method (BEM), to accelerate the substrate model extraction and to avoid the dense matrix storage. We have shown that the accuracy parameters governing the global substrate coupling are different from those governing the local transistor level coupling. This has enabled us to build an efficient methodology for inter-block coupling as well as for transistor level intra-block coupling. The supremacy of the method in terms of extraction time, memory usage and accuracy has been demonstrated. The weak dependence of its efficiency on the complexity of the chip, makes it one of the most attractive method to verify MS-SOCs.

#### References

- [1] S. Kapur and D. Long "Large scale capacitance extraction" IEEE/ACM Design Automation Conference 2000, pp. 744-749.
- [2] K. Nabros and J. White "Fast capacitance extraction of general three-dimensional structures" IEEE Transactions on Microwave Theory and Techniques, June 1992 vol. 40, pp. 1496-1507.
- [3] S. Kpur and D. E. Long "IES3: Afast integral equation solver for efficient 3-dimensional extraction" IEEE/ACM Int. Conf. On Computer-Aided Design,1997, pp. 448-455
- [4] F. Clément 'Computer Aided Analysis of Parasitic Substrate Coupling in Mixed Digital-Analog CMOS Integrated Circuits 'Ph.D. Dissertation No.1449. Swiss Federal Institute of Technology in Lausanne, 1996. ("SubtrateStorm User's Guid" Simplex).
- [5] J. R. Phillips and L. Miguel Silveira "Simulation Approaches for Strongly Coupled Interconnect Systems" IEEE/ACM Computer Aided Design International Conf. 2001 pp. 430 -437
- [6] J. Kanapka, J. Phillips, and J. White "Fast Methods for Extraction and Sparsification of Substrate Coupling" IEEE/ACM Design Automation Conference 2000, pp. 738-743
- [7] M. Chou "Fast Algorithms for Ill-Conditioned Dense-Matrix Problems in VLSI Interconnect and Substrate Modelling" Ph.D. Dissertation, Massachusetts Institute of Technology, 1998.
- [8] R. Gharpurey and S. Hosur "Transform domain techniques for efficient extraction of substrate parasitics" IEEE/ACM Int. Conf. On Computer-Aided Design,1997, pp. 461-467
- [9] J. P. Costa, M. Chou, and L. M. Silveira "Precorrected-DCT techniques for modeling and simulation of substrate coupling in mixed-signal IC's" IEEE Int. Symposium on ciruits and Systems, 1998, pp. 358-362
- [10] R. Gharpurey and R. G. Meyer "Modeling and Analysis of Substrate Coupling in Integrated Circuits" IEEE J. Solid-State Circuits, March.1996, vol.31, pp.344-353
- [11] A. M. Niknejad, R. Gharpurey, and R. Meyer "Numerical Stable Green Function for Modeling and Analysis of Substrate Coupling in Integrated Circuits" IEEE Trans. On Computer-Aided Design of Integrated Circuits and Systems, 1998, pp. 305-315
- [12] J. P. Costa, M. Chou, and L. M. Silveira "Efficients Techniques for Accurate Modeling and Simulation of Substrate Coupling in Mixed-Signal IC's" IEEE Trans. On Computer-Aided Design of Integrated Circuits and Systems, 1999, pp. 597-607
- [13] Y. Saad, and M. H. Schultz "GMRES: A generalized minimal residual algorithm for solving nonsymmetric linear systems" SIAM I. Scientif., Stat. Computing, vol. 7 July 1986, pp. 856-869.
- [14] A. Koukab, M. Declercq, and C. Deholain "Analysis and Improvement of the Noise Immunity in a Single-Chip Super-Regenerative Tranceiver" IEE Proc.-Circuits Devices Syst., October 2001, vol. 148, pp 250-254
- [15] S. Mitra, R. A. Rutenbar, L. R. Carley, and D. J. Allstot "A Methodology for Rapid Estimation of Substrate-Coupling Switching Noise" Custom Integrated Circuit Conference 1995 pp. 129-132

Categories and Subject Descriptors

J. [Computer Applications]: J.6 Computer-Aided Engineering – Computer-aided design (CAD).

# **General Terms**

Algorithms, Design, Theory, Verification.

## Keywords

Substrate coupling, noise, substrate noise, Mixed-Signal noise, Boundary-Element-Method, Numerical analysis, switching circuits, supply noise.