# A Clock Delayed Sleep Mode Domino Logic for Wide Dynamic OR Gate

Kwang-II Oh

Lee-Sup Kim

Department of EECS, KAIST 373-1 Guseong-dong, Yuseong-gu, Daejeon, Republic of Korea

olight@mvlsi.kaist.ac.kr

lskim@ee.kaist.ac.kr

### **ABSTRACT**

A high performance and low power clock delayed sleep mode (CDSM) domino logic is proposed for wide fan-in domino logic. The CDSM-domino logic not only improves the robustness but also reduces the active and stand-by power. The proposed scheme reduces delay by 21%, dynamic power by 16%, and leakage power by 91% respectively compared to the typical wide fan-in domino logic in 0.18 µm CMOS technology. In addition, the sleep mode entrance power is reduced to  $10^{-5}$  of the HS-domino logic [3].

## **Categories and Subject Descriptors**

B.6 [Logic design]: General.

### **General Terms**

Performance, Reliability

#### Kevwords

Dynamic circuits, sleep mode, clock delay, leakage, low power.

### 1. INTRODUCTION

Wide fan-in domino logics are widely used in high performance microprocessors and VLSI circuits. To maintain the robustness in the dynamic node of wide fan-in domino logics, a small PMOS keeper is needed. However, if the input number of domino logic becomes large, a large PMOS keeper is inevitable. Thus, the fighting current between the PMOS keeper and the NMOS pull down network becomes large. This results in the performance degradation and the active power increase. Another issue of recent digital logic designs is the impact of sub-threshold leakage current of deep sub-micron design technology. In order to reduce dynamic power consumption, the supply voltage has been reduced. At the same time, the threshold voltage should also be reduced to maintain the performance. As a result, the reduced threshold voltage increases the sub-threshold current exponentially as the technology develops [1].

A couple of domino logic design techniques have been proposed to achieve low active power consumption caused by fighting

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee.

ISLPED'03, August 25–27, 2003, Seoul, Korea. Copyright 2003 ACM 1-58113-682-X/03/0008...\$5.00.

current [2],[3]. The fighting current between the PMOS keeper and the NMOS pull down logic at the beginning of evaluation phase is minimized due to the clock delayed keeper control scheme. Therefore, the large keeper can be implemented without significant performance degradation and power dissipation increase. Also, the reliability of domino logic is maintained because the noise margin lies in reasonable region. However, the area and power overheads of the clock delayed keeper control circuit still exist. The HS-domino logic modified the clock delayed keeper control scheme to make the internal dynamic node as the minimum leakage state [3]. This reduces both the active power and the sleep mode leakage power. However, if all data inputs are LOW, the dynamic node can be an intermediate voltage when it begins to enter sleep mode. This leads to the short circuit power dissipation at the output inverter.

To solve these problems, a new high performance and low power domino logic style is proposed. The proposed scheme is clock delayed sleep mode (CDSM) domino logic. This scheme makes the keeper control circuit simple. Therefore, CDSM-domino logic consumes less power than the conventional clock delayed keeper control schemes. The proposed sleep control circuit enables the CDSM-domino to enter the sleep mode with minimum power overhead.

This paper is organized as follows. Section 2 describes the typical scheme and the clock delayed keeper control schemes and their logic problems. In section 3, we explain the operation of proposed CDSM-domino logic. In section 4, the simulation results and the comparisons are shown. This paper is concluded in section 5.

# 2. THE CONVENTIONAL WIDE FAN-IN DOMINO LOGICS

Fig.1 (a) shows the typical wide fan-in domino logic. To maintain the robustness of dynamic node and to overcome the sub-threshold leakage current, appropriate size of PMOS keeper is needed. However, as the PMOS keeper size increases, the fighting current between the PMOS keeper and the NMOS pull down network becomes large. Therefore, the delay and the power consumption caused by the fighting current increase as the number of fan-in increases. Fig.1 (b) shows the CKP-domino logic using the clock delayed keeper control scheme [2]. The scheme eliminates the fighting current between PMOS keeper and NMOS pull down network. However, the area and power overheads of the NAND gate exist even if it is implemented in the minimum transistor size. Fig.1 (c) shows the HS-domino logic using the modified clock



Fig.1: Conventional wide fan-in domino logics

delayed keeper control scheme and sleep control circuit [3]. The HS-domino logic enters the minimum leakage state by discharging dynamic node to save leakage power dissipation. However, an intermediate voltage exists at the dynamic node. It results in the short circuit power dissipation at the output inverter. Hence, the 'sleep mode entrance power' is large. The size of  $P_S$  and  $P_K$  must be widened enough to maintain the noise margin because  $P_S$  and  $P_K$  are connected serially.

The 'sleep mode entrance power' is defined as following: when the sleep signal is asserted there is a power dissipation to discharge dynamic node. We define this power as 'sleep mode entrance power'. The sleep mode entrance power also includes the short circuit power consumption of the output inverter during discharging dynamic node. The noise margin is defined as the input voltage change which causes 10% drop of  $V_{\rm DD}$  at the dynamic node [4]. In other words, the appropriate noise margin is set to be 10% of  $V_{\rm DD}$  throughout this work. To guarantee the correct operation, the PMOS keeper width of the domino logic is sized 10% of the worst-case effective pull down width for the typical 4-input fan-in domino logic. And the keeper width of 8bit, 16bit, 32bit fan-in domino logic is sized according to the noise margin definition.

# 3. THE PROPOSED CDSM WIDE FAN-IN DOMINO LOGIC

Fig.2 (a) shows the proposed clock delayed sleep mode wide fanin domino (CDSM-domino) logic. Its configuration is similar to the previous approaches except for the modified simple clock delayed keeper control circuit and simple sleep control circuit. And it needs odd number of the clock delay inverter chain.

Fig.2 (b) shows briefly that the active mode operation of the proposed CDSM-domino logic. The basic operation is similar to the previous scheme. The active mode circuit operation is as follows. While the clock is LOW, the dynamic node is precharged to  $V_{\rm DD}$ . N1 is turned ON and the delayed clock is HIGH, thus the PMOS Keeper ( $P_{\rm K}$ ) is turned OFF because  $V_{\rm K}$  is HIGH during precharge phase. When the clock is HIGH, the circuit enters the evaluation phase. Because the dynamic node is still HIGH before it discharges, then the N1 remains turned ON. Because the delayed clock is HIGH during the short interval of the clock delay element,  $P_{\rm K}$  maintains OFF at the beginning of the evaluation phase. By using this method, the fighting current between the PMOS keeper and the NMOS pull down network is eliminated.



(a) Circuit structure of N input wide fan-in CDSM-domino logic



(b) Timing diagram of CDSM-domino circuit

Fig.2: The proposed N input wide fan-in CDSM-domino logic and its operation

Therefore, the evaluation delay and the active power consumption are minimized.

After the interval of the clock delay element, the delayed clock changes to LOW from HIGH. If the voltage of the dynamic node remains ' $V_{DD}$ ', N1 is ON. Therefore, the PMOS keeper gate input ( $V_K$ ) becomes LOW. This makes the 'keeper ON', maintains the dynamic node at ' $V_{DD}$ ' and the turned ON keeper compensates for any leakage current and input noises. If the dynamic node is discharged to GND, P1 is turned ON. Therefore,  $V_K$  becomes



Fig.3: The CDSM-domino logic with sleep mode control circuit

HIGH. This makes the 'keeper OFF' during the rest of evaluation phase. Like the other clock delayed keeper control schemes, the CDSM-domino solves fighting current problems by turning the keeper OFF at the start of evaluation phase but it needs less number of transistors. Eventually, the keeper width is sized for the appropriate noise margin and the leakage compensation without significant speed degradation and dynamic power increase. The proposed CDSM-domino keeper control circuit consists of only one PMOS and one NMOS transistor, so that it reduces the size and the power dissipation in the control circuit.

The sleep control circuit is merged to reduce the stand-by subthreshold leakage current. Fig.3 shows that the CDSM-domino logic with the sleep control circuit. In the typical wide fan-in domino logic, the dynamic node remains HIGH when all data input are LOW because the keeper constantly supplies charge to the dynamic node in the evaluation phase. This results in constant leakage power consumption. Therefore, the leakage current increases when the number of fan-in becomes larger. However the CDSM-domino logic can reduce the large amount of leakage power by using the sleep mode. The sleep mode entrance operation is as follows. If the /sleep signal is HIGH, the circuit operates at the active mode. If the /sleep signal becomes LOW when the clock is HIGH, the circuit operates at the sleep mode. Since V<sub>K</sub> is HIGH, P<sub>S</sub> is turned OFF. Hence, there is no additional charge supplement to the dynamic node. The precharged dynamic node discharges slowly due to the leakage current. This leads the dynamic node to the minimum leakage state. In the HS-domino logic shown in Fig.1(c), the serial connection of P<sub>S</sub> and P<sub>K</sub> requires the width of the P<sub>S</sub> and P<sub>K</sub> to be widened to maintain the appropriate noise margin. However, in the CDSM-domino logic, the sleep PMOS  $(P_S)$  controls the keeper PMOS gate  $(V_K)$ . Therefore, the up sizing on the keeper and sleep control circuit is not required.

To eliminate the short circuit current due to the slowly discharge in the dynamic node, a NMOS transistor is inserted between the output inverter NMOS and GND [5]. Since the inserted NMOS cuts off the discharge path to the GND, only the output inverter PMOS is turned ON slowly when entering the sleep mode. Hence, the short circuit current is eliminated. Because the virtual ground node ( $V_{VG}$ ) of output inverter and clock delay chain is sharable, only one NMOS transistor is inserted. Since the NMOS of the output inverter and the clock delay chain are non-critical path, the



(a) Normalized average power vs. number of fan-in



(b) Normalized delay vs. number of fan-in

Fig.4: The power and delay comparison for the CDSM-domino to previous approach

performance degradation caused by this insertion is minimal. Therefore, the short circuit power consumption is eliminated. Also the sleep mode entrance power is much smaller than the HS-domino logic.

### 4. SIMULATION RESULTS

We compare the performance and the power consumption of the CDSM-domino logic with the typical domino logic, the CKP-domino logic, and HS-domino logic. The 4bit fan-in, 8bit fan-in, 16bit fan-in, 32bit fan-in domino logics are simulated in 0.18  $\mu$ m CMOS technology with 1.8V supply voltage.

Fig.4 (a) and (b) shows the active power consumption and the delay versus the number of fan-in for the CDSM-domino logic. The delay and power curves of the CKP-domino logic and the HS-domino logic are also plotted on the same graph to illustrate the speed and power advantage of the CDSM-domino logic. Note that the CDSM-domino, the CKP-domino and the HS-domino are normalized by the typical domino logic for clear comparison. The active power consumption is defined as the average power consumption of wide fan-in domino logic at 500MHz clock frequency when the worst case vector (only one pull down NMOS is ON) is asserted on the wide fan-in input. Also, the delay is defined as the time between the input assertion and the inverter output transition during the evaluation phase.



Fig.5: The normalized leakage power comparison vs. number of fan-in

Since the keeper must be widened for appropriate noise margin, the delay and power consumption of typical domino logic increase sharply as the number of fan-in increase. However, the CDSMdomino logic consumes less active power even if the number of fan-in increases. The 32bit fan-in CDSM-domino logic reduces delay by 21%, dynamic power by 16%, compared to typical 32bit fan-in domino logic. Moreover, the CDSM-domino logic reduces delay by 1.3%, dynamic power by 2.2% compared to the 32bit fan-in CKP-domino logic, and reduces delay by 4.8%, dynamic power by 7% compared to the 32bit fan-in HS-domino logic. Further, less dynamic power dissipation is achieved when its fanin number becomes small. The CDSM-domino logic dissipates 5.9% and 7.9% less power than 8bit fan-in CKP-domino and 8bit fan-in HS-domino respectively. This is attributed to the followings. First, the simpler architecture is used on a keeper control circuit. Second, keeper up sizing is not required to maintain noise margin. Therefore, the CDSM-domino logic achieves less active power dissipation on wide fan-in domino logic than the CKP-domino logic and the HS-domino logic.

Fig.5 shows the leakage current reduction effect of the CDSM-domino logic. In Fig.5, as the number of fan-in increases, the stand-by leakage power of the typical wide fan-in domino logic and the CKP-domino logic increases sharply. They consume lots of constant leakage power even if the circuit does not operate. However the CDSM-domino logic consumes much less leakage power as well as the less active mode power. The 32bit fan-in CDSM-domino logic saves 91% and 31.2% of leakage power dissipation compared to typical 32bit fan-in domino logic and 32bit fan-in HS-domino logic respectively.

In Fig.6, the sleep mode entrance power versus the number of fanin is shown for the HS-domino logic and the CDSM-domino logic. The curve shows that the sleep mode entrance power of the CDSM-domino logic is much less than the HS-domino logic. The sleep mode entrance power is reduced to 10<sup>-5</sup> of the HS-domino logic. Since the HS-domino logic has the intermediate voltage at the dynamic node, the short circuit power dissipation exists on the output inverter. However, the sleep control circuit of the CDSM-domino logic eliminates the short circuit current by inserting NMOS transistor between the output inverter NMOS and GND. Therefore, the short circuit power dissipation does not exist during discharging dynamic node the sleep entrance power is minimized power consumption.



Fig.6: Normalized sleep mode entrance power comparison

### 5. CONCLUSION

A high performance and low power clock delayed sleep mode (CDSM) domino logic is proposed for wide fan-in domino logic. The CDSM-domino logic not only improves the robustness but also reduces the active and stand-by power. This circuit enables the usage of large size of keeper on the wide fan-in domino logic with negligible cost of the load. The proposed scheme reduces delay by 21%, dynamic power by 16% and leakage power by 91% respectively compared to the typical wide fan-in domino logic in 0.18 µm CMOS technology. In addition, the CDSM-domino logic shows reduced power consumption compared to the conventional clock delayed keeper control schemes. By 5.9% and 7.9% less power dissipation are obtained compared to 8bit fan-in CKP-domino logic and 8bit fan-in HS-domino logic respectively. Also, the sleep mode entrance power is reduced to 10<sup>-5</sup> of the HS-domino logic due to the modified sleep control circuit embedded.

#### 6. ACKNOWLEDGMENTS

This work was supported by KOSEF through the MICROS at KAIST, Korea

### 7. REFERENCES

- [1] A. Chandrakasan, W.J.Bowhill, and F. Fox, "Design of High performance Microprocessor Circuits", IEEE Press, 2000.
- [2] Alvandpour A., Larsson Edefors P., Svensson C., "A leakage-tolerant multi-phase keeper for wide domino circuits", IEEE Intl. Conference on Electronics, Circuits and Systems, Vol.1, pp. 209 -212, 1999.
- [3] Mohamed W., Allam, Mohab H., Anis, Mohammed I. Elmasry, "High-Speed Dynamic Logic Style for Scaled-Down CMOS and MTCMOS Technologies", IEEE Intl. Symp. On Low Power Electronics Design, pp.155-160, 2000.
- [4] S. Thompson et al., "Dual Threshold Voltage and Substrate Bias: Keys to High Performance, Low Power, 0.1um Logic Designs", IEEE Symp. On VLSI Tech., pp.69-70, 1997.
- [5] Seongmoo Heo, Krste Asanovic, "Leakage-Biased Domino Circuits for Dynamic Fine-Grain Leakage Reduction", IEEE Symp. On VLSI Tech., pp.316-319, 2002