# **Adaptive Supply Voltage Technique for Low Swing Interconnects**

Woopyo Jeong, Bipul Chandra Paul, and Kaushik Roy

Department of Electrical and Computer Engineering, Purdue University, West Lafayette, IN 47907, USA {jeongw,paulb, kaushik}@ecn.purdue.edu

Abstract - The increase in power consumption due to interconnects and the variation in delays (delay spread) among long interconnects are becoming important issues for design of high performance and low power circuits in scaled technologies. In this paper we propose an adaptive supply voltage technique for low swing interconnects. The proposed technique assigns different supply voltages to drive interconnects based on their delay. The voltage assignment is done only once during the initialization period of the circuit. Hence, there is no extra power consumption in the active mode. We also show that there is an optimum number of supply voltages required to achieve maximum power saving. Simulation results show that for a set of 64 buses we can achieve 42.8% and 55.9% reductions in the power consumption and delay spread, respectively.

# I. Introduction

As process technology is scaled down, the fraction of total energy consumption due to long interconnects such as global buses and clock is increasing [1,2,3]. Liu demonstrated that the power consumption due to interconnects can be up to 40% of the total power consumption in gate array and cell-library-based designs [4]. The power consumption due to an interconnect can be represented as [2]

$$p_{dyn} = \alpha \cdot f \cdot C_{w} \cdot V_{swing} \cdot V_{dd}$$
 (1)

where  $\alpha$  is the switching probability, f is the clock frequency,  $C_w$  is the wire capacitance of interconnect, and  $V_{swing}$  is the voltage swing on the interconnect. As shown in equation (1), the power consumption is proportional to the voltage swing of interconnects. Hence, to reduce the power consumption due to interconnect, low swing buses have been used [5, 6, 7].

Furthermore, as the complexity of the circuit increases, the difference in the wire capacitances among global buses is also increasing substantially. This in turn increases the delay spread (differences in delay) among these buses. The low swing bus design in [5, 6, 7] uses a single supply voltage ( $V_{\text{swing}} < V_{\text{dd}}$ ) for all interconnects irrespective of their individual capacitances and hence, the delay spread can not be reduced.

In this paper we propose an adaptive supply voltage design technique for low swing interconnects, which assigns different supply voltages to interconnects depending on their capacitances, e.g., a lower supply voltage is assigned to the interconnect having a small wire capacitance and a higher supply voltage is assigned to the interconnect with a large capacitance. The proposed technique reduces both the delay spread and power consumption due to interconnects. Furthermore, the assignment is done only once during the initialization period of the circuit and initialization circuitry is turned off during normal operations. Hence, there is no extra power consumption in the active mode.

We also find the optimum number of supply voltages required to achieve maximum power saving, which strongly depends on the number of buses.

The rest of the paper is organized as follows. Section 2 discusses the principle of low swing techniques for interconnects. Section 3 explains the implementation and operation of adaptive supply voltage design technique for low swing interconnects. Section 4 discusses the experimental results on different sets of buses.

# II. Low Swing Techniques for Interconnects

If the maximum delay among a set of long interconnects (e.g., global buses) is less than the permissible delay then a lower voltage swing ( $V_{swing} < V_{dd}$ ) can be used to drive interconnects to reduce the power consumption.  $V_{swing}$  is chosen depending on the maximum delay among all buses and the available slack. For example, if the maximum delay among all buses is x and the permissible delay is y (y > x), then  $V_{swing}$  is chosen in such a way that x increases to y. The same  $V_{swing}$  is assigned to all other buses and hence, their power consumption reduces accordingly.



Fig. 1 Multiple Low Swing Voltages for Interconnects

The total power consumption of n interconnects with single and adaptive supply voltages can be written as

This research was funded in part by Semiconductor Research Corporation under contract (#98-HJ-638) and in part by DARPA.

$$p_{dyn\_single} = \alpha \bullet f \bullet V_{dd} \bullet \sum_{i=0}^{n} (C_{wi}) \bullet V_{swing}$$
 (2)

$$p_{dyn\_single} = \alpha \bullet f \bullet V_{dd} \bullet \sum_{i=0}^{n} (C_{wi}) \bullet V_{swing}$$

$$p_{dyn\_multi} = \alpha \bullet f \bullet V_{dd} \bullet \sum_{i=0}^{n} (C_{wi} \bullet V_{swi})$$
(2)

where, V<sub>swing</sub> is the supply voltage used to drive all interconnects (single voltage swing case) and  $C_{wi}$  is the wire capacitance of ith interconnect with corresponding voltage swing V<sub>swi</sub> (adaptive supply voltage case). Please note that  $max\{V_{swi}\}$  is the same as  $V_{swing}$  The power improvement of adaptive supply voltage design methodology over single voltage swing is

$$p_{dyn\_improv.} = \alpha \bullet f \bullet V_{dd} \bullet \sum_{i=0}^{n} (C_{wi} \bullet (V_{swing} - V_{swi}))$$
 (4)

# III. Adaptive Supply Voltage Design Methodology for Low Swing Interconnects



Fig. 2 Block Diagram of Adaptive Supply Voltage Design Technique for Low Swing Interconnects

In this section we propose an adaptive supply voltage design methodology, which assigns different supply voltages to drive global buses depending on their load capacitances. The proposed technique determines the bus having maximum delay and equalizes the delays of other buses by reducing their voltage swings.

Fig. 2 shows the block diagram of the proposed technique. It consists of a control unit and a multiple output DC-DC converter. Control unit compares the delay of a bus with the maximum delay and generates a select signal. DC-DC converter generates several supply voltage levels and depending on the select signal, an appropriate supply voltage is chosen to drive a particular bus. In our experiment, all buses are represented by distributed RC circuit model.

#### A. Implementation

Fig. 3 shows the RC circuit representing global buses modelled by resisters and capacitors. Buffers are inserted in every 2000um to maintain the slopes of signals. Level shifters are used at the end of all buses to convert the voltage swing to V<sub>dd</sub>. Since each global bus is located between power supply lines acting as shields, there is no noise disturbance due to the coupling capacitance between buses.



Fig. 3 RC Model for Global Buses

Control Logic consists of MAX Delay Detector, Pulse Generator, Counter and Pulse Selector, Converter based on PWM (Pulse Width Modulation), and Registers & Accumulator. MAX Delay Detector detects the maximum delay among the global buses when a signal is applied to their inputs simultaneously.



Fig. 4 Timing Diagram of Output of Pulse Generator.

Pulse Generator compares the maximum delay detected by the MAX Delay Detector with other bus delays and generates pulses with different widths depending on the delay differences. Fig. 4 shows an example how the Pulse Generator generates pulse signals. The input signal, IN, is applied simultaneously to all buses (Bus  $0 \sim n$ ). In this particular example, Bus 2 has the maximum delay, and therefore, Pulse Generator does not generate any pulse for this signal. On the other hand, for the other buses it generates pulses with widths equal to their delay differences from Bus 2.

The counter selects one bus at a time and Pulse Selector selects the corresponding pulse generated by the Pulse Generator. Pulse Selector is basically an n: 1 multiplexer, where n is the number of buses.

Converter based on PWM converts the selected pulse to a 3 bit binary signal depending on its pulse width. 3 bit binary numbers are required to select one of the 8 supply voltages from DC-DC converter. A smaller binary number is assigned to a smaller pulse width and vice versa. If the pulse width is less than a minimum value (t<sub>margin</sub>,), the pulse is converted to the binary number, 000.

Register & Accumulator adds the binary number generated by Converter to the previous binary data, and stores the result. This is required to select the appropriate supply voltage. For example, in  $k^{th}$  cycle, if the binary number generated by Converter is 001 and the previously stored number is 100, Register & Accumulator adds these two numbers and changes the supply voltage corresponding to 101 (i.e.,  $V_{sw5}$  from DC-DC converter).

The Multiple Output DC-DC Converter generates 8 local supply voltages (2.5V, 2.3V, 2.1V, 1.9V, ..., 1.1V) including  $V_{dd}$  (2.5V).

# B. Operation

In this section we explain the operation of the Adaptive Supply Voltage design technique for low swing interconnects. The sequence of the operation is as follows.

Step 1: In the initialization period, an input (voltage swing equal to  $V_{\text{dd}}$ ) is applied to the all buses at the same time

Step 2: The outputs of global buses are fed to the inputs of the MAX Delay Detector and Pulse Generator. The MAX Delay Detector detects the maximum delay among all outputs. The Pulse Generator generates pulses with different widths depending on the delay differences.

Step 3: The Pulse Selector selects one of the pulses depending on the counter. At the beginning of the initialization period, the counter is set to 0000 (for 16 buses), which means the Pulse Selector selects Bus 0.



Fig. 5 Timing Diagram with Different Supply Voltages.

Step 4: Converter based on PWM now converts the pulse signal selected by Pulse Selector to a 3 bit binary number based on its pulse width. If the pulse width is less than a predefined value (t<sub>margin</sub>), then the output of the Converter becomes 000. t<sub>margin</sub> is the increase in delay due to the reduction in supply voltage by one level (Fig. 5). This is to ensure that the delay of the bus does not exceed the maximum delay.

Step 5: If the output of the Converter is not 000, the Register & Accumulator uses this output to assign an appropriate voltage swing to the bus. It adds the nonzero binary number from the Converter output to its previously stored number and assigns an appropriate supply voltage to the bus. At the beginning of the process, the stored number in Register & Accumulator is initialized to 000, which means a supply voltage equal to  $V_{\rm dd}$  is assigned to all buses. On the other hand, a 000 at the output of Converter indicates that the supply voltage assigned to the bus is appropriate. The counter is then increased by 1 to select the next bus and the stored number in Register & Accumulator is reset to 000.

Step 6: After assigning a supply voltage to the particular bus based on the output of the Register & Accumulator, we repeat Step 1 to Step 5 until the Converter output becomes 000.

We repeat Step 1 to Step 6 until all buses are assigned to appropriate supply voltages. The initialization circuitry is subsequently turned off.

#### IV. Results and Discussion

We implemented a distributed RC circuit, which represents global buses with different bus lengths chosen arbitrarily between 3mm and 10 mm. We used 0.25um technology for our experiment. The supply voltage,  $V_{\rm swing}$ , used for single voltage design was 2.5V.

We first analyze the proposed technique on a set of 16 buses. The supply voltages in our experiment varied from 2.5V to 1.1V in steps of 0.2V. Fig 6 shows the supply voltages assigned to different buses. It is observed that the proposed technique takes 520ns (initialization period) to assign suitable swing voltages to all buses. The supply voltages for buses 0, 2, 6, 10, 13, 14, and 15 remained at 2.5V whereas, lower supply voltages were assigned to other buses.



Fig. 6 Wave forms of swing voltages, v<sub>sw</sub>'s, adapted to the power lines of global buses



Fig. 7 Waveforms of outputs from RC model of global buses: (a) Before the proposed technique is activated, (b) After it is activated.

TABLE 1 Power consumption and delay spread due to global buses (@100MHz).

| Parameter    | Single Voltage<br>Swing | Multi Voltages<br>Swing | Improvement |
|--------------|-------------------------|-------------------------|-------------|
| Power        | 8.27 mW                 | 5.26 mW                 | 36%         |
| Delay spread | 0.47 nS                 | 0.18 nS                 | 62%         |

Fig 7 shows the waveforms of the outputs of global buses. The waveforms in Fig 7 (a) are the outputs of

buses before the proposed technique was activated, i.e., a single supply voltage was applied to all buses. It shows that the delay spread was 0.47ns. Fig 7 (b) shows the waveforms with adaptive supply voltages, where the delay spread is reduced to 0.18ns (Table 1).

Table 1 shows the power consumption and delay spread of global buses. A 100MHz clock was used for this experiment. The maximum reduction in power consumption can be achieved using this technique when all other buses are driven by the lowest supply voltage except one. In this case the power consumption reduces to 2.01mW, which is more than 4 times improvement over single voltage swing design. Furthermore, since the proposed technique operates only during the initialization period, there is no dynamic power dissipation due to the control logic in the active mode.

We also analyze the optimum number of supply voltages required for maximum power saving. Fig. 8 shows the power dissipation vs. number of supply voltages for different set of global busses. The lengths of all buses are chosen arbitrarily between 3mm and 10 mm. Consequently, the set of 16 buses used here are different from the previous experiment. Fig. 8 also shows the complexity of the control circuitry in terms of the number of transistors. It can be seen from the figure that for each set of buses the power dissipation initially reduces with the increase in number of supply voltages and then saturates.



Fig. 8 Power and complexity vs. number of supply voltages for different sets of buses.

It is observed that the optimum number of supply voltages varies from 2 to 6 depending on the number of buses. It is also observed that the complexity increases almost linearly with the number of supply voltages. It can also be noted that the complexity increases linearly with the number of buses. Hence, the proposed technique is applicable to reduce the power consumption in a large number of global buses in the circuit.

Fig. 9 shows the optimum number of supply voltages required based on the number of buses. It is observed that the optimum number of supply voltages initially increases with the number of buses and saturates for large number of buses. Table 2 summarizes the improvement in both power and delay spread for different sets of buses. It can be seen that the power saving is more if the proposed technique is applied to a large number of buses.



Fig. 9 Optimum number of supply voltages vs. number of buses to achieve maximum power saving.

TABLE 2. Experimental results on different sets of buses.

| No. of |        |       | Improvement (%) |            |
|--------|--------|-------|-----------------|------------|
| Buses  | of VDD | Power | Delay spread    | Complexity |
| 8      | 2      | 34.3  | 11.7            | 892        |
| 16     | 4      | 38.2  | 50.2            | 3084       |
| 32     | 6      | 40.7  | 59.7            | 6406       |
| 64     | 6      | 42.8  | 55.9            | 12098      |

#### V. Conclusions

In this paper we proposed an adaptive supply voltage design technique for long interconnects having different wire capacitances. Our technique assigns different supply voltages to interconnects based on their delay and reduces both delay spread and power consumption. The assignment is done only once during the initialization period of the circuit and initialization circuitry is turned off during normal operations. Hence, there is no extra power consumption in the active mode. We also showed that we should use an optimum number of supply voltages depending on the number of buses to achieve maximum power saving. The simulation results show that the proposed technique reduces power consumption and delay spread in a set of 64 buses by 42.8% and 55.9%, respectively, compared to the single voltage swing design.

#### References

- H. B. Bakoglu, Circuits, Interconnect and Packaging for VLSI, Addison-Wesley, 1990.
- [2] A. Chandrakasan, et al., Design of High-Performance Microprocessor Circuits, IEEE press, 1999.
- [3] D. Sylvester and K. Keutzer, "Getting to the bottom of deep submicron II: A global wiring paradigm," proc. DAC, pp. 726-731, 1998.
- [4] D. Liu, et al., "Power consumption estimation in CMOS VLSI chips," IEEE J. Solid-State Circuits, vol. 29, pp. 663-670, June 1994.
- [5] Hui Zhang, et al., "Low-Swing On-Chip Signaling Techniques: Effectiveness and Robustness," IEEE Trans. On VLSI Systems, vol. 8, no. 3, Jun. 2000.
- [6] Yoshinobu Nakagome, et al, "Sub-1-V Swing Internal Bus Architecture for Future Low-Power ULSI's," IEEE Trans. of Solid-State Circuits, vol. 28, no. 4, April 1993.
- [7] Mitsuru Hiraki, et al., "Data-Dependent Logic Swing Internal Bus Architecture for Ultralow-Power LSI's," IEEE J. of Solid-State Circuits, vol. 30, no. 4, Apr. 1995