# Exact and Approximate Methods for Calculating Signal and Transition Probabilities in FSMs* 

Chi-Ying Tsui, Massoud Pedram, Alvin M. Despain<br>Department of Electrical Engineering - Systems<br>University of Southern California, Los Angeles, CA 90089


#### Abstract

In this paper, we consider the problem of calculating the signal and transition probabilities of the internal nodes of the combinational logic part of a finite state machine (FSM). Given the state transition graph (STG) of the FSM, we first calculate the state probabilities by iteratively solving the Chapman-Kolmogorov equations. Using these probabilities, we then calculate the exact signal and transition probabilities by an implicit state enumeration procedure. For large sequential machines where the STG cannot be explicitly built, we unroll the next state logic $k$ times and estimate the signal probability of the state bits using an OBDD-based approach. The basic computation step consists of solving a system of nonlinear equations. We then use these estimates to approximately calculate signal and transition probabilities of the internal nodes. Our experimental results indicate that the average errors of transition probabilities and power estimation(compared to the exact method) are only $5 \%$ and $0.6 \%$ respectively when $k=3$. This is an order of magnitude improvement in computation accuracy compared to the existing approaches.


## 1 Introduction

One of the primary objectives in the design of portable systems - which are becoming widespread - is power reduction needed to minimize the size and weight allocated to batteries. Another driver of the progress in the area of low power design is the increasing need to reduce active and/or standby power consumption in all electronic systems. Essential elements of a low power design environment include means of analyzing the dissipation of a design and mechanisms for minimizing the power consumption when needed. This paper is concerned with the power estimation in finite state machines. Various approaches for power minimization at the sequential logic synthesis level [9] [6] can benefit from the techniques presented here.

In CMOS circuits, power is consumed during charging and discharging of the load capacitances. In order to estimate the power consumption, we have to calculate the signal and transition probabilities of the internal nodes of the circuit. These probabilities depend on the input patterns, the delay model and the circuit structure.

[^0]Several signal and transition probabilities estimation algorithms have been developed for combinational circuit. Burch et al. [2] introduced the concept of a probability waveform. Given such waveforms at the primary inputs and with some convenient partitioning of the circuit, they examined every sub-circuit and derive corresponding waveforms at the internal circuit nodes. Najm [7] described an efficient technique called probabilistic simulation to propagate the transition densities at the circuit primary inputs into the circuit to give transition densities at internal and output nodes. Both methods assume inputs to sub-circuits are independent and thus did not account for the reconvergent fanout and input correlations. Stamoulis et al. [10] improved the probabilistic simulation approach by calculating the statistics of the waveforms and delays more accurately and by considering signal correlation. Ghosh et al. [4] proposed symbolic simulation in order to produce a set of Boolean functions which represent conditions for switching at each gate in the circuit. Tsui et al. [11] described a tagged probabilistic simulation approach which employs a real delay model to account for glitchings and also handles reconvergent fanout. This approach requires much less memory and rums much faster than symbolic simulation, yet achieves a very high accuracy.

The above methods assume the primary inputs to the circuit are both spatially and temporally independent, i.e. the signal value $x_{i}$ of a primary input $i$ is independent of any other primary input, and $x_{i}$ at time instance $t$ is independent of $x_{i}$ at time instance $t+1$. While this assumption holds for most combinational circuits, it does not hold for finite state machines where the present state bit inputs are spatially correlated by the state encoding and temporally correlated by the state transition behavior. Figure 1 shows the STG and the gate implementation of a 4 -state finite state machine. If the state bits are assumed spatially independent, the signal probability of $n(P(n))$ is equal to $0.5^{*} 0.5^{*} 0.5=0.125$. However, $n$ will evaluate to 1 only if $s_{1} s_{2}=10$ and input $i$ is 0 , hence $P(n)=$ state probability of state- $3^{*} 0.5=0.0721$. Furthermore, if the state input bits are assumed temporally independent, the transition probability of $n\left(P_{1->0}(n)\right)$ is equal to $0.125^{*}(1-0.125)=$ 0.1093 . However, when the present state is state- 3 and the input is 0 , the next state is state- 4 which always forces $n$ to 0 . Therefore, the actual transition probability of $n$ $\left(P_{1->0}(n)\right)$ is equal to 0.0721 .

Let $f_{t}(n)$ denotes the Boolean value of $n$ at time $t$. The output of $n$ will switch exactly if

$$
\begin{equation*}
g(n)=f_{0}(n) \oplus f_{t}(n) \tag{1}
\end{equation*}
$$


 bilities given the STG and then present an exac
for calculating the signal and transition probabili
 gOHLAN LOVXG GHL $\boldsymbol{Z}$ results and conclusions are presented in Sections




 network, we present an approximate transition p
calculation procedure for large sequential machi and cascading the next state logic of the FSM
note this as a k-unrolled network. Based on the
network, we present an approximate transition meration procedure. We also demonstrate that
state bit signal probabilities can be obtained by
and cascading the next state logic of the FSM signal and transition probabilities by an implicit
meration procedure. We also demonstrate that sion. Using these probabilities, we then calculate
signal and transition probabilities by an implicit


[^1]$$
\sum_{j} P_{S_{j}}=1
$$
where $M$ is the number of states.
The state probabilities can then be obtained by solving the Chapman-Kolmogorov equations [8] as follows:
\[

$$
\begin{array}{rlr}
P_{S_{i}} & =\sum_{j \in I N_{-S T A T E(i)}} p_{j i} P_{S_{j}} \quad i=1,2, \ldots, M-1 \\
1 & =\sum_{j} P_{S_{j}} \tag{3}
\end{array}
$$
\]

where IN_STATE( $i$ ) is the set of fanin states of $i$ in the STG.

Alternatively, we could solve for $S_{i}$ 's as follows. Let $P_{S_{i}}(n)$ be the probability of $S_{i}$ after $n$ cycles. For a discrete-state discrete-transition Markov process, $P_{S_{i}}(n)$ and $P_{S_{i}}(n+1)$ are related by:

$$
\begin{align*}
P_{S_{i}}(n+1)= & \sum_{j \in I N_{-} S T A T E(i)} p_{j i} P_{S_{j}}(n) \\
& i=1,2, \ldots, M-1 \\
1= & \sum_{j} P_{S_{j}}(n+1) . \tag{4}
\end{align*}
$$

Given a set of initial condition $P_{S_{1}}(0), P_{S_{2}}(0), \ldots, P_{S_{M}}(0)$, these equations can be solved iteratively for $n=0,1,2, \ldots$ to determine the state probabilities as a function of $n$.

This process is continued until the state probabilities converge, that is, the difference between $P_{S_{i}}(n+1)$ and $P_{S_{i}}(n)$ for all states is within a user defined tolerance value.

### 2.2 Signal Probability Calculation

Let $S=S_{1}, S_{2}, \ldots, S_{M}$ be the set of states of the FSM, $s=s_{1}, s_{2}, \ldots, s_{N}\left(N \geq\left\lceil\log _{2} M\right\rceil\right)$ be the set of state bits, $P I=i_{1}, i_{2}, \ldots, i_{K}$ be the set of primary inputs and $n$ be an internal node in the combinational circuit of the FSM.

Suppose $f$ is a disjoint cover of the function computed by $n$, i.e.,

$$
\begin{equation*}
f=\sum_{m \in D i s j o i n t+C o v e r(n)} C_{m} \tag{5}
\end{equation*}
$$

where $C_{m}$ is a cube of the disjoint cover. $C_{m}$ is a function of $s$ and $P I$. We partition the inputs to $C_{m}$ into two groups: the symbolic state support $S S_{m}$ which includes all states $S_{i}$ that have set the appropriate state bits, and the primary input support $I_{m}$ which includes the PI inputs of $C_{m}$. Hence $\mathrm{C}_{m}=S S_{m} I_{m}$.

Given a disjoint cover of node $n$, the signal probability of $n$ is given by:

$$
\begin{equation*}
P(n)=\sum_{m \in D i s j o i n t-C o v e r(n)} P\left(C_{m}\right) . \tag{6}
\end{equation*}
$$

Since the primary inputs are independent of the state that the machine is currently in and states of the FSM are distinct, we can write

$$
\begin{align*}
P\left(C_{m}\right) & =P\left(I_{m}\right) P\left(S S_{m}\right) \\
& =P\left(I_{m}\right) \sum_{S_{i} \in S S_{m}} P_{S_{i}} . \tag{7}
\end{align*}
$$

From equations (6) and (7), we have:

$$
\begin{equation*}
P(n)=\sum_{m \in D i s j o i n t-C o v e r(n)} P\left(I_{m}\right) \sum_{S_{i} \in S S_{m}} P_{S_{i}} . \tag{8}
\end{equation*}
$$

Let $S S(n)$ denote the union of the symbolic state supports of all the cubes in the disjoint cover of $n$, equation (8) can be written as:

$$
\begin{equation*}
P(n)=\sum_{S_{i} \in S S(n)} P\left(S_{i}\right) P\left(A U X \_I\left(n \mid S_{i}\right)\right) \tag{9}
\end{equation*}
$$

where $A U X_{-} I\left(n \mid S_{i}\right)$ is a Boolean function (of the primary inputs) which forces $n$ to 1 given that the present state of the machine is $S_{i}$. Equation (9) requires explicit enumeration of the states in $S S(n)$ and is very costly. In [12], a method which employs an implicit enumeration of states using OBDDs is described.

### 2.3 Transition Probability Calculation

The transition probability calculation can be reduced to a signal probability calculation based on equation (1) as shown in Figure 2. $g$ is a function of $P I_{0}, P I_{t}$ and $P S_{0}$. As these input vectors are assumed to be uncorrelated, equation (8) can be used to obtain the exact signal probability of $g$, thus, the exact transition probability of $n$.

## 3 THE APPROXIMATE METHOD

As the number of states is exponential in the number of flip flops, for sequential machines having large number of flip flops, we cannot explicitly build the STG, and thus the exact method cannot be applied. To calculate the signal and transition probabilities of the internal nodes, we have to use the signal and transition probabilities of the state bits. The state bits are correlated and hence their signal probabilities are not 0.5 (which is the probability of a random input). In the following, we describe an approximate method for calculating the signal and transition probabilities using finite network unrolling followed by iterative solution of a system of non-linear equations.

### 3.1 Signal Probability Calculation

Given the state probabilities and the state encoding, the signal probability of state bit $s_{i}$ is given by:

$$
\begin{equation*}
P\left(s_{i}\right)=\sum_{S_{m} \in S_{-} E N(i)} P_{S_{m}} \tag{10}
\end{equation*}
$$

where $S_{-} E N(i)$ is the set of states whose encodings have the $i^{\text {th }}$ bit equal to 1 .

The state bit signal probabilities can also be derived without explicitly calculating the state probabilities which is very costly for sequential machines with a large number of flip flops as described next.

The transition behavior of the STG is implicitly captured by the next state logic of the FSM. Assuming the present state bits are uncorrelated and their signal probabilities are given, the characteristics (i.e. signal probabilities and correlations) of the next state bits can be obtained from the Ordered Binary Decision Diagrams (obdDs) [1] representation for the next state logic. If we unroll and cascade the next state logic $k$ times to form a $k$-unrolled
tical. We thus approximate the signal
stateprobabilities obtained by solving
a specific state encoding, a set of state

> 3.2 Transition Probability
> network with signal probability feedback
$\begin{aligned} & \text { work without any signal probability feed } \\ & \text { to this method as } 1 \text {-unrolled, } O \text {-feedback }\end{aligned}$
$\begin{aligned} & \text { signal probability calculation for the } k \text { - } \\ & \text { guaranteed to converge. }\end{aligned}$

$$
\begin{aligned}
& \begin{array}{l}
\text { Proof As } f_{i}=p s_{j} P\left(\mathcal{F}_{i}\left(\mathcal{P}_{j}=1\right)\right)+(1 \\
0) \text {, we can write }
\end{array}
\end{aligned}
$$

$\begin{gathered}8 \\ 3 \\ 3 \\ 3 \\ 3 \\ 0 \\ 0 \\ 0 \\ 0 \\ 0 \\ 0 \\ \vdots \\ \vdots\end{gathered}$
$\left.\cdots \cdot \tau_{s d} \cdot \operatorname{tsd}^{\prime} \cdot \tau d\right) \operatorname{tf}^{\prime}=\operatorname{Is} d$
to find the steady state bit probabilities,
bit at the input of the $k$-unrolled net
unrolled network, respectively and $\mathcal{F}_{i}$ 's
$\begin{aligned} & \text { where } \mathcal{N}_{i}^{k} \text { and } \mathcal{P}_{j}^{0} \text { denote the } i^{t h} \text { next } \\ & \text { output and the } j^{t h} \text { present state bit at }\end{aligned}$
$\begin{aligned} & 3 \\ & 3 \\ & 3 \\ & 3 \\ & 3 \\ & 3 \\ & 3 \\ & 3 \\ & 0.0 \\ & 0.0 \\ & 0\end{aligned}$







 К7!!!qeqoud [еияі! ory. Power consumption measurement was based on the
following model:



 -svosi pue [6-onON әц jo słasqns Su!̣n squәuụədxa [e,ə






[8] A. Papoulis. Probability, Random Variables and Stochastic Processes. McGraw-Hill, 1984.
[9] K. Roy and S. Prasad. SYCLOP: Synthesis of CMOS logic for low power application. In Proceedings of the International Conference on Computer Design, pages 464-467, October 1992.
[10] G. I. Stamoulis and I. N. Hajj. Improved techniques for probabilistic simulation including signal correction effects. In Proceedings of the 30 th Design Automation Conference, pages 379-383, June 1993.
[11] C. Y. Tsui, M. Pedram, and A. Despain. Efficient estimation of dynamic power dissipation with an application. In Proceedings of the IEEE International Conference on Computer Aided Design, pages 224-228, Nov 1993.
[12] C. Y. Tsui, M. Pedram, and A. Despain. Exact and approximate methods for calculating signal and transition probabilities in FSMs. Technical Report CNEG 93-42, Electrical Engineering-System Department, University of Southern California, October 1993.

| circuit | $\begin{aligned} & \text { Method } \\ & 1 u_{-} o f \end{aligned}$ | $k$-unrolled with $k$-feedback |  |  |
| :---: | :---: | :---: | :---: | :---: |
|  |  | $\mathrm{k}=1$ | k=2 | k=3 |
| bbsse | 20.81 | 4.93 | 1.11 | 0.24 |
| beecount | 6.01 | 2.23 | 0.93 | 0.28 |
| cse | 3.03 | 0.84 | 0.20 | 0.04 |
| dk14 | 2.19 | 0.22 | 0.16 | 0.03 |
| dk15 | 3.97 | 2.38 | 0.15 | 0.02 |
| dk16 | 4.20 | 2.07 | 0.80 | 0.73 |
| dk17 | 5.31 | 3.00 | 0.25 | 0.26 |
| donfile | 10.28 | 8.44 | 2.83 | 1.42 |
| ex2 | 6.92 | 3.58 | 2.80 | 1.00 |
| ex5 | 8.81 | 1.31 | 0.55 | 0.27 |
| ex6 | 12.12 | 8.55 | 1.60 | 0.56 |
| keyb | 12.89 | 1.46 | 0.36 | 0.05 |
| opus | 20.13 | 15.35 | 5.05 | 3.20 |
| planet | 17.35 | 14.80 | 18.13 | 14.89 |
| s 1488 | 10.54 | 8.15 | 2.69 | 1.01 |
| s386 | 24.20 | 1.81 | 0.44 | 0.04 |
| s510 | 11.37 | 6.87 | 5.32 | 4.81 |
| s820 | 13.78 | 2.07 | 0.89 | 0.47 |
| scf | 19.00 | 13.70 | 3.12 | 1.99 |
| styr | 18.70 | 4.16 | 1.32 | 0.72 |
| tbk | 5.17 | 3.87 | 1.45 | 0.93 |
| $\begin{aligned} & \text { average } \\ & \text { error }(\%) \end{aligned}$ | 11.28 | 5.23 | 2.39 | 1.57 |

Table 1: State bit signal probabilities.

| circuit | Exact trans. prob. | $\begin{gathered} \text { Method } \\ 1 u \_o f \end{gathered}$ | k-unrolled with k-feedback |  |  |
| :---: | :---: | :---: | :---: | :---: | :---: |
|  |  |  | k=1 | $\mathrm{k}=2$ | $\mathrm{k}=3$ |
| bbsse | 20.67 | 71.23 | 15.18 | 5.27 | 1.13 |
| beecount | 12.04 | 29.25 | 7.39 | 2.42 | 0.64 |
| cse | 16.09 | 98.55 | 8.10 | 2.31 | 0.53 |
| dk14 | 23.20 | 19.88 | 2.88 | 0.30 | 0.05 |
| dk15 | 19.61 | 11.47 | 5.99 | 1.04 | 0.04 |
| dk16 | 38.43 | 23.18 | 9.52 | 3.79 | 1.45 |
| dk17 | 16.42 | 19.66 | 8.28 | 2.24 | 1.16 |
| donfile | 23.56 | 18.63 | 8.83 | 5.10 | 2.90 |
| ex2 | 29.65 | 41.74 | 34.88 | 11.43 | 4.96 |
| ex5 | 13.61 | 51.20 | 9.17 | 3.97 | 1.25 |
| ex6 | 21.66 | 25.64 | 11.33 | 2.99 | 1.59 |
| keyb | 26.53 | 50.56 | 6.11 | 1.55 | 0.43 |
| opus | 16.59 | 42.79 | 40.23 | 24.11 | 5.11 |
| planet | 69.98 | 42.93 | 40.76 | 28.10 | 26.80 |
| s1488 | 69.58 | 241.14 | 119.05 | 34.79 | 14.60 |
| s386 | 25.57 | 377.13 | 12.20 | 5.80 | 1.33 |
| s510 | 42.82 | 39.31 | 14.56 | 1.48 | 9.04 |
| s820 | 43.55 | 201.45 | 14.22 | 7.80 | 3.88 |
| scf | 59.22 | 102.57 | 79.31 | 38.51 | 18.64 |
| styr | 35.47 | 166.25 | 43.16 | 14.83 | 7.62 |
| tbk | 79.46 | 39.86 | 17.98 | 3.74 | 2.58 |
| average error (\%) |  | 81.64 | 24.24 | 9.60 | 5.03 |

Table 2: Transition probabilities.

| circuit | CPU time in seconds |  |  |  |  |
| :---: | :---: | :---: | :---: | :---: | :---: |
|  | $\begin{aligned} & \text { Exact } \\ & \text { Method } \end{aligned}$ | $\begin{gathered} \text { Method } \\ 1 u_{-0 f} \end{gathered}$ | k -unrolled with  <br> $\mathrm{k}=1$  <br> k  |  | $k$-feedback |
|  |  |  |  |  | $\mathrm{k}=3$ |
| bbsse |  | 3.76 | 3.28 | 5.16 | 8.86 |
| beecount | 2.40 | 1.43 | 1.05 | 1.45 | 2.08 |
| cse | 11.86 | 6.29 | 5.31 | 10.71 | 19.80 |
| dk14 | 6.30 | 3.62 | 2.71 | 4.98 | 8.13 |
| dk15 | 3.86 | 2.31 | 1.63 | 2.36 | 3.20 |
| dk16 | 18.20 | 9.73 | 8.43 | 17.68 | 30.65 |
| dk17 | 3.60 | 2.00 | 1.55 | 2.24 | 3.28 |
| donfile | 26.30 | 6.83 | 5.40 | 11.20 | 21.13 |
| ex2 | 12.41 | 4.75 | 4.22 | 8.00 | 15.40 |
| ex5 | 3.28 | 1.82 | 1.45 | 2.11 | 3.18 |
| ex6 | 5.46 | 3.12 | 2.51 | 4.30 | 6.82 |
| keyb | 34.48 | 11.34 | 8.98 | 20.24 | 36.04 |
| opus | 6.48 | 2.48 | 2.26 | 3.90 | 5.96 |
| planet | 144.54 | 18.00 | 15.51 | 22.68 | 31.80 |
| s1488 | 89.89 | 32.54 | 35.23 | 66.44 | 109.48 |
| s386 | 26.30 | 5.80 | 5.16 | 10.00 | 16.23 |
| s510 | 127.94 | 9.08 | 7.95 | 12.55 | 18.33 |
| s820 | 33.51 | 22.15 | 16.06 | 42.50 | 84.62 |
| scf | 294.50 | 31.30 | 26.48 | 40.09 | 56.40 |
| styr | 38.78 | 23.03 | 19.19 | 43.23 | 84.60 |
| tbk | 127.14 | 74.56 | 60.34 | 209.60 | 905.00 |

Table 3: CPU times for transition probability calculation.

| circuit | Power consumption |  |  |  |  |
| :--- | ---: | ---: | ---: | ---: | ---: |
|  | Exact | Method | $k$-unrolled with $k$-feedback |  |  |
|  | Method | $1 u-0 f$ | $\mathrm{k}=1$ | $\mathrm{k}=2$ | $\mathrm{k}=3$ |
| bbsse | 5072 | 6326 | 5035 | 5037 | 5061 |
| beecount | 2204 | 2422 | 2179 | 2172 | 2197 |
| cse | 4144 | 5842 | 4141 | 4144 | 4147 |
| dk14 | 4913 | 5186 | 4939 | 4916 | 4913 |
| dk15 | 3755 | 3793 | 3751 | 3748 | 3755 |
| dk16 | 11947 | 12487 | 11928 | 11962 | 11972 |
| dk17 | 3934 | 4151 | 4008 | 3945 | 3936 |
| donfile | 7787 | 7872 | 7773 | 7806 | 7812 |
| ex2 | 8695 | 7963 | 7816 | 8377 | 8539 |
| ex5 | 3051 | 3500 | 3037 | 3012 | 3001 |
| ex6 | 4518 | 4866 | 4655 | 4567 | 4529 |
| keyb | 7712 | 9176 | 7638 | 7704 | 7728 |
| opus | 4080 | 4277 | 4238 | 4049 | 4095 |
| planet | 19497 | 19905 | 19778 | 19483 | 19489 |
| s1488 | 22250 | 52481 | 31707 | 26735 | 23679 |
| s386 | 7058 | 14749 | 6763 | 7037 | 7076 |
| s510 | 8709 | 10816 | $9261)$ | 8785 | 8745 |
| s820 | 18078 | 31557 | 18001 | 17878 | 18116 |
| scf | 14309 | 18323 | 15434 | 14351 | 14342 |
| styr | 13020 | 16106 | 13367 | 12953 | 13045 |
| tbk | 74528 | 87964 | 73328 | 73337 | 74505 |
| average |  | 26.83 | 4.30 | 1.51 | 0.63 |
| error (\%) |  |  |  |  |  |

Table 4: Power estimates.

| circuit | No. of latches | Power consumption |  |  |  |
| :---: | :---: | :---: | :---: | :---: | :---: |
|  |  | $\begin{aligned} & \text { Method } \\ & 1 u_{-} \text {of } \end{aligned}$ | k-unrolled network method |  |  |
|  |  |  | $\mathrm{k}=1$ | $\mathrm{k}=2$ | $\mathrm{k}=3$ |
| s344 | 15 | 7256 | 5791 | 5750 | 5707 |
| s382 | 21 | 10099 | 3980 | 3992 | 3976 |
| s400 | 21 | 10640 | 4063 | 4075 | 4060 |
| S444 | 21 | 10174 | 3890 | 3897 | 3881 |
| s526 | 21 | 14202 | 5778 | 5798 | 5762 |
| S641 | 19 | 9500 | 8149 | 8111 | 8098 |
| s713 | 19 | 9848 | 8566 | 8539 | 8522 |
| s838 | 32 | 5934 | 5934 | 5934 | 5934 |
| s953 | 29 | 16440 | 12381 | 11856 | 12367 |
| s1196 | 18 | 28935 | 27889 | 27890 | 27890 |
| s1238 | 18 | 31946 | 30911 | 30912 | 30912 |

Table 5: Power estimates for machines with large number of flip flops (the estimation accuracy increases as $k$ increases).


[^0]:    *This research was supported in part by the NSF's Research Initiation Award under contract No. MIP92/11668 and by APRA under contract No. J-FBI-91-194.

[^1]:    Figure 2: Capture temporal correlation of
    by next state logic.

