# There is Life Left in ASICs

Leon Stok IBM TJ Watson Research Center Yorktown Heights, NY 10520 USA leonstok@us.ibm.com

## **Categories and Subject Descriptors**

B.7.1 [Hardware]: Integrated Circuits—*Types and Design Styles* 

## **General Terms**

Algorithms, Design

#### Keywords

ASIC, design cost, design tools

## 1. INTRODUCTION

Standard cells have long been an excellent abstraction of technology. ASIC design styles allowed logic designers to very rapidly take advantage of major advantages in silicon technology. For the last few years however, many people have been predicting the death of ASICs. They argue that they are too difficult to design, that the gap between the process technology and the ASIC is growing, and that the cost will make them economically infeasible. In this presentation we will argue that there is a lot of life left in your ASIC. Specifically, there is still plenty of room for improvement in design tools to ensure we stay close to Moore's curve in design performance despite base process technology slowdown. Similarly, cost can be lowered significantly by better optimization, analysis and verification.

Alternatives like FPGAs or fully programmable standard products have been proposed. However, both are significantly more inefficient in their use of power than ASICs, and do not meet performance requirements for many applications under a power budget. Since few attractive silicon implementation alternatives are available to standard cells, it is important that the design tools will step up to the task.

# 2. MANAGING THE ASIC TECHNOLOGY GAP

It has been widely observed that the gap between ASICs and full-custom design is growing. Full custom designs take full advantage of all features a technology offers, while ASICs John Cohn IBM Micro-Electronics Essex-Junction, VT USA johncohn@us.ibm.com

are basically limited by what features design tools gracefully support. There is a gap in the basic layout features that one can take advantage of, in the performance of the circuits and in the features one uses to minimize power. We will look at each of these and argue why this gap might actually start shrinking.

#### 2.1 Performance

Closing on timing in ASICs has indeed become more and more challenging. In earlier technology generations, most of the technology effects could be encapsulated inside the boundaries of the standard cells. No matter how the cells were connected, they were guaranteed to fabricate correctly and meet the timing constraints. However, in the last few process technologies many of the DSM effects have moved outside the standard cell boundary into the interconnect. In response, a whole new suite of design automation tools has emerged to address these problems. Virtual prototyping and interconnect planning tools are becoming more prevalent in the industry. Detailed extraction, cross-coupling, and other analysis tools to address the effects of the wiring on performance are becoming available and included in design closure tool suites.

There is no reason to believe that there is a significant limitation in ASIC performance due to the use of standard cell libraries. IBM is designing 2-6Ghz micro-processors using standard cell based control logic [5]. A critical element is the design of the standard cell libraries themselves. Modern fine-grained libraries provide many more cells sizes to ensure that we do not overdrive nets and reflect unnecessary additional load back to the inputs. Libraries developed for IBM micro-processor design contain additional cells with multiple beta ratios to optimize for rise and fall times and tapered cells to optimize for staggered arrival times at the inputs. Optimization algorithms have been developed that can effectively deal with these multi-dimensional libraries [4]. Well designed libraries coupled with strong synthesis algorithms have allowed for synthesis of very high-frequency Random Logic Macros (RLMs). Techniques like automated RLM tuning using a tool like EinsTuner allow one to push the performance even further within a standard cell design framework.

Global interconnections will indeed become more of a bottleneck for larger ASICs running at higher frequencies. Interconnect pipelining and retiming will become as essential functions as buffering and wiresizing are now. Networks-onchip will provide system level paradigms that can cope with the additional latency introduced in these interconnections.

A significant piece of the performance gap is caused by the

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee.

ISPD'03, April 6-9, 2003, Monterey, California, USA.

Copyright 2003 ACM 1-58113-650-1/03/0004 ...\$5.00.

guardbanding of the process that is done in ASICs, compared to full custom designs. Despite many of the efforts from the process teams to keep process technology under control, and increased use of software assisted RET tools, variability such as ACLV will increase substantially in future process generations. Traditionally, in ASIC design we have taken a worst-case approach in characterizing all steps of the process. This is obtained by relative conservative characterization of the libraries and interconnect. However, a worst case analysis among the many different process corners of the front-end of the line and the back-end of the line, will become overly time consuming and pessimistic. Many analysis runs will deteriorate the turn-around time of ASIC design. Too pessimistic analysis will leave too much on the table which will affect the cost.

Statistical analysis techniques, such as statistical timing analysis [2] are emerging and will become very important to reliably and rapidly sign-off on many ASICs. More important, they will reduce the cost of current overdesign. The ASIC standard cell abstraction is an important level at which we can build appropriate models for statistical analysis tools. But a tremendous amount of work needs to be done on what variations need to be modeled, and how to efficiently incorporate these in large scale analysis. However, once statistical design becomes acceptable, significant guardbanding can be removed from the ASICs methodology.

#### 2.2 Power

Power is a major concern in nearly all ASIC designs. Consumer products need to operate without fans, hand-held products on batteries, and network products need to fit into closets in utility buildings with limited power delivery. One of the major advantages ASICs have over programmable fabrics like FPGAs is their power consumption. Only gates needed for the particular function are included on the chip. It is essential that ASICs take advantage of all features the technology offers to minimize the active power consumption and leakage. Modern technologies offer devices with multiple thresholds, operating on multi-voltages and facilities to group them in voltage or power islands. Finding the right combination of these to minimize power consumption while meeting other design targets is a daunting task.

To minimize power, more dimensions are added to the libraries described above. Cells with multiple threshold levels and cells at multiple voltages are included. There is enormous leverage in selecting the right combination of voltages, thresholds and cell sizes. These multiple dimensions are nearly impossible for a circuit designer to handle and design automation tools are an absolute requirement. However, a structured design of the library along orthogonal axis is needed to develop effective optimization algorithms.

We have done a case study and developed a library and optimization tools along the dimensions outlined above. Compared to a baseline ASIC design, a factor of 4 improvement was realized in terms of the delay, area, power product metric [6]. Active power dissipation itself was reduced from 0.13W to 0.052W.

More recently the concept of voltage-islands has been developed to manage power by allowing different sections of the chip to run at different operating voltages. Technology provides coarse and fine grained island options. In the coarser versions, whole IP blocks are run on different voltages, while in the finer grained versions only circuit rows or part of circuit rows will run at different voltages. Lackey [3] illustrates the use of coarse grained voltage islands in an ASIC for high-speed serial links where the power consumptions can be reduced from 36.6W to 30W. Voltage and power islands lead to interesting design partitioning problems that cross the timing, power-grid design and placement domains. Requiring rigid partitioning up front, certainly simplifies the problem, but drives towards suboptimal designs, and more refined techniques will lead to larger power savings.

Power minimization also leads to interesting new optimization problems when coupled to cost. For example, how does one minimize the number of threshold voltages to be used to minimize the number of extra masks while still meeting all other design constraints? Optimizing for power is probably one of the biggest levers in reducing the ASIC cost, when sufficient reduction can be obtained such that the design can go in a cheaper package.

## 2.3 Process complexity

Until recently most of the advantages in process technology came from a linear scaling driven by lithography improvements. Full custom designers knew what topologies worked, and which ones did not and pretty much stuck to the good ones and scaled them to new technology generations. This situation is changing rapidly. New limitations on lithography will require strong resolution enhancement techniques (RET) to be used. These result in many non-linear scalings of the design rules.

Standard cell abstractions are well positioned to cope with these developments. Standard cell design images can be perfectly tuned with respect to the range of interaction of litho effects. Many of the effects can be contained and analyzed within the cell boundaries. The few effects that extend beyond the cell boundary can be enforced by providing multiple copies of particular cells that satisfy the correct boundary conditions. Alternate Phase Shift Masking is one of the example of techniques considered. By providing multiple phase colored copies of standard cells, the placer can automatically ensure that cells are legally abutted to each other.

Ever more strict design rules might enforce layouts to adhere to much stricter pitches and layout grids. In the standard cell footprints, different design grids can be introduced for the various process layers, such that the features on these layers will print with the highest resolution. There is a wide open area of research to effectively link the best designs for manufacturing to chip images and standard cell layout topologies, and provide cell layout generation tools for these. But interestingly, these gridded approaches might make very high quality automated cell layout generation more feasible than ever before.

## 3. MANAGING ASIC COST

The drastic rise in mask costs has spawned many predictions that ASICs will disappear. However, there are many avenues to pursue to keep the cost of fabricating ASICs economically viable. Sharing masks among multiple part numbers will become an increasingly viable option for lower volume parts. Increased emphasis on yield learning for masks will drive down costs more rapidly when technologies mature. Maskless lithography based on optical micro-mirrors are emerging that are viable for lower volume ASICs, customization of mostly predefined chip images or ECOs by mask repair. In addition, there are many things that can be done in the design methodology and design tools to drive down mask costs.

#### **3.1** Minimize number of design spins

Anecdotal evidence suggests that an industry average of 2-3 spins are required to get an ASIC right. A robust "firsttime right" methodology with "first-time right" tools will significantly reduce the average amount to be spend on mask sets for ASICs. A solid logical verification strategy will be able to minimize the engineering changes required due to logical bugs. Robust timing and signal analysis tools will be able to avoid noise problems. Design of robust power delivery grids and robust clocking strategies will further avoid many signal integrity problems. Key to this is sufficient detailed electrical analysis. The current bottleneck are analysis algorithms that can provide detailed analysis across multiple domains on very large networks.

## 3.2 Reuse and repair

New and interesting technology options can be provided in an ASIC framework to repair defects and allow reuse of the same design in multiple applications. Autonomic devices are starting to make their way into DRAM designs. Current IBM ASICs use a type of embedded DRAM that we test and fix, if required, with a laser editing technique before shipping. Once sold and installed, malfunctions can only be fixed by ripping-and-replacing. The next ASIC generation will use new autonomic technology, in which the chip logic will query its DRAM cache and fix problems by electronically blowing pre-set fuses to correct glitches.

Embedded FPGAs [7] are being added to the ASICs roadmap. eFPGAs allow for ECOs without refabrication. eFPGAs can extend the lifetime of a chip by allowing upgrades for future product lines or evolving standards. As pointed out by Zuchowski et al., including eFPGAs in ASICS leads to many new challenges. The physical integration of large metalintensive FPGA cores is challenging for floorplanning and chip physical design. Wide differences in power and performance specifications for the two technologies create unique challenges for design planning, logic partitioning, synthesis, and timing. As an example, eFPGA partitioning needs to be done such that it increases the chances that a particular ECO can be handled by the programmable fabric. Embedded FPGAs are technically possible, but their economic feasibility will heavily depend on design tools that can capitalize on their advantages.

#### **3.3 Design Size, Time to market**

The cost advantage of future ASICs relies heavily on the continued integration of more function on a single die. Design sizes will therefore continue to drastically increase, for next generation ASICs to remain economically feasible. One of the major advantages of ASIC design styles over other fabrics is the enormous number of usable gates that can be placed on a chip.

Great opportunities exist for improved DA tools to effectively support these growing design sizes. Tools like PDS (IBM's Placement Driven Synthesis system) have given large productivity boosts and significantly shortened the time to market. However, many of the design tools currently in use, produce results far from optimal, especially for larger designs. For example, in IBM we have access to a variety of placement algorithms and see vastly different results on a particular design. Looking at many different designs, no algorithm is a clear winner, in some designs one algorithm produces the best results in other designs the other. The difference in quality of results, being it wire length, routability or delay are remarkable. The study in [1] confirms that even only from a wire-length perspective, many of the algorithms in practical use leave much on the table.

Design tools have failed to capitalize on the vast compute power that modern system architectures like SMPs, or modern networks architectures like GRIDs make available to us. On-demand computing will makes these resources available in very large quantities and we need to capitalize on this to obtain better turn-around times on very large designs.

#### 4. CONCLUSIONS

The introduction of powerful design automation tools made the standard cell design paradigm an economically thriving one. A next generation of design tools will lift this paradigm to the next level, and ensure that ASICs will keep up with Moore's law.

- Increased coupling between design, design automation and process teams is needed for leadership. This will be difficult for stand-alone foundries to achieve. A tight coupling between technology definition, library and chip image design, design tools and designers are essential to design next generation standard cell designs.
- Efficient use of all options process technology provides us can only be obtained by very effective use of sufficiently smart design tools. An ASIC design style has been and will continue to be a useful abstraction to base these design tools on.
- The ASIC sign-off business models will be changing. New design service and COT models will emerge. But ASICs as a standard cell design methodology will also be the cornerstone of these new offerings.

#### 5. **REFERENCES**

- C.-C. Chang, J. Cong and M. Xie, "Optimal Scalability Study of Existing Placement Algorithms," Asia South Pacific Design Automation Conference, Kitakyushu, Japan, pp 621-627, January 2003.
- [2] J. A. G. Jess and K. Kalafala and S. R. Naidu and R. H. J. M. Otten and C. Visweswariah, "Statistical timing for parametric yield prediction of digital integrated circuits", Proc. 2003 Design Automation Conference, June, 2003, Anaheim, CA.
- [3] Lackey, D.E.; Zuchowski, P.S.; Bednar, T.R.; Stout, D.W.; Gould, S.W.; Cohn, J.M.; "Managing power and performance for system-on-chip designs using voltage islands", IEEE/ACM International Conference on Computer Aided Design, 2002 Page(s): 195 -202
- F. Beeftink and P. Kudva and D. Kung and L. Stok,
  "Combinatorial cell design for CMOS libraries", Integration, the VLSI Journal, vol 29, no 4, pp 67-93, 2000.
- [5] Northrop, G.A.; Pong-Fei Lu; "A semi-custom design flow in high-performance microprocessor design", in: Design Automation Conference, 2001, Page(s): 426 -431
- [6] Sunderland, David, Cohn John, Stok, Leon, "Multi-Mission, Multi-Kernel Processor (MKP) Study" DARPA Contract MDA972-01-C-0015, November 2002.
- [7] Zuchowski, P.S.; Reynolds, C.B.; Grupp, R.J.; Davis, S.G.; Cremen, B.; Troxel, B. "A hybrid ASIC and FPGA architecture" IEEE/ACM International Conference on Computer Aided Design, 2002, Page(s): 187-194