EECS 141: Digital Integrated Circuits-Spring 2005
Report Cover Sheet

Term Project: 16-bit Arithmetic Adder
Due: Thursday May 5, 2005

|Names |
|Loi Phat Dinh | |

|Parameter |Pre-design estimate |Actual |Units |
|Adder topology |Modified Brent-Kung |
|Circuit style |Static CMOS |
|Critical path delay |1.77 |1.33 |ns |
|Worst case energy per addition |41 |68 |pJ |
|Layout Area |25380 |27495 |μm2 |

Executive Summary – Adder Design
Overall Design Decisions: The objective of the project is to design a 16-bit adder. The goal of the project is to minimize the delay while keeping power within constraints, as well as minimizing the area. The constraints are as follows: Energy/operation has to be less than 0.5nJ, trise and tfall < 200ps, and use the standard cell layout, which must obey all the design rules. Throughout the course there were several adder topologies with different advantages been introduced, and we finally came to the decision to choose Brent-Kung adder architecture. The reason we chose this topology was because its implementation was simpler than Radix-4 Kogge-Stone tree. As a result the layout of the circuit would be less complicated. Even though it is a bit slower than the Kogg-Stone, it would save us time doing the layout in the end. After choosing the topology, we had to come up with the logic style to implement the blocks such as: , etc. We had implemented the logic blocks using transmission gates to gain more speed. However, for simplicity in sizing the blocks we realized that STATIC CMOS property would make the sizing rather easy using the logical effort approach introduced in the lecture.
Since we wanted the adder to operate as fast as possible, we came up with some ideas to minimize the propagation delay such as: we modified the Brent-Kung tree, added 3 more dot products and cut the number of stages for the critical path from 7 to 5. For the Brent-Kung tree in order to get the carryout C14, the adder has to wait for the product of C7&C11, then C11&13 and finally C13&P14. On the other hand for the modified Brent-Kung tree C7 is directly producted with C13 at the fourth stage. This helps the C14 to be available at the fifth stage. The modified Brent-Kung tree is shown below as well as the critical path, which is highlighted in red.

|Final size for the critical path |
|after the optimization process: |
|s1 |s2 |s3 |s4 |s5 |
|1.00 |3.07 |5.29 |5.45 |10.24 |
|s6 |s7 |s8 |s9 |s10 |
|5.87 |9.99 |6.90 |14.83 |5.50 |
|s11 |s12 |s13 |s14 |s15 |
|11.50 |6.50 |14.50 |7.50 |16.37 |
|s16 |s17 |s18 | | |
|11.9 |14.9 |28.7 | | |

The original size for the critical path has been obtained from the formulas above. We have used this size as a base to optimize the propagation delay for the critical path. Decreasing the resistances would result in a better propagation delay. We decreased the resistances by increasing the size of the transistors along the critical path. As a result the propagation delay has gotten faster. However there is a limit to increasing size of the transistors. If we past a certain limit then the intrinsic capacitance of the transistor will dominate the load capacitance. Hence it will have a negative impact on the delay. Moreover besides optimizing the critical path we also optimized other paths as well to prevent other path become the critical path.

Transistor Diagrams and Sizing

XOR |E |E' |F |F' |AND |P |P' | |NMOS |720 |720 |720 |720 |NMOS |720 |720 | |PMOS |1440 |1440 |1440 |1440 |PMOS |720 |720 | |DOT Product |P |G |G' |Co |Ci |G |P | |NMOS |720 |360 |720 |NMOS |720 |360 |720 | |PMOS |1440 |1440 |1440 |PMOS |1440 |1440 |1440 | |Figure 3-Gate sizes
For our original transistor sizing, since these are built based on STATIC CMOS logic, we sized in such a way that the gates have the same driving capability as a minimum sized inverter with 2 to 1 ratio of PMOS to NMOS. For the transistors on the critical path we simply multiply every transistor’s width by the sizing factor determined using logical effort approach. These sizes were given in the previous page.

Timing and Energy Simulations

Test Vectors:
Figure 4: Test Vectors to verify the functionality of the adder
From the Figure 6, we can see that the functionality of the adder working correct. As you notice in the graph there are glitches in the figure, and this is due to the switching from 0 to 1.

Critical Delay for Differences Test Vectors
Figure 5: Propagation Delays for Different Test Cases
Apparently from the Figure 7 above we have realized there is another set of test vector which also produce the worst propagation delay. Therefore there are two set of test vectors that produce the same worst delay: * A = 0xffff & B = 0x1 * A = 0x0101010101010101 & B = 0x1010101010101010




Static CMOS Co



Figure 2

Figure 2






The test input A= 0xffff, B = 0x1. We can see that the worst delay is 1.33ns which is matched with what we claimed in the previous page. After testing different set of inputs, this set of vector proved to be the worst compared to other test cases.

Since the period of the input pulse B is 80ns, therefore the energy was calculated for the entire period.
Hspice Deck:
.measure tran max integ i(vdd) from=0n to 80n
.measure tran maxcurrent max i(vdd)
.measure energy param=’max*2.5’

Loading estimate:
Rwire=Rsq*L/W=0.75*1.8mm /4»=281.2Ohm
Cin = 24.7fF(from the sum of all the gate input capacitance of the very first stage.)
Since the adder driving a 1.8*L/W=0.75*1.8mm /4λ=281.2Ohm
Cin = 24.7fF(from the sum of all the gate input capacitance of the very first stage.)
Since the adder driving a 1.8mm long bus with 6 loads evenly distributed. Each capacitive load is equal to the adder input capacitance.
CL = 6Cin + Cwire = 195.9fF

Results from Hspice: tp = 1.33ns
Energy = 68pJ

Layout of Dot Product

Layout of Co generator block

Layout of Inv

Layout of AND gate

Layout of XOR gate


Remark: I have designed the schematic for both Brent-Kung and the modified one and tested with Hspice. The results were just as I expected that the modified one was indeed faster than the Brent-Kung.

Loading estimate:
Rwire=Rsq*L/W=0.75*1.8mm /4λ=281.2Ohm
Cin = 24.7fF(from the sum of all the gate input capacitance of the very first stage.)
Since the adder driving a 1.8mm long bus with 6 loads evenly distributed. Each capacitive load is equal to the adder input capacitance.
CL = 6Cin + Cwire = 195.9fF

Results from Hspice: tp = 1.33ns
Energy = 68pJ

Remark: We have designed the schematic for both Brent-Kung and the modified one and tested with Hspice. The results were just as I expected that the modified one was indeed faster than the Brent-Kung.

Layout of 16-bit Adder


and, xor, dot product

Estimated Area:
A=(#NMOS*360 + #PMOS*720)*L*(total Width along the critical path
= (0.72*546 + 0.36*546)*0.24*180 = 25380um^2

