Hi Weng,
Weng Tianxiang said:
NAND comprises 2 P-type pass gates serial coupled to VC and 2 N-type
pass gate parallel coupled to ground and control input of each P-type
pass gate and N-type pass gate is coupled to one of input.
One 2-input multiplexer has a P-type pass gate and a N-type pass gate.
Their select input terminal are tied together and their 2 input
terminal are coupled to 2 inputs and their outputs are connected
together.
Which one is faster? Can you give some data on them. I don't have ASIC
experiences and don't have its related time estimate software
experiences. I want someone to provide some tips on its speed level.
It's long since I did full custom layout and SPICE simulations on the
layout, so I can't give you have hard data. But a NAND2 gate in a
commercial 130nm process is less than 100psec (depending on output
drive etc).
In the standard 2-input NAND implementation each input will see a
p-mos and an n-mos gate, and the output drive is directly from
VDD/VSS.
Time for some ascii-art (please use a fixed-pitch font when viewing this):
VDD
-------------
| |
|- |-
A-o| B-o|
|- |-
| |
+------+--- Y
|
|-
A--|
|-
|
|-
B--|
|-
|
-------------
VSS
Pass-gate/mux implementation:
B
|
o
---
| |
VSS--- -----+
| | |
--- |
| |
+--!B +---Y
| |
o |
--- |
| | |
A--- -----+
| |
---
|
|
B
(please bear with my lazyness - !B is input B inverted)
The first implementation requires 4 transistors, is quite symmetric
wrt rise/fall time and delay from input to output.
The second will have a inferior drive to VDD/VSS when B=1 and good
drive to VSS when B=0. Input delay timing will obviously be different
for the two inputs, and it's going to take up more area (6
transistors).
I'd seriously doubt that the second implementation can be better than
the first implementation on any parameter (area, timing, power) -
assuming they are designed for driving the same output load, and
having /reasonably/ identical delay timing from A and B to Y.
Using pass-gates to implement complex logic can be very efficient,
when you are designing a well-controlled full-custom block (e.g.
ALU/MUL/RAM/CAM blocks, where the signal will be 'refreshed' before it
is sent outside the block), but for fundamental gates like a NAND2
gate I would say that it's not their day anymore.
For 180nm and below, most of the delay is not in the gates themselves,
but normally in the wires between the gates. Hence, you need a good
clean drive out of your gate onto the next, in order to get the best
timing.
I hope this has answered your questions.
Kai