# Tools and Devices Supporting the Pseudo-Exhaustive Test

Sybille Hellebrand, Hans-Joachim Wunderlich

Institute of Computer Design and Fault-Tolerance

(Prof. Dr. D. Schmid) University of Karlsruhe P. O. Box 6980 D-7500 Karlsruhe Federal Republic of Germany

## Abstract:

In this paper logical cells and algorithms are presented supporting the design of pseudo-exhaustively testable circuits. The approach is based on real hardware segmentation, instead of path-sensitizing. The developed cells segment the entire circuits into exhaustively testable parts, and the presented algorithms place these cells, under the objective to minimize the hardware overhead.

The approach is completely compatible with the usual LSSDrules. The analysis of the well-known benchmark circuits shows only little additional hardware costs.

*Keywords:* Pseudo-exhaustive test, automatic design for testability.

### 1. Introduction

The pseudo-exhaustive test, sometimes called verification test, retains almost all benefits of an exhaustive test, whereas the complexity is reduced in many cases [McBo81].

For a primary output o of a combinational circuit, the cone  $C_0$  is the minimal subcircuit containing all nodes, which are on a path to o (fig. 1).



the number of primary outputs. This test strategy is also applicable to sequential circuits, if a scan path is integrated. An obvious advantage is the high fault coverage, within a cone all combinationally faulty functions are detected. A faulty sequential behavior induced by stuck-open faults can be detected by applying special pattern sequences as described in [WuHe88]. The high fault coverage is obtained without an expensive test pattern generation, optionally the number of necessary patterns can be reduced by some compaction techniques [McCl84a], [Aker85]. Unfortunately a pseudo-exhaustive test is not applicable, if a primary output depends on a very large number of primary inputs. This problem is solved by circuit segmentation, where essentially two techniques are proposed: Sensitizing techniques assign fixed values to some primary inputs, in order to sensitize paths to some regions of the circuit which are exhaustively tested [Udel86], [McBo81]. Hardware segmentation cuts some circuit lines logically during a test mode by some additional circuitry [McBo81].

In the presented approach, the hardware segmentation is supported. A uniform technique is presented integrating the scan design and the segmentation of the combinational network, in a similar way as proposed in [Bhat86]. The authors of this paper propose to integrate LSSD-latches into the combinational network, such that the circuit function is not altered. But the system operation of the circuit is slowed down, and they have to integrate additional latches only for timing reasons. This leads to a considerable hardware-overhead.

In the presented approach, we use special segmentation cells, which are compatible with the usual LSSD design styles, as e.g. the L1/L2\*-technique, and which do not alter the system mode of the circuit in any way. Benefits are a less complex partitioning problem, lower hardware-overhead, and minimal drawbacks concerning the circuit speed. The cells are placed into the combinational network and can be used as pseudo-primary inputs or outputs. In order to minimize the hardware overhead, several cell placement algorithms based on hill-climbing and simulated annealing techniques have been implemented.

In the second section of the paper, we discuss the classical hardware segmentation techniques, sketch some rules of the LSSDtechnique and state some requirements the segmentation cells have to meet. The new cells are presented in more detail in section 3. Section 4 gives a graph-theoretic formulation of the cell placement problem. Since this problem is np-complete, we use some heuristics to obtain suboptimal solutions by hill-climbing algorithms. The algorithms and results are discussed in section 5. In order to validate the quality of the obtained results a simulated annealing procedure has been implemented additionally.

## 2. Classical Segmentation Techniques

As the pseudo-exhaustive test in the classical sense is only applicable to combinational networks, a scan design is mandatory. The classical LSSD-approach puts some restrictions on the network, which are sketched in this section [EiWi77]. Finally, we investigate some drawbacks of the classical multiplexer partitioning technique to these LSSD-requirements.

### 2.1. LSSD-techniques

A detailed description of LSSD-techniques is found in survey papers [McCl84b] or textbooks [Fuji85], [McCl86]. Therefore we only discuss the principles, which are of importance to the segmentation cells proposed in section 3. For the sake of simplicity, we only describe the 2-phase clock technique. But the presented approach is compatible with a general LSSD-architecture. The basic storage device is a hazard-free polarity-hold latch, which may be implemented by transmission gates as shown in figure 2.



Figure 2: Implementation of a polarity-hold latch.

This latch has the data input D clocked by system-clock CLK, and the shift data input SDI clocked by A.

Design rules have to guarantee a stable behavior of the circuit function. These rules are fulfilled if for example the circuit is organized in a so-called L1/L2\*-configuration. Two non-overlapping system-clocks CLK1 and CLK2 control two disjoint sets of latches. The latches controlled by CLK1 are called L1-, and the latches controlled by CLK2 are called L2\*-latches. The combinational network is partitioned into two disjoint sets N1 and N2, such that the L1-latches receive their data from the N1network, but they do not give any data to N1. Analogously, the L2\*-latches receive their data from N2, but do not give data to N2. Figure 3 shows the structure of a circuit designed in L1/L2\*-technique.



Figure 3: L1/L2\*-technique.

During test mode the latches are combined to shift register latches (SRLs) as shown in figure 4. All SRLs are interconnected into a scan path, which is controlled by the two nonoverlapping shift-clocks A and B.



Figure 4: Shift register latch within an L1/L2\*-configuration.

There are programs reported for checking a partitioning of the circuit according to these rules, and supporting the partitioning [GoFB77]. The rules are easily extended according to schemes of multiple clocks, and any additional "design for testability"-features have to be compatible with these LSSD-rules.

### 2.2. Multiplexer partitioning

One of the first papers describing the design of pseudo-exhaustively testable circuits proposed multiplexer partitioning [McBo81]. Figure 5 shows a partitioning into two parts by multiplexers.



Figure 5: Multiplexer partitioning.

During system operation, the 1-inputs of the multiplexers are sensitized, and the subcircuits G1 and G2 are connected. During the test of G2, at M1 and M3 the 0-inputs are sensitized, and at M2 and M4 the 1-inputs are activated. The inverse controls test subcircuit G1. This original technique has some disadvantages:

- 1) There is a large overhead due to wiring, since there are multiple control lines for the multiplexers, and additional lines are necessary to control and observe the node itself.
- 2) For every cut node, there are two multiplexers within the data path.
- 3) If several segments have to be tested, a complex control of the multiplexers is required. Moreover the segments can not be tested in parallel, and the multiplexers are not tested exhaustively.
- Some layout-dependent faults are not tested. These faults can only be detected during the system mode.
- 5) The inputs A, B, and the outputs C, D are latches in the LSSD design, where new dependences are introduced. This may destroy a given partitioning according to the LSSD rules.

#### 3. Segmentation cells

The concept of segmentation cells integrates network partitioning and scan design. The cells are placed into the combinational network, where during system operation they are a simple connecting line, but during the test mode they are a latch within the scan path (fig. 6).



Figure 6: Integrating segmentation and scan design.

The additional test circuitry should fulfill the following requirements:

a) All faults within the segmentation cells are detectable during the pseudo-exhaustive test.

- b) The additional silicon area is minimized.
- c) There are no serious impacts on the system behavior of the circuit.

These requirements are met by the segmentation cell of fig. 7 to a wide extent. Essentially the cell is a L1-latch, and two of them form a  $L1/L2^*$ -configuration. They may be part of a usual scan path.



Figure 7: Segmentation cell.

Signal S determines system and test modes. For S = 1, data-input D and output Q are directly connected. During the test mode, if S = 0, the cell works like a usual latch within the scan path. The cell satisfies the segmentation function like the multiplexer solution, but avoids the mentioned disadvantages:

- 1) *Wiring:* The segmentation cells are members of the scan path, and they can be wired by a suitable program, automatically after the design, regardless of their order. This very high degree of freedom minimizes the overhead (see [AgJS84]).
- 2) **Delay:** Only the transmission gate T1 is member of the data path, in contrast to the multiplexer approach.
- 3) *Control:* For segmentation, only S := 0 is set, and all cells are controlled by the same signal.
- 4) *Fault coverage:* Every part of the data path D-Q is used during the test mode, and a pseudo-exhaustive test is an exhaustive test of most parts of the segmentation cell. The system signal S is easily tested, since CLK and S are symmetric and directly accessible.
- 5) LSSD-compatibility: The placement of the segmentation cells does not introduce any new dependences of state variables within the original circuit. For this reason a LSSDpartitioning of a circuit remains valid after segmentation. But additionally the segmentation cells have to be partitioned according to the system clocks, too. In few cases, this may require a double-latch configuration of the segmentation cells.



Let C be a combinational circuit, and let  $\in \mathbb{N}$  made a minimal set of some more ion cells, such that every node depends on at most prime or pseudo-primary inputs.

This problem can regarded as a graph-theoretic problem. In the following a combinational circuit is represented by an acyclic directed graph G = (V,E) with vertices V and edges  $E \subset V^2$ . The vertices of G correspond to the nodes of the circuit, and there is an edge  $(v,w) \in E$ , if and only if node v is input of a gate with output w. If a segmentation cell is built in a circuit node, this can be modeled as a "cut" of the corresponding vertex in G. Figure 8 shows a circuit with segmentation cells and the corresponding circuit graph.



Figure 8: Circuit with segmentation cells and its representation as directed graph.

The cut of a vertex  $v \in V$  transforms G into a graph  $G_{\{v\}}$ ; v is replaced by two new vertices  $v_o$  and  $v_i$ , such that  $v_o$  represents an additional output and  $v_i$  an additional input of the circuit. As illustrated in figure 9,  $v_o$  has no successors, and its predecessors are the predecessors of v, whereas  $v_i$  has no predecessors and its successors are the successors of v.



Figure 9: Cut of a vertex  $v \in V$ .

One easily verifies that for  $v, w \in V$  the cuts are independent of the cuts are inde dent of their order. Thus for a set  $W:=\{w_1,\ldots,w_m\}{\subset} V$ the cut of G along W as  $G_W := (V_W, E_W) := G_{\{W\}}$ For a vertex  $v \in V$  the natural numb primary inputs which are conne this we can state our segmentatio **Problem OCS** (Optimal Circui Let G := (V, E) be an acyclic dire gra set  $W \subset V$  of size  $k \leq |V|$  such all v (V<sub>W</sub>,E<sub>W</sub>) have dependence level mo For any cut along W,  $d(v) \leq$ checked in nearly in linear time. Generating and checking all cuts would take exponential effort. Unfortunately we can spect an algorithm of a better worst case complexity:

### *Theorem 1:* OCS is np-complete for

A complex proof of this theorem is found in [Bhat86], a shorter reduction in [Hell90, Wu90].

Since OCS is np-complete, we refrain from looking for minimal solutions, but present some efficient heuristics in the next section.

### 5. Hill-Climbing procedures for OCS



Z∈ d(v) denote th For OCS the set states is fv $\in$  V be th determines a cut. e admissible states are  $\{x \in p(fv) | |sd(x)|$  $\forall v \in V_Z : d(v) \leq in$  The cost function k:  $\{y \in p(fv) | (d(y) >$ |Z|, corresponds to the necessary number of segmenta In addition to that we d e a heuristic function h: minimal length, such that evaluate states:  $\forall \ i{<}k{:} (|sd(\omega_i)|=1) \land (v^* \in X \cup pd(fv) \cup \bigcup_{y \in Y} pd(y)).$  $h(Z) := \sum \ln(d($  $\{v \in V_Z \mid d(v) > v \in$ in v\* is called the push-forward of v. the number of vertices which assume an enumeration A global ootained by th procedure glob( mal ., too. Left to the reader. is the se of all ordered tupels instead of sets. e T<sub>G</sub>, where equent stat root  $Z_0$  is the empty tupel, and the direct successors of  $Z \in$  $sd(Z) := \{ <\!\!v_{i_1}, \ldots, \!v_{i_k}, \!v_{i_{k+1}}\!\!> \mid Z = <\!\!v_{i_1}, \ldots, \!v_{i_k}\!\!>, i_k < \!i_{k+1} \le n \}.$ another state U according to the following rule: We have to look for an admissible state having shortest distance

to the root. Of course  $W := \langle v_1, ..., v_n \rangle$  is admissible, and we set **current** := n. We carry out a depth-first search within this tree, but never go deeper than current. If we find a new admissible state at depth k $\leq$ n, we update **current** := k.

## end-glob.

The

We d

b)

The procedure glob provides all admissible states of minimal costs, but it has an exponential worst case complexity. More efficiently problems of CO can be dealt with by hill-climbi procedures [Rich83] <u>A hill-climbing procedure for OCS ba</u> on a "divide-and-cor principle can be built using the fo wing concepts.

vell-known benchma Definition 1: I .E) be a ci p(v) deno DC CMP DC predecesso is the sograph ( 27 20 56 8 20 0 tion 19 34 14 8 20 9 18 37 51 21  $\min\{j \mid d(v_i)\}$ 108 45 87 138 63 96 nstruc search graph 52 42 93 65 115 c6288 (Vne cuts, and an edge in)∈ exis: 0.0 79 a) fv  $\in$  V\Z<sub>1</sub> is the first viola , and Table lls by DC n lution for OCS of the  $W \subset p(fv)$  is an optimal

C(fv) in  $G_{Z_1}$  and  $Z_2$ The search is started at Z = Ø, an<u>d **us**til</u> an admissible state Zis th  $(Z,Z_1) \in$ reached, one branches from Z to a  $z_1$  v and L  $= \min\{ h(\tilde{Z}) \mid (Z,\tilde{Z}) \in$ 

Searching  $W \subset p(fv)$  in b) is done by the procedure glob. Let pd(v) be the set of direct successors of a vertex  $v \in V$ . Then W := pd(fv) and **current** := |pd(fv)| form a (sub-) optimal solution. Therefore the search space can be reduced drastically.

A further reduction of the search space is possible if not all optimal solutions for OCS in C(fv) are taken into account.

Let G := (V ) an acyclic directed graph, set of direct successors for  $v \in$ t viola<u>tio</u>n in G, and  $v \in p(fv)$ .

 $-|p_{m(x)}|+1) \wedge (|pd(y)| > 1)\}.$ 

e is a uniquely determined path (v =  $\omega_0$ , ...,  $\omega_k$  = :v\*) of

**Theorem 2:** Let  $fv \in V$  be the first violation in G and W :=  $w_1,...,w_{\blacksquare}$  ⊂ pf(fv) an optimal solution of OCS in C(fv). Then the busic forward W\* :=  $\{w_1^*, \dots, w_m^*\}$  of W is an opti-

hill-climbing procedure is augmented by backand by an additional rule. If a certain number of subansitions did not reduce the heuristic h, one reearlier state Z and selects another successor of Z. If this back-tracking also fails in an improvement, one jumps to

Let  $Z \subset V$  define a cut  $G_Z := (V_Z, E_Z)$ , let  $f_V \in V_Z$  be the first violation in G<sub>Z</sub>. Let  $v \in pd(fv)$  be selected such that d(v) =max d(w). Jump to the state  $\underline{U} := Z \cup \{v\}$ .

ill-c

pd(fv

 $w \in pd(fv)$ 

This defines the complete worst-case complexity of O algorithm named DC (divide in PASCAL under the UN comparison, an algorithm has been implemented, to of segmentation cells for the

nbing proce ving a  $\cdot O(|V|^2)$ . Th itation segm nd onquer) has bee im temented peration system. reasons of MP based on the p er [RoLa84] able 1 shows the ired number circui CMP



- a) fv  $\in V \setminus Z_1$  is the first violation in  $G_{Z_1}$ , and
- b)  $Z_2 := Z_1 \cup \{v^*\}$ , where  $v \in p(fv)$  is a node of  $V_{Z_1}$  with minimal  $h(Z_1 \cup \{v\})$ .

This proceeding uses the fact, that surely one of the predecessors of the first violation must be cut, and this may be a

| pushed-forward vertex. This restricts the search back and a vertex is cut minimize the global heuristic. T in Igorithm called GL has a worst is complexity of $O( pd(f   ) \cdot O( \nabla ^2))$ . Table 2 shows for GL is a for DC the necessary is ber of cells and the computing time reasure ion a SUN 3/280 imputer |                                                                                                                                |                                                                                                                   |                                                      |                                                      |
|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------|------------------------------------------------------|------------------------------------------------------|
|                                                                                                                                                                                                                                                                                                                          | =1                                                                                                                             |                                                                                                                   | = 1                                                  |                                                      |
| circuit                                                                                                                                                                                                                                                                                                                  | DC                                                                                                                             | GL                                                                                                                | DC                                                   | GL                                                   |
| c432<br>c499<br>c880<br>c1355<br>c1908<br>c2670<br>c3540<br>c5315<br>c6288<br>c7552                                                                                                                                                                                                                                      | 27 (4013)<br>8 (104)<br>19 (8)<br>8 (7455)<br>21 (6326)<br>45 (55267)<br>87 (22438)<br>52 (19637)<br>93 (15390)<br>100 (74316) | 27 (25)<br>8 (10)<br>16 (14)<br>8 (85)<br>22 (140)<br>33 (123)<br>90 (858)<br>62 (411)<br>98 (4647)<br>117 (1667) | $\begin{array}{cccccccccccccccccccccccccccccccccccc$ | $\begin{array}{cccccccccccccccccccccccccccccccccccc$ |

n cells and computing time (in hill-climbing procedures DC Table 2: Number of segme brackets) in sec or th and GL.

An analysis of table 2 sho c5315 the faster global he For the c2670-circuit even Table 3 gives an idea of t mentation. Column 1 der

is that only for the circuits c7552 and ic results in a distinct deterioration. ar better results are obtained.

hardw e-overhead required for segtes the necessary minimal number of segmentation cells for = **Equation** = **Eq** terms of primary inputs, and the last column is the percentage of nodes to be cut.



A common provide n of hill-climbing procedures is the occurence of local minimater apart from the global optimum. Such local, suboptimal solutions have been reached by GL at circuits c6288

and c7552 for = 2 **...** herefore a simulated annealing algorithm has been implemented in order to validate the obtained solutions. It tries to make use of the specific problem structure as far as possible while preserving the necessary conditions for convergence. A detailed description is found in [HeWu88].

Applying simulated annealing to the benchmark circuits we obtained the following results. Starting with an initial state of relatively high costs the algorithm came to results close to those of DC, but in no case there was an improvement of the best results received by DC or GL respectively. For the c880-circuit for example simulated annealing was started with the result produced by CMP with costs 34. The final state of the annealing procedure had costs 22, whereas DC resulted in costs 19. A detailed analysis of the annealing process for this circuit showed that for small values of c still about 1% of the generated transitions where accepted. But although transitions increasing the cost function were no longer accepted, a further improvement of the result could not be achieved.

The behavior of the annealing algorithm validates the fact, that the results obtained by the hill-climbing algorithm are very close to a global optimal solution. On the other hand experiments with simulated annealing have shown, that good tailored heuristics often are superior to simulated annealing methods [NaSS86] This statement seems to be confirmed by our experiments with the OCS problem.

### Conclusions

Hardware and software problems have been solved in order to implement a pseudo-exhaustive test. Segmentation cells have been developed compatible to the usual LSSD-rules. Efficient segmentation algorithms have been proposed, resulting in a minimal number of nodes to be cut. Combining the hardware and the algorithms, a pseudo-exhaustive test can be implemented automatically at low additional cost.

#### References

- Agraval, V. D.; Jain, S. K.; Singer, D. M.: Automation in AgJS84 Design for Testability, Proc. IEEE Custom Integrated Circuits Conference, pp. 159 - 163, 1984
- Akers, S. B.: On the Use of Linear Sums in Exhaustive Testing, Proc. 15th International Symposium on Fault-Tolerant Aker85 Computing, 1985
- Bhatt, S. N.; Chung, F. R. K.; Rosenberg, A. L.: Partitioning Bhat86 Circuits for Improved Testability, Advanced Research in VLSI: proc. 4-th MIT conference, pp. 91 -106, April 7 - 9, 1986
- EiWi77 Eichelberger, E.B.; Williams, T.W.: A logic design structure for LSI testability, Proc. 14th Design Automation Conference, pp. 462-468. June1977
- Fujiwara, H.: Logic Testing and Design for Testability, MIT Fuji85 Press, 1985
- GoFB77 Godoy, H. C.; Franklin, G. B.; Bottroff, P. S.: Automatic checking of logic design structure for compliance with testability groundrules, Proc. 14th Design Automation Conference, June 1977, pp.469 - 478
- D'I', pp. 476 Hellebrand, S: Synthese vollständig testbarer Schaltungen, Ph. D. Thesis, University of Karlsruhe, 1990 Hell90
- Hellebrand, S.; Wunderlich, H.-J.: Automatisierung des Entwurfs HeWu88 vollständig testbarer Schaltungen, Proc.18. Jahrestagung der Gesellschaft für Informatik, Hamburg 1988, Informatik-Fachberichte 188, Springer-Verlag
- McCluskey, E. J.; Bozorgui-Nesbat, S.: Design for Autonomous McBo81 Test, IEEE Transactions on Circuits and Systems, Vol. Cas-28, No. 11, November 1981
- McCl84a McCluskey, E. J.: Verification Testing - A Pseudoexhaustive Test Technique, IEEE Transactions on Computers, Vol. c-33, No.6, June 1984
- McCl84b McCluskey, E.J.: A Survey of Design for Testability Scan Techniques, VLSI Design, Dec. 1984
- McCluskey, E.J.: Logic Design Principles: With Emphasis on McCl86 Testable Semicustom Circuits, Prentice-Hall, 1986
- NaSS86 Nahar, S.; Sahni, S.; Shragowitz, E.: Simulated Annealing and Combinatorial Optimization, Proc. 23rd Design Automation Conference, 1986
- Rich83 Rich, E.: Artificial Intelligence, McGraw-Hill International Editions, 1983
- Roberts, M. W.; Lala, M. Sc.: An Algorithm for the Parti-RoLa84 tioning of logic circuits, IEE Proceedings, Vol. 131, Pt.E., No.4, July 1984
- Udell, J.G., Jr.: Test Set Generation for Pseudo-Exhaustive BIST, Udel86 Proc. International Conference on Computer Aided Design, 1986
- Wu90 Wunderlich, H.-J.: Rechnergestützte Verfahren für den prüfgerechten Entwurf hochintegrierter Schaltungen, University of Karlsruhe, 1990
- Wunderlich, H.-J.; Hellebrand, S.: Generating Pattern Sequences WuHe88 for the Pseudo-Exhaustive Test of MOS-Circuits, Proc. 18th International Symposium on Fault-Tolerant Computing, Tokyo 1988