# 3D IC Design-for-Test with Implementation of the IEEE 1149.1-2013, the IEEE 1500-2005, and the IEEE 1838-2019 Standard

Konstantinos Leventos Department of Computer Science and Engineering University of Ioannina Ioannina, Greece k.leventos@uoi.gr

Abstract—The paper presents a design for a testable 3D integrated circuit (IC), and more specifically the Design-for-Test (DfT) method of its stacked layers. It describes how it implements the IEEE 1149.1-2013 Standard, the IEEE 1500-2005 Standard, and the IEEE 1838-2019 Standard. Finally, it presents the instructions that allow for the combined implementation of these Standards, and an analysis of its additional required footprint.

## Keywords—3D IC Testing, IEEE Standards, DfT scheme

#### I. INTRODUCTION

To maximize the benefit of Circuit Integration, 3D ICs and their microscale Through-Silicon Vias (TSVs) were introduced as a new inter-die connection. Even though they suffer from both area and electrical coupling overhead, they are a key technological advancement for die connectivity. Testing is of paramount importance when it comes to TSVs, as their high integration density, and their manufacturing process, make them especially vulnerable to various defects. Therefore, effective defect screening and quality assurance are not only necessary but also a prerequisite for 3D ICs.

The emergence of 3D stacking technology offers high functionality at a reduced die footprint by enabling the integration of multiple silicon dies on a vertical stack. Separately manufactured dies are integrated into the same package, and TSVs are used to connect dies to each other [1]. 3D stacking removes the scalability barriers of nanometre technologies by offering reduced wire length, reduced interconnect delays, lower power consumption, higher interconnect bandwidth, and true heterogeneous integration [2, 3].

TSVs constitute a key technological advantage for die connectivity but come at a cost; the significant area of silicon that gets occupied by them. The bonding process of the dies for the stack requires alignment of the dies with a precision of  $0.5\mu$ m, thus imposing strict limits on the minimum allowed TSV diameter [4, 5]. Moreover, surrounding every TSV there is a Keep-Out Zone of a minimum size equal to  $3\mu$ m for ICs manufactured at 20nm, which forms a microcrack shield between the TSV and the active logic of the die [6]. Chrysovalantis Kavousianos Department of Computer Science and Engineering University of Ioannina Ioannina, Greece kabousia@cse.uoi.gr

In addition to their large overhead area, TSVs also suffer from several manufacturing defects. Specifically, such as voids and cracks, incomplete fillings, pinholes on the insulator boundary, missing landing pads, improper connections between pads and TSVs, and electromigration. All these defects adversely affect chip yield and further increase manufacturing cost [7]. Even a single defective TSV in a stack leads to the disposal of the whole stack, hence wasting the good dies of the rest of the stack. In addition, several concerns have been raised about defects that may appear in the bottom layer when additional layers are processed on top of them [8]. Moreover, nonbottom layers are susceptible to process variations and electrostatic coupling, while the vias themselves are prone to shorts, opens, and delay defects. Therefore, effective defect screening and quality assurance are not only necessary but also a prerequisite for 3D ICs, especially for 3D processors. Even more important is the need for 3D IC DfT solutions to enable defect isolation and yield enhancement.

In this paper, we present a novel way to utilise three IEEE Standards in order to achieve full observability of a 3D stack. That is to say, the problem this paper solves is the lack of a DfT solution for an entire 3D stack, supported by IEEE Standards. The cornerstone of our approach is a new way of using the Flex-ible Parallel Port (FPP) of the IEEE 1838 Standard to create vertical "II" shaped testing connections through the TSVs. These connections are used along with horizontal daisy chain testing paths through the cores of each die and allow for in parallel transfer of test data in and out from the 3D stack.

# II. BACKGROUND

Historically, the semiconductor industry has been able to meet the demand for high-performance integrated circuits with added functionality by relentlessly scaling device sizes. It has become clear, though, that it is increasingly difficult to sustain device scaling in an economically viable manner, with escalating costs which are mainly due to the challenges associated with the lithography of small features, interconnect scaling, and reducing and mitigating process variations. And that is exactly

This work was supported by the project "Dioni: Computing Infrastructure for Big-Data Processing and Analysis." (MIS No. 5047222) which is implemented under the Action "Reinforcement of the Research and Innovation Infrastructure", funded by the Operational Programme "Competitiveness, Entrepreneurship and Innovation" (NSRF 2014-2020) and co-financed by Greece and the European Union (European Regional Development Fund).



Fig. 1 A DfT solution with separate Test Layers between the Functional Layers of a 3D stack.

where 3D stacking comes into play, and the whole reason for trying to solve its testing issues. Furthermore, 3D technologies enable the integration of heterogeneous fabrication processes, which opens the way for more complex systems such as memory-on-logic [9]. Although TSVs could be tested together with logic and memory, a DfT method is still required, especially for defect isolation and yield learning.

Moreover, 3D integration is proving to be a promising way to achieve high-performance ICs with more functionality and a reduced die footprint. The basic idea lies in die or wafer stacking, as it does not require substantial changes to the existing fabrication process. Thus, separately manufactured dies or wafers are integrated onto the same package, and TSVs are used to connect the dies to each other. Considerable research efforts have been directed towards the development of TSV-based 3D stacking technology. However, the keep-out zone required for TSV and the limitation with the precision on die alignment impose limits on achievable device integration density.

With all these issues, research must focus on finding test solutions for 3D ICs. A well-known one is based on dedicated test layers [10], which are inserted between the functional layers of the 3D stack (Fig. 1). Meaning that one layer has the functional components, while the other layer has the test scan chains, and in such manner in an alternating fashion. More specifically, the test layer includes an interface register controlling signals from a testing module to one of the test scan chains, and an instruction register connected to the interface register. The instruction register processes testing instructions from the testing module, which is connected with TSV to the functional components, and the test module throughout the test layer.

On the other hand, there is always the thought of built-in self-test (BIST) methods. 3D stacking involves many possible test insertions, due to multiple yield and test cost parameters corresponding to different dies and tests, such as for pre-bond, post- bond, and partial stack. As an exponentially large number of test flows must be evaluated, analysis methods and tools are needed for test-cost optimization and automated test-flow selection. BIST is a promising solution because it simplifies the test application. Especially in 3D ICs, since tests can be applied at many possible test stages or test insertions, there is a need for a distributed BIST framework. This framework can enable BIST-based testing at multiple test insertions [11]. Specifically, there are methods to locate defects in a passive interposer before and after stacking, such as a technique for contactless prebond TSV testing and a DfT architecture for post-bond die access, and an optimization approach to select an effective test flow by systematically exploring an exponentially large number of candidate test flows, or even an end-to-end design of a BIST infrastructure.

Furthermore, the very placing of the TSVs may turn knowngood chips into faulty ones, which feeds back to the low yield of 3D integration. Therefore, there is a need for DfT solutions to enable defect isolation and yield enhancement. However, to understand the DfT solution this paper presents, the three IEEE standards used must also be understood.

The IEEE 1149.1 Std. [12] defines test logic that can be included in an integrated circuit to provide standardized test solutions. Firstly, it is used to test the interconnections between integrated circuits once they have been assembled onto a printed circuit board or other substrate. Secondly, it is used to test the integrated circuit itself. Lastly, it is used to observe or to modify circuit activity during the component's normal operation. The test logic consists of a boundary-scan register and other building blocks and is accessed through a Test Access Port (TAP), which controls its Finite State Machine (FSM) that is the heart of its test logic.

The IEEE 1500 Std. [13] defines a scalable architecture for independent, modular test development and test application for embedded design blocks and enables testing of the external logic surrounding these cores. In addition, its architecture can also be used to partition large design blocks into smaller blocks of more manageable size and to facilitate test reuse for blocks that are reused from one System-on-Chip (SoC) design to the next. It has developed a standard design-for-testability method for ICs containing embedded non-mergeable cores. Its method is independent of the underlying functionality of the IC or its individual embedded cores. The method creates the necessary requirements for the test of such ICs, while allowing for ease of interoperability of cores that may have originated from different sources. Its aim was to provide a consistent scalable solution to the test reuse challenges specific to the reuse of nonmergeable cores, while preserving all aspects that are often associated with these cores. This objective was achieved with a



Fig. 2 A top view of the architecture proposed, illustrating 6 cores, wrapped with IEEE 1500 Std., and using TSVs along with the IEEE 1838 Std. FPP switchboard. The solid lines exist in every layer of the stack, while the dotted lines exist only on the last layer of the stack.

core-centric methodology that enables successful integration of cores into SoCs.

The IEEE 1838 Std. [14] is die-centric, applying to a die that is intended to be part of a multi-die stack. It defines die-level features that, when compliant dies are brought together in a stack, comprise a stack-level architecture. Initially, it enables transportation of control and data signals for the test of intradie circuitry. Additionally, it enables inter-die interconnects in both pre-stacking and post-stacking situations. It supports testing for both partial and complete stacks in pre-packaging, postpackaging, and board-level situations. The primary focus of inter-die interconnect technology addressed by this standard is TSVs. Being die-centric, compliance to the standard pertains to a die (and not to a stack of dies). Standardized die-level DfT features comprise a stack-level test access architecture. In this way, the standard enables interoperability between die makers and stack maker. The standard does not address stack-level challenges and solutions.

Its general architecture is based on its two interfaces, the Primary and Secondary one. They are made in such a way so that they can be brought together in a stack. Part of the interfaces is used for test data, while the rest is assumed to be used for functional data. That means that the standard can accommodate power TSVs, data elevators, and various other stack-based solutions without alterations. For a basic understanding of its usage, it is sufficient to understand that the Primary Interface TAP (PTAP) is test logic surrounding an IEEE 1149.1 Std. FSM. It contains the signals and internal die logic connections that are associated with the Primary Interface. Another subset of the secondary interface is one or more Secondary Interface TAP (STAP). They contain the signals and internal die logic connections that are associated with the Secondary Interface. These may also include the flexible parallel port (FPP) which will be described further on.

In total, this paper combines the core-centric IEEE 1500 Std. and the die-centric IEEE 1838 Std. to create something greater than the sum of their parts; a DfT solution for an entire 3D stack.

#### III. METHODOLOGY

To implement the three different Standards, all their specifications had to be meticulously combined to be able to follow the right composition of the hundreds of rules, recommendations, and permissions.

We present the general architecture of the 3D design used in this paper (Fig. 2). Let us assume a theoretical die consisting of 6 cores, which are first wrapped with IEEE 1500 Std. compatible wrappers. These give them the ability to be tested serially with the connections between the cores, following the 1500 Std. testing protocols. Additionally, the FPP switchboard is added to each of the cores, which works in tandem with each 1500 Std. compatible wrapper. This enables in parallel testing, through the TSVs, adhering to IEEE 1838 testing protocols. For instance, all control signals can be changed serially, through the connections of the connections of the 1500 Std. Additionally, one can prepare all scan chains in parallel but extract the results serially. Thus, the two testing schemes can work seamlessly together because they use the same wrapper cells around each core.

Moreover, the two testing schemes can be used for different types of testing. For example, in the case of EXTEST, the connections between the cores can be accessed through the 1500 Std. scheme, while the TSV connections between the clusters can be accessed through the 1838 Std. scheme. The architecture is flexible in this regard, especially when it comes to the usage of the various bypass signals. One can skip over cores in the die, or even entire dies of the stack, allowing for the testing to happen in any way the testing process requires it to.

At this stage of the design flow, the cores can be considered test ready. Therefore, the next step in the flow is to generate the Test Access Mechanism (TAM) of each die. The TAM makes every die test-ready and offers the means: a) to connect the intest ports of all cores to the source of test data at the die level, in order to transfer the test data into the cores, and b) the outtest ports to the test sink at the die level, in order to transfer the test responses out of the cores. Although many different TAM architectures have been proposed in the literature, in this paper we utilize the daisy chain architecture. With it, the cores are connected in a sequential manner forming a virtual daisy chain: the out-test port of every core drives the in-test port of the next core in the chain, while the first one is driven by the test source of the die, and the last one drives the test sink of the die.

In the architecture shown (Fig. 2), the TAM connects the testing ports of the cores in sequence, that is, the test-in port of each core is directly connected to the test-out port of the precursor core in the daisy chain. In this example, the TAM consists of one daisy chain at every die, but additional daisy chains can be used in a similar manner. The input of each of these chains is driven by the PTAP module, while their output drives the STAP module (Fig. 2).

We present the complete 3D test scheme developed in this paper, which is a fully testable 3D stack, using the combination



Fig. 3 A side view of the architecture, showing the dotted lines which connect the TSVs through the IEEE 1838 Std. FPP switchboards, and allow for parallel access of the test data of the cores.

of the IEEE 1149.1-2013 Standard, the IEEE 1500-2005 Standard, and the IEEE 1838-2019 Standard. We must note that the complete 3D scheme for serial testing implemented in this paper assumes vertical connections among the PTAP and STAP at the dies of the stack, which are realised by using TSVs. However, the TSVs through the IEEE 1838 Std. FPPs accompanying every core can be used in parallel testing.

By its definition, the FPP can be used to establish a connection between the primary interface, the secondary interface, and the core of functional logic on the die. The optional FPP is intended to carry arbitrary test data, clock, and control signals up and down the die stack independently of the die wrapper. The configuration of the FPP can vary depending on the application. The FPP is composed of a set of lanes. Each lane implements a one-bit-wide path. Lanes with identical properties and control may be grouped into channels. Lanes can be unidirectional or bidirectional, registered (including pipe-lined pathways between terminals of the lane) or unregistered, and include several connection points between the bottom and top of the die. The collection of connection points is selectable but may include terminals from the lane to the core and back, from one lane to another, and between the primary and secondary interfaces. Multiplexing functions within the lane can select how these terminals interconnect within the lane. Controls for these multiplexing functions are derived from test data register bits sprinkled throughout the scan-accessible network.

The purpose of the FPP is to enable parallel data flow into each die and between dies in the stack. In addition to the interdie connection, it is also possible to connect the lanes with the core logic of the current die. This is done by using its six terminals; there exist terminals for connecting to the TSVs, for connecting to the logic core within the die, and for connecting to other FPP lanes. It may be composed of registered and nonregistered lanes; the latter category can be further subdivided into clock lanes and non-clock lanes. The configuration of each of the lane elements is done with PTAP-accessible register bits that withhold their state while the FPP is used to apply tests. The lanes can be controlled individually to connect the signals from the primary interface, secondary interface, and core to each other.

These pathways might be registered to enable higher frequency data communications through them. If registered, a clock can be provided from various sources. But it is envisioned that a clock might be easily connected through a nonregistered lane, so special clock terminal names were selected, as its latching can be bypassed if so desired. IEEE 1149.1 Std. is leveraged for the serial control and data path. The stack-level and dielevel interfaces are specified with respect to the TAP interface signals. The PTAP controller and register architecture is based on an IEEE 1149.1 Std. TAP controller and register architecture. The Instruction Register directs the device-level registers to be placed in the testing pathway. Finally, the Die Wrapper Register (DWR) is quite flexible in its construction. As such, an IEEE 1500 Std. core wrapper can become part of its content, as allowed by the IEEE 1838 Std. and its specifications. To complete the connections, the last layer of the stack (i.e., the layer furthest from the I/Os) has some additional connections between the FPP switchboards of the cores, allowing for the TSVs to connect and complete a "II" shaped circuit, as shown (Fig. 3). Meaning that clusters with an odd number within the daisy chain use their TSVs to shift data away from the I/O of the stack. While clusters with an even number use theirs to shift data towards them.

# IV. METHODOLOGY DETAILS

As we mentioned earlier, the PTAP and STAP modules of each die are controlled by the known TAP signals of IEEE 1149.1 std, as described in the IEEE 1838 Std. The details of the Instruction Register of the IEEE 1838 Std. are presented here (Table 1). Most of the options have to do with the DWR, except the 3D Configuration Register and the Identification Register, which are both defined within the IEEE 1838 Std. As ordained by the IEEE 1838 Std., the two edge codes are for the bypass instruction, which allows for a layer of the stack to remain untested, giving further freedom to the DfT of the 3D stack.

The DWR enables controllability and observability of the logic to and from die terminals for both INTEST and EXTEST modes. The IEEE 1838 Std. allows multiple configurations of the DWR, namely that it can reuse one of more segments of an IEEE 1500 Std. Wrapper Boundary Register as the DWR. The DWR cell operation relies on the Shift, Capture, and Update events. It can be selected for testing or turned transparent with the corresponding instruction code, so that the 3D stack operates as it was functionally intended.

| 3'b000 | Bypass                 |
|--------|------------------------|
| 3'b001 | Select DWR Transparent |
| 3'b010 | Select DWR Extest      |
| 3'b011 | Select DWR Intest      |
| 3'b100 | Select 3DCR            |
| 3'b101 | Select IDCODE          |
| 3'b110 | Select DWR             |
| 3'b111 | Bypass                 |

As for the Wrapper Instruction Register (WIR) of the IEEE 1500 Std., it can be made more simply (Table 2), only needing to allow serial or parallel access to each of the cores within the die, while also allowing for a full bypass of each one of the cores. Once again, as ordained by the IEEE 1500 Std., the two edge codes are for the bypass instruction, effectively allowing for parts of the die to go untested, which is useful in order to control the total temperature of the die while testing is ongoing, as to not exceed the thermal limit of the stack.

| Table 2 | WIR | Serial | codes | (IEEE | 1500 | Std.) |  |
|---------|-----|--------|-------|-------|------|-------|--|
|---------|-----|--------|-------|-------|------|-------|--|

| 2'b00 | Bypass                |
|-------|-----------------------|
| 2'b01 | Serial Testing Mode   |
| 2'b10 | Parallel Testing Mode |
| 2'b11 | Bypass                |

Finally, and the most important detail of this paper, is the actual usage of the control signals of the FPP. Let a concatenation of the IEEE 1838 Std. signals which select the Primary Interface, which select the Core under test, which select the Secondary Interface, and the signals which bypass their respective registers, and finally the signals which enable the Primary Interface and the Secondary Interface exist, in this exact sequence. Then, they are used in four distinct ways (Table 3), which are needed for the connection of the TSVs as described in the previous section.

| Table 3 FPP Control Signals values |                                   |  |  |  |
|------------------------------------|-----------------------------------|--|--|--|
| 8'b11111000                        | Connection from Side to Side      |  |  |  |
| 8'b01111000                        | Connection from Primary to Side   |  |  |  |
| 8'b11011000                        | Connection from Secondary to Side |  |  |  |
| 8'b11111011                        | Connection from Side to All       |  |  |  |

To clarify, the "Side to All" connection includes the three inverse directions of the "Side to Side", "Primary to Side", and "Secondary to Side" connections, but only one of those should be used at a time. These are used to connect the FPP to the TSVs of the die. A "Side to Primary" connection is used between the FPP at the end of the sequence of cores, to push the test data closer to the I/Os of the 3D stack. On the contrary, a "Side to Secondary" connection is used to push the test data further away from the I/Os of the 3D stack (Fig. 4).



Fig. 4 The FPP connections between Pri(mary), Sec(ondary), and Side. The "Side to All" is the combined inverse of the other three.

When this is not required, and only two FPPs need to be connected, the "Side to Side" option is used. Lastly, the connections "Primary to Side" and "Secondary to Side" are for the FPPs that take test data directly from the TSVs of the die. A "Primary to Side" connection is the first connection between the I/Os of the 3D stack and the first FPP of the sequence of cores, and the first connection of each die following for TSVs with an "upward" (away from the I/Os) direction. In contrast, a "Secondary to Side" connection is used for the TSVs with a "downward" (towards the I/Os) direction, which take test data from a previous die. As defined in IEEE 1838 Std., "*Primary*" is the side of the die closer to the I'Os of the stack, while "*Sec*ondary" is the opposite side.

## V. REQUIRED TESTING FOOTPRINT

We used the following IWLS benchmark [15] cores to design and simulate three layers of a 3D stack with the SAED 14nm (tt0p8v25c) library as separate floorplans: *des3\_perf*, *eth\_top*, *vga\_enh\_top*, and *wb\_conmax*. In Fig. 4, we present the percentage of additional floorplan area required by each of these cores to follow our DfT scheme, with a stable *create\_floorplan -core\_utilization 0.7* value in Synopsys ICC.



Fig. 5 Additional Floorplan Area required for the IWLS Benchmark cores to follow our DfT scheme.

Each floorplan contains 6 cores, of which 2 are always the *wb\_conmax* cores, and the other 4 are either 4 *des3\_perf* cores, or 4 *eth\_top* cores, or 4 *vga\_enh\_top* cores. In Fig. 5, we present the percentage of additional floorplan area required for each of these layers of our experimental 3D stack which follows our DfT scheme.



Fig. 6 Additional Floorplan Area required for the 3D stack layers to follow our DfT scheme.

While the 2D floorplan software used does not allow for a calculation including the TSVs required between the layers, the total area cost of our DfT scheme is calculated on average to be slightly above 5% for the entire 3D stack, which is considered average for a DfT scheme.

## CONCLUSION

In summary, this paper describes a minimal intersection of instructions between IEEE Standards 1149.1, 1500, and 1838, to achieve a DfT scheme for a 3D IC. The presented scheme provides complete observability of the 3D stack, with an average area cost of about 5%. The paper focuses on the use of the FPP presented in [14] to achieve test data access for each core, of each die, for each stack following the presented DfT scheme.

#### REFERENCES

- J. Lau, "Evolution, challenge, and outlook of TSV, 3D IC integration and 3D silicon integration," in Proc. Int. Symp. Adv. Packag. Mater. (APM), pp. 462–488, Oct. 2011.
- [2] R. S. Patti, "Three-dimensional integrated circuits and the future of system-on-chip designs," Proceedings of the IEEE, vol. 94, no. 6, pp. 1214–1224, 2006.
- [3] M. Motoyoshi, "Through-silicon via (TSV)," Proceedings of the IEEE, vol. 97, no. 1, pp. 43–48, 2009.
- [4] Jae-Seok Yang et al, "TSV Stress Aware Timing Analysis with Applications to 3D-IC Layout Optimization," In Proceedings ACM/IEEE Design Automation Conference (DAC), pp. 803– 806, June 2010.
- [5] Jae-Seok Yang, "Nanometer VLSI Design-Manufacturing Interface for Large Scale Integration", PhD thesis, The University of Texas at Austin, 2011.
- [6] A. Koneru, S. Kannan, and K. Chakrabarty, "Impact of electrostatic coupling and wafer-bonding defects on delay testing of monolithic 3D integrated circuits," ACM Journal on Emerging Technologies in Computing Systems (JETC), 13(4), pp. 1-23, 2017.
- [7] D. H. Jung, Y. Kim, J. J. Kim, H. Kim, S. Choi, Y. H. Song, and A. Orlandi, "Through silicon via (TSV) defect modelling, measurement, and analysis," IEEE transactions on components, packaging, and manufacturing technology, 7(1), pp. 138-152, 2016.
- [8] Gyujei Lee et al, "Mechanical characterization of residual stress around TSV through instrumented indentation algorithm," In Proceedings IEEE International Conference on 3D System Integration (3DIC), 2012.
- [9] A. Koneru, S. Kannan, and K. Chakrabarty, "Impact of Electrostatic Coupling and Wafer-Bonding Defects on Delay Testing of Monolithic 3D Integrated Circuits," ACM Journal on Emerging Technologies in Computing Systems (JETC), 13(4), pp. 54, 2017.
- [10] S. Kannan, A. Koneru, and K. Chakrabarty, "Testing monolithic three-dimensional integrated circuits," U.S. Patent No. 10,775,429, Sep. 2020.
- [11] R. Wang, S. Deutsch, M. Agrawal, and K. Chakrabarty, "The hype, myths, and realities of testing 3D integrated circuits," In Proceedings of the 35th International Conference on Computer-Aided Design (p. 58), ACM, Nov. 2016.
- [12] IEEE Standard for Test Access Port and Boundary-Scan Architecture," in IEEE Std 1149.1-2013 (Revision of IEEE Std 1149.1-2001), vol., no., pp. 1-444, 13 May 2013. doi:10.1109/IEEESTD.2013.6515989
- [13] IEEE Standard Testability Method for Embedded Core-based Integrated Circuits," in IEEE Std 1500-2005, vol., no., pp.1-136, 29 Aug. 2005. doi:10.1109/IEEESTD.2005.96465
- [14] IEEE Standard for Test Access Architecture for Three-Dimensional Stacked Integrated Circuits," in IEEE Std 1838-2019, vol., no., pp.1-73, 13 March 2020. doi:10.1109/IEEESTD.2020. 9036129
- [15] C. Albrecht, "IWLS 2005 benchmarks." In International Workshop for Logic Synthesis (IWLS), 2005