# Simulation Based Analysis of Temperature Effect on the Faulty Behavior of Embedded DRAMs Zaid Al-Ars Ad J. van de Goor Jens Braun **Detley Richter** Faculty of Information Technology and Systems Section of Computer Engineering Delft University of Technology Mekelweg 4, 2628 CD Delft, The Netherlands Infineon Technologies AG Product Engineering Group 2 Balanstr. 73, 81541 Munich, Germany E-mail: z.e.al-ars@its.tudelft.nl Abstract: Temperature has proven to be an effective stress condition, commonly used to stress memory devices and to detect special types of failure mechanisms. In this paper, a new approach is presented where temperature is used as a test parameter to increase the fault coverage of specific tests. This is done using defect injection and simulation of a memory model at different temperatures. The analysis presents new types of detection conditions for memories and evaluates the impact of temperature on these conditions. **Key words:** Embedded DRAM, fault primitives, temperature effect, defect simulation, opens, memory testing #### 1 Introduction In the past decade, memory testing has significantly grown in complexity as a result of the continued increase in memory density and of the associated faulty behavior. This has led memory manufacturers to increase the fault coverage of their tests by modifying some operational parameters in order to stress the devices under test, thereby putting them closer to failure. The operational parameters (called *stress conditions*) usually used in testing are: temperature, supply voltage and timing [Vollrath00]. Temperature (T) has become a standard stress condition, commonly used in memory testing. Experimental studies show that raising T increases the fault coverage of a given set of tests [vdGoor99]. Currently, tests are usually performed at different temperatures to serve two main goals: 1) guarantee device functionality in the defined T range according to the specifications [Falter00], and 2) to target specific failure mechanisms by making them easier to detect. These two goals result in applying memory tests at the high end and low end temperatures as defined in the specifications. In this paper, T is used to achieve the new goal of optimizing the fault coverage of memory tests, depending on the effectiveness of a given T to a specific subset of memory tests for embedded DRAMs (eDRAMs). Although a lot has been published about memory testing at elevated T, in terms of cost effectiveness and impact on reliability, not much effort has been spent on the analysis of the $way\ T$ affects the faulty behavior of memories. This paper presents new types of detection conditions for memories and evaluates the impact of T on these conditions. The paper also presents the fault coverage of applying the new tests in practice. Section 2 starts by describing the electrical memory model used to perform simulations. Next, Section 3 defines the functional fault models (FFMs) to be used in this paper. Then, Section 4 identifies the open defects injected into the electrical model in order to induce the faulty behavior. Section 5 gives the methodology to be used for performing the simulations, while Section 6 discusses the simulation results. Section 7 uses these results to derive detection conditions and analyzes the effect of elevated T on the resulting tests. Finally, Section 8 ends with the conclusions. #### 2 *e*DRAM simulation model The simulation model is based on a design-validation model of an actual eDRAM, produced by Infineon Technologies. A general description of this eDRAM and its test concept can be found in the literature [McConnell98]. Since the time needed for simulating a complete memory device is excessively long, the simulation model used is simplified, taking two factors into consideration in order to preserve the model accuracy. First, removed components should be electrically compensated, and second, the resulting simplified circuit should describe enough of the memory to enable injecting the defects of interest. Figure 1 shows a block diagram of the cell array column of the simulated eDRAM. (The blocks labeled OB1s, OB1c, OB2s, etc., are locations of opens on bit lines as dis- Figure 1. Cell array column of the eDRAM, where the possible locations of opens on BT and BC are indicated. cussed in Section 4.2. In a defect free model, these blocks represent no resistance on the bit lines.) This simplified simulation model contains a $2\times2$ cell array, in addition to two reference cells, precharge circuits and a sense amplifier. The removed memory cells are compensated by resistances and capacitances along the bit line. In addition to the shown cell array column, the simulation model contains one data output buffer, needed to examine data on the output, and a write driver, needed to perform write operations. All simulations have been done using the simulator 'Pstar' (a commercial Spice based simulator), and using a transistor model compatible with the Spice Level 3 model. Figure 2 shows the simulation results of the properly functional memory while performing a write 0 operation followed by a read operation performed on Cell 0. The figure is divided into three panels, each with time as the horizontal axis and the voltage as the vertical axis. The first panel shows voltages on the bit lines (BT and BC), named VN (BT) and VN (BC), respectively. They show the shared effect of any defect in the cell array column on other parts of the column. The second panel shows the voltage stored across the storage capacitor of Cell 0 (referred to in the figure as V(C\_S0)), which reveals the short and long term effects of a defect on the stored logic value. Finally, the third panel shows the voltage on the T and F nodes of the data output buffer (referred to asVN (DATA\_T) and VN (DATA\_F), respectively), which indicate whether the defect in the array column causes a fault to be detected on the output. At the beginning of each simulation run, cell capacitances $(C_s)$ are initialized to the voltage level corresponding to the logic value they are supposed to store; bit line capacitances $(C_b)$ are set to the precharge voltage (equals $V_{DD}$ ); reference cell capacitances are set to the mid-point voltage $V_{mp}$ , which is the threshold voltage between logic 0 and logic 1; and the data output buffer is forced to contain a logic 1 at the true side. Although timing and basic operations of a designvalidation model are known to correspond to that on real silicon, silicon measurements at different temperatures Figure 2. Simulation of 1w0r0 performed on Cell 0. have been made and compared with model simulations to analyze the way the model reflects behavioral change with T. The results show that, although the model does not reflect the exact silicon behavior for a given failure, the simulated T related behavioral change approximates that measured on silicon. Measurements are made using two memory tests: a $V_{DD}$ bump test and a margin test. $V_{DD}$ bump test: The $V_{DD}$ bump test is used to examine the ability of the memory to charge up the cell capacitor to the high voltage level required when a w1 or w0 is performed to cells on BT or BC, respectively. This is done by changing the supply voltage $(V_{DD})$ in an attempt to induce a memory failure. This test can be represented by the following march test $\{ \updownarrow(w0); V_{DD} = V_{nom} - V_{bump}; \updownarrow(w1); V_{DD} = V_{nom}; \updownarrow(r1) \}$ , where $V_{nom}$ is the nominal voltage and $V_{bump}$ is the bump voltage. This test has been simulated and performed on silicon on a 256 Kbit eDRAM at different temperatures. At each T, the $V_{bump}$ had been gradually increased from 0 until it resulted in at least one fault. The T related $V_{bump}$ change $(dV_{bump})$ of the simulated and measured results of this test are shown in Figure 3. T is listed in the figure on the x-axis, while $dV_{bump}$ is listed on the y-axis, so that each point in the figure represents a $(T, dV_{bump})$ pair. Points on a line in the figure are calculated as: $dV_{bump} = V_{bump}(T) - V_{bump}(300\text{K})$ , where $V_{bump}(T)$ is the bump voltage that results in at least one cell fail for a given T. Points above a line in the figure give $(T, dV_{bump})$ pairs Figure 3. Simulated and measured results for the $V_{DD}$ bump test. where more than one cell fail is detected, while the memory functions properly for points below a line. The figure shows that measured and simulated results are rather close. **Margin test:** The margin test is used to examine the ability of the memory to discharge the cell capacitor to the low voltage level, required when a w0 or a w1 is performed to cells on BC or BT, respectively. This test is the complement of the $V_{DD}$ bump test since it examines the opposite functionality of the memory (discharging instead of charging up). Note that access to cells in this memory is done by pulling down the voltage on the word line, because this enables the PMOS pass transistors of the cells. Therefore, the margin test can be performed by not allowing the word line to reach its minimum voltage, thereby limiting access to the cell capacitor. This test can be represented by the following march test $\{ \updownarrow(w1); WL = WL_{min}; \updownarrow(w0); under WL$ $WL_{nom}; \mathfrak{T}(r0)$ , where $WL_{nom}$ is the nominal WL voltage and $WL_{min}$ is the minimum voltage the word line reaches during cell access. This test has been simulated and performed on silicon on a 256 Kbit eDRAM at different temperatures. At each T, the $WL_{min}$ has been gradually increased from 0 until it results in at least one cell fail. The T related $WL_{min}$ change (dWLmin) of the simulated and measured results of this test are shown in Figure 4. T is listed in the figure on the x-axis, while $dWL_{min}$ is listed on the y-axis, so that each point in the figure represents a $(T, dWL_{min})$ pair. Points on a line in the figure are calculated as: dWLmin = $WL_{min}(T)-WL_{min}(300K)$ , where $WL_{min}(T)$ is the word line voltage that results in at least one cell fail for a given T. Points above a line in the figure give (T, $dWL_{min}$ ) pairs where more than one cell fail is detected, while the memory functions properly for points below a line. The figure shows that measured and simulated results are close. Figure 4. Simulated and measured results for the margin test. #### **3 Definition of FFMs** In this section, the FFMs used in this paper are defined. Two basic ingredients are needed to define any fault model: a list of performed memory operations and a list of corresponding deviations in the observed behavior from the expected one. The only functional deviations considered relevant to the faulty behavior are the stored logic value in the cell and the output value of a read operation. Any difference between the observed and expected memory behavior can be denoted by the following notation $\langle S/F/R \rangle$ , referred to as a fault primitive (FP). S describes the sensitizing operation sequence (SOS) that sensitizes the fault; F describes the value of the faulty cell, $F \in \{0,1\}$ ; and R describes the logic output level of a read operation, $R \in \{0,1,-\}$ . The '-' is used in case a write, and not a read, is the operation that sensitizes the fault. FPs can be classified according to #C, the number of different cells accessed during an SOS, and according to #O, the number of different operations performed in an SOS [vdGoor00]. In this paper, we are only interested in SOS's performed on one memory cell (#C=1), because we assume that fault effects of opens are localized to a single cell. In addition, we will restrict ourselves to FPs with $\#O \le 1$ (static) and #O = 2 (dynamic). The notion of FPs makes it possible to give a precise definition of an FFM as understood for memory devices. This definition is presented next. A functional fault model (FFM) is a non-empty set of fault primitives (FPs). #### 3.1 Single-cell static FFMs Single-cell static FFMs describe faults sensitized by performing at most one operation on the faulty cell. As men- Table 1. All combinations of single-cell static FPs. | # | S | F | R | FP | Fault model | |-----|-----|----|----|-------------|-------------------| | 1 | 0_ | 1 | | < 0/1/- > | SF <sub>0</sub> | | 2 | 1 | 0 | | < 1/0/- > | SF <sub>1</sub> | | 3 | 0w0 | 1 | - | < 0w0/1/- > | WDF <sub>0</sub> | | 4 | 0w1 | 0 | _ | < 0w1/0/-> | TF↑ | | 5 | 1w0 | 1 | | < 1w0/1/- > | TF↓ | | 6 | 1w1 | 0 | | < 1w1/0/-> | WDF <sub>1</sub> | | 7 | 0r0 | 0 | 1 | < 0r0/0/1 > | IRF <sub>0</sub> | | 8 | 0r0 | 1 | 0 | <0r0/1/0> | DRDF <sub>0</sub> | | 9 | 0r0 | 1 | 1_ | <0r0/1/1> | RDF <sub>0</sub> | | 10 | 1r1 | 0 | 0 | < 1r1/0/0 > | RDF <sub>1</sub> | | 11 | 1r1 | 0 | 1 | < 1r1/0/1 > | DRDF <sub>1</sub> | | _12 | 1r1 | 11 | 0 | < 1r1/1/0 > | IRF <sub>1</sub> | | | | | | | | tioned earlier, a particular FP is denoted by $\langle S/F/R \rangle$ . S describes the value or operation that sensitizes the fault, $S \in \{0, 1, 0w0, 0w1, 1w0, 1w1, 0r0, 1r1\}$ for static FPs. Now that the possible values for S, F and R are known for single-cell static FPs, it is possible to list all detectable FPs using this notation. Table 1 lists all 12 possible combinations of the values, in the $\langle S/F/R \rangle$ notation, that result in FPs. The column 'Fault model' states the FFM defined by the corresponding FP. All FPs listed in Table 1 are targeted in this paper. Below, they are used to define 6 FFMs described in terms of non-empty sets of FPs. - 1. State faults $(SF_x)$ —A cell is said to have an SF if the logic value of the cell flips before it is accessed, even if no operation is performed on it<sup>1</sup>. Two types of SF exist: $SF_0 = \{<0/1/->\}$ , with FP #1, and $SF_1 = \{<1/0/->\}$ , with FP #2. - 2. Transition faults (TFx)—A cell is said to have a TF if it fails to undergo a transition $(0 \rightarrow 1 \text{ or } 1 \rightarrow 0)$ when it is written. Two types of TF exist: TF $\uparrow$ = $\{<0w1/0/->\}$ , with FP #4, and TF $\downarrow$ = $\{<1w0/1/->\}$ , with FP #5. - 3. Read disturb faults (RDF<sub>x</sub>) [Adams96]—A cell is said to have an RDF if a read operation performed on the cell changes the data in the cell and returns an incorrect value on the output. Two types of RDF exist: RDF<sub>0</sub> = $\{<0r0/1/1>\}$ , with FP #9, and RDF<sub>1</sub> = $\{<1r1/0/0>\}$ , with FP #10. - 4. Write disturb faults (WDF<sub>x</sub>)—A cell is said to have a WDF if a non-transition write operation (0w0 or 1w1) causes a transition in the cell. Two types of WDF exist: WDF<sub>0</sub> = $\{<0w0/1/->\}$ , with FP #3, and WDF<sub>1</sub> = $\{<1w1/0/->\}$ , with FP #6. - 5. Incorrect read faults (IRF<sub>x</sub>)—A cell is said to have an IRF if a read operation performed on the cell returns the incorrect logic value, while keeping the correct stored value in the cell. Two types of IRF exist: IRF<sub>0</sub> = $\{<0r0/0/1>\}$ , with FP #7, and IRF<sub>1</sub> = $\{<1r1/1/0>\}$ , with FP #12. - 6. Deceptive read disturb faults (DRDF<sub>x</sub>) [Adams96]—A cell is said to have a DRDF if a read operation performed on the cell returns the correct logic value, while it results in changing the contents of the cell. Two types of DRDF exist: DRDF<sub>0</sub> = $\{<0r0/1/0>\}$ , with FP #8, and DRDF<sub>1</sub> = $\{<1r1/0/1>\}$ , with FP #11. The 6 FFMs defined above cover the space of all 12 single-cell static FPs of Table 1. Any single-cell static FFM can be represented as the union set of two or more of these 12 FPs. For example, if a defect results in a faulty behavior represented by an incorrect read-1 fault (IRF<sub>1</sub>) and a read-0 disturb fault (RDF<sub>0</sub>), then the corresponding behavior is described as $\{<1r1/1/0>\} \cup \{<0r0/1/1>\} = IRF<sub>1</sub> \cup RDF<sub>0</sub>$ . #### 3.2 Single-cell dynamic FFMs FFMs sensitized by performing more than one operation on the faulty memory cell are called *dynamic fault models*. There are 2-operation, 3-operation, ..., dynamic fault models, depending on #O. Here, we restrict ourselves to the analysis of 2-operation dynamic FFMs. There are 30 different single-cell 2-operation dynamic FPs possible [vdGoor00], but in order to reduce simulation time, not all 30 FPs are considered. We choose only to target the 4 dynamic SOS's 0w0r0, 0w1r1, 1w0r0 and 1w1r1 (in short xwyry), because in memory devices, an isolated write operation may not be sufficient to detect a fault since, externally, a cell needs to be read to detect the stored value set during the write. The 4 targeted SOS's are capable of sensitizing 12 single-cell 2-operation FPs, which are used to define the following 3 FFMs. The names of these FFMs are chosen in such a way that they represent an extension of the single-cell static FFMs defined in Section 3.1. 1. Dynamic read disturb fault (RDF<sub>xy</sub>) is a fault whereby an xwyry SOS changes the stored logic value to $\overline{y}$ and gives an incorrect output. <sup>&</sup>lt;sup>1</sup>It should be noted that the state fault should be understood in the static sense. That is, the cell should flip in the short time period after initialization and before accessing the cell. Four types of dynamic RDF exist: RDF<sub>00</sub> = $\{<0w0r0/1/1>\}$ , RDF<sub>11</sub> = $\{<1w1r1/0/0>\}$ , RDF<sub>01</sub> = $\{<0w1r1/0/0>\}$ , and RDF<sub>10</sub> = $\{<1w0r0/1/1>\}$ . - 2. **Dynamic incorrect read fault** (IRF<sub>xy</sub>) is a fault whereby an xwyry SOS returns the logic value $\overline{y}$ while keeping the correct state of the cell. Four types of dynamic IRF exist: IRF<sub>00</sub> = $\{<0w0r0/0/1>\}$ , IRF<sub>11</sub> = $\{<1w1r1/1/0>\}$ , IRF<sub>01</sub> = $\{<0w1r1/1/0>\}$ , and IRF<sub>10</sub> = $\{<1w0r0/0/1>\}$ . - 3. Dynamic deceptive read disturb fault (DRDF<sub>xy</sub>) is a fault whereby an xwyry SOS returns the correct logic value y while destroying the state of the cell. Four types of dynamic DRDF exist: DRDF<sub>00</sub> = $\{<0w0r0/1/0>\}$ , DRDF<sub>11</sub> = $\{<1w1r1/0/1>\}$ , DRDF<sub>01</sub> = $\{<0w1r1/0/1>\}$ , and DRDF<sub>10</sub> = $\{<1w0r0/1/0>\}$ . ## 4 Simulated opens In this section, the opens to be considered for injection and simulation in the eDRAM model are first classified, then the location of each of them is shown on the simulated eDRAM model. #### 4.1 Definition of opens Opens represent unwanted impedances on a signal line, which is otherwise supposed to conduct perfectly. For an open, the impedance value is given by $Z_{op}$ and is predominantly resistive (i.e., $C_{op} \approx 0$ making $Z_{op} \approx R_{op}$ ). The open resistance may take any value in the resistance domain, which gives $0 \leq Z_{op} \leq \infty$ $\Omega$ . The fact that opens result in negligible capacitive coupling between the broken nodes has been substantiated by Henderson[91]. By analyzing the electrical circuits of the cell array column, we notice some symmetry in the topology of these circuits. This results in a symmetry in the faulty behavior, which can be used to reduce the number of opens to be simulated and analyzed. The faulty behavior of one open can help deduce the faulty behavior of another symmetrically related open. An open O1 at a given position shows the complementary faulty behavior to an open O2 at another position, if the faulty behavior of O1 is the same as that of O2, with the only difference that all 1s are replaced by 0s, and vice versa. For example, if O1 affects cell x and O2 affects cell y, and O1 forces a 0r0 operation to cause an up transition in cell x, then O2 forces a 1r1 operation to cause a down transition in cell y. Table 2 lists the singlecell FFMs defined in Section 3 and their complementary counterparts. Table 2. Single-cell FFMs and complementary FFMs. | Fault model | Complementary | Fault model | Complementary | |-------------------|-------------------|--------------------|--------------------| | SF <sub>0</sub> | SF <sub>1</sub> | IRF <sub>00</sub> | IRF <sub>11</sub> | | IRF <sub>0</sub> | IRF <sub>1</sub> | DRDF <sub>00</sub> | DRDF <sub>11</sub> | | DRDF <sub>0</sub> | DRDF <sub>1</sub> | RDF <sub>00</sub> | RDF <sub>11</sub> | | RDF <sub>0</sub> | RDF <sub>1</sub> | IRF <sub>01</sub> | IRF <sub>10</sub> | | WDF <sub>0</sub> | WDF <sub>1</sub> | DRDF <sub>01</sub> | DRDF <sub>10</sub> | | TF↑ | TF↓ | RDF <sub>01</sub> | RDF <sub>10</sub> | Table 3. Simulated and complementary opens within a cell. | OC on BT | OC on BC | Description | |----------|----------|--------------------------------------------------------| | OC1s | OC1c | Pass transistor connection to bit line broken | | OC2s | OC2c | Pass transistor connection to storage capacitor broken | | OC3s | OC3c | Cell connection to ground broken | #### 4.2 Locations of opens The possible locations of opens within memory cells (OC), along bit lines (OB), on word lines (OW), and within the sense amplifier (OS) are enumerated and provided with a label for future reference. Opens within a memory cell (OC) can occur at any node within the storage cell (see Table 3). The choice has been made to simulate the opens within a cell on the true bit line (BT), and these defects are therefore labeled as OCxs ('s' for simulated). Consequently, the faulty behavior of an open in a cell on the complement bit line (BC), which is labeled as OCxc ('c' for complementary), may be derived from the corresponding simulated one because it shows the complementary faulty behavior. Opens along a bit line (OB) can occur anywhere on the bit line. Figure 1 shows a complete cell array column with BT and BC together with the bit line opens. The bit lines are divided into 10 regions, each of which may contain an open. Every open on BT has its complementary open on BC and vice versa. Thus, only opens present on BT are simulated. Every open on BT is given the name OBxs, while its counterpart on BC is given the name OBxc. **Opens on a word line (OW)** can only be at one position between the row decoder and the gate of the pass transistor of a memory cell. The behavior of the cell with an open on its word line is the same for every cell on BT and complementary to that on BC. Therefore, only one open is simulated, namely that on WL0, which is called OW1s. The open located on WL1 is called OW1c. Opens within the sense amplifier (OS) can occur at any node within the sense amplifier. Figure 5 shows the internal structure of the sense amplifier with four MOS devices (M1, M2, M3 and M4), a power supply $(V_{DD})$ and an activation node (Active). The figure also shows the 8 possible opens classified as simulated (s), and other 6 opens classified as complementary (c). Opens OS1s through OS6s all have their complementary opens OS1c through OS6c, while the two remaining opens, OS7s and OS8s, have no complementary counterparts. The possible sense amplifier opens located on the bit line are not included here because they have already been treated as a part of the bit line opens. Figure 5. Sense amplifier with all possible locations of opens. ## 5 Simulation methodology The behavior of the eDRAM is studied after injecting and simulating each of the opens defined in Section 4.2. For each defect, simulations are performed at three different temperatures (300 K, 360 K and 420 K) to evaluate the gradual development of T related effects. The analysis considers open resistances within the range ( $10\Omega \leq R_{op} \leq$ $10 \, \text{M}\Omega$ ) on a logarithmic scale using 5 points per decade, in addition to $R_{op}=\infty$ $\Omega$ . Each injected open in the memory model creates floating nodes, the voltage of which is varied between $V_{DD}$ and GND on a linear scale using 10 points. When an interesting faulty behavior is observed, more detailed simulations are performed. Determining the floating node, resulting from each injected open, depends on the type of the open. For opens along bit lines, the floating node is the node connected to column access devices, not the one connected to the precharge devices, since this node is precharged to a known voltage at the beginning of each operation. The floating node for opens within memory cells is taken to be the node connected to the cell capacitor. For opens on word lines, the floating node is the node connected to the memory cell. Finally, the floating node for opens within the sense amplifier is the one between the open and the MOS devices of the sense amplifier (M1, M2, M3 or M4). For each value of the open resistance $(R_{op})$ and of the initial floating node voltage $(U_{init})$ , all the SOS's associated with the targeted FPs defined in Section 3 are performed and inspected for proper functionality. As a result, the faulty behavior resulting from the analysis of opens is represented as regions in the $(U_{init}, R_{op})$ plane. Each region contains a number of sensitized FPs that describe the FFM of the memory in this region. #### 5.1 Dirty operations The analysis shows that performing some SOS's results in a special type of faulty behavior where the SOS leaves behind faulty voltage levels, not only within the cell and on the output, but also on bit lines, word lines, data lines, etc. Since the FP description only considers voltage deviations in the cell (F) and on the output (R), it is not possible to distinguish between an SOS that results in a faulty bit line voltage, for example, and an SOS that leaves behind a proper voltage on the bit lines. In order to identify these SOS's, we give them the title dirty SOS's, since they leave behind a 'dirty' trail of faulty voltages that cannot be described by regular FPs. Dirty SOS's have important test consequences, since they may require new types of detection conditions that are specific to each observed failure (depending on type of faulty voltage trail dirty SOS's leave behind). Dirty SOS's can be identified in the analysis by inspecting the consistency of the faulty behavior associated with a given fault region of a defect. For example, $TF\uparrow$ is a static fault that is sensitized by a failing 0w1 operation. This fault can be detected by performing the sequence 0w1r1, where the r1 operation is performed to detect the failing 0w1 operation. At the same time, the sequence 0w1r1 sensitizes and detects the dynamic fault $RDF_{01}$ . As a results, to be able to detect a sensitized $TF\uparrow$ in a given fault region, the region should also contain a sensitized $RDF_{01}$ . Regions with failing static SOS's, but no failing dynamic SOS's, are called strict static fault regions. These regions need to be identified since they are important from a testing point of view and indicate the presence of a dirty SOS. #### 5.2 Simulation example As an example, the results of the analysis performed on OS4s (which describes an open between BT and the drain of M3, see Figure 5) at 300 K and 420 K are given in Figure 6. The figure shows the observed faulty behavior in terms of fault regions in the $(U_{init}, R_{op})$ plane. The figure shows a number of different fault regions for different Figure 6. Fault analysis of the open OS4s in the $(U_{init},R_{op})$ plane at (a) T=300 K, and (b) T=420 K. combinations of $U_{init}$ and $R_{op}$ . Each fault region contains a number of FPs, each of which describes a failing SOS with the associated faulty behavior. If a region contains more that one FP, it means that more than one SOS is failing at the same time. As a result, if a test detects *anyone* of the failing SOS's in a given fault region, then the test covers that region. It is clear in the figure that, for this defect, $U_{init}$ does not have any impact on the faulty behavior. This is expected since the floating nodes for opens within the sense amplifier have a very small capacitance (1 to 2 fF) relative to the capacitance of the bit lines (150 fF). At 300 K, three fault regions are present: - A. Fault region TF↑ ∪ RDF<sub>01</sub> ∪ DRDF<sub>11</sub> ∪ DRDF<sub>1</sub> ∪ WDF<sub>1</sub> - B. Fault region $TF \uparrow \cup RDF_{01} \cup DRDF_{11}$ - C. Fault region TF↑ Region A has five FPs, all of which represent problems with reading or writing a logic 1. This is caused by the inability of the sense amplifier to properly set a logic 1 on BT since its connection to $V_{DD}$ is broken. When $R_{op}$ decreases below 1.2 M $\Omega$ , Region B starts where it is possible to restore a 1 into the cell by performing either 1w1 or 1r1. Still, the 1w1 degrades the stored 1 thereby resulting in the failure of the dynamic 1w1r1 operation of DRDF<sub>11</sub>. When $R_{op}$ decreases below about 260 k $\Omega$ , Region C starts where only TF $\uparrow$ = {<0w1/0/->} remains. It is interesting to note that although the static operation sequence 0w1 fails, the dynamic operation sequence 0w1r1 succeeds to read a 1 from a cell containing a 0. Simulations show that the open does not only result in the failure of writing a 1 into the cell, but also in setting a high enough voltage into the reference cell. As a result, the level that is considered as 1 at sense time is also reduced, which leads to the success of a subsequent read operation to detect a 1 from the cell. This fault region remains until $R_{op}$ decreases below 20 k $\Omega$ . Region C is an important region from a testing point of view, since the open results in the failure of the *static* 0w1 of TF $\uparrow$ , but not in the failure of the *dynamic* 0w1r1 of RDF<sub>01</sub>. The analysis shows that the static 0w1 operation is a dirty SOS since, in addition to the incorrect setting of the cell voltage, it results in an incorrect setting of the reference cell voltage. This incorrect setting of the reference cell voltage makes Region C particularly difficult to detect. For example, the test $\{\uparrow(w0, w1); \uparrow(r1)\}$ , which detects TF $\uparrow$ sensitized by a clean 0w1 operation, fails to detect the failure in Region C. In Section 7, a new detection condition is presented to detect Region C. At 420 K, only two fault regions are present: - A. Fault region $TF \uparrow \cup RDF_{01} \cup RDF_{11}$ - B. Fault region TF↑ ∪ RDF<sub>01</sub> At 420 K, we are only interested in the *change* in the faulty behavior with respect to that at 300 K. Two main differences can be seen: - The first differece is in the absence of the FPs WDF<sub>1</sub> = $\{<1w1/0/->\}$ and DRDF<sub>1</sub> = $\{<1r1/0/1>\}$ from the behavior at higher T. At 300 K, these two FPs fail due to the inability of the sense amplifier to restore a logic 1 into the cell. Yet, the resulting stored 0 is weak, which means that the two FPs fail by a very narrow voltage margin. Their absence at higher T is due to the increased leakage current to the cell, which pulls up the weak 0 to a weak 1, thereby preventing their fail by a narrow margin too. - Another, more important, change (from a testing point of view) in the faulty behavior at 420 K is the absence of a strict static behavior (absence of Region C at 300 K). This change in behavior is due to the fact that, at 420 K, RDF<sub>01</sub> extends across the whole faulty space of R<sub>op</sub>. In contrast, at 300 K, RDF<sub>01</sub> is not observed in Region C due to the low voltage present in the reference cell. The voltage in the reference cells is the mid-point voltage that specifies the voltage level between a 0 and a 1. At 420, RDF<sub>01</sub> fails because of the increased leakage current to the reference cell, which makes it more difficult to sense a 1. The presence of RDF<sub>01</sub> along with TF $\uparrow$ at 420 K makes it possible to detect the faulty behavior of Region B using the dynamic sequence 0w1r1. This is in contrast to Region C at 300 K, where more complex detection conditions need to be introduced to detect this open, thereby increasing the complexity of the needed memory test. Table 4. Observed faults for opens within the sense amplifier. | Open | Simulated | Complementary | |-------------|-------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------| | OS1 | RDF <sub>0</sub> , RDF <sub>00</sub> , RDF <sub>01</sub> | RDF <sub>1</sub> , RDF <sub>11</sub> , RDF <sub>10</sub> | | OS2 | TF\$\psi\$, WDF\$_0\$, RDF\$_0\$, RDF\$_00\$, RDF\$_10 | $TF\uparrow$ , $WDF_1$ , $RDF_1$ , $RDF_{11}$ , $RDF_{01}$ | | OS3 | RDF <sub>1</sub> | RDF <sub>0</sub> | | OS4,<br>OS6 | TF\(\phi\), WDF\(\pi\), DRDF\(\pi\), RDF\(\partial\), DRDF\(\pi\)1 | TF↓, WDF <sub>0</sub> , DRDF <sub>0</sub> , RDF <sub>10</sub> , DRDF <sub>00</sub> | | OS5 | RDF <sub>0</sub> , RDF <sub>00</sub> , DRDF <sub>10</sub> | RDF <sub>1</sub> , RDF <sub>11</sub> , DRDF <sub>01</sub> | | OS7 | $RDF_0$ , $IRF_0$ , $DRDF_{00}$ , $DRDF_{10}$ | RDF <sub>1</sub> , IRF <sub>1</sub> , DRDF <sub>11</sub> , DRDF <sub>01</sub> | | OS8 | TF†, WDF <sub>1</sub> , DRDF <sub>1</sub> , DRDF <sub>01</sub> , DRDF <sub>11</sub> | TF↓, WDF <sub>0</sub> , DRDF <sub>0</sub> , DRDF <sub>10</sub> , DRDF <sub>00</sub> | ## 6 Discussing simulation results All opens defined in Section 4.2 have been injected, simulated and analyzed, using the $(U_{init}, R_{op})$ plane. The results of opens within memory cells, along bit lines and on word lines have already been published [Al-Ars01]. The results of opens within the sense amplifier are listed in Table 4. The first column in the table specifies the analyzed defects (in case a number of defects sensitize the same FFMs they are listed together), while the second and third columns list the FFMs detected for the simulated and complementary instances of these defects, respectively. The table shows that the opens cause both static as well as dynamic FFMs. Table 5 lists all simulated opens and identifies the change in the faulty behavior as T increases. The first column gives the name of the open, the second and third columns list the change in faulty behavior at 360 K and 420 K, respectively, as compared to the behavior at 300 K. An FFM with a minus sign (-) means that the FFM is removed from the behavior, while a plus sign (+) means that the FFM is added to the behavior. An equal sign entry (=) means that no change takes place. The table shows that, for opens within cells and on bit lines, the faulty behavior of the memory is mostly the same at 300 K and 360 K, while changes take place at 420 K. On the other hand, for opens in the sense amplifier, the faulty behavior is mostly the same at 360 K and 420 K but different at 300 K. Also note that, as T increases, most of the *removed* FFMs in the table describe a fault in the F (stored voltage in the faulty cell) component of an FP = $\langle S/F/R \rangle$ , while most of the *added* FFMs describe a fault in the R (read value) component of an FP that is *detected* on the output. This result is in agreement with experimental studies that show an increase in *detected* faults with raising T [vdGoor99]. Generally, simulations have shown that T has the largest impact on the faulty behavior when $R_{\it op}$ is relatively high. This means that the T related changes in the faulty behav- **Table 5.** Changes in the faulty behavior with increasing T. | Open | 360 K | 420 K | |--------|-----------------------------------------|-----------------------------------------------------------------------| | OC1s | = | +RDF <sub>01</sub> , -DRDF <sub>01</sub> , -IRF <sub>00</sub> | | OC2-3s | = | = | | OB1s | = | +RDF <sub>01</sub> , +RDF <sub>11</sub> | | OB2-3s | = | = | | OB4s | = | -WDF <sub>1</sub> , -DRDF <sub>1</sub> | | OB5-6s | = | $-RDF_0$ , $+DRDF_0$ | | OB7-8s | = | | | OB9s | = | $+DRDF_1$ , $+DRDF_{01}$ , $+DRDF_{11}$ , $-DRDF_{00}$ , $-DRDF_{10}$ | | OB10s | +TF↑ | $+TF\uparrow$ , $+RDF_{01}$ , $+DRDF_{01}$ | | OW1s | = | = | | OS1s | = | = | | OS2s | +RDF <sub>01</sub> , +RDF <sub>11</sub> | $+RDF_{01},+RDF_{11}$ | | OS3s | = | = | | OS4s | $-DRDF_1, -WDF_1$ | -DRDF <sub>1</sub> , -WDF <sub>1</sub> | | OS5s | = | = | | OS6s | -DRDF <sub>1</sub> , -WDF <sub>1</sub> | -DRDF <sub>1</sub> , -WDF <sub>1</sub> | | OS7s | = | | | OS8s | -DRDF <sub>1</sub> , -WDF <sub>1</sub> | $+RDF_{01}$ , $-DRDF_{1}$ , $-WDF_{1}$ , $-DRDF_{11}$ | ior, as described in Table 5, are not distributed uniformly across simulated $R_{op}$ values; they become more common as $R_{op}$ increases. The reason is that it is more difficult for the memory to correct voltage imperfections induced by T when the open resistance is high, rather than low. Therefore, the faulty behavior resulting from the combination of high $R_{op}$ and high T is particularly easier to detect. At the same time, it is observed that for most simulated opens, the region of proper operation shrinks as T increases, making the faulty behavior more prominent. This is, however, not the case for the analysis of OS4 shown in Figure 6, where the region of proper operation does not change with T. Summarizing, the following four mechanisms explain the effects of increased T on the observed change in the faulty behavior: - 1. Increased leakage current into the memory cells weakens the voltage level of a stored 0. - Increased leakage current into the reference cells makes it more difficult to read a stored 1. - 3. Decreased drain current in the pass transistor of the memory cell in particular, and in other transistors in general. This reduces the speed of writing logic 0 and logic 1 into the cell. - 4. General T related effects distributed on more than one component in the memory. This mechanism is said to take effect, if increasing T of an isolated component in the memory does not account for the observed behavioral change. Table 6. Strictly static fault regions resulting from opens. | Open | Simulated | Complementary | Temperatures | |-------|------------------|------------------|------------------------| | OC1-3 | TF↑ | TF↓ | 300 K, 360 K and 420 K | | OB1 | RDF <sub>1</sub> | RDF <sub>0</sub> | 300 K, 360 K and 420 K | | ОВ9 | RDF <sub>0</sub> | RDF <sub>1</sub> | 300 K, 360 K and 420 K | | OS4 | TF↑ | TF↓ | 300 K, 360 K | From a testing point of view, it is important to indicate which FFMs exist in strict static or strict dynamic fault regions in the $(U_{init}, R_{op})$ plane. All dynamic FFMs defined in Section 3.2 take place in strict dynamic regions [Al-Ars01]. In addition, some of the static FFMs defined in Section 3.1 take place in strict static regions. These static FFMs are listed in Table 6. The table shows that four static FFMs exist in strict static fault regions: TF $\uparrow$ , TF $\downarrow$ , RDF<sub>0</sub> and RDF<sub>1</sub>. Table 6 also shows that the strict static fault regions (TF $\uparrow$ and TF $\downarrow$ ) resulting from OS4 are only observed at 300 K and 360 K and not at 420 K, where dynamic faulty behavior is present in all regions (see also Figure 6). ## 7 Test implications The fault analysis performed on the cell array column of the $e\mathrm{DRAM}$ shows that all defined static and targeted dynamic FFMs do take place. Moreover, some defects result in a faulty behavior with only dynamic or only static fault models, by performing certain SOS's on a memory cell. In order to ensure that a particular memory cell array is not faulty, tests should be developed to sensitize and detect all static and dynamic FFMs resulting from the analysis. First, detection conditions are derived to detect all types of observed faulty behavior. Then, the impact of T on these detection conditions is discussed. From a testing point of view, the observed FFMs can be classified into three *categories*, depending on the fault regions that contain them. Each category has its own testing requirement. - Fault regions that contain both static and dynamic FFMs. Here, the FFMs are observed when static as well as dynamic sequences are performed. For these regions, it is enough to sensitize and detect the static FFMs. Many tests have been proposed to detect this type of behavior, such as MATS+, March C-[vdGoor98] and March LA [vdGoor97]. - Fault regions with strict dynamic FFMs (no static FFMs are observed). In order to detect these FFMs, it is not enough to detect the static FFMs, and therefore dynamic sequences should be used. A test to detect - the observed strict dynamic FFMs has been published by Al-Ars[01] . - 3. Fault regions with strict static FFMs (no dynamic FFMs are observed). Here, static FFMs are sensitized; however, subsequent operations hide their fault effect (for example, it is possible to sensitize $TF\uparrow$ by 0w1 but not to detect it by 0w1r1). For these regions, the SOS's causing the faulty behavior are dirty SOS's, which need special detection conditions to detect the faulty behavior. The observed strict static fault regions as described by Category 3 are listed in Table 6. For the FFMs contained in each of these regions, special detection conditions should be generated to detect them. In the following, the four static regions listed in Table 6 are analyzed in order to provide each with an appropriate detection condition. - 1. Fault region caused by defects OC1-OC3, resulting in TF $\uparrow$ = $\{<0w1/0/->\}$ (simulated) and TF $\downarrow$ (complementary). A w1 operation in this region fails to set a 1 into the cell. However, the sequence 0w1r1 detects an the expected logic 1 and charges the cell up to 1. Simulations have been performed in an attempt to detect this fault region. The simulations showed that the simulated fault region can be detected using $\mathfrak{T}(...0, w1, w1, w0, r0, ...)$ . This detection condition starts with two w1 operations that charge up the cell to a logic 1. Then the sequence w0r0 fails and detects a 1 on the output. The complementary fault region can be detected using the detection condition $\mathfrak{T}(...1, w0, w0, w1, r1, ...)$ . - 2. Fault region caused by defect OB1 (see Figure 1), resulting in RDF<sub>1</sub> = $\{<1r1/0/0>\}$ (simulated) and RDF<sub>0</sub> (complementary). This fault region only has a static read disturb fault, which requires the cell to be initialized to a 1 by performing w1. Yet, the sequence w1r1 does not result in a fault, because the defect (OB1) prevents precharging the bit lines to $V_{DD}$ during the precharge cycle. Therefore, the read in the sequence w1r1 does not result in any fault, since the w1 operation preconditions the bit lines to a voltage that ensures a properly functional read operation. In order to detect this fault region, it is necessary to change the preconditioned state of the bit lines after the first write operation, such that a subsequent read operation can detect the fault. This can be done by performing a 0w0 or a 1w0operation to a different cell on the same bit line part, or by performing a 1w1 or a 0w1 operation to a cell on the complementary bit line; this has to be done between the w1 and the r1 of the sequence w1r1. Because of address scrambling, consecutive fast-y addresses alternate between cells on BT and cells on BC. A march test that detects the simulated RDF<sub>1</sub> should perform $\updownarrow(...w1)$ ; $\updownarrow(r1, w1, ...)$ , and (...w0); (r0, w0, ...) for RDF<sub>0</sub>, where the addressing order should be in the fast-y direction. - 3. Fault region caused by defect OB9 (see Figure 1), resulting in RDF<sub>0</sub> = $\{<0r0/1/1>\}$ (simulated) and RDF<sub>1</sub> (complementary). In this fault region, the defect prevents the sense amplifier from amplifying the read signal, which results in a weak signal that is unable to modify the value of the output buffer. The simulated sequence w0r0 succeeds because the w0 operation sets the output buffer to 0, which is kept when the r0 is performed. Therefore, in order to detect this fault region it is important to set the output buffer to 1 using a w1 (to a different cell on the same bit line part) or a w0 (to a cell on the complementary bit line part) after performing the w0 operation to the cell and before performing the r0 operation. A march test that detects this fault region should perform $\uparrow(..., w0)$ ; $\uparrow(r0, w0, w0)$ ...) (simulated fault) and $\updownarrow$ (..., w1); $\updownarrow$ (r1, w1, ...) (complementary fault), where the addressing order should be in the fast-y direction. - 4. Region C caused by defect OS4 at 300 K (see Figure 6), resulting in TF† (simulated) and TF\$\psi\$ (complementary). This fault region is the result of an incorrect setting of the voltage level within the reference cells. By the end of the transition write (0w1) operation, the memory cell contains 1.3 V, which represents a logic 0; however, it is mistaken for a logic 1 because the reference cells contain yet a lower voltage of 0.9 V. The reduced reference cell voltage results from the failure of the sense amplifier to amplify the written 1, which is caused by the open at the open at the source of the pull-up transistor M3 (see Figure 5). This can be corrected by inserting a 0w0 or a 0w1(performed on a different cell on the same bit line part), or a 0w1 or a 1w1 (performed on a cell on the complementary bit line part) after performing the 0w1 operation to the cell and before performing the r1 operation. A march test that detects this fault region should perform $\uparrow(..., w0, w1)$ ; w0, ...) (complementary fault), where the addressing order should be in the fast-y direction. In order to detect all strict static fault regions, a march test has to be performed on the memory that satisfies all the specific detection conditions described in the cases 1–4 above. However, since the static region caused by OS4 is not observed at $T=420~\rm K$ , it is not necessary to perform the associated detection condition at this T. r0), which has a complexity of 14n. This means that the reduction in test time due to elevated T is $\frac{16n-14n}{16n} \approx 13\%$ . ### 8 Conclusions In this paper, a simulation based analysis has been described to study the impact of T on the faulty behavior of $e\mathrm{DRAMs}$ for open defects. The analysis identified a number of T related mechanisms that increase effectiveness of testing at high T. The paper also presented new types of detection conditions for memories, derived new memory tests, and evaluated the impact of T on these efficiency of the derived tests. #### References - [Adams96] R.D. Adams and E.S. Cooley, "Analysis of a Deceptive Destructive Read Memory Fault Model and Recommended Testing," in Proc. IEEE North Atlantic Test Workshop, 1996. - [Al-Ars01] Z. Al-Ars and A.J. van de Goor, "Static and Dynamic Behavior of Memory Cell Array Opens and Shorts in Embedded DRAMs," in Proc. Design, Automation and Test in Europe, 2001, pp. 496–503. - [Falter00] T. Falter and D. Richter, "Overview of Status and Challenges of System Testing on Chip with Embedded DRAMs," in Solid-State Electronics, no. 44, 2000, pp. 761–766. - [Henderson91] C.L. Henderson, J.M. Soden and C.F. Hawkins, "The Behavior and Testing Implications of CMOS IC Logic Gate Open Circuits," in Proc. IEEE Int'l Test Conf., 1991, pp. 302-310. - [McConnell98] R. McConnell, U. Möller and D. Richter, "How we test Siemens' Embedded DRAM Cores," in Proc. IEEE Int'l Test Conf., 1998, pp. 1120–1125. - [vdGoor97] A.J. van de Goor et al., "March LA: A Test for Linked Memory Faults," in Proc. European Design and Test Conf., 1999, p. 627. - [vdGoor98] A.J. van de Goor, Testing Semiconductor Memories, Theory and Practice, ComTex Publishing, Gouda, The Netherlands, 1998, http://ce.et.tudelft.nl/~vdgoor/ - [vdGoor99] A.J. van de Goor and J. de Neef, "Industrial Evaluation of DRAM Tests," in Proc. Design, Automation and Test in Europe, 1999, pp. 623-630. - [vdGoor00] A.J. van de Goor and Z. Al-Ars, "Functional Memory Faults: A Formal Notation and a Taxonomy," in Proc. IEEE VLSI Test Symp., 2000, pp. 281–289. - [Vollrath00] J. Vollrath, "Tutorial: Synchronous Dynamic Memory Test Construction, A Field Approach," in Proc. IEEE Int'l Workshop Memory Technology, Design and Testing, 2000.