# **CHAPTER 4** # HYBRID MULTI-STORAGE SRAM CELL WITH WRITE ASSIST CIRCUIT #### 4.1 Introduction The rapid increase in the number of connected devices in IoT has resulted in the generation of enormous amounts of data [1]-[6]. However, collecting, storing, and analyzing this increasing amount of data remains one of the biggest challenges of IoT [7]. The collected raw sensor data needs to be modeled and reasoned to convert it into useful information [8]. In addition, IoT requires the integration of data from a large number of heterogeneous sensors [9]. In this regard, context-aware computing has proven successful in understanding a large volume of data collected from different sources. Context-aware computing allows identification of the different contexts where a context is referred to as information linked to a single sensor [8]. Since IoT consists of a large number of sensors that interact autonomously, it frequently needs to switch contexts to handle data from different sensors. The traditional memory hierarchy suffers from long latencies while multi-context computing since each context switch results in reloading the on-chip memory with a set of instructions and data required for new context [10]. In conventional memory hierarchy, data and code often reside in external non-volatile memory, which are brought to on-chip memory during processing to increase the speed of execution. However, the energy cost of transporting one bit of data from an off-chip memory such as DRAM/Flash is 1000x more than the energy cost of transferring one bit from local cache SRAM or register file [11]. Therefore, frequent context switches become costly for energy-limited IoT devices that are mainly powered by a battery or harvest ambient sources. Furthermore, the devices used in IoT applications have a short active period followed by a large idle period [12]. The most efficient way to conserve power is to turn OFF the blocks when not in use. This way of computing is referred to as 'Normally-OFF' [13]. Even though normally-off computing results in significant power saving during the idle period but it results in loss of system current state. Consequently, an energy-intensive boot process is required for system initialization, which becomes a bottleneck for energy-constraint IoT [14]. Hence, the state of the system must be preserved before powering down to ensure a quick wake-up. To address the aforementioned issues, emerging non-volatile device such as Magnetic Tunnel Junction (MTJ) seems to be a promising candidate for on-chip memory. It offers features like high integration density, scalability, unlimited endurance, and non-volatility [15]-[17]. Therefore, in this work, we design a hybrid multi-storage memory cell by embedding MTJ device into the SRAM cell. The proposed cell has a 6T SRAM core with N pair of MTJ devices to store N bits of data where each bit corresponds to a different context. Although the memory cell can simultaneously store the data for multiple contexts, only one context can be accessed at a time. The data is accessed through the SRAM cell; therefore, the proposed design hides the long latency of MTJ. Although MTJ offers several advantages, nevertheless the problem with the MTJ is large energy consumption while storing data into it [17]–[20]. The main reason for high store energy is the stochastic behavior of MTJ switching, which results in variable switching time [21]–[24]. Therefore, to avoid store failure due to fluctuation in switching time, pulse width for store operation is kept 4x of average switching time [25]. The probability of error reduces but at the expense of increased power consumption. To address this issue, we also propose an assist circuit which is augmented with every cell to asynchronously terminate the store operation once it is complete. # 4.2 Hybrid Multi-Storage SRAM Cell # 4.2.1 Structure of hybrid multi-storage SRAM cell The proposed hybrid multi-storage SRAM cell modifies the conventional 6T SRAM cell to incorporate non-volatility using MTJ, as shown in Figure 4.1. In this work, we design a multi-storage SRAM cell capable of storing four bits in four MTJ pairs (MTJ1a-MTJ4a and MTJ1b-MTJ4b). **Figure 4.1:** Proposed hybrid multi-storage SRAM cell The proposed memory cell is constructed such that the value stored by the SRAM cell can be copied to any one of the MTJ pair within the cell. Each MTJ pair is independent of each other, and they are separated from the main cell using an isolation transistor (XE1-XE4). The data stored in one MTJ pair corresponds to a single context. Since the proposed cell has multiple MTJ pairs, it can store instruction and configuration for different contexts. In this regard, data stored by MTJ1a and MTJ1b correspond to the content of context 1. Similarly, MTJ2a and MTJ2b corresponds to the content of context 2 and so on. The proposed cell has only one context active at a time while the rest of the contexts remain cached in MTJ pairs. Therefore, application with N context can be handled efficiently using this hybrid memory cell architecture with one active context in SRAM cell and N-1 cached contexts in MTJ pair. For accessing active context, read/write operations identical to traditional SRAM are performed. Whereas to manage cached context, two new operations: load and store, are introduced. Load context: Copy a context from the MTJ pair into the cell internal nodes q and qc. Store context: Save active context to the MTJ pair. However, to load any cached context, the content of the active context is first stored so that it can be restored and executed from the same point later. The process of storing one context and loading another is referred to as context switch. The real time switching is realized by using an equalization transistor (XE5) shown in Figure 4.1, which allows context switching without turning off the power supply. # 4.2.2 Operational modes of multi-storage SRAM cell The proposed cell works in four operational modes: Normal read/write operation, Load operation, Store and Switch operation. # 4.2.2.1 Normal operation In this mode of operation, all the isolation transistors and an equalization transistor are turned off to isolate the MTJ from the main cell. The active context resides in cell storage nodes q and qc. The read and write operations on active context data are performed similar to conventional SRAM cell. Table 4.1 tabulates the status of the control signals while performing read and write operations on the proposed multi-storage SRAM cell. TABLE 4.1: STATUS OF CONTROL SIGNAL DURING NORMAL OPERATIONAL MODE | Operation | BL | BL_bar | WL | RE | WRE1-WRE4 | |-------------|---------|---------|-----|-----|-----------| | Write (1/0) | vdd/gnd | gnd/vdd | vdd | gnd | gnd | | Read | vdd | vdd | vdd | gnd | gnd | # 4.2.2.2 Store operation In this mode of operation, the value present at the nodes q and qc is stored into the MTJ pair of active context. Figure 4.2 shows the store operation of proposed multi-storage SRAM cell. Let us assume that context 1 is active. Hence, q=1 and qc=0 must be stored into MTJ1a and MTJ1b, respectively. Figure 4.2: Store operation in proposed multi-storage SRAM cell The store operation is initiated by asserting control signal WRE1 high, which turns ON the transistor XE1 to connect the MTJ with the cell. Let us assume the MTJ pair (MTJ1a & MTJ1b) initially stores logic '0' (MTJ1a in HRS and MTJ1b in LRS) such that storing logic '1' in MTJ involves flipping of states. The enabling of XE1 transistor allows current to flow from node q to qc through MTJ1a and MTJ1b as indicated by I<sub>Store</sub> in Figure 4.2. The magnitude of current is maintained higher than the switching threshold of MTJ to switch MTJ1a from HRS to LRS and MTJ1b from LRS to HRS. # 4.2.2.3 Load operation In this mode of operation, data stored in the MTJ pair is copied into the cell internal nodes q and qc. Let us consider the case of loading context 1 shown in Figure 4.3, where the value stored by MTJ1a and MTJ1b is loaded into the nodes q and qc. Table 4.2 lists the status of all the control signal while performing a load operation. The load operation begins with the equalization of storage nodes q and qc using equalization transistor XE5. The control signal RE is asserted high for a short period, as shown in Figure 4.3, to bring both the nodes q and qc at the same voltage level. Meanwhile, control signal pch is asserted low to enable control signal WRE1. With WRE1 and RE both high, path shown in Figure 4.3 is formed. As soon as control signal RE goes low, the difference in resistance of MTJ1a (HRS) and MTJ1b (LRS) causes an imbalance between node voltages q and qc, which is then translated to full voltage level (q=0 and qc=1 for this case) by cross-coupled inverter action. Figure 4.3: Load operation of proposed multi-storage SRAM cell | | cs0 | cs1 | cx1 | WL | se | pch | RE | da | WRE1 | |---------------------------------------------|-----|-----|-----|-----|-----|---------------|------------------|-----|------| | Load Operation<br>performed on<br>Context 1 | low | low | low | low | low | high then low | high<br>then low | low | high | TABLE 4.2: STATUS OF CONTROL SIGNAL DURING LOAD OPERATION # 4.2.2.4 Switch operation In this mode of operation, contexts are switched. Before switching to another context, values present at nodes q and qc are stored in corresponding MTJ of active context by performing store operation described in sub-section 4.2.2.2. After store operation is complete, cell nodes q and qc are loaded with MTJ values of new context by performing load operation described in sub-section 4.2.2.3. Therefore, operation of switching context involves store and load operation performed on present and next active context, respectively. # 4.2.3 Circuit-level analysis Case 1: Activating a Context - Let us consider the case of 'activating context 1'. Figure 4.4 plots the simulation waveform and the circuit diagram (explanation of circuit diagram in subsection 4.2.2.3). **Figure 4.4**: Simulation waveform of activating a context in proposed multi-storage SRAM cell To access any cached context, the first step is to perform the load operation. Figure 4.4 first plots the load operation in which value stored by MTJ1a and MTJ1b is copied into the nodes q and qc, respectively. The initial value of nodes q/qc is 0/1, whereas the initial state of MTJ1a is low resistance (LRS) and MTJ1b is high resistance (HRS). Therefore, the load operation results in overwriting 0/1 present at the nodes q/qc to 1/0. The simulation begins by asserting control signals pch low and RE high. As a result, a path shown in circuit diagram in Figure 4.4 is formed, which create a differential voltage across nodes q and qc. Now, as soon as control signal RE goes low, nodes q and qc take the values stored by the MTJ pair. Figure 4.4 then plots the normal operation. Write '0' and Read '0' operations on context 1 are performed in the same way as conventional SRAM cell. Case 2: Switching Context – The process of switching context involves the store operation followed by the load operation. Let us consider the case in which active context 1 is switched to context 2. Figure 4.5 plots the simulation waveform for this case. The simulation begins with the store operation in which values present at nodes q=0 and qc=1 are stored into MTJ1a and MTJ1b, respectively. Since the previous state of MTJ1a is LRS and MTJ1b is HRS, store operation of q/qc = 0/1 involves flipping of MTJ states. The control signal pch is asserted low to initiate the store operation. Since the switching time of HRS to LRS is less than the switching time of LRS to HRS, MTJ1b switches earlier than MTJ1a. Next, load operation is performed in which values of MTJ2a, and MTJ2b are copied into the cell nodes q and qc, respectively. The MTJ2a is in LRS and MTJ2b is in HRS state. Therefore, node q makes a transition from 0 to 1 and qc from 1 to 0. **Figure 4.5:** Simulation waveform performing context switch in proposed multi-storage SRAM cell # 4.3 Write Assist Circuit for Hybrid SRAM Cell The proposed hybrid SRAM is designed to meet the requirement of data retention in normally-off applications. However, the write energy of the proposed hybrid SRAM is very high. There are three main reasons which contribute to high write energy. First, the requirement of a large write current through the MTJ for a long time to switch its state. The write current of the MTJ is 10x higher than that of traditional SRAM, which infers a high-power consumption [17]. Second, asymmetric and stochastic nature of MTJ write. The write operation in MTJ is asymmetric which means writing '1' takes more time compared to writing '0'. The duration of write operation must be employed for the worst-case scenario. Therefore, the pulse width for write depends upon writing '1' time [18]. As a consequence, unnecessary current flows through the cell undergoing write '0' operation, even after the switching is complete. Moreover, due to random thermal fluctuation, the switching time of the MTJ is inherently stochastic. The stochastic nature of MTJ write results in random switching time even with the same operating conditions [19]. The variation in switching time results in an increased probability of write errors. Therefore, to account for both stochastic and asymmetric behavior of MTJ, write pulse duration is enlarged at the expense of an increase in power consumption. The third reason for high write energy is redundant write operation which refers to the case in which data to be written matches the MTJ current state. While performing a redundant write operation, the same amount of switching current (in µA) flows through the MTJ as required to flip its state. Consequently, the total power consumption of the memory increases. On average, in a typical application, 88% of the writes in the L2 cache are redundant [26]. Therefore, huge power can be saved provided redundant writes are avoided. Furthermore, unnecessary current flow through MTJ during redundant write imposes severe stress on the device. As a result, MTJ resistance, switching current, and switching time values are degraded over time. To address these challenges, we present a novel write assist or write termination circuit, which is integrated with the proposed single bit and multi-storage hybrid SRAM cell to detect the successful switching of MTJ states and thereafter terminate the write current. # 4.3.1 Proposed write assist/termination circuit for single-bit hybrid SRAM cell Figure 4.6 illustrates the schematic diagram of the proposed single-bit 8T hybrid SRAM cell (refer to section 3.3.2) integrated with the write assist/termination circuit. Figure 4.6: Proposed single-bit hybrid 8T SRAM cell with write termination circuit The proposed write assist/termination circuit is augmented with every cell for independent bit control. It allows bit-wise monitoring and write current termination after detecting the successful switching of MTJ states. The switching of the MTJ state is detected by constant monitoring of the voltage variation on nodes 1a and 1b, which is then used to asynchronously terminate the operation to save power. # 4.3.1.1 Operational modes of self-terminating SRAM cell The proposed hybrid memory works in three different operational modes: write mode, restore mode and normal memory operation mode. #### 4.3.1.1.1 Write operation In this mode of operation, value at the nodes q and qc are stored into the MTJ1 and MTJ2, respectively. As soon as MTJs are switched to the required state, the write current path is disabled by the write termination circuit. The write termination circuit consists of two inverters, one AND gate and one OR gate as shown in Figure 4.7. Figure 4.7: Write operation of proposed hybrid 8T SRAM cell The Write Detection signal (WD) used in the proposed write termination circuit is asserted low to indicate completion of switching. Whereas, the Write Path Enable signal (WPE) is asserted low to terminate the write current. Figure 4.8 plots the transient waveform for the proposed hybrid cell with a write assist circuit. In Figure 4.8, the first highlighted region in blue plots the write operation for the hybrid 8T SRAM cell. Let us assume the initial state of MTJ1 and MTJ2 are LRS and HRS, respectively. Now, for storing q=0/qc=1 in the MTJ, state of both the MTJs has to be flipped. The WE signal is asserted high for the complete duration of the write operation. Now, to begin the write current, a pulse is applied at the Write & Restore Enable (WRE) signal which turns ON the X1 transistor. As a result, I<sub>write</sub> flows through the MTJs whose direction is determined by the voltage present at the nodes q and qc. The current flows from qc→MTJ2→MTJ1→q switching MTJ1 to high resistance state (HRS) and MTJ2 to low resistance state (LRS), as shown in Figure 4.8. The voltage at nodes 1a and 2a remain at a lower voltage level (lower than inverters threshold) until the switching happens. Therefore, the initial value of write detection signal (WD) is high. Due to change in resistance of MTJ, the voltage at the node 1a and 2a varies. And once the switching is complete voltage at both the node 1a and 2a rises to a value higher than threshold value of the inverters. This change in voltage is amplified by a CMOS inverter. The amplified voltages now present at node 1b and 2b respectively are then ANDed to assert WD low. With write detection signal WD low, WPE is also pulled to gnd, which finally terminates the write current by turning OFF the X1 transistor. Figure 4.8: Simulation waveform of the proposed single-bit 8T SRAM cell However, for the case q=1/qc=0, MTJs states are not flipped as data to be stored is the same as the MTJ current state. Therefore, detection signal WD is low, and as soon as WRE is asserted low, control signal WPE is pulled to gnd, immediately terminating the write current. # **4.3.1.1.2** Restore operation In the restore mode of operation, the stored values in the MTJs are written into the SRAM cell nodes q and qc. In Figure 4.8, the region highlighted in red plots the restore operation for the proposed 8T SRAM cell. The application of a short pulse to the restore enable signal (RE) enables equalization transistor X2 to bring the voltage at nodes q and qc to the same value. The pulse width of RE is significantly shorter compared to the restore period to reduce the power dissipation due to short circuit current which flows during the process of equalization of nodes. After equalization, the control signal WRE is asserted high to provide a complete path $[X2\rightarrow q\rightarrow MTJ1\rightarrow X1\rightarrow MTJ2\rightarrow qc]$ for nodes to have differential voltage. This differential voltage across nodes is translated to full voltage level as soon as WRE is asserted low by the regenerative action of two cross-coupled inverters. # 4.3.1.1.3 Normal memory operation In normal operating mode, both the isolation (X1) and equalization transistors (X2) are OFF to disconnect the MTJ device and the write termination circuit from the cell. As a result, the proposed cell functions as a typical SRAM cell with similar read and write operation. # 4.3.2 Proposed write assist/termination circuit for multi-storage hybrid SRAM cell In order to reduce write/store energy in the multi-storage hybrid SRAM cell, we propose a write assist circuit that senses the completion of store operation and terminates it immediately. The write assist/termination circuit for the multi-storage SRAM cell consist of a detection circuit and a store enable circuit, as shown in Figure 4.9 (a). The function of the store assist circuit is to terminate the current I<sub>Store</sub> in the main cell as soon as successful switching of the MTJ state is detected. As shown in Figure 4.9 (b), the detection circuit continuously monitors the MTJ states and generates a detection acknowledgement (da) signal to indicate successful switching of the MTJ states. It has a multiplexer that selects one of the nodes (1a-4a) using two control signals, cs0 and cs1. The value of control signals cs0 and cs1 depends on the active context, i.e., if context 1 is active, then cs0/cs1 is 0/0 and node 1a is selected. The voltage variation at the selected node is then used to determine the status of MTJ switching. The selected node voltage is amplified by a buffer and ANDed with store enable (se) signal to generate detection acknowledgement (da) signal. On the other hand, the store enable circuit shown in Figure 4.9 (b) initiates the store operation and terminate it upon receiving detection acknowledgment (da) signal. # 4.3.2.1 Operational modes of self-terminating multi-storage SRAM cell The proposed multi-storage hybrid memory with write assist circuit also have the same operational mode: Normal memory operation, Load operation, Store and Switch operation. The normal and load operation remains unchanged with the addition of assist circuit. However, the modified functionality of the store operation is presented in the following section. **Figure 4.9:** (a) Block Diagram of proposed multi-storage SRAM cell (b) Proposed multi-storage SRAM cell with write/store assist circuit # **4.3.2.1.1 Store operation** The store operation is initiated by enabling any one of the WRE1/WRE2/WRE3/WRE4 control signals. The selection of a particular WRE control signal depends on the context requesting for store operation where the context is indicated by the low value of control signal cx (cx1-cx4). For example, if context 1 requests for store operation then, the control signal cx1 is asserted low. Consequently, control signal WRE1 is asserted high, which is connected to the gate of isolation transistor XE1. However, upon receiving the detection acknowledgement (da) signal, it immediately disables the I<sub>Store</sub> current in the main cell by asserting WRE low. As a result, power consumption during the store operation reduces. Let us consider the case in which store operation is performed on context 1. With context 1 active, the status of the control signal to perform the store operation is listed in Table 4.3. The WRE1 control signal which enables the current path in the cell is generated by the Store Enable Circuit. To begin the operation, control signal pch is asserted low for a short period, as shown in Figure 4.10. TABLE 4.3: STATUS OF CONTROL SIGNAL DURING STORE OPERATION | Store Operation performed on Context 1 | | | | | |----------------------------------------|------------------------------|--|--|--| | cs0 | low | | | | | cs1 | low | | | | | cx1 | low | | | | | WL | low | | | | | se | high | | | | | pch | Goes low for the short time | | | | | RE | low | | | | | da | Goes high for the short time | | | | | WRE1 | high then low | | | | Figure 4.10: Timing Waveform during store operation The value of control signal cx1 is low since context 1 is active. The control signals cx1 and pch are de-asserted low, and the control signal is WRE1 pulled to vdd, as shown in Figure 4.11. As soon as WRE1 is asserted high, current $I_{Store}$ starts flowing through the hybrid SRAM cell to switch the states of the MTJ. Figure 4.11: Generation of WRE control signal by Store Enable circuit Meanwhile, the detection circuit is enabled by the store enable (se) signal. Node voltage 1a shown in Figure 4.12 (a) is continuously monitored to detect the switching of MTJ states. Before switching, the value of node 1a is below the threshold of the buffer such that output of the buffer is logic '0'. This output is then ANDed with store enable signal to generate detection signal (da). The initial value of acknowledgement signal (da) is low such that transistor tx1 in the store enable circuit is turned OFF, therefore, WRE1 remains high, as shown in Figure 4.12 (b). **Figure 4.12:** (a) Generation of detection signal (b) Self-termination of store operation in multi-storage SRAM Cell Once switching is complete, node voltage 1a rises above the threshold value of buffer resulting in detection signal (da) asserted high. With detection acknowledgment (da) asserted high, tx1 in store enable circuit is turned ON to quickly discharge WRE1 to gnd, as shown in Figure 4.12 (b). As a result, current I<sub>Store</sub> in the main cell is disabled. #### 4.3.2.2 Simulation Results Figure 4.13 plots the simulation waveform for store operation to highlight the self-terminating feature of proposed hybrid multi-storage SRAM cell. As soon as both the MTJ switches to the required state, a detection acknowledgment signal (da) is generated by the detection circuit as shown in Figure 4.13. With detection signal (da) asserted high by the detection circuit, WRE1 is de-asserted low to stop the current flow, thereby terminating the operation. Figure 4.13: Simulation waveform for store operation in proposed multi-storage SRAM cell The energy consumption and latencies of the proposed multi-storage SRAM cell is plotted in Figure 4.14. Our results indicate that the increase in energy consumption during write operation with 4 contexts is 25%, 15.2%, 5.4% when compared to multi-storage SRAM cell with 1, 2 and 3 contexts, respectively. Similar trend of increase in energy consumption is observed during the read operation. The increase in read energy of multi-storage SRAM with 4 contexts is 19%, 11%, 5.7% when compared to multi-storage SRAM cell with 1, 2, 3 contexts, respectively. On the other hand, there is no significant impact on read and write access time with increasing number of contexts. Also, area estimation based on number of transistors indicate that there is no area overhead with the proposed cell design when compared to SRAM cell of same capacity. **Figure 4.14:** Energy consumption and latencies are plotted for hybrid multi-storage cell against number of contexts # 4.3.2.2.1 Comparative analysis of proposed multi-storage SRAM cell with existing multi-storage SRAM cell In order to evaluate the efficiency of the proposed multi-storage SRAM cell, we compare the proposed cell with the existing Yanjun Ma multi-storage SRAM cell [10]. For the fair comparison, both the memory cells are designed using the same MTJ model. Table 4.4 summarizes the energy consumption while performing different operations in the proposed multi-storage SRAM cell and Yanjun Ma multi-storage SRAM cell [10]. It can be observed that the overall energy requirement of Yanjun Ma's SRAM cell is higher than the proposed hybrid SRAM cell, which can be explained as follows: - Energy consumption during normal read/write operation in the Yanjun Ma multi-storage SRAM cell [10] is high because the read/write operation is performed through the MTJ while in proposed multi-storage SRAM cell MTJ are completely isolated from the main cell during normal operation. - During the store operation in Yanjum Ma multi-storage SRAM cell [10], bitlines are biased to vdd/2 while nodes have either vdd or gnd values. Therefore, the effective potential difference between the nodes and bitlines is vdd/2 which reduces the current value. Since switching time depends on the amount of current flow, it takes longer time to switch the MTJs into the desired state. As a result of increased switching latency, energy consumption during the store operation increases. On the other hand, the proposed circuit has a potential difference of vdd, resulting in smaller switching times and smaller store - energy. Also assist circuit is integrated with the proposed circuit to further the energy consumption, which can be observed in Table 4.4. - Switching context in Yanjun Ma multi-storage SRAM cell [10] consumes more power since switching context involves storing the content of current context and then loading the content of next context. Both the store operation and load operations in the Yanjun Ma [10] cell are energy-intensive. Therefore, the total energy consumption during switching context operation is high. TABLE 4.4: ENERGY CONSUMPTION DURING DIFFERENT OPERATIONAL MODES IN YANJUN MA AND PROPOSED MULTI-STORAGE SRAM CELL. | Energy Consumption (fJ) | | | | | | | | | |-------------------------------------------------|---------------------|-------|---------------------|------------------------|-----------|---------------------|--|--| | | Normal<br>Operation | | Store O | peration | Load | Switch<br>Operation | | | | | Read | Write | With assist circuit | Without assist circuit | Operation | (Store + Load) | | | | Proposed<br>multi-storage<br>SRAM cell | 2 | 13 | 148 | 229 | 63 | 135 | | | | Yanjum Ma<br>multi-storage<br>SRAM cell<br>[10] | 135 | 334 | - | 544.3 | 135 | 679.23 | | | #### 4.4 Conclusion We present a multi-storage hybrid SRAM cell for multi-context applications in the Internet of Things network. The proposed cell has an active context in its 6T SRAM core, while other contexts are stored in multiple MTJ pairs. We demonstrated that adding non-volatility and multiple storage capabilities to the conventional 6T SRAM cell using an MTJ device has little impact on its read and write performance. Since storing data bits into the MTJ is energy costly, we also propose an assist circuit for the proposed single-bit and multi-storage SRAM cells that terminate the store operation asynchronously once MTJs are switched to the required value, therefore, achieving 35% of power saving when compared to SRAM cell without assist circuit. The different operational modes of the proposed cell with assist circuit are explained, and simulation results are demonstrated. Exhaustive circuit analysis is also performed using multiple key performance parameters, including read/write energies, store/load energies, and latencies. The proposed cell shows significant improvement in performance when compared with the existing multi-storage SRAM cell. It is worth noting that the proposed cell shows a reduction of energy consumption by 72% during store operation and a reduction of 68% during switching context operation. #### REFERENCES - [1] ITRS, "International Technology Roadmap for Semiconductors 2.0: Executive Report," *Int. Technol. roadmap Semicond.*, p. 79, 2015. - P. Sparks, "Whitepaper: The route to a trillion devices The outlook for IoT investment to 2035," *ARM Community*, no. June. pp. 1–14, 2017. - [3] S. Gaudin, "Get ready to live in a trillion-device world," *Computer World*, 2015. [Online]. Available: https://www.computerworld.com/article/2983155/internet-of-things/get-ready-to-live-in-a-trillion-device-world.html. - [4] E. Alioto, Massimo, "Enabling the Internet of Things: THE IOT VISION", 1st Editio. *Springer International Publishing*, 2017. - [5] C. Perera, S. Member, A. Zaslavsky, P. Christen, and D. Georgakopoulos, "Context Aware Computing for The Internet of Things: A Survey," *IEEE Commun. Surv. Tutorials*, vol. 16, no. 1, pp. 414–454, 2014. - [6] S. Cheng, "Drawing Dominant Dataset From Big Sensory Data in Wireless Sensor Networks," 2015 IEEE Conf. Comput. Commun., no. 1, pp. 531–539, 2015. - [7] L. G. R'10s and Jos'e Alberto Incera Digue, "Big Data Infrastructure for analyzing data generated by Wireless Sensor Networks," in 2014 IEEE International Congress on Big Data, 2014, pp. 816–823. - [8] O. B. Sezer, E. Dogdu, and A. M. Ozbayoglu, "Context-Aware Computing, Learning, and Big Data in Internet of Things: A Survey," *IEEE Internet Things J.*, vol. 5, no. 1, pp. 1–27, 2018. - [9] Ren'e Bergelt, M. Vodel, and Wolfram Hardt, "Energy efficient handling of big data in embedded, wireless sensor networks," in 2014 IEEE Sensors Applications Symposium (SAS), 2014. - [10] Y. Ma, "Nonvolatile multibit SRAM, bit level caching, and multi-context computing for IoT," 2015 15th Non-Volatile Mem. Technol. Symp. NVMTS 2015, pp. 15–17, 2015. - [11] R. Aitken, V. Chandra, J. Myers, B. Sandhu, L. Shifren, and G. Yeric, "Device and technology implications of the Internet of Things," in 2014 Symposium on VLSI Technology (VLSI-Technology): Digest of Technical Papers, 2014, pp. 1–4. - [12] Y. Gong, N. Gong, L. Hou, and J. Wang, "MTJ based data restoration in non-volatile SRAM," in 2016 13th IEEE International Conference on Solid-State and Integrated Circuit Technology, ICSICT 2016 Proceedings, 2017, vol. 1, pp. 1011–1013. - [13] B. Jovanovic, R. M. Brum, and L. Torres, "MTJ-based Hybrid Storage Cells for 'Normally- OFF and Instant-ON' Computing," *Electron. Energ.*, vol. 28, no. 3, pp. 465–476, 2015. - [14] O. Turkyilmaz *et al.*, "RRAM-based FPGA for 'normally Off, Instantly On' applications," *J. Parallel Distrib. Comput.*, vol. 74, no. 6, pp. 2441–2451, 2014. - [15] A. Chen, "A review of emerging non-volatile memory (NVM) technologies and applications," *Solid. State. Electron.*, vol. 125, pp. 25–38, 2016. - [16] J. Meena, S. Sze, U. Chand, and T.-Y. Tseng, "Overview of emerging nonvolatile memory technologies," *Nanoscale Res. Lett.*, vol. 9, no. 1, p. 526, 2014. - [17] R. Bishnoi, M. Ebrahimi, F. Oboril, and M. B. Tahoori, "Asynchronous Asymmetrical Write Termination (AAWT) for a low power STT-MRAM," *Des. Autom. Test Eur. Conf. Exhib. (DATE)*, 2014, pp. 1–6, 2014. - [18] P. Zhou, B. Zhao, J. Yang, and Y. Zhang, "Energy reduction for STT-RAM using early write termination," *Comput. Des. Dig. Tech. Pap. 2009. ICCAD 2009. IEEE/ACM Int. Conf.*, pp. 264–268, 2009. - [19] R. Bishnoi, F. Oboril, M. Ebrahimi, and M. B. Tahoori, "Avoiding unnecessary write operations in STT-MRAM for low power implementation," *Proc. Int. Symp. Qual. Electron. Des. ISQED*, pp. 548–553, 2014. - [20] D. Lee, S. K. Gupta, and K. Roy, "High-Performance Low-Energy STT MRAM Based on Balanced Write Scheme," in *ISLPED '12*, 2012, pp. 9–14. - [21] D. Suzuki, M. Natsui, A. Mochizuki, and T. Hanyu, "Cost-efficient self-terminated write driver for spin-transfer-torque RAM and logic," *IEEE Trans. Magn.*, vol. 50, no. 11, pp. 2–5, 2014. - [22] M. K. Gupta, "Self-Terminated Write-Assist Technique for STT-RAM," *IEEE Trans. Magn.*, vol. 52, no. 8, 2016. - [23] D. Zhang *et al.*, "High-Speed, Low-Power, and Error-Free Asynchronous Write Circuit for STT-MRAM and Logic," *IEEE Trans. Magn.*, vol. 52, no. 8, pp. 2–5, 2016. - [24] H. Farkhani, M. Tohidi, A. Peiravi, J. K. Madsen and F. Moradi, "STT-RAM Energy Reduction Using Self-Referenced Differential Write Termination Technique," in *IEEE Transactions on Very Large Scale Integration (VLSI) Systems*, vol. 25, no. 2, pp. 476-487, Feb. 2017, doi: 10.1109/TVLSI.2016.2588585. - [25] T. Zheng, J. Park, M. Orshansky, and M. Erez, "Variable-Energy Write Architecture with Bit-Wise Write-Completion Monitoring," in *Internatinal Symposium on Low Power Electronics and Design (ISLPED)*, 2013, pp. 229–234. - [26] Bishnoi, R., Ebrahimi, M., Oboril, F., Tahoori, M.B., "Improving Write Performance for STT-MRAM", *IEEE Trans. Magn.*, Volume 52, 2016.