Real Computer Science begins where we almost stop reading ...: Random-Access Memories

Saturday, 5 October 2013

Random-Access Memories

we will examine memory components in more detail. In particular, we will focus on two important aspects of memory system design: the detailed timing waveforms for a static RAM component, and the design of the register and control logic that surrounds the memory subsystem, making it possible to interface the memory to the rest of the digital system. But first, we must begin with the basics.

7.6.1 RAM Basics: A 1024 by Four-Bit Static RAM

We begin with a relatively simple memory component, a 1024 by 4-bit static RAM. The basic storage element of the static RAM is a six-transistor circuit, shown in Figure 7.46.

The "static" storage element is provided by the cross-coupled inverters. This circuit configuration will hold a 1 or 0 as long as the system continues to receive power. There is no need for a periodic refreshing signal or a clock.

The nMOS transistors provide access to the storage element from two buses, denoted Dataj and

. To write the memory element, special circuitry in the RAM drives the data bit and its complement onto these lines while the word enable line is asserted. When driven in this fashion, the data bit can overwrite the previous state of the element.

To read the contents of the storage element, the word enable line is once again enabled. Instead of being driven onto the data lines (also called bit lines), the data are "sensed" by a different collection of special circuits. These circuits, called sense amplifiers, can detect small voltage differences between the data line and its complement. If Dataj is at higher voltage than

, the cell contained a logic 1. If the situation is reversed, the cell contained a logic 0.

RAMs are efficient in packing many bits into a circuit package for two reasons. First, only a small number of transistors are needed to implement the storage elements. And second, it is easy to arrange these elements into rows and columns. Each row of memory cells shares a common word enable line. Each column shares common bit lines. The number of columns determines the bit width of each word. Thus, you can find memory components that are 1, 4, or 8 bits wide and that read or write the bits of a single word in parallel.

Figure 7.47 shows the pin-out for our 1024 by 4-bit SRAM (static RAM). The pins can be characterized as address lines, data lines, and control lines. Since the RAM has 1024 words, there must be 10 lines to address them. Since each word is 4 bits wide, there are four data lines. The same pins are used for reading or writing and are called bidirectional. The value on the active low control signal Write Enable (

) determines their direction.

The final signal on the chip is the chip select control line (

). When this signal goes low, a read or write cycle commences, depending on the value of write enable. If write enable is also low, the data lines provide new values to be written into the addressed word within the RAM. If it is high, the data lines are driven with the contents of the addressed word.

Internal Block Diagram From the preceding discussion, you might infer that the RAM is organized as an array with 1024 words and four columns. In terms of performance and packaging, this is not the best internal organization. A long thin array leads to long wires, which take more time to drive to a given logic voltage. Also, rectangular integrated circuits are more difficult to arrange on a silicon wafer for processing. A square configuration is much more desirable.

Figure 7.48 gives a more realistic block diagram of the internal structure of a typical 1024 by 4-bit SRAM. The RAM array consists of four banks of 64 words by 16 bits each. This makes the array square. Let's consider a read operation. The high-order 6 bits of the address select one of 64 words. Four groups of 16 bits each emerge from the storage array, one group for each of the possible data bits. The four low-order address bits select one of 16 bits from each of the four groups to form the 4-bit data word. Writes are similar, except with data flowing in the opposite direction.

This form of two-dimensional decode, with row and column decoders, is used universally in memory components. Not only does it keep the memory array square, it also limits the longest lines in the decoders.

Simplified Read Cycle and Write Cycle Timing Controlling the function of a RAM chip requires precise sequencing of the address pins and control signals.

Figure 7.49 gives a simplified logic timing diagram for the RAM read cycle (we defer a more precise description of RAM timing to Section 7.6.2). First, a valid address must be set up on the address lines. Then the chip select (

) line is taken low while the write enable (

) stays high. The memory access time is the time it takes for new data to be ready to appear at the output. It is measured from the last change in the address lines, although the output is not visible off-chip unless the chip select is low. Once the chip select line goes high again, deselecting the chip, the output on the data lines will no longer be valid.

Figure 7.50 gives the write cycle sequencing. Because an erroneous write could have destructive consequences, we must be especially careful during the sequencing of the write signals. To be conservative,

should be brought low and the address and data lines should be stable before

goes low. A similar sequence occurs in reverse to end the write cycle.

While conceptually correct, this specification is more restrictive than it needs to be. Technically speaking, the write cycle begins when both

and

go low. It ends when

goes high. The only absolute requirement is that the address is stable a setup time before both signals go low and satisfies a hold time constraint after the first one goes high. The data setup and hold times are also measured from the first control signal to rise.

Another important metric for RAMs is the memory cycle time. This is the time between subsequent memory operations. In general, the access time is less than or equal to the memory cycle time.

7.6.2 Dynamic RAM

Static RAMs are the fastest memories (and the easiest to interface with), but the densest memories are dynamic RAMs (DRAMs). Their high capacity is due to an extremely efficient memory element: the one-transistor (1-T) memory cell.

The 1-T memory cell, consisting of a single access transistor and a capacitor, works as follows (see Figure 7.51).

The word line and bit line provide exactly the same function as in the SRAM. To write the memory cell, the bit line is charged to a logic 1 or 0 voltage while the word line is asserted. This enables the access transistor, making it possible to charge up the storage capacitor with the desired logic voltage.

The read operation takes place by asserting the word line. The access transistor is turned on, sharing the voltage on the capacitor with the bit line. Sensitive amplifier circuits detect small changes on the bit line to determine whether a 1 or 0 was stored in the selected memory element.

The destructiveness of the read operation makes DRAMs complex. To read the contents of the storage capacitor, we must discharge it across the bit line. Thus, external circuitry in the DRAM must buffer the values that have been read out and then write them right back.

The second problem with DRAMs, and the most significant one from your viewpoint as a system designer, is that their contents decay over time. Every once in a while (measured in milliseconds), the charge on the storage capacitors leaks off. To counteract this, the DRAM must be refreshed. Periodically, the memory elements must be read and written back to their storage locations.

To make this operation reasonably efficient, the DRAM's memory array is a two-dimensional matrix organized along the lines of the SRAM block diagram of Figure 7.48.

Figure 7.52 shows the block diagram of a 4096 by 1-bit DRAM. Rather than refresh individual bits, a refresh cycle reads out and writes back an entire row. This happens about once every few microseconds. Just as in the SRAM, the row is typically a multiple of the DRAM's word size. In this case, it is 64 bits wide. The refresh cycles are generated by an external memory -controller.

DRAM Access with Row and Column Address Strobes Every time a single-bit word is accessed within the memory array of Figure 7.52, the DRAM actually accesses an entire row of 64 bits. The column latches select one of the 64 bits for reading or writing. The DRAM often accesses adjacent words in sequence, so it is advantageous if access is rapid.

Memory chip designers have developed clever methods to provide rapid access to the DRAM. The key is to provide separate control lines for DRAM row access, RAS (row address strobe), and column access, CAS (column address strobe). Normally, access involves specifying a row address followed by a sequence of column addresses. This has the extra advantage of reducing the number of address pins needed. A single (smaller) set of address lines are multiplexed over time to specify the row and column to be accessed. This becomes a critical issue as memory chips exceed 1 million bits (20 address pins).

In Figure 7.52 we have done away with the chip select signal and replaced it with two signals:

and

. The address lines, normally 12 for a 4096-bit memory, can now be reduced to 6. Memory access consists of a RAS cycle followed by a CAS cycle. During the RAS cycle, the six -address lines specify which of the 64 rows to access. In the following CAS cycle, the address lines select the column to access.

Figure 7.53 shows the RAS/CAS timing for a memory read. Throughout this sequencing, the

line is held high. First the row address is provided on the address lines. When the

line is brought low, the row -address is saved in a latch within the DRAM and the memory access -begins. Meanwhile, the address lines are replaced by a column address. When

goes low, the column address is latched. At this point the output is enabled, although it is valid only after a propagation delay. When

goes high again, the accessed row is written back to the memory -array, -restoring its values. When

goes high, the output returns to the high-impedance state.

Figure 7.54 shows the RAS/CAS sequencing for a memory write. The signaling begins as before: the row address appears on the address lines and is internally latched when

goes low. The row is now read out from the memory array. While the address lines are changing to the column address, valid data is placed on the data input line and

is taken low. Once the address lines are stable,

can also be taken low. This latches the column address and also replaces the selected bit with the DIN within the column latches. At this point,

can be driven high. When

goes high again, the entire row, including the replaced bit, is written back to the memory array. Finally,

can be driven high, and another memory cycle can commence.

Refresh Cycle The storage capacitor at the heart of a DRAM memory cell is not perfect. Over time, it leaks away the charge it is meant to hold. Thus DRAMs must undergo periodic refresh cycles to maintain their state. In its simplest form, a refresh cycle looks like an abbreviated read cycle: data is extracted from the storage matrix and then immediately written back without appearing at the output pin.

Suppose every DRAM word must be refreshed once every 4 ms. This means that the 4096-word RAM would require a refresh cycle once every 976 ns. Assuming the cycle time is 120 ns, approximately one in every eight DRAM accesses would be a refresh cycle!

Fortunately, the two-dimensional organization of the storage matrix makes it possible to refresh an entire row at a time. Since the DRAM of Figure 7.52 has 64 rows, we can refresh the rows in sequence once every 62.5 µs, still meeting the overall 4-ms requirement. This is approximately one refresh cycle every 500 accesses. Larger-capacity DRAMs usually have 256 to 512 rows and require a refresh cycle once every 8 to 16 µs.

The RAS-only refresh cycle provides a simple form of refresh. It looks very much like the read timing with the CAS phase deleted. The row address is placed on the address lines and

is taken low. This causes the row to be read out of the storage matrix into the column latches. When

goes high again, the column latches are written back, refreshing the row's contents.

This refresh cycle requires an external memory controller to keep track of the last row to be refreshed. To simplify the memory controller design, some DRAMs have a refresh row pointer in the memory chip. A special CAS-before-RAS signaling convention implements the refresh. If

goes low before

, the chip recognizes this as a refresh cycle. The indicated row is read, written back, and the internal indicator points to the next row to be refreshed.

7.6.3 DRAM Variations

We have described the basic internal functioning of a DRAM with RAS/CAS addressing, but have not shown how such an organization can improve DRAM performance. Variations on the basic DRAM model take advantage of the row-wide access to the storage matrix to reduce the time to access bits in the same DRAM row. These are called page mode, static column mode, and nibble mode DRAMs. All three support conventional RAS/CAS addressing. They differ in how they specify accesses to additional bits in the same row.

Page mode DRAMs can read or write a bit within the last accessed row without repeating the RAS cycle. The first time a bit within a row is accessed, the controller sequences through a RAS followed by a CAS cycle, as described earlier. To access a subsequent bit in the row, the controller simply changes the column address and pulses the CAS strobe (

is held low throughout). CAS pulsing can be repeated several times to access a sequence of bits in the row. The result is much faster access than is possible with complete RAS/CAS cycling.

Static column mode DRAMs provide a similar function but present a slightly simpler interface to the memory controller. Changing the column address bits accomplishes a static column read, eliminating multiple strobes on the

line altogether. Writes are a little more complicated. To protect against accidentally writing the wrong memory location, either

must be driven high before the column address can be changed.

Nibble mode DRAMs are yet another variation on page mode. Most memory locations are accessed in sequence, and the DRAM can take advantage of this to reduce the complexity of the control sequencing. After the first RAS/CAS cycle, a subsequent CAS pulse accesses the next bit in sequence. This can be done three times, yielding 4 bits in sequence, before a RAS/CAS cycle is needed again. Thus the sequence is RAS/CAS, CAS, CAS, CAS, RAS/CAS, CAS, CAS, CAS, etc.

Video RAMs (VRAMs) are DRAMs that can be used as frame buffers for computer displays. A frame buffer is a display memory that allows new data to be written to storage without affecting how the screen is being refreshed from the old data. A VRAM has a conventional DRAM storage matrix and four serial-access memories (SAMs). Its signaling convention allows a data row to be transferred from the storage matrix to the SAMs. Once in a SAM, the data can be read out a bit at a time at a high rate even while new data is being written into the storage matrix. In this way, VRAMs support a kind of dual-port access to memory: one from the standard read/write interface and one from the serial memories.

In addition, some VRAMs support logical operations, such as XOR, between the current contents of the memory and the bit that is overwriting it. This is useful for certain graphics-oriented operations, such as moving items around smoothly on the display.

7.6.4 Detailed SRAM Timing

Here we expand on the discussion of SRAM components begun in the previous section. We will describe the detailed timing of a 1024 by 4-bit static RAM, the National 2114. We have already shown the basic pin-out in Figure 7.47: 10 address lines, 4 data input/output lines, and active low chip select (

) and write enable (

) control signals. The generic read and write cycle sequencings were shown in Figures 7.49 and 7.50.

Read Cycle Let's reexamine the read cycle timing in more detail. The following discussion assumes that

is held high throughout the read operation. Any change on the address lines causes new data to be extracted from the storage matrix, independent of the condition of the chip select. Once

goes low, the output buffers become enabled, latching the data from the storage array and driving the output pins.

An important metric of a memory component is the access time, tA. This is the time it takes for an address change to cause new data to appear at the output pins (the memory has already been selected by driving

low). The MM2114 has an access time of 200 ns, which is relatively slow by today's standards. High-speed static RAMs, such as the Cypress CY2148, have an access time of 35 ns.

An equally important metric is the read cycle time, tRC. This is the time between the start of one read operation and a subsequent read operation. In the case of the MM2114, it is also 200 ns. In modern SRAM components, the access time and cycle time are usually the same. However, this is not the case for DRAMs, where cycle times are often longer than access times.

Figure 7.55 shows the read cycle timing waveform. It shows two back-to-back read cycles, the first commencing before

goes low and the second while

is low.

The first cycle begins with a change on the address lines. Valid data cannot appear on the output lines before an access time, 200 ns, has expired. The timing waveform assumes that the limiting condition is not the access time, but rather the time from when the chip is selected. TCX, the time from chip select to output active, is 20 ns. This means that the memory chip will begin to drive its outputs no sooner than 20 ns after chip select goes low, although the data is not yet valid. Any other components driving the same wires as the RAM's output lines must become tri-state within this time.

TCO, the time from chip select to output valid, is 70 ns. Thus, the time to data valid is determined by which is longer: 200 ns from the last address line change or 70 ns from when the chip is selected.

Once the address lines begin to change for the next read cycle, the outputs are guaranteed to remain valid for another 10 ns, the output-hold-from-address-change time, tOHA. External logic latching the output of the RAM must factor this into its timing before it can allow the addresses to change.

Once the address lines are stable, it takes another 200 ns for valid data to appear at the outputs. If a third read cycle were to commence now, by changing the address lines, the output data would remain valid for 10 ns. However, the read cycle is terminated in the waveform by deselecting the chip. In this case, the hold time is determined by tCOT, the time from chip select to output tri-state. For the MM2114, tCOT is a minimum of 0 ns and a maximum of 40 ns. Thus external logic must wait at least 40 ns before it can begin to drive the wires at the RAM's I/O pins.

Write Cycle As in all RAMs, the write cycle timing is more complex and requires more careful design. The write operation is enabled whenever

and

are simultaneously driven low. This is called the write pulse. To guard against incorrect writes, one or both of the signals must be driven high before the address lines can change.

Figure 7.56 gives the timing waveform for the write operation. Once the address lines are stable, we must wait an address setup time before driving the last of the chip select and write enable signals low. This is the address-to-write setup time, tAW, and is 20 ns. The write cycle time, tWC, is defined from address change to address change and must be at least 200 ns for this component. The write pulse width, tWP, is at least 100 ns.

The write cycle ends when the first of the two control signals goes high. Thus, data setup and hold times are measured with respect to this event. The data setup time, tDS, is at least 100 ns. The data hold time, tDH, is 0 ns.

The write recovery time-the time between the end of the write pulse and when the address lines are allowed to change-is denoted by tWR. For this RAM component, the recovery time is 0 ns.

The bottommost waveform in Figure 7.56, labeled Data Out, illustrates the interaction between read and write cycles. The writing circuitry must realize that the outputs could take as long as 40 ns to be tri-stated, so this amount of time must pass before the data to be written can be placed on the I/O pins. This is indicated by tWOT, the time from write enable to output tri-state.

The final timing specification, tWO, the write-enable-to-output-valid time, is the time between the end of the write cycle and when the RAM turns around to drive the output lines with the data just written (write enable high). In this case, it is 80 ns.

7.6.5 Design of a Simple Memory Controller

At the heart of most digital systems is a data path consisting of one or more interconnection pathways (called buses) and several registers, arithmetic circuits, and memory attached to some or all of these interconnections. This is the "switchyard," which routes data items from memory to a unit that executes some operation on them and then back into memory.

In this subsection, we will design a simple memory controller, using the TTL register and counter components introduced in this chapter, as well as a 2114 RAM chip. We will develop control circuitry for sequencing through the write enable and chip select signals to implement read and write operations.

Memory Subsystem Data Path

Figure 7.57 gives a block diagram view of the data and address paths of a simple memory subsystem. To keep it simple, the address and data paths are just 4 bits wide. We can read data from four input switches and store them in the 2114 RAM when the tri-state buffer is enabled. Data stored in the RAM can be read out, latched into a register, and then displayed on LEDs attached to the register's outputs (inverter drivers are used to buffer the LEDs). We use a -4-bit binary counter to access locations in memory sequentially for reading or writing. We also use LEDs to monitor memory addresses.

Figure 7.58 provides the schematic representation for the data path, using TTL components to implement the logic blocks of Figure 7.57. We implement the address counter with a 74163 four-bit binary up-counter. For the tri-state buffers and output latches we use one-half of a 74244 and a 74379 component, respectively. The 74379 is a 4-bit version of the 74377 introduced in Figure 7.3(a). For the purposes of the schematic, we have replaced the output LEDs by symbols for hexadecimal displays and the input switches by a hex keypad.

Memory Controller

The following signals control the data path:

INC_ADR Add one to address.

Write Enable on 2114.

Chip Enable on 2114.

Latch valid data on data bus in display register during read cycle.

Enable buffer to put switch data on data bus during write cycle.

User input to select read or write mode.

In addition, we need a global reset signal to force the counter to the 0000 state.

The memory controller reads from or writes to the current address, then increments the counter to point to the next address. First, a sequence of write cycles fills the RAM with data. Then the controller is reset, setting the address counter back to 0. A sequence of read cycles then views the data that has been stored in the RAM.

Figure 7.59 shows a skeletal sequencer circuit diagram. The timing waveform it generates is given in Figure 7.60.

Pressing the momentary push-button switch generates a

signal that lasts for one clock cycle. The

signal enables the 74194 shift register, which shifts right, generating overlapping clock signals F1, F2, F3, F4. The circuit halts when all of the clock signals return to 0. This sequencer could be driven with a slow clock, such as the 555 timer described in the last chapter.

We can use simple combinational logic to derive pulses of the correct start time and length for the various control signals from the multiphase clock.

Figure 7.61 shows how this is accomplished.

We start by partitioning the overlapping clocks into seven periods, each of which is defined as a unique function of two of the clock phases. By combining these functions, we can obtain equations for the individual control signals. For example, we choose to implement

simply as the inversion of clock phase F2. To be safe, we will design the high-to-low-to-high transitions on the

signal to be properly nested within the

transitions.

should be low exactly during the time that clock phases F1 and F3 overlap (periods 3 and 4 in Figure 7.61). This is easy to generate with combinational logic:

The write enable signal should be asserted whenever clock phases F1 and F3 are asserted simultaneously and the

signal is low. Since

is active low, we invert this logic to obtain the current sense of the signal.

We assert

during period 5, when F1 is low and F2 is high. Since the signal is active low, the implementation for the control signal becomes

INC_ADR is active during period 7, so its implementation becomes

. Finally,

is identical to the signal

.

Of course, alternative implementations are possible as long as they lead to valid sequencing of the control signals. Also, the logic must be designed so that the sequence meets the setup and hold time requirements for the 2114 RAM.

Real Computer Science begins where we almost stop reading ...