Saturday, 15 June 2013

Cache Memory Organization in Nutshell


Introduction

The goal of an effective memory system is that the effective access time that the processor sees is very close to to, the access time of the cache. Most accesses that the processor makes to the cache are contained within this level. The achievement of this goal depends on many factors: the architecture of the processor, the behavioral properties of the programs being executed, and the size and organization of the cache. Caches work on the basis of the locality of program behavior. There are three principles involved:

  1. Spatial Locality - Given an access to a particular location in memory, there is a high probability that other accesses will be made to either that or neighboring locations withing the lifetime of the program.
  2. Temporal Locality - This is complementary to spatial locality. Given a sequence of references to n locations, there is a high probability that references following this sequence will be made into the sequence. Elements of the sequence will again be referenced during the lifetime of the program.
  3. Sequentiality- Given that a reference has been made to a particular location s it is likely that within the next several references a reference to the location of s + 1 will be made. Sequentiality is a restricted type of spatial locality and can be regarded as a subset of it.

Some common terms

Processor reference that are found in the cache are called cache hits. References not found in the cache are called cache misses. On a cache miss, the cache control mechanism must fetch the missing data from memory and place it in the cache. Usually the cache fetches a spatial locality called the line from memory. The physical word is the basic unit of access in the memory.
The processor-cache interface can be characterized by a number of parameters. Those that directly affect processor performance include:

  1. Access time for a reference found in the cache (a hit) - property of the cache size and organization.
  2. Access time for a reference not found in the cache (a miss) - property of the memory organization.
  3. Time to initially compute a real address given a virtual address (not-in-TLB-time) - property of the address translation facility, which, though strictly speaking, is not part of the cache, resembles the cache in most aspects and is discussed in this chapter.

Cache Organization

Within the cache, there are three basic types of organization:
  1. Direct Mapped
  2. Fully Associative
  3. Set Associative
In fully associative mapping, when a request is made to the cahce, the requested address is compared in a directory against all entries in the directory. If the requested address is found (a directory hit), the corresponding location in the cache is fetched and returned to the processor; otherwise, a miss occurs.

Click here for link to source of this graphic

Fully Associative Cache


In a direct mapped cache, lower order line address bits are used to access the directory. Since multiple line addresses map into the same location in the cache directory, the upper line address bits (tag bits) must be compared with the directory address to ensure a hit. If a comparison is not valid, the result is a cache miss, or simply a miss. The address given to the cache by the processor actually is subdivided into several pieces, each of which has a different role in accessing data.

Click here for link to source of this graphic

Direct Mapped Cache


The set associative cache operates in a fashion somewhat similar to the direct-mapped cache. Bits from the line address are used to address a cache directory. However, now there are multiple choices: two, four, or more complete line addresses may be present in the directory. Each of these line addresses corresponds to a location in a sub-cache. The collection of these sub-caches forms the total cache array. In a set associative cache, as in the direct-maped cache, all of these sub-arrays can be accessed simultaneously, together with the cache directory. If any of the entries in the cache directory match the reference address, and there is a hit, the particular sub-cache array is selected and outgated back to the processor.

Click here for link to source of this graphic

Set Associative Cache 

Some example calculations regarding Cache:-  

Cache Time Analysis

Cache Specification :
Cache Size = 2 KB
Associativity = 4
Words / Block = 8
Hit Rate = 0.936
Miss Rate = 0.064
Parameters :
% Writes = 22%
% Reads = 78%
% Dirty = 10%
Hit Time = 1
Miss Penalty = 40
Write Policies :
Write Back
No Write Allocate
-
-
-
Average Memory Access Time Calculations
Read Hit Contribution :
%Reads * Hit_rate * HitTime =
78% * 0.936 * 1 =
0.73008
Read Miss Contribution :
%Reads*MissRate*((MissPenalty+HitTime)+(%Dirty*MissPenalty)) =
78% * 0.064 * (( 40 + 1 ) + ( 10% * 40 )) =
2.2464
Write Hit Contribution :
%Writes * HitRate * HitTime =
22% * 0.936 * 1 =
0.20592
Write Miss Contribution :
%Writes * MissRate * MissPenalty =
22% * 0.064 * 40 =
0.5632
Total Average Memory Access Time ( Avg. clocks per memory access) =
3.7456

 Cache Time Analysis

Cache Specification :
Cache Size = 2 KB
Associativity = 4
Words / Block = 8
Hit Rate = 0.936
Miss Rate = 0.064
Parameters :
% Writes = 25%
% Reads = 75%
% Dirty = 10%
Hit Time = 1
Miss Penalty = 40
Write Policies :
Write Through
No-Write Allocate
-
-
-
Average Memory Access Time Calculations
Read Hit Contribution :
%Reads * HitRate * HitTime =
75% * 0.936 * 1 =
0.702
Read Miss Contribution :
%Reads * MissRate * ( MissPenalty + HitTime ) =
75% * 0.064 * ( 40 + 1 ) =
1.968
Write Hit Contribution :
%Writes * HitRate * MemWriteTime =
25% * 0.936 * 6 =
1.404
Write Miss Contribution :
%Writes * MissRate * MemWriteTime =
25% * 0.064 * 6 =
0.096
Total Average Memory Access Time ( Avg. clocks per memory access) =
4.17