Monday, 24 June 2013

Virtual Memory

Most modern computers use the programmers addresses as virtual
addresses. The virtual addresses must be converted to physical
addresses in order to access data and instructions in RAM.

The RAM is divided into many pages. A page is some number of
bytes that is a power of 2. A page could be as small as 2^12=4096
bytes up to 2^16=65536 bytes or larger. The page offset is the
address within a specific page. The offset is 12-bits for a
4096 byte page and 16-bits for a 65536 byte page.

The virtual address and physical address do not necessarily
have to be the same number of bits. The operation of virtual
memory is to convert a virtual address to a physical address:

          Programmers Virtual Address
  +----------------------------+-------------+
  |    Virtual Page Number VPN | page offset |
  +----------------------------+-------------+
               |                    |
               v                    |
              TLB                   |
               |                    |
               v                    v
    +--------------------------+-------------+
    | Physical Page Number PPN | page offset |
    +--------------------------+-------------+
                RAM Physical Address

TLB is the acronym for Translation Lookaside Buffer. The TLB
is the hardware on the CPU that converts the virtual page number
to a physical page number. The Operating System is the resource
manager and thus assigns each process the physical page numbers
that the process may use. The virtual page numbers come from the
programmers source code through compiler, assembler and loader
onto disk.




Two programs, p1 and p2, with code segments  p1c and p2c,
and data segments p1d and p2d. The operating system runs
a simple program as a process. Now, each segment is
divided into pages. p1c0, p1c1, p1c2 are the first three
pages of program 1 code segment. These are virtual pages.
These pages may be loaded into any physical pages in RAM.
Each segment is consecutive as stored on disk as an
executable program.
   disk pages, each line is a page
        ...
        p1c0   executable program 1
        p1c1
        p1c2
        p1d0
        p1d1
        ...
        p2c0   executable program 2
        p2c1
        p2d0
        p2d1
        p2d2

There are also other types of segments.
You may recall from Homework 3, the address of
"main" was  0x08048390  28 bits of virtual address.

The page size may be chosen by the operating system author or
in some computer architectures the page size is determined by
the hardware, as shown below.

As time goes on, the operating system allocates and frees physical pages.
Physical memory could look like this at some time:
(Each line is a page, e.g. 8192 bytes)

      os0  operating system pages
      os1
      ...
      osn
      p2d3  somewhat randomly scattered pages
      p1c2
      empty
      p2c0
      p2c1   
      p1d5
      p1c4
      etc

Pages for a program may not be contiguous.
Pages for a segment of a program may not be contiguous.
Basically, any virtual page can be in any physical page.
Code and data segments may not all be in physical memory.


A TLB attached to a cache. Any cache could be used,
a simple one word per block cache is shown.
Note that the TLB is fully associative.




A flow diagram showing the logical steps to get from
an executable programs virtual address to a physical address
that can access RAM.





Note that a TLB is a cache yet it typically has some extra
complexity. In addition to the valid bit there may be a
"read only" bit that can easily prevent a store operation
into a page. Another bit may be an "execute only" bit for
instruction pages that prevents both load and store operations.

A required bit is a "dirty" bit. Consider a page that is
referenced: The page must be loaded from disk or may be
a created page of zeros in RAM. Then eventually that page
in RAM is needed for some process. If any store operation
changed that page in RAM, the page must be written out
to disk. The page is "dirty" meaning changed. If the page
in RAM is not dirty, the new page information just
over writes the physical page in RAM with some other
page.

A significant performance requirement for the operating
system is to efficiently handle paging. If there are
no physical pages on the OS free page list, a Least
Recently Used, LRU, strategy is typically used to
choose a page to over write.

The specific architecture of the TLB must be known in order
to compute the number of bits of storage needed.

Given: a 36-bit virtual address,
       a 32-bit physical address,
       a 8192 byte page:
Compute:
 log2 8192 = 13-bit page offset. (2^13=8192)
 Thus the VPN is 36-13 = 23-bits
      the PPN is 32-13 = 19-bits

 or, drawn
          Programmers Virtual Address 36-bits
  +----------------------------+-------------+
  |    Virtual Page Number VPN | page offset |
  |      23-bits               |  13-bits    |
  +----------------------------+-------------+
               |                    |
               v                    |
              TLB                   |
               |                    |
               v                    v
    +--------------------------+-------------+
    | Physical Page Number PPN | page offset |
    |  19-bits                 |  13-bits    |
    +--------------------------+-------------+
                RAM Physical Address  32-bits

Given 128 blocks in the TLB,
      3 bits for valid, dirty and ref
Compute:
   log2 128 = 7-bits in TLB index
   VPN = 23 - 7-bit index gives 16 bits in TLB tag
or drawn
    V D R tag 16-bits       PPN 19-bits       3+16+19=38-bits
   +-+-+-+------------+---------------------+
   | | | |            |                     |
   +-+-+-+------------+---------------------+
                ...                            128 of these
   +-+-+-+------------+---------------------+
   | | | |            |                     |
   +-+-+-+------------+---------------------+

thus 38 * 128 = 4864 bits in TLB.

Now, given a simple page table is used, indexed by VPN,
the page table has 2^VPN = 2^23 = 8,388,608 entries.

Given a page table with three control bits V,D and R
and a Physical Page Number then the page table needs
 1 + 1 + 1 + 19 = 22 bits.
Total bits 22 * 8,388,608 = 184,549,376 bits.
Using power of 10, 184*10^6, 184 million bits, Each 
process requires a page table. Fortunately, the OS uses
intelligence and only builds a page table big enough
for the size of the program or possibly for only the
pages that are actually used. The page table itself is
in a page and may, of course, be paged out. :)

A reminder on bits in address vs size of storage:
  bits    size              approximate
   10     kilobyte  2^10    10^3
   20     megabyte  2^20    10^6
   30     gigabyte  2^30    10^9
   40     terabyte  2^40    10^12
   50     petabyte  2^50    10^15

Actually, modern computers use a hierarchy of page tables.