Sunday, 3 November 2013

VIRTUAL MEMORY

All the memory management policies we have discussed so for try to keep a number of processes in memory simultaneously to allow multiprogramming, but they require an entire process to be loaded in memory before it can execute.
With the virtual memory technique, we can execute a process which is only partially loaded in memory. Thus, the logical address space may be larger than physical memory, and we can have more processes executing in memory at a time, hence a greater degree of multiprogramming.
Demand paging
Demand paging is the most common virtual memory oriented memory management system. It is based on the locality model of program execution:
AS A PROGRAM EXECUTES, IT MOVES FROM LOCALITY TO LOCALITY.
Locality is defined as a set of pages actively used together. A program is composed of several different localities which may overlap. For example, a small procedure when called, defines a new locality. A repeat-until statement when being executed defines a new locality.
Procedural languages with while-do repeat-until for-do constructs (i.e. PASCAL, C, ALGOL, etc.) have less frequently changing localities than other high-level languages with go-to constructs. (i.e. FORTRAN, BASIC)
In demand paging programs reside on a swapping device commonly known as the backing store. The backing store, for most of today's operating systems is a disk.
When the operating system decides to start a new process (remember our discussion on the long term scheduler in chapter 4), it swaps only a small part of this new process (a few pages) into memory. The PT of this new process is prepared and loaded into memory, and the valid/invalid bits of all pages that are brought into memory are set to "valid". All the other valid/invalid bits are set to "invalid" showing that those pages are not brought into memory.
If the process currently executing tries to access a page which is not in memory, a PAGE FAULT occurs, and the operating system must bring that page into memory.
A page fault is serviced in a number of steps :
a. Service the page fault
b. Swap in the page
c. Restart the process which caused the page fault
In more detail:
1. Trap to the OS
2. Save registers and process state for the current process
3. Check if the trap was caused because of a page fault and whether the page reference is legal
4. If yes, determine the location of the required page on the backing store
5. Find a free frame
6. Read the required page from the backing store into the free frame. (During this I/O, the processor may be scheduled to some other process)
7. When I/O is completed, restore registers and process state for the process which caused the page fault and save state of the currently executing process
8. Modify the corresponding PT entry to show that the recently copied page is now in memory
9. Resume execution with the instruction that caused the page fault.
Steps a. and c. can be performed in about 100 instructions, so in about 100 ® 1000 microseconds.
Step b. can be performed in about 10 ® 30 milliseconds (we may also have to include device queuing time if a number of processes are waiting for disk input/output).
Calculation of the performance of demand paging systems
If there is no page fault:
effective access time = effective memory access time.
Let p be the probability of a page fault. (0 <= p <= 1).
Then, the effective access time must be calculated as follows :
e.a.t. = p * "page fault service time" + (1-p) * emat
For example, if memory access time for a system is given as 1 microseconds, and the average page fault service time is given as 10 milliseconds, then,
effective access time = p * (10 msec) + (1-p) * 1 msec
= 1 + 9999 * p msec.
If p is given as 0.001, then e.a.t. = 11 msec.
Page Replacement
While executing a process, a page fault occurs. The OS finds the desired page on the backing store, but there are no free frames. In that case, the OS must choose a page in memory (which is not the one being used) as a victim, and must swap it out to the backing store.
Then, the OS changes the valid/invalid bit for the victim page in the PT to indicate that it is no longer in memory. It swaps the desired page into the newly freed frame, modifies the frame table, and sets the valid/invalid bit of the new page to valid. The executing process may now go on.
Therefore, in servicing a page fault, the OS performs two page transfers:
a. Swap the victim page out,
b. Swap the desired page in.
Virtual memory can be implemented in:
a. paging systems
b. paged segmentation systems
c. segmentation systems (However, segment replacement is much more complex then page replacement, because segments are of variable size).
In the rest of this chapter, we shall try to find answers to the following two crucial questions for demand paging systems:
a. How many frames should we allocate to each process?
b. How shall we select the victim page when page replacement is required?
Page replacement algorithms:
A page replacement algorithm determines how the victim page(s) is selected when a page fault occurs. The aim is to minimize the page fault rate.
The efficiency of a page replacement algorithm is evaluated by running it on a particular string of memory references and computing the number of page faults.
Reference strings are either generated randomly, or by tracing the paging behavior of a system and recording the page number for each logical memory reference.
Consecutive references to the same page may be reduced to a single reference, because they won't cause any page fault except possibly the first one.
Example :
(1,4,1,6,1,1,1,3) ---> (1,4,1,6,1,3).
We have to know the number of page frames available in order to be able to decide on the page replacement scheme of a particular reference string. Normally, one would expect that with the total number of frames increasing, the number of page faults caused by a particular reference string should decrease.
With the above reference string, if we had 4 frames, it would cause only 4 page faults : one fault for the first reference to each page.
If we had only a single frame, then the above string would cause 6 page faults, one for every new page reference.
In the following part of this chapter, we shall introduce three different page replacement algorithms.
1. Optimal Page Replacement Algorithm (OPT)
With this algorithm, we replace that page which will not be used for the longest period of time. OPT has the lowest page fault rate for a fixed number of frames among all the page replacement algorithms. However, it is not possible to implement in practice, because it requires future knowledge. It is mainly used for performance comparison.
Example : Consider the reference string :
5,7,6,0,7,1,7,2,0,1,7,1,0.
and assume there are 3 page frames:
5

7

6

0

7

1

7

2

0

1

7

1

0

f1
5

5

5

0

0

0

0

0

0

0

0

0

0
f2


7

7

7

7

7

7

2

2

2

7

7

7
f3




6

6

6

1

1

1

1

1

1

1

1
pf
1

2

3

4

same

5

same

6

same

same

7

same

same
This string causes 7 page faults.
2. First-In-First-Out (FIFO)
This is a simple algorithm, and easy to implement. The idea is straight forward, choose the oldest page as the victim.
Example : Assume there are 3 page frames, and consider the following reference string:
5,7,6,0,7,1,7,2,0,1,7,1,0.
5

7

6

0

7

1

7

2

0

1

7

1

0

f1
5

5

5

0

0

0

0

2

2

2

7

7

7
f2


7

7

7

7

1

1

1

0

0

0

0

0
f3




6

6

6

6

7

7

7

1

1

1

1
pf
1

2

3

4

same

5

6

7

8

9

10

same

same
10 page faults are caused by FIFO.
Belady's Anomaly
We said that normally as the number of page frames increase, the number of page faults should decrease. However, for FIFO there are cases where this generalization will fail! This is called Bleady's Anomaly. Notice that OPT never suffers from Belady's anomaly. 
Example : Consider the reference string:
1,2,3,4,1,2,5,1,2,3,4,5.
 3. Least Recently Used (LRU)
The idea in LRU is to use recent past as an approximation of the near future. LRU replaces that page which has not been used for the longest period of time. So, the operating system has to associate with each page the time it was last used.
Example : Consider
5,7,6,0,7,1,7,2,0,1,7,1,0.
Assume there are 3 page frames:
5

7

6

0

7

1

7

2

0

1

7

1

0

f1
5

5

5

0

0

0

0

2

2

2

0

0

0
f2


7

7

7

7

7

7

7

7

1

1

1

1
f3




6

6

6

1

1

1

2

2

2

2

2
pf
1

2

3

4

same

5

same

6

7

8

9

same

same
9 page faults are caused by LRU.
LRU algorithm does not suffer Belady's anomaly, since it is a stack algorithm. It can be shown that the set of pages in memory when the number of frames is n, is always a subset of the set of pages in memory when the number of frames is n+1.
LRU requires extra hardware to determine and order between pages defined by the last-use time. For implementation with minimum extra hardware, we can use a "reference bit" for each frame. The OS sets the reference bit of a page to "1" when it is referenced. This bit will not give the order of use but it will simply tell whether the corresponding frame is referenced or not. The OS resets all reference bits periodically.
Dirty Bit:
In order to reduce the page fault service time, a special bit called the dirty bit can be associated with each page.
The dirty bit is set to "1" by the hardware whenever the page is modified. (written into).
When we select a victim by using a page replacement algorithm, we examine its dirty bit. If it is set, that means the page has been modified since it was swapped in. In this case we have to write that page into the backing store.
However if the dirty bit is reset, that means the page has not been modified since it was swapped in, so we don't have to write it into the backing store. The copy in the backing store is valid.
Using a dirty bit may reduce the page fault service time by half.
Allocation of frames among processes:
There is a minimum number of frames that must be allocated to each process because of the hardware requirements. There must be enough frames that any single instruction can reference.
In page replacement, one of the following two policies can be applied :
a. Local Replacement : each process can replace only from its own set of allocated page frames.
b. Global Replacement : a process can replace any page in memory.
With global replacement, the same program may perform quite differently in different executions, because the set of pages in memory for that program depends on the paging behavior of other processes as well as itself.
Frame allocation algorithms
a. Equal allocation: There are n frames, and p processes. allocate n/p frames to each process.
b. Proportional allocation:
Let the virtual memory size for process i be v[i].
Let there be m processes and n frames.
The total virtual memory size will be : V = v[i]
Allocate ( v[i] / V ) * n frames to process i.
Example : There are 64 frames in a system. There are four processes with the following virtual memory sizes :
v[1] = 16 v[2] = 128 v[3] = 64 v[4] = 48
a. Equal allocation : allocate 64 / 4 = 16 frames to each process.
b. Proportional allocation : V = 16 + 128 + 64 + 48 = 256
Allocate 16 / 256 * 64 = 4 frames to process 1,
128 / 256 * 64 = 32 frames to process 2,
64 / 256 * 64 = 16 frames to process 3, and
48 / 256 * 64 = 12 frames to process 4.
Thrashing
A process is thrashing if it is spending more time for paging-in/out (because of frequent page faults) than executing.
Thrashing causes considerable degradation in system performance. If a process does not have enough number of frames allocated to it, it will issue a page fault. A victim page must be chosen, but all the pages are in active use. So, the page chosen as victim and replaced will be needed in a very short time. This means another page fault will be issued shortly, and so on and so forth.
In case a process thrashes, the best thing to do is to suspend its execution and page out all its pages in memory to disk. ( This is swapping, or, intermediate term CPU scheduling.)
Local replacement algorithms can limit the effects of thrashing.
If the degree of multiprogramming is increased over a limit, processor utilization falls down considerably because of thrashing.
To prevent thrashing, we must provide a process as many frames as it needs. For this, we shall use the working set model, which depends on the locality model of program execution, discussed earlier.
We shall use a parameter, D , called the working set window. We shall examine the last D page references.
The set of pages in the last page references shall be called the working set of a process.
Example : assume D = 10 , and the reference string is :
2 2 1 5 7 7 7 7 5 1 3 4 4 4 3 4 3 4 4 4
+ + +
t1 t2 t3
WS(t1) = {2,1,5,7} WS(t2) = {7,5,1,3,4} WS(t3) = {3,4}
Note that in calculating the working sets, we cannot reduce consequent references to the same page to a single reference. Choice of D is crucial. If D is to small, it will not cover the entire working set. If it is too large, several localities of a process may overlap. Madnick and Donovan suggested D to be about 10.000 references.
Now, compute the WS size (WSS) for each process, and find the total demand, D of the system at that instance in time, as the summation of all the WS sizes.
If the number of frames is n, then
a. If D > n , the system is thrashing.
b. If D < n, the system is all right, the degree of multiprogramming can possibly be increased.
In order to be able to use the working set model for virtual memory management, the OS keeps track of the WS of each process. It allocates each process enough frames to provide it with its WS.
If at one point D > n, OS selects a process to suspend. The frames that were used by the selected process are reallocated to other processes.
We can also use the page fault frequency to decide on decreasing or increasing the no. of frames allocated to a process.
 QUESTIONS
1. Consider a demand paging system with associative registers.
a. If the hit ratio for the system is 80%, and the main memory access time is 950 nanoseconds, calculate the effective memory access time (emat). Assume that associative registers can be loaded in parallel with an access to the main memory.
b. Let the probability of a page fault be 0.005 and calculate the effective access time if the page fault service time is given as 5 msec. Use the emat calculated in part 'a'.
2. Consider a demand paging system. assume working set window is 7, and the following reference string is given for process P:
1, 2, 1, 4, 3, 4, 1, 2, 1, 4, 5, 2, 5, 3, 5, 2, 3
tnow
a. What is the current working set of this process?
b. What is the current working set size of this process?
c. What happens if the summation of working set sizes for all the currently executing processes is larger than the total number of frames in the memory?
d. What would you suggest to solve the problem in 'c'?
3. The following reference string is given for a process executing in a demand paging system which uses FIFI page replacement algorithm:
4, 3, 2, 1, 4, 3, 5, 4, 3, 2, 1, 5
a. If there are three memory frames allocated to these process, give a picture of all p ages in memory for the given reference string, after each page fault.
b. Repeat part 'a' for four memory frames.
c. Comment on the total number of page faults in part 'a' and part 'b'.
4. Consider a demand paging system with the following observed average device utilization values:
processor ut.=30% disk ut.=95% I/O devices ut.=10%
Discuss whether the following measures can improve the processor utilization: (considering them one by one)
a. Replace the processor with a slower one.
b. Replace the processor with a faster one.
c. Replace the disk with a faster one.
d. Increase the degree of multiprogramming.
e. Decrease the degree of multiprogramming.
f. Replace the I/O devices with faster ones.
5. In a demand paging system, the page replacement policy is: examine each page in memory regularly, and swap those pages, which have not been used since the last examination, to disk. Discuss whether this policy would result in better performance than LRU page replacement policy or not.
6. Prove that OPT and LRU page replacement algorithms do not suffer Bleady's anomaly.
7. Consider a demand paging system. The effective memory access time is 8000 nanoseconds. It takes 9 milliseconds to service a page fault if a free frame is found in memory, or the page to be replaced is not dirty, and takes 22 milliseconds to service a page fault if there are no free frames and the victim p age is dirty. In 7 out of every 10 page faults, a victim page has to be chosen and in 5 of these 7 cases, the victim page is dirty. What is the acceptable page fault rate for this system in order to achieve an effective access time of no more than 2 milliseconds ?
8. In a virtual memory system, what determines the size of a logical address space? What determines the size of a physical address space? What (if any) relationship is there between them?
9. A working set window is the most recent k page references made by a process. Assume the most recent page references of a process are :
..., 6, 5, 3, 3, 4, 3, 2, 1, 4, 2
and k is 7.
a. What is the current working set window of this process?
b. What is the current working set of this process?
c. Explain briefly what happens when k is too large.
d. Explain briefly what happens when k is too small.
10. a. Explain how the LRU page replacement algorithm works.
b. Why do we need extra hardware such as a dirty bit and a reference bit in order to implement the LRU page replacement algorithm in a demand paging system?
c. A demand paging system with 4 page frames uses the LRU algorithm for page replacement. Give a diagram of all pages in memory after each page fault, for the following reference str