Why we want to run transactions concurrently?
Concurrent or overlapping execution of transactions are efficient
How we ensure the correctness of the concurrent transactions?
Concurrency control(Serializability) and Recovery are two criteria that ensure the correctness of concurrent transactions
Why concurrency control is needed
Lost Update : update of some data by one transaction is lost by update from another transactionDirty Read or temporary update problem : one transaction updates the value of common data and aborts before it can revert the changes transaction 2 reads the value of updated variable.
incorrect summary problem: transaction reads the data while another transaction is still changing the data
Why recovery is needed
In any kind of
problem like hardware malfunction, software error exceptions or
violating the concurrency property, deadlock recovery of transaction is
needed
Transaction states
- begin transaction marks the beginning of transaction.
- end_transaction specifies transaction execution is complete and system check whether changes can be permanently applied.
- rollback or abort for unsuccessful end of tranaction
Fig. transaction states |
- At commit point all transaction operations have been logged and new entry is done in log 'commit T' stating that all transaction operation permanently logged
- before writing commit T the complete log should be written to disk from buffers
- Rollback : when commit T statement is not found in log, its rollbacked
How recoverability is implemented
System log is kept on disk which logs transaction like write old value new value read.Protocol that do not provide cascading rollback do not need to keep read entry
Schedules
Recoverable Schedule : If T2 reads a data item written by T1 commit operation of T1 should appear before commit operation of T2.
Cascadeless Schedule: If T2 reads a data item written by T1 commit operation of T1 should appear before read operation of T2.
Strict Schedule : If a write operation of T1 precedes a conflicting operation of T2 (either read or write), then the commit event of T1 also precedes that conflicting operation of T2.
Fig. schedules recoverability |
pattern for recoverable schedule would be like
Serializabality might be compromised in some cases but
recoverability compromise would mean violating database integrity.
Isolation levels decide tradeoff between correctness and concurrency
Isolation level
when the changes made by one operation becomes visible to another operation
most relaxed ACID property
From more stringent to relaxed isolation levels
Serializable as if transactions execute in complete isolation, in serial fashion. read locks released immediately but write locks released at end of transaction
Read Uncommited Dirty reads allowed that is another transaction reads the data value but then first transaction can revert so other transaction has incorrect value
Repeatable read: A phantom read occurs when range of rows is read by one transaction and another transaction inserts or delete a record in between rows returned by same query is different at two points in time for a transaction that is another transaction updates and commits in between
Read commited(Non repeatable read):prevents dirty read that is reads commited data only.Fig. simple pattern for recoverability schedules on vertical time line |
What is Serializability
If
executing interleaved transaction results in same outcome as serial
schedule(running transaction in some sequnence) then they are considered
serializable. this schedule is type of nonserial schedule.
Isolation level
when the changes made by one operation becomes visible to another operation
most relaxed ACID property
From more stringent to relaxed isolation levels
Serializable as if transactions execute in complete isolation, in serial fashion. read locks released immediately but write locks released at end of transaction
Read Uncommited Dirty reads allowed that is another transaction reads the data value but then first transaction can revert so other transaction has incorrect value
Repeatable read: A phantom read occurs when range of rows is read by one transaction and another transaction inserts or delete a record in between rows returned by same query is different at two points in time for a transaction that is another transaction updates and commits in between
UPDATE or DELETE of any of the rows read by earlier transaction
Dirty Read
|
Lost Update
|
Phantom Records
| ||
Read uncommitted
|
yes
|
yes
|
yes
| |
Read committed
|
No
|
yes
|
yes
| |
Repeatable read
|
No
|
no
|
yes
| |
Serializable
|
No
|
no
|
no
|
Types of serializability
View and conflict serializability
- conflict is subset of view serializability
- Conflict is widely utilized because it is easier to determine and covers a substantial portion of the view serializable
In view serializable, two schedules write and read the same data values.
and In conflict seriablizable, same set of respective chronologically ordered pairs of conflicting operations.
Conflict serializable
In conflict serializabability two schedules are conflict equivalent and we can reorder the non conflicting operation to get the serial schedule
Conflicting operation
1) they are upon same data item
2)At least one of them is write
3) they are from different transactions
Non commutative that is their orders matter
Testing conflict serializability
Test is through precedence graph. The acyclic preced graph shows conflict serialiczable schedule and topological sorting of that graph gives the serializable schedules
cycle of commited transaction can be prevented by aborting an undecided transaction on each cycle in precedence graph of all transaction
view serializability is NP complete
materialized conflict if the requested conflicting operation is actually executed
//pending
Problem 1 Consider two schedules
S1 : r1(x) w1(x) r1(y) c1
S2: r2(x) r2(y) c2
1.What the number of possible schedules.
2. How many serial schedules are possible
3. Determine type of schedules(recoverable cascadeless strict) and serializability(conflict seriablizable or not) for below schedules
S1: r 1 (X); w 1 (X); r 2 (X); w 2 (X); r 1 (Y); w 1 (Y); C 2 ; C 1
S2: r 1 (X); r 2 (X); w 1 (X); r 1 (Y); w 1 (Y); C 1 ; w 2 (X); C 2
S3: r 1 (X); w 1 (X); r 2 (X); w 2 (X); r 1 (Y); w 1 (Y); C 1 ; C 2
Solution
1.In general, given m transactions with number of operations n1, n2, ..., nm, the number
of possible schedules is: (n1 + n2 + ... + nm)! / (n1! * n2! * ... * nm!)
here m=2 and n1= 4 and n2 = 3
so (3+4)! / (3!*4!)
7C3 or 7C4
2.Number of serial schedules is m! where m is number of transactions
so here we have 2!
3. To check type of schedule
we need to check all conflicting operations of schedule
S1 : r2(X) comes after w1(x) , since no commit before r2(X) its not cascadeless and strict
commit of T1 should be before T2 in this case for it to be recoverable which is not so it is not even recoverable. Its non recoverable schedule
S2: there is only one conflicting operation w1(X) w2(X) and for that we have commit of T1 before w2(X) so its strict schedule and so is cascadeless also
S3: for one of the conflicting operation w1(X) r2(X) there is no commit in between so its not cascadeless and strict. To check for recoverable commit of T1 should be before that of T2 which is there
To check serializability
S1 precedence graph is
T1 ---> T2
acyclic so its conflict serializable
S2 precedence graph
T1-------->T2
^ |
|_________|
there is cycle so its not conflict serializable
//pending problems
http://academic.udayton.edu/SaverioPerugini/courses/Winter2006/cps432/index.html
Very clear. I love your article.
ReplyDelete