Tuesday 26 February 2013

FAULT TOLERANT SYSTEMS

LECTURE #FILES
Dependability Concepts

Lecture 01

 [ PPT ] [pdf]
Lecture 02
[       PPT ]  [pdf]
Lecture 03
[ PPT ] [pdf]


Fault-Tolerant (FT) Design Techniques
Lecture 04
 [ PPT       ] [pdf]
Lecture 05
 [ PPT ] [pdf]
Information Redundancy - self reading
[pdf]


Dependability Modeling
Reliability, MTTF, etc.PPT
Fault Tree Analysis
[PPT]
Petri Nets


FT in Distributed Systems
Stable storage -- RAIDPPT
Stable storage - advanced RAIDPPT
ConsensusPPT
Clock SynchronizationPPT
System-level diagnosisPPT
Checkpoint and Rollback recoveryPPT
Atomic actions -- Lock & Commit ProtocolsPPT
Replica management protocolsPPT


FT in Networks
Dependable communication - 1PPT
Primary-backup pathPPT
Fault LocalizationPPT
Dependability-SecurityPDF


FT in Real-Time Systems
Lecture 06
[ PPT ] [pdf]
Lecture 07
[ PPT ] [pdf]
Lecture 08
[ PPT ] [pdf]




Spring 2010 Student Presentations
Recovery-Oriented Computing - Peter ScottPDF
ZFS - a RAID based file system - Henri Bai http://hub.opensolaris.org/bin/view/Community+Group+zfs/docs
2-dimensional error coding - Long ChenPDF
Software based fault detection - Tim PrincePPT
Self Recovery of Server Programs - Chesta DwivediPPT
Dynamic Fault Trees - Ashok AdityaPPT
Device Failure Tolerance Using Software - Haribabu NarayananPPT
FPGA Fault Tolerance - Matt ClausmanPPT
Byzantine Storage - Debkanta ChakrabortyPPT




Spring 2009 Student Presentations
Fault-Tolerant Internet Services -- Indranil RoyPPT
Checkpoint Recovery in Petaflop systems -- Paul JenningsPPT
Highly Available Systems - Case Study -- Cory KleinhekselPPT
Fault-Tolerant TCP Server -- Preethika K.PPT
Fault-Tolerant CORBA (NVP implementation) -- Indranil RoyPPT
Fault-Tolerant Multipath Routing - Ganesan ManiPPT
Petri Net modeling - Phased Mission - Siddharth SridharPPT




Spring 2007 Student Presentations
Energy-aware scheduling Weakly-hard real-time systems (Julie Rursch)PPT
Fault-Tolerance in Multiprocessor SoC (Premkumar)PPT
Fountain Codes (Long Long)PPT
Network Time Protocol (Lizandro)PPT
RAID architectures (Russell Graves)PDF


Spring 2006 Student Presentations
Architecture fault-tolerance (Viswanathan)PPT
Advanced Quorum protocols (Kamna)PPT
Fault-tolerant objects (Bebek)PPT
Hierarchical system-level Diagnosis (Qin Wen)PPT
 Checkpointing in mobile systems (Ben)PPT
Dependability and Security (Srdjan)PPT
 Decidability and Schedulability -- Timed Automata  ()PPT
System-level diagnosis in adhoc networks (Kavitha)PPT1, PPT2

NOTE: You can print the handout slides from Microsoft Powerpoint.
 
 
 
 
 Fault-Tolerant Systems
General Description
Computers and networks are increasingly used in critical applications, where system failures can be expensive or even catastrophic. Example applications include aircraft fly-by-wire control, automobile control, computers used in medical systems, spacecraft, and databases in a large variety of financial and enterprise applications. The overall reliability expected of a computer system in these applications far exceeds that of any individual computer. This course is about how to build a highly reliable system that continue to function acceptably even after a number of its components (hardware or software) have failed

Main Topics
  • Introduction to fault tolerance.
  • Measures of fault-tolerance.
  • Exploiting and managing redundancy in:
    • Hardware.
    • Software.
    • Time.
    • Data.
  • Network fault tolerance.
  • Issues in distributed systems.
    • Byzantine generals algorithm.
    • Fault-tolerant clock synchronization.
    • Reliable remote procedure calls.
  • Reliability evaluation techniques.
Slides: HWFT Part 1
Slides: HWFT Part 2
Slides: HWFT Part 3
Slides: Networks Part 1 
Slides: Networks Part 2 
Slides: Networks Part 3 
Slides: Data Replication
Slides: Checkpointing Part 1 
Slides: Checkpointing Part 2 
Slides: Checkpointing Part 3 
Slides: Coding
Slides: Coding Part 2
Slides: Software Fault Tolerance Part 1
Slides: Software Fault Tolerance Part 2
Byzantine Generals Algorithm
Slides: Byzantine Generals algorithm
 

No comments:

Post a Comment