LECTURE # | FILES |
Dependability Concepts | |
Lecture 01
| |
Lecture 02 | |
Lecture 03 | |
Fault-Tolerant (FT) Design Techniques | |
Lecture 04 | |
Lecture 05 | |
Information Redundancy - self reading |
[pdf]
|
Dependability Modeling | |
Reliability, MTTF, etc. | PPT |
Fault Tree Analysis |
[PPT]
|
Petri Nets | |
FT in Distributed Systems | |
Stable storage -- RAID | PPT |
Stable storage - advanced RAID | PPT |
Consensus | PPT |
Clock Synchronization | PPT |
System-level diagnosis | PPT |
Checkpoint and Rollback recovery | PPT |
Atomic actions -- Lock & Commit Protocols | PPT |
Replica management protocols | PPT |
FT in Networks | |
Dependable communication - 1 | PPT |
Primary-backup path | PPT |
Fault Localization | PPT |
Dependability-Security | |
FT in Real-Time Systems | |
Lecture 06 | |
Lecture 07 | |
Lecture 08 | |
Spring 2010 Student Presentations | |
Recovery-Oriented Computing - Peter Scott | |
ZFS - a RAID based file system - Henri Bai | http://hub.opensolaris.org/bin/view/Community+Group+zfs/docs |
2-dimensional error coding - Long Chen | |
Software based fault detection - Tim Prince | PPT |
Self Recovery of Server Programs - Chesta Dwivedi | PPT |
Dynamic Fault Trees - Ashok Aditya | PPT |
Device Failure Tolerance Using Software - Haribabu Narayanan | PPT |
FPGA Fault Tolerance - Matt Clausman | PPT |
Byzantine Storage - Debkanta Chakraborty | PPT |
Spring 2009 Student Presentations | |
Fault-Tolerant Internet Services -- Indranil Roy | PPT |
Checkpoint Recovery in Petaflop systems -- Paul Jennings | PPT |
Highly Available Systems - Case Study -- Cory Kleinheksel | PPT |
Fault-Tolerant TCP Server -- Preethika K. | PPT |
Fault-Tolerant CORBA (NVP implementation) -- Indranil Roy | PPT |
Fault-Tolerant Multipath Routing - Ganesan Mani | PPT |
Petri Net modeling - Phased Mission - Siddharth Sridhar | PPT |
Spring 2007 Student Presentations | |
Energy-aware scheduling Weakly-hard real-time systems (Julie Rursch) | PPT |
Fault-Tolerance in Multiprocessor SoC (Premkumar) | PPT |
Fountain Codes (Long Long) | PPT |
Network Time Protocol (Lizandro) | PPT |
RAID architectures (Russell Graves) | |
Spring 2006 Student Presentations | |
Architecture fault-tolerance (Viswanathan) | PPT |
Advanced Quorum protocols (Kamna) | PPT |
Fault-tolerant objects (Bebek) | PPT |
Hierarchical system-level Diagnosis (Qin Wen) | PPT |
Checkpointing in mobile systems (Ben) | PPT |
Dependability and Security (Srdjan) | PPT |
Decidability and Schedulability -- Timed Automata () | PPT |
System-level diagnosis in adhoc networks (Kavitha) | PPT1, PPT2 |
NOTE: You can print the handout slides from Microsoft Powerpoint.
Fault-Tolerant Systems
General Description
Computers and networks are increasingly used in critical applications,
where system failures can be expensive or even catastrophic. Example
applications include aircraft fly-by-wire control, automobile control,
computers used in medical systems, spacecraft, and databases in a large
variety of financial and enterprise applications. The overall
reliability expected of a computer system in these applications far
exceeds that of any individual computer. This course is about how to
build a highly reliable system that continue to function acceptably even
after a number of its components (hardware or software) have failed
Main Topics
Slides: HWFT Part 2
Slides: HWFT Part 3
Slides: Networks Part 1
Slides: Networks Part 2
Slides: Networks Part 3
Slides: Data Replication
Slides: Checkpointing Part 1
Slides: Checkpointing Part 2
Slides: Checkpointing Part 3
Slides: Coding
Slides: Coding Part 2
Slides: Software Fault Tolerance Part 1
Slides: Software Fault Tolerance Part 2
Byzantine Generals Algorithm
Slides: Byzantine Generals algorithm
- Introduction to fault tolerance.
- Measures of fault-tolerance.
- Exploiting and managing redundancy in:
- Hardware.
- Software.
- Time.
- Data.
- Network fault tolerance.
- Issues in distributed systems.
- Byzantine generals algorithm.
- Fault-tolerant clock synchronization.
- Reliable remote procedure calls.
- Reliability evaluation techniques.
Slides: HWFT Part 2
Slides: HWFT Part 3
Slides: Networks Part 1
Slides: Networks Part 2
Slides: Networks Part 3
Slides: Data Replication
Slides: Checkpointing Part 1
Slides: Checkpointing Part 2
Slides: Checkpointing Part 3
Slides: Coding
Slides: Coding Part 2
Slides: Software Fault Tolerance Part 1
Slides: Software Fault Tolerance Part 2
Byzantine Generals Algorithm
Slides: Byzantine Generals algorithm
No comments:
Post a Comment