Wednesday, 19 February 2014

Dependable Systems: Introduction

"Fundamental Concepts of Dependability" outlined the results of nearly 20 years of activity in this domain and related ones. Introduced concept and the taxonomy will be used in my further posts. Next figure (taken from the article) shows co-called 'the dependability tree' that gives some intuition what is it all about:


Dependability is a system characteristic like functionality, performance or costs. Formal definition is as follows: "Dependability of a (computing) system is the ability to deliver service that can justifiably be trusted".

So according to "the dependability tree", we can describe it from 3 points of view: Attributes, means, and threats. Attributes - a kind of sub-characteristics of the dependability: 
  • Availability: readiness for correct service 
  • Reliability: continuity of correct service 
  • Safety: absence of catastrophic consequences on the user(s) and the environment 
  • Confidentiality: absence of unauthorized disclosure of information 
  • Integrity: absence of improper system state alterations 
  • Maintainability: ability to undergo repairs and modifications 
Means - goals of dependability analysis:
  • Fault prevention: how to prevent the occurrence or introduction of faults; 
  • Fault tolerance: how to deliver correct service in the presence of faults; 
  • Fault removal: how to reduce the number or severity of faults; 
  • Fault forecasting: how to estimate the present number, the future incidence, and the likely consequences of faults. 
Threats - classification of threats:
  • Fault is a defect on the system that can be activated and become a cause of an error (broken wire, electrical short, bug in a program). 
  • Error refers to incorrect internal state of the system or a discrepancy between the intended behavior of a system and its actual behavior inside the system boundary. 
  • Failure is an instance in time when a system displays behavior that is contrary to its specification. 
I want to tell a little bit more about the threats, because this concept is very interesting but not so obvious. Faults, errors and failures operate according to the chain shown in the next figure:


Fault activation can lead to an error. Once a fault is activated an error occurs. Examples of fault activation are execution of a line of code with a bug, an attempt to send a signal via corrupted connector or execution of a broken hardware part. An error may act in the same way as a fault. It can create further error’s conditions. An invalid state generated by the error may lead to another error or to a failure. Important to note, that failures are defined according to the system boundary. They are basically errors that have propagated out of the system and have become observable. If an error propagates outside the system boundary a failure is said to occur.