Availability Management
Methods and Techniques
Although we have already been talking about availability for a while, we have not yet suggested a means of quantifying it.
Availability is commonly defined as a percentage as follows:
where:
AST is the agreed service time, DT is the downtime of the service during the time intervals it has been agreed it should be available.
For example, if the service is 24/7 and over the last month the system has been down for four hours to carry out maintenance, the real availability of the system was:
Availability Management has a wide variety of methods and techniques at its disposal with which to determine what factors are involved in service availability and consequently enabling it to decide what resources should be assigned to prevention, maintenance, and recovery tasks. It can also use these methods to prepare plans for improvement based on this analysis.
These techniques include:
CFIA
Which stands for Component Failure Impact Analysis.
This method is used to identify the impact on IT service availability of the failure of each configuration item involved. Obviously, this method requires a fully up-to-date CMDB.
FTA
Which stands for Failure Tree Analysis.
The aim of which is to study how faults propagate throughout the IT infrastructure, so as to better understand their impact on service availability.
CRAMM
Which stands for CCTA Risk Analysis and Management Method.
The aim of CRAMM is to identify the risks and vulnerabilities to which the IT infrastructure is exposed in order to take countermeasures to reduce them or enable rapid recovery of the service in the event of an interruption.
SOA
Which stands for Service Outage Analysis.
The aim of this technique is to analyse the causes of the faults detected and propose solutions to them.
It differs from the previous methods in that it performs its analysis from the point of view of the customer, placing special emphasis on factors other than purely technical aspects directly linked to the IT infrastructure.




