Availability Management
Controlling the Process
Availability Management needs to prepare regular reports on its management providing information of relevance to customers and to the rest of the IT organisation.
These reports should include:
- Techniques and methods used to prevent and analyse faults.
- Statistical information on:
- Times taken to detect and respond to faults.
- Times taken to repair faults and recover the service.
- Average time of service between faults.
- Real availability of each of the services.
- Fulfillment of the SLAs as regards the availability and reliability of the service.
- Fulfillment of OLAs and UCs as regards the service capacity given by internal and external service providers.
In order for this information to be easy to interpret and analyse correctly, it is essential to establish precise metrics allowing parameters such as downtime and uptime to be determined unambiguously. For example, in the case of an online e-commerce service, it may be considered that response times of over 10 seconds are equivalent to the system's being down, although strictly speaking the system does eventually respond.




