Problem Management
Practical Case
The Service Desk of "Cater Matters" has informed Problem Management about an incident which could not be associated with a known error and which caused a low impact interruption to service.
Problem Management decided to analyse the problem following the established protocol, which is based on the Kepner-Tregoe method:
- Identifying the problem.
- Classifying of the problem.
- Establishing the possible causes.
- Checking the most likely cause.
- Confirming the actual cause.
Identification: In the case with which we are concerned, the problem is easy to define:
- The online orders application produces unpredictable errors when recording certain orders. There is no apparent relationship between the error and other hardware/software components.
Classification: The problem may be classified according to the following parameters:
- Identification: Problems recording orders.
- Source: Online orders module.
- Frequency: the problem is not recurrent, this is the first time it has been detected.
- Impact: slight. The incident was resolved without a serious interruption to service.
Possible causes: The most likely causes include:
- Errors in programming on the client side of the application.
- Errors in the web server recording modules.
- Database configuration errors.
The analysts decide that the most likely origin of the problem is in the application's recording modules.
Checking the most likely cause: with the help of the information recorded by Incident Management:
- Problem management tries to reproduce the problem.
- They find that the error is only reproduced with a particular brand of ice-cream.
- They notice that the brand of ice-cream has an apostrophe in its name and that if this is removed the order is recorded without problems.
Verification:
- A test environment is set up reproducing the module of interest on the live environment.
- The necessary programming changes are made.
- They confirm that the order is recorded correctly.
The problem has been converted into a known error. It is now the task of Error Control to:
- Raise an RFC with the proposed solution.
- Carry out the post-implementation review if Change Management considers it appropriate to implement the RFC.




