Risk based optimisation of inspection planning in ships (June 2010)

Ujjwal Bharadwaj and John Wintle

TWI Ltd

Paper presented at 5th International ASRANet Conference, Edinburgh, UK, 14-16 June 2010.

Abstract

There is competitive pressure on industry to look for cost effective approaches to manage ships such that they remain fit for service. Inspection provides information of the state of a structure or equipment and is thus a vital input to this process of maintaining ships. The proposed paper presents a new method using risk based principles for planning inspection as applied to the ship hull structure. The same approach can be expanded to include other parts of ship.

Current approach to inspection of ships is heavily influenced by time based (class) rules. The aim of this paper is to demonstrate a methodology using information generated by class inspection and/or historical operational data to optimise inspection such that the combination of 1) the risk of failure of a structure and 2) the cost of such inspection is minimised or within user specified acceptable limit.

The proposed inspection planning methodology does not purport to do away with class inspections - it supplements these. It uses all available failure information including that from class inspections such that the resources spent on inspection (and maintenance) are focussed on those structural locations that are deemed high risk. Here risk involves not just the likelihood of failure occurring but also its consequence.

The paper describes work-in-progress on research undertaken for a project with the acronym RISPECT (Risk Based Expert System for Through-life Structural Inspection, Maintenance and New-Build Structural Design) under the auspices of the European Commission's Framework 7 programme.

Nomenclature

CI - Class Inspection
ECI - Extra Class Inspection
ETA - Event Tree Analysis
FMEA - Failure Modes and Effects Analysis
FMECA - Failure Modes and Effects Criticality Analysis
FTA - Fault Tree Analysis
RISPECT - Risk Based Expert System for Through-Life Structural Inspection, Maintenance and New-Build Structural Design
RPN - Risk Priority Number

1. Introduction

For industry to keep its fleet of ships fit for service there is need for cost effective approaches to the life management of ships. Current approach to ship life management is mainly based on the time or rule (as stipulated by Classification bodies) based approach. This paper presents a new risk based methodology to inspection planning as an efficient and somewhat flexible approach to be used in conjunction with class rules for inspection of ship hull.

Inspection not driven/prescribed by class rules is termed as 'extra-class inspection' (ECI) and that in accordance with class rules as 'class inspection' (CI), in the context of this paper. It is expected that such ECI will result in remedial action such that i) reliability of the ship will improve due to inspection being conducted on a more continual basis, as opposed to the fixed periodicity of the class based approach, and ii) ultimately when the ship goes for classification as per its fixed time-frame, there will be financial gains due to timely ECI actions taken beforehand.

The ECI planning as described in this paper is informed by reports from CI, engineering judgement/expert opinion, as well as operating data from a relevant sample of ships from a generic database such as the one envisaged within the RISPECT (Risk Based Expert System for Through-Life Structural Inspection, Maintenance and New-Build Structural Design) project.^[1] RISPECT involves, inter alia, the setting up of a hierarchy of databases (mainly comprising the ship's database, ship managers' database, and a central statistical database with data recorded in a suitable format to ensure its confidentiality) containing relevant shipping data and a number of modules that make risk and reliability calculations of the ship hull structure using this data. It must be noted here that inspection necessitated as part of reactive maintenance may also be termed as ECI; however, this is the type of inspection (and the costs associated with it) that the methodology shown here aims to avoid by taking timely remedial action.

This paper extends principles in asset integrity management well established in other sectors of industry. It draws on the risk based practices in industry such as oil and gas, and construction ^[2,3,4], and standard optimisation procedures. ^[5]

In the sections that follow, a risk based approach to inspection planning of the ship hull structure is delineated. The approach is demonstrated by creating a model with suitable values to bring out its prominent features. The values are for the sole purpose of illustrating the approach and need to be considered as such.

2. Acceptable risk levels in asset life management

In the context of this paper, 'risk' has a special meaning. The risk of a failure event is a measure of the combination of the likelihood of a failure consequence occurring and its severity. Activities broadly included in the life management of an engineering asset such as inspection, maintenance and repair aim at maintaining assets within identified acceptable risk levels. Figure 1 shows the main approaches to establishing what acceptable level of risk/reliability is.

Fig.1. Approaches to establish acceptable risk levels

The first approach can be based on expert-opinion elicitation for want of real data or experience in operating the new type of structure/equipment. There are formal procedures for doing this. ^{[6, 7]}

The second approach, i.e. calibration to existing successfully used design codes, is the most commonly used approach as it provides the means to build on previous experiences.^[8] For example target reliabilities can be established by assessing those implied in current codes or rules provided by classification bodies and industry societies in similar applications.

The third approach is based on finding the optimum trade-off between the economic costs of an action that mitigates risk and the cost (risk of failure) without that action. Here the aim is to minimise the total 'expected cost' of operating a plant (structure/equipment). Expected cost is a useful measure of risk and, in the current context, is the potential cost of failure which is a combination of the likelihood of the consequences of the failure and the cost of these consequences.

The approaches are by no means always mutually exclusive in their application. The risk based approach described in the following sections aims to minimise expected costs while remaining within certain constraints including stipulated reliability/risk levels.

In the discussion that follows, inspection reduces risk in two ways i) it reduces the uncertainty regarding the condition of the structure, and ii) the subsequent remedial action that is assumed to be taken as indicated by inspection, mitigates the likelihood of failure occurring. Although, it may not be explicitly mentioned in the context of this paper, it is assumed that inspection will lead to remedial action.

3. Effect of inspection on the operational risk level of a system

Activities such as inspection, maintenance and repair aim at maintaining structures within identified acceptable levels of risk.

Figure 3-1, adapted from^[9] depicts the risk of failure within a system of components over its operating life and the impact of inspection (and subsequent remedial action) on the operating risk.

Fig.2. Effect of inspection on a system of components over its operating life - each sawtooth represents inspection being carried out

Stage 1 is when the structure has just been commissioned. The risk of failure is mainly due to incorrect application, installation flaws, inherent weakness in component design and material manifesting itself as failure when service conditions are initially experienced, and fabrication defects not identified by Quality Control/Manufacturing NDT or inspection.

Stage 2 is the useful life of the structure. By this time, initial problems have been identified. The structure is comparatively more predictable and reliable in performance, and inspection and maintenance keep the failure rate low.

Stage 3 is the when the structure begins to show signs of ageing. Damage that has accumulated over pervious stages manifests itself in an increasing failure rate.

Inspection is mainly aimed at i) keeping the structure within identified acceptable levels of risk and ii) prolonging stage 2, which is the useful and most productive period in a structure's operation life.

4. Risk based optimisation of inspection planning

4.1 Risk based approaches

Inspection and maintenance strategies have followed an evolutionary continuum from the more time based traditional approaches to the advanced ones such as risk based.

Time-based approaches are those in which specified action is required at some point of time; often there are industry standards stipulating when or how frequent the action is required. The approach is also called the rule-based approach as this approach is prescriptive and the scheduling of the concerned action is not at the discretion of the operator. These rules or standards are based on industrial experience and are influenced by historical data; in this sense, the rules assume that the asset is operating in industry-wide average conditions.

Condition-based approaches are those in which action is informed by the condition of the asset. Relative to the rule-based approach, the approach here is more case-by-case, i.e., based on current state of the asset and on local conditions.

The more advanced risk based approaches prioritise action based on the risk profile of various components within a system; the aim here is to focus resources on components that are deemed more risky. Relative to other approaches, risk based approaches to asset integrity management are more sophisticated in that, apart from factors considered in the previous approaches, here the context in which the asset (or a component within a system) is being operated is also considered.

In the interest of continuity, mention must be made of some other approaches, each having their own bias. Reliability Centred Maintenance (RCM) is a subset of risk based maintenance in that maintenance is optimised taking into consideration the effect the structure/equipment has on plant reliability. Reactive Maintenance (RM) is an approach in which maintenance is performed only when the structure fails or shows signs of failing. Run-to-failure policy is one in which the structure is operated until it breaks down effecting RM or resulting in discarding the equipment, for example, a satellite.

The above discussion shows that the level of sophistication in various approaches to inspection and maintenance (asset life management, in general) increases from the run-to-failure policy to the risk based approach.

4.2 Risk based optimisation

The risk based optimisation approach described here, in a way, adds a third dimension to assessing potential failures. Apart from the likelihood and consequence of failures, these approaches, by focussing resources on high risk components, consider the manageability aspects of failures within resource constraints. These approaches often answer the question: how to best manage the risk from failures within a system of components, given the resources available?

5. Risk model

The approach to the risk based optimisation of inspection is shown by the model as described below. The model and the values used are for the sole purpose of demonstrating the approach; the focus of this paper is on the risk based optimisation methodology.

There are two parts to the risk model. Part 1 is a technique for according priority to structures within the ship hull based on certain measures of risk. Part 2 of the model is used for optimising inspection actions, given the risk based order of priority established in Part 1 and the financial resource available for the purpose.

5.1 Part 1: profiling ship hull structures

There are a number of formal established engineering techniques such as FMEA (Failure Modes and Effects Analysis), FMECA (Failure Modes and Effects Criticality Analysis), FTA (Fault Tree Analysis) and ETA (Event Tree Analysis) that help to identify high risk components within a system. FMEA identifies components that have high impact failures by identifying different modes of failures for the component the effect of these failures. FMECA is an extension of FMEA in that it formally, qualitatively or quantitatively, ranks components in terms of their relative failure criticality. FTA is a graphical model created by reasoning that considers various combinations of events leading to the occurrence of some top event failure. ETA is again a graphical model that is created by reasoning that considers initial events followed by other events leading to the final set of consequences. These techniques may be purely qualitative or may have some quantitative measures too. Each of these techniques has some advantages and limitations in relation to other techniques. The FMEA, FTA and ETA provide different perspectives on risk and are not competing tools; they often act in conjunction with each other to give a larger picture of the risk within the system under consideration.

In the discussion that follows, critical components are identified through FMEA using which a risk priority number (RPN) is accorded to each component. A detailed discussion on the FMEA conducted is beyond the scope of this paper. A typical FMEA will include, inter alia, a list of components, the potential causes of failure, failure modes and its effects, and factors from which RPN can be assessed. The format for an FMEA is shown in Table 1.

The RPN technique for criticality assessment is an extension of the FMEA or FMECA originally developed by the US Department of Defence. ^[10]

The RPN method of assessment is based on fairly general observations relevant to inspection planning. These are:

The more a structure is susceptible to failure, the more critical it is for it to be inspected,
The more a structure is susceptible to high impact/ consequence failure, the more critical it is to be inspected
The less the probability of detection and timely management of potential failure, the more critical it is to be inspected.

Table 1: Illustration of an FMEA format

The RPN assessment is based on engineering judgement/ expert opinion and/or historical failure data (such as from previous CI or ECI reports or data from generic databases) to rate each potential failure mode into three rating scales: severity rating, occurrence rating and detection rating.

Severity rating (S) grades all assessed effects (or impacts) of failure modes depending on their severity. The grading requires a scale to be used. For example, a relative rating scale of 1 to 10 may be used, where '1' implies 'low' and '10' implies 'extremely high'. An example of the rating scale is shown in Table 2 where even 'low', 'medium' and 'high' ranks have different values, higher values being accorded to relatively more severe failure events. Such a scale can also be used for the other two types of ratings.

Occurrence (O) rating is the relative grade given to a failure mode depending on the likelihood of the cause leading to a particular failure mode and its particular effect. Table 3 shows a possible scheme for scaling the frequency of occurrence of a (failure) event.

Detection rating (D) is the effectiveness of controls currently in place to prevent or detect i) the cause of failure mode before the failure occurs, or ii) the failure mode before it results in the postulated effect. Table 4 shows a possible scheme for scaling the effectiveness of mechanisms in place to control failures.

Tables 2, 3 and 4 are for illustration only; what is important is for the assessment to capture available information using a consistent and auditable method.

Once ratings have been assigned, RPN for each failure mode assessed to be associated with the failure of a structure within the hull can be calculated as,

(RPN_fm1,effect1) = Severity*Occurence*Detection Equation 1

Where,

fm1= failure mode designated as 1,

effect1= one of the effects of fm1 designated as 1,

Using the above equation, Severity, Occurrence and Detection ratings are given to failure mode1 resulting in effect 1; the use of further subscripts is avoided here for the sake of convenience.

Table 2: A possible scheme for ranking severity of failure events

Rating	Description
Low 1-3	Failure implies low risk of human injury, environmental pollution or economic loss; - minor injuries, if at all; < 0.5 t spill, minimal clean-up required if spillage occurs; temporary repair till next dry scheduled docking
Medium 4-7	Failure implies medium risk of human injury, environmental pollution or high economic consequences^[11]; - many injuries, few fatalities; 35-350 t spillage, clean-up necessary; immediate repair essential, ship remains out of service for less than 2 weeks.
High 8-10	Failure implies high risk of human injury, substantial environmental pollution or very high economic consequences; - many injuries, many fatalities; spillage >350t requiring major clean-up operation, economic costs are very high.

Table 3: A possible scheme for ranking the frequency of occurrence of failure events ^[12]

Rating Description	Failure Probability
1 2 Low 3	< 1 in 10⁶ 1 in 20,000 1 in 4000
4 5 Moderate 6 7	1 in 1000 1 in 400 1 in 80 1 in 40
8 9 High 10	1 in 20 1 in 8 >1 in 2

Thus, if there are 'i' effects associated with fm1, RPN for fm1 can be defined as:

(RPN_fm1) = (RPN_fm1,effect1 + RPN_fm1,effect2+ ...RPN_fm1,effecti ) Equation 2

Table 4: A possible scheme for assessing the effectiveness of controls to manage failures ^[13]

Rating	Description
10 - Very low	Controls are not expected to detect the existence of a defect; there are no controls in place with the ability to do so.
9 - Low	Controls are not likely to detect the existence of a defect.
7-8 - Low	Controls have a poor chance of detecting the existence of failure.
5-6 - Moderate	Controls may detect the existence of a defect.
3-4 - High	Controls have a fair chance in detecting the existence of a defect.
1-2 - Very high	Controls will usually detect the existence of a defect.

The RPN of the structural component with 'j' failure modes can be defined as

RPN_str1 = (RPN_fm1 + RPN_fm2 + ...`RPN_fmj ) Equation 3

The product RPN must be treated with caution. It is not enough to just establish some threshold value for RPNs in the assessment of risk. For example, other things being equal, a case of low severity high occurrence rating may have the same RPN as a case of high detection high occurrence rating. A more cautious approach would be to look at individual ratings- S, O and D - and ensure that cases in which any of these are above a certain critical level, remedial action is taken. For example, if S, O and D are each rated 1-10, then a RPN of 125 associated with S=O=D='5' would be medium risk. Rather than considering just the RPN of 125 as a threshold value, one can consider a value of '5' or more than '5' for any of S, O or D for the component failure modes to merit attention. The scheme of rating needs to be tailored to individual requirements.

The RPN calculated above will be used in part 2 of the model in which optimisation of inspection planning will be carried out. Table 5 gives the RPN assessed for, as illustration, 36 types of structural components within the ship hull. The table does not list all 36 types in order to capture the model in one screen.

Table 5: RPN of types of structures within ship hull

5.2: Part 2: optimising inspection within stipulated constraints

Figure 3 shows the left hand side (LHS) of part 2 of the model which optimises inspection planning. The model starts with a list of types of hull structures under consideration - with identification numbers and names; it is assumed there are 36 such types of structures that make up the hull. The third and fourth columns are the number of class inspection (CI) most recently undertaken and the costs associated with the same.

Fig.3. LHS of the part 2 of the risk model

This is required to establish some point of reference for planning inspection. The post-CI risk profile is as assessed using RPNs calculated in Part 1 of the model - RPN values are shown in Table 5. As shown in the figure total cost of CI and total post-CI RPN is also calculated. The number of ECI considered, in the next column, is also (like the assessment of RPNs) the result of reports from previous inspections in conjunction with engineering judgement. For the sake of simplicity, the cost of ECI is assumed to be the same as the cost of CI. Total estimated cost of ECI considered is also calculated to establish a point of reference.

The optimisation is carried out on the right hand side (RHS) of part 2 of the risk model (shown in Figure 4) where the first two columns are the same as the LHS of the model. The third column is the result of optimising ECI such that

priority in planning ECI is accorded to those structures with higher RPN values, and
the cost of the resulting optimised list of ECI is less than that stipulated, i.e. the allocated budget for ECI - this is 40,000 Euros, for example, as shown in Figure 3 in the LHS of part 2 of the risk model.(For planning purpose, this can be a fraction of the calculated total cost of the most recent CI (i.e. 105,100 Euros, in the illustration in Figure 4) or the total cost of all ECI considered (i.e. 99,300Euros, in the example).

The optimisation, which minimises post-ECI RPNs subject to the budget allocated, is carried out using linear programming. The post-ECI risk profile is assessed by way of the change in the RPN due to inspection actions. It is assumed that with inspection, remedial action as indicated will be taken to bring the structure to a 'low risk' status, which is associated with a RPN of 0. It must be noted that it is also assumed that there is a direct linear relationship between ECI (with remedial action) and RPN so for example if for a structure with a RPN of 250, 5 inspections are being considered, conducting one will reduce the RPN to 200. In reality, this may not be true but this simplification enables one to prioritise and optimise inspection. In the comments column, 'status-quo' is used when the optimised ECI is '0' for a particular type of structure (Str... 1 and Str... 2, for example). 'Action taken' is used when optimised ECI is the same as ECI considered; the risk profile is assumed to be 'low risk' after inspection and remedial action with an associated RPN of '0'. The term 're-assess' is used when not all ECI considered are being undertaken in the optimised inspection planning; there is only a partial reduction in the RPN due to not all considered ECI being undertaken - for example, Str...36. In such cases, the assessor must re-assess the RPN after ECI; this is because the relationship between inspection and risk mitigation as measured by a reduced RPN may not be linear.

Fig.4. RHS of part 2 of the risk model

In Figure 4, note the optimised schedule for ECI inspection and the change in the RPN values post-ECI. For example, Str no 4 has a pre-ECI RPN of 392 which has reduced to 131. For some structures RPN values have remained the same. Remaining RPN values represent residual risk corresponding to the budget available for risk mitigation action i.e. ECIs, 40,000 Euros in the illustration above shown in Figures 3 and 4.

Figure 5 shows the change in RPN values before and after ECI.

Fig.5. Change in RPN values after optimised ECI planning

6. Discussion

In the illustration of the risk model as above, the following points merit further attention:

In engineering there is a justifiable bias in favour of quantitative models. However, it is not always possible to get precise quantitative data necessitating the use of qualitative or semi-quantitative data. Under such circumstances, the focus is on how to make the best use of available information and use methods that are consistent and auditable.

Part 1 of the risk model captures risk profile of structures using ratings based on the likelihood of occurrence of failure, its severity and the effectiveness of current measures to mitigate the failure and/or its consequence. These ratings provide a RPN for each structure under consideration. The procedure of assessment leading to RPNs is semi-quantitative in nature and dependent on engineering judgement/ expert opinion and data from a generic database, if available.

It is worth noting that the same optimisation approach will work if quantitative data in terms of the consequences of failure and the likelihood of the occurrence of that consequence is available. If failure consequence cost can be obtained (in monetary terms) then risk can be measured in terms of expected costs. For example, if the probability of failure of a structure definitely resulting in a consequence costing 100,000 Euros is 0.05 per year and assuming only one consequence applicable, then risk can be defined as Risk (Euros/year) = (probability of failure/year) x (consequence cost (Euros) of the failure).

i.e. Risk = (0.05) x (100,000)
= 5000 Euros/year If reliable quantitative information is available one can use quantitative FMECA models where criticality is defined directly by using failure rate and consequence cost of a component or a structure within a system.
With reference to Figures 4 and 5, the post-ECI RPNs are less than or equal to those before (i.e. post-CI RPNs). However, even a reduced RPN or 'status-quo' RPN may have individual values of 'S', 'O', or 'D' that are unacceptably high. For example, consider structure type Str...33 with an RPN of 112. The S, O and D values are 2, 7 and 8 respectively (seen in Table 5), even after optimum ECI planning. In such cases, the optimisation maybe required to be carried out again, using different criteria. For example, the optimisation model shown in figure 6 determines the budget needed to conduct ECI on all structures with values of S, O or D greater than 7 regardless of the final RPN value for each structure. With this constraint, no feasible solution could be found within the allocated budget of 40,000 Euros. Repeated running of the model showed that the budget required to mitigate risk to this level(level as indicated by S, O and D ratings being less than 8) is 56,300 Euros as shown in the Figure 6. The post total ECI RPN is now 629 as opposed to the original 1738 and the one in the previous optimisation, 670. ConsiderStr...33, whereas in the previous optimisation, the residual RPN was 112, in the optimisation shown below, it has reduced to zero. This is because this optimisation is targeted at S, O or D values that are greater than 7; for Str...33,S, O and D values are 2, 7 and 8 respectively. The extra risk mitigation comes at an increased cost of 56,300 Euros.

Fig.6. Optimised ECI planning targeted at RPN values containing S, O or D values that are greater than 7

In the current application, it is assumed that inspection is followed by remedial action to mitigate the risk of failure. Once inspection is done and the cost of remedial action is established, one could use the same risk based approach to prioritise and optimise identified actions within user-specified constraints of acceptable risk and the financial resources available.

7. Conclusions

This paper presents a new risk based methodology to prioritise and optimise inspection based on the criticality of failure modes and the resources available for the purpose. The approach is applied to structures within the ship hull but can be extended to other parts of the ship. The approach described above uses semi-quantitative measures, but can be adapted to suit the type of data available. The application of the risk model shown here has the potential of increasing the reliability of ships in a cost effective way.

Acknowledgements

This working paper describes research for the project with the acronym RISPECT funded via the European Commission's Framework 7 programme. The authors have benefitted from their interaction with other partners involved in the project and express their gratitude to them.

The authors are grateful to Dr Li Xu (University of Strathclyde), the project manager, for her ready assistance, and Professor Nigel Barltrop (University of Strathclyde), the project co-ordinator, for his guidance and direction in facilitating the research.

References

Public website at www.rispect.org.uk/ and at www.rispect.eu/
API 2002, Risk-based Inspection, API Recommended Practice 580, American Petroleum Institute, USA.
CIRIA 2009, Whole-life infrastructure asset management: good practice guide for civil intrastructure, CIRIA, London.
EEMUA 2006, Risk Based Inspection - A guide to effective use of the RBI process, EEMUA, UK.
Winston, L.W.1993, Operations Research- Applications and Algorithms, Third edn, Duxbury Press, California, USA.
O'Hagan, Anthony; Buck, Caitlin E., Daneshkhan, Alireza; Eiser, J; Garthwaite, Paul H.; Jenkison, David J., Oakley, Jeremy E.; Bakow, Tim. (2006) Uncertain Judgements: Eliciting Experts' Probabilities (Statisticsin Practice), UK, John Wiley & Sons Ltd.
ASME International (2003), Risk based methods for equipment life management, CRTD Vol 41 ASMW, New York.
Ayyub, Bilal M. (2003) Risk Analysis in Engineering and Economics, Washington: CRC Press.
Wintle, Moore, P., Smalley, S. & Amphlett, G. (2006), 'Section 3: Identification of Ageing' in Plant Ageing: Management of equipment containing hazardous fluids or pressure HSE, UK, pp. 43.
MIL-STD-1629A (1980); and MIL-STD-1629 Notice 2 (1984).
Taking cognisance of 'IACS Common Structural Rules for Double Hull Oil Tankers, January 2006' in which 'high' economic consequences are placed in 'medium' risk category and 'very high' economic consequences in'high' risk category.
Andrews, J.D. & Moss, T.R. 2002, Reliability and Risk Assessment, UK.
Adapted from Ayyub Bilal's Risk Analysis in Engineering and Economics.

For more information please email:

contactus@twi.co.uk