Ujjwal R Bharadwaj
Julian B Speck and Chris J Ablitt
TWI Ltd, Cambridge
Paper presented at 26th International Conference on Offshore Mechanics and Arctic Engineering, OMAE 2007, San Diego, California, 10-15 June 2007. Paper no. 29260.
Offshore wind farm managers are under increasing pressure to minimise life cycle costs whilst maintaining reliability or availability targets, and to operate within safety regulation. This paper presents a risk based decision-making methodology for undertaking run-repair-replace decisions with the ultimate aim of maximising the Net Present Value (NPV) of the investment in maintenance. The paper presents the methodology developed for the risk based life management of Offshore Wind farms under the remit of the CORLEX (Cost Reduction and Life Extension of Offshore Wind Farms) project funded by DTI (Department of Trade and Industry, UK) Technology Programme on Renewable Energy.
Unlike traditional approaches to decision-making that consider either the probability of failure of a component or the consequence of failure in isolation, a risk-based approach considers both these aspects in combination to arrive at an optimal solution. The paper builds a basic Qualitative Risk Analysis methodology to highlight high-risk components that are then investigated further by a Quantitative Risk Analysis. The risk is now quantified in monetary terms and the time of action - replacement or maintenance- indicated by the model is such that the NPV of the action is maximized. The methodology is demonstrated by considering offshore wind turbine tower as the critical component and corrosion as the damage mechanism.
Maintenance manages the process of ageing of a plant or machinery. Ageing is a process that is relentless and starts from the moment a product is manufactured or fabricated. The rate of failure versus time relationship is usually depicted by Figure 1. During the initial stage- the Infant Mortality stage, there are 'teething' problems causing the failure rate to be high. The rate then falls as these problems - design, manufacturing defects, etc are identified and solved. In the second stage, which is the Useful Life stage, standard maintenance practices keep the failure rate almost constant. During the third and final stage of the plant- the Ageing stage, the failure rate rises mainly due to damage accumulated by ageing. At some point during this stage, high failure rate requires operators to consider replacing the plant. In practice, there are usually a number of such plants and a limited budget available to decision makers.
Fig.1. Failure rate of plant components versus age
This paper develops and demonstrates a risk-based methodology to decide the optimum time of replacement or repair of a plant, given, a number of such plants, and limited budgetary support.
The paper starts with a discussion on risk and its analyses - qualitative and quantitative, the concept of NPV, and moves on to the trade off involved in risk-based decision-making within budgetary and safety constraints.
Risk has numerous definitions depending upon its use. Risk is a combination of the probability of an event and its consequence. [API 2002] It is a deviation from the normal or expected. Numerically, it is a product of probability of an event occurring and the consequence of the event.
3 The risk based approach to maintenance
A risk-based approach considers failure in both its dimensions, taking cognizance of the two elements that constitute risk - the probability (or likelihood) of failure and the consequence of that failure. Figure 2 shows the two dimensional risk profiles of the components in an offshore wind turbine plant henceforth referred to as the plant.
Fig.2. Idealized Risk Plot of several components within a plant
The probabilities and the consequences of failure of ten components have been determined and presented as points on a Risk Plot. An iso-risk line is also plotted representing a constant risk level as defined by the operator according to their perception of what is an acceptable threshold level of risk. The iso-risk line separates acceptable risk components from the unacceptable risk components, enabling plant managers to focus maintenance resources on the relatively more risky components.
4 Risk analysis methods
Risk analysis is the systematic use of information to identify sources of risk, and to estimate the risk of failure. The information used in risk analysis includes historical data, theoretical analysis, informed opinions and stakeholder concerns.
Risk analysis methods are generally categorised as qualitative or quantitative. There may be an intermediate category (semi-quantitative) depending upon how quantitative the risk analysis is. The American Petroleum Institute's Recommended Practice 580 on risk-based inspection describes a 'continuum of approaches' ranging from the qualitative to quantitative, Figure 3. The figure depicts the level of detail in risk analysis corresponding to a purely qualitative approach on one end of the spectrum, to the purely quantitative one on the other, with intermediate approaches in between.
Fig.3. Continuum of Risk Analysis methods
4.1 Qualitative analysis
This method uses engineering judgement and experience as the basis for risk analysis. The results of the analysis largely depend on the expertise of the user. The primary advantage of qualitative risk analysis is that it enables assessment in the absence of detailed numerical data. It is also the first pragmatic step to conduct a quantitative risk analysis by screening out components of less concern. Moreover, the results can serve as a reality check on the outcome of quantitative analysis. However, it is not a very detailed analysis and provides only a broad categorization of risk. Failure Modes, Effects and Criticality Analysis (FMECA), Hazard and Operability Studies (HAZOPS), and the Risk Matrix approach are examples of qualitative risk analysis. In the Risk Matrix [API Publication 2000] approach, the likelihoods and consequences of failure are qualitatively described in broad ranges (e.g. high, medium or low). Figure 4 shows the risk profiles of selected components of a wind turbine plant. The risk profiles are for demonstration only: in practice, the profiling is done by involving plant experts.
Fig.4. Qualitative Risk Analysis using a Risk Matrix
4.2 Quantitative analysis
Qualitative risk assessments become less discerning when the system complexity increases, so quantitative analysis is usually required the risk discrimination of a system of components. Quantitative Analysis assigns numerical values to the probability (e.g. 10-5 failure events per year) and the consequences of failure (e.g. inventory released over 1,100m2). Qualitative Analysis techniques such as FMECA and HAZOPS can become quantitative when the values of failure consequence and failure probability are numerically estimated. The numeric values can determined from a variety of references such as generic failure databases, elicited expert opinion, or calculated by specific engineering and statistical analysis. [ASME International 2003] There are statistical methods for combining data from various sources or updating data with additional information. [Jordan 2005] [Kallen, Noortwijk 2005] [Khan, Haddara & Bhattacharya 2006]
In the current discussion, it is assumed that the structure of a wind turbine tower is of critical importance, as highlighted by the Qualitative Analysis in the previous section.
For the Quantitative Risk Analysis method proposed in this paper, a failure frequency versus time curve, for the Ageing period of life is developed by engineering analysis of the structure for the active or potentially active in-service damage mechanisms, e.g. corrosion. The consequence of failure is in financial terms. For complex systems, event tree analysis is usually undertaken to determine the effect the particular component has on the system, to thereby resolve the individual cost of consequence of the component's failure.
5 Risk based optimisation
The next step is the calculation of the optimum action schedule or date, of the run-repair-replace action. This calculation weighs the financial benefits of maintenance action against the risk (as expressed in costs) of not taking the action. The ultimate aim is to maximise the net present value of the investment ( i.e. the maintenance action) by adjusting the date of the action.
6 Decision-making using financial criterion
6.1 The need for financial criterion in maintenance decision making
Maintenance projects are increasingly being evaluated by decision makers who need to understand the implications of various options in financial terms. Although predictive maintenance techniques have matured, the predictions are in engineering terms, and these are not easily understood by financially oriented decision makers. Thus, there is a need to express the effects of engineering wear and tear in financial terms. In the current context, this is done by evaluating the cost of the consequences of failure owing to wear and tear of plant and machinery.
6.2 The drivers for a consistent decision making methodology in maintenance
Many old plants, structures, capital equipment or components are in their Ageing period of life. However, increasing competition means many of them cannot be replaced and need to have their useful life extended. In addition, new components are often designed to operate with maximum efficiency, and are designed with lower 'margins of error' against assumed operating conditions.
Each action (or project) has costs associated with it. These costs are, in essence, investments made by the concerned asset owner with the expectation of certain return on the investment(s). The decision maker will normally be faced with a number of projects competing for such investments, and therefore needs to take decisions that maximise the returns on these investments. The most widely understood financial techniques to evaluate projects include 'return on investment', 'pay-back period' and 'discounted cash flow (DCC)' methods. [Brealey, Myers 1991] These techniques have various strengths and weaknesses. This paper employs the net present value (NPV) technique that is a form of DCC analysis.
7. Maximizing NPV using probabilistic damage mechanism models
7.1 NPV financial analysis
In the current discussion, it is assumed that a project with a higher NPV is a better investment than a project with a lower NPV. The NPV of a project is the present (current) value of the total future cash flows, both positive(income) and negative (cost). NPV considers the time value of money by discounting all the cash flows, and it is calculated as follows:
N = project life (years); t = timing of cash flow (year); r = interest rate, or discount rate; and Ct = cash flow in year t.
The future cash flows are 'expected' cash flows, as they do not occur with certainty. The uncertainty arises in the engineering analysis to calculate the probability of failure over time for the damage mechanism(s) of interest.
The risk associated with any project is finally expressed in terms of its NPV by using expected values (EV). The EV of a failure event is the product of the probability of the event occurring and the cost of consequence of that event.
The cost of consequence of a failure event is directly assessed from a prior quantitative consequence analysis, and it must be expressed in financial terms.
Thus, the NPV of a project with uncertain outcomes is the sum of the expected values of all future discounted cash flows, as follows:
p t = the probability of the event occurring at time t.
7.2 Probabilistic damage mechanism model
This paper does not discuss the details of damage models for use in probabilistic analysis. Instead it illustrates a simple probabilistic damage mechanism model for general corrosion of the tower structure. Consider a structure subject to corrosion. Assuming this to be the only damage mechanism causing failure of the structure, the remaining life of the structure can be calculated as:
RL = remaining life (years); Tc = current thickness of the structure (mm); MAT = minimum allowable thickness (mm); and CR = corrosion rate (mm/year).
CR is derived from periodic in-service measurements of metal loss resulting from corrosion. Tc is known from the most recent thickness measurements on the structure (or at the start of the structure's life, Tc can be assumed to be equal to the original nominal thickness of the structure as specified by the designer including tolerances, corrosion allowance, etc). MAT is the absolute minimum thickness calculated by the designer to prevent failure by overload, collapse, etc as appropriate.
The convention is to calculate RL in a deterministic manner, whereby each independent variable in Equation 3 is a specific value. This assumes that these variables have no random or probabilistic aspects but can be defined in a fixed predictable fashion. In reality, there is considerable uncertainty associated with these variables and each can be defined by a statistical distribution of values.
In the method here, a statistical analysis tool ( i.e. Palisade's @RISK for Microsoft Excel) is used to describe all the independent variables probabilistically, and RL is then calculated using Monte Carlo Simulation (MCS) technique. In this way, the calculated RL by MCS is actually a distribution of values, so that the annual probability of failure (the failure rate per year) over time can be obtained. This probability versus times curve may then be used to derive the EV, where the EV of a failure event is the product of the probability of the event occurring and the cost of consequence of that event, at a specific point in time.
7.3 Risk-based Optimisation
The key inputs to the optimisation model are as follows:
(a) The expected present value of the proposed action (replacement or repair of the asset);
(b) The expected present value of inaction which is equal to the expected present value of the production losses avoided as the result of undertaking the proposed action;
(c) Any financial constraints, such as the annual maintenance budget limit; and
(d) Any non-financial constraints, e.g. on failure rates due to safety regulations.
Thus, for the NPV of an action taken at time t=n, the following can be defined:
- CBt = Cash flows associated with production in year t;
- CPt = Cash flows associated with implementing the project in year t, including any tax credits (positive cash flow) on depreciation costs; [Collier, Glagola 1998]
- N = the maintenance planner's strategic planning period;
- n = year in which the action is proposed to be undertaken;
- pt = probability of the event (failure) occurring in year 't'; and
- r = interest rate, i.e. the cost of money (finance).
In the current context, the NPV of action in any year 'n' is given by:
NPV=(Expected present value of action)+(Expected present value of inaction) (4)
Assuming that cash outflows are negative and cash inflows are positive, and failure results in production loss,
The optimization algorithm calculates the year of maintenance action for which the NPV is maximum (least negative), subject to stipulated constraints.
The maintenance action may be replacement or repair. In case of replacement, the equipment/component begins its life cycle from its Infant Mortality Stage through to the Ageing Stage. In the current model, it is assumed that repair improves the condition of the equipment such that it returns to a stage prior to the Ageing Stage i.e. the Normal Life Stage or, preferably, the Infant Mortality Stage. It may be noted here that the implications in terms of tax credits on plant/equipment depreciation may be different in case of replacement compared to repair and these need to be reflected in the equation accordingly.
8. Inputs from inspection
It is possible to add value to the overall replacement decision-making in the Ageing period of life, by using Inspection findings to revise the rate of damage (e.g., corrosion rate, CR), or take corrective action to reduce the rate of damage (e.g. maintenance painting to eliminate corrosion). Using the model described above, the NPV of such an action can then be compared with the NPV without such action, to appraise the inspection financially.
9. Demonstration of the model
The approach described above was successfully demonstrated by evaluating three offshore wind turbine towers ( i.e. the structures, and not the rotating machinery). The failure frequency from a probabilistic corrosion model is presented in Figure 5. The optimised replacement schedule for Structure #1 is shown in Figure 6. The year in which the 'ActionCost NPV' is maximum has been calculated using the linear optimization tool 'Solver' in the MS Excel spreadsheet. NPV is obtained by considering the expected net present value of: (a)cash flows resulting from the replacement of the structure; (b) the avoided lost production outage cost due to replacement. The optimum action date for Structure #1 is 2013, Figure 7.
Fig.5. Probability of failure versus time
Fig.6. Optimum time of replacement of Structure #1
Fig.7. Structure #1, Action NPV and Probability of Failure versus time
Figure 7 shows the application of risk based approach to maintenance of Structure #1. The probability of failure versus time curve derived from remaining life estimates on its own is incomplete information to the decision maker. The Action NPV versus time curve generated by the risk based approach enables the user to make a more informed maintenance decision by considering the consequences of failure too, in conjunction with the probability of failure. The optimal action date is the time when the NPV of the action in maximum (2013, for Structure #1).
The optimized action years for the other two structures are derived in the same way. To determine the optimised action years for all the three structures within a budgetary constraint, the Solver in MS Excel was used. The resulting optimised schedule is shown in Figure 8.
Fig.8. Optimum schedule of replacement of three structures within budgetary constraint
Figure 8 shows the replacement schedule for three structures Str#1, Str#2 and Str#3 as 2013, 2016 and 2014 respectively.
10. Limitations of the model
Some of the immediately apparent limitations of this tool are as follows:
For more complex systems with increasing number of components and constraints, non-linear optimization tools (i.e. based on genetic algorithms) more powerful than the linear Solver in Microsoft Excel may be required;
There are economic dependencies in maintenance and inspection, so with increasing dependencies, the methodology will become more complex and more computing power may be required for the analysis; and
For safety critical systems, as opposed to business critical systems, where the target or acceptable levels of failure probability are of the order of 10-4 to 10-6 failures per year, the constraints on the failure probability may be so severe, as to cancel out any potential financial benefits from applying the methodology.
This methodology is essentially for proposed action during the 'Ageing Phase' of a plant in which failure is primarily due to accumulated damage. Thus the method needs to be used in conjunction with an overall maintenance strategy rather than in isolation.
The paper describes and demonstrates work in progress of the DTI funded CORLEX project under the Technology Programme on Renewable Energy.
Risk-based maintenance optimization requires a detailed analysis using quantitative techniques. The proposed methodology uses engineering analysis by developing a basic probabilistic damage mechanism model to obtain failure rates. The resulting failure rates over time, are used to calculate expected present values of cash flows before and after selected maintenance actions ( e.g. equipment replacement).
It has been shown that the optimum year of replacement can be calculated when the net present value (NPV) of the maintenance action is maximised. If there is a budgetary constraint that does not allow for a series of actions in a system of structures to be undertaken in a given strategic planning period, multi-component optimisation can be easily undertaken using the approach.
Future work will focus on the derivation of failure rates using expert elicitation, as well as the combination of failure probabilities using Bayesian methods to update prior probability distributions with other data sources, e.g. generic failure databases with expert knowledge. The optimisation method currently used in the method will also be developed by exploring the practicalities of using off-the-shelf genetic algorithm solvers. Finally, further work will be undertaken to incorporate a wider range of in-service equipment damage mechanisms, as well as previous in-service inspection data, to optimise future inspection plans ( i.e. coverage and schedule).
The authors wish to acknowledge the support of Professors Vadim Silberschmidt and John Andrews at Loughborough University, and Mr John Wintle at TWI.
- API 2002, Risk-based Inspection, API Recommended Practice 580, American Petroleum Institute, USA.
- API Publication 2000, 'Risk-Based Inspection, Base Resource Document' in, pp. 5-1-5-4.
- ASME International 2003, Risk Based Methods for Equipment Life Management, CRTD Vol 41, ASMW, New York.
- Brealey, R.A. & Myers, S.C. 1991, Principles of Corporate Finance, 4th edn, McGraw-Hill, New York.
- Collier, A.C. & Glagola, R.C. 1998, 'Engineering Economic and Cost Analysis' in , 3rd edn, Addison-Wesley, California, pp. 438-464.
- Jordan, I. 2005, Decisions under uncertainty: Probabilistic Analysis for Engineering Decisions,.
- Kallen, M.J. & Noortwijk, J.M.V. 2005, 'Optimal Maintenance decisions under imperfect inspection', Reliability Engineering & System Safety,.
- Khan, F.I., Haddara, M.M. & Bhattacharya, S.K. 2006, 'Risk Based Integrity and Inspection Modeling (RBIIM) of Process Components/System', Risk Analysis, vol. 26, no. 1.