Paper presented at International Seminar on Railway Axles, Imperial College, London, 25-26 Sept. 2003
Summary: The detection of cracks in axles in the UK is carried out by a variety of non-destructive testing (NDT) methods. The probability of detection (POD) represents the effectiveness of the inspection, and is a quantity needed, together with crack growth rates, to determine inspection intervals to avoid failure in service.
This paper will review the various methods and the POD information available for each method from the rail and other industries. Recommendations for future work needed to establish POD curves will be given.
Index Terms: axles, inspection
Railway Safety asked TWI to review the non-destructive methods of detecting fatigue cracks in rail vehicle axles. The review was to include a survey of relevant inspection reliability data from other industries as well as any that could be found from the rail sector. The work involved consisted of both a literature search (both library and web), and a series of visits to train operating companies to understand the methods they used, to canvas their opinions and assess whether site conditions might influence the performance of the techniques.
This paper describes the results of the literature survey and a comparison of data obtained for probability of detection of cracks.
Probability of detection (POD) data is used, in combination with crack growth data, to give optimum inspection intervals for a given component in service. The interval is set such that an in-service defect which the NDT system would miss cannot grow to a size likely to cause failure before the next inspection. Obviously, more sensitive inspection techniques, which detect smaller defects, will allow for less inspection and are therefore desirable in most cases. A typical POD data curve is given in Fig.1.
The establishment of the POD of a particular inspection can be carried out experimentally, with the help of some theory or by 'expert judgement'. The experimental method is basically a task of providing a sample with a known set of flaws, measuring the number detected by a technique compared with the number present. However the performance of a non-destructive testing technique is closely related to the procedure used and the particular component under test. The cost of providing flawed samples usually prohibits the number of trials that can be carried out. The practice of carrying out measurements to establish the POD is therefore constrained by this situation and there are well established practices which need to be followed. These methods have been generally established by the US Aircraft industry , but have been used also by the UK Offshore Industry.  Theoretical modelling can be used if the theory available accounts for he various flaw geometries and properties and allows for operator error. Usually however some support from experimental methods is useful. Expert judgement is frequently used to relate PODs known of a particular technique known in one situation to those in another similar one.
Comparison of POD curves is quite difficult because the curves may have different shapes and minor changes in procedure can cause significant changes in POD. For simplicity the flaw size for POD at 90% is often used for comparison, and will be used in the majority of cases below. It should be noted that this flaw size alone should not be used for determination of inspection intervals as 1 miss in 10 flaws may not be an acceptable 'miss rate'. Mathematical methods using the whole curve are preferred for this analysis.
A set of POD curves used for determination of inspection intervals on rail axles is given by Benyon and Watson.  No reference was given for the source of this data, but given that it is already in use for the purpose intended we will use it as a reference for the rail industry when comparing that with other industries.
2. NDT methods for axles
The methods used for axle inspection are summarised in Table 1. They are described in more detail, with some information from TWI's visits below.
Table 1 Summary of existing techniques
|Hollow Axles UT
||None except generic ultrasonic limitations.
|Solid Axles Low angle scan for centre axle fatigue cracks
||Can be carried out with limited access, and on axles in repair depots.
||Manual and no electronic record.
Apparent low sensitivity. Not highly regarded.
|Solid Axles Near End/High Angle scan for wheel seats
||Some inspection applicable from axle end.
||Manual and no electronic record.
|Axle Surface MPI
||Traditional well regarded method.
||Requires axle removal for inspection in MPI bay. Subject to operator vigilance and interpretation. Apparently many axles removed for corrosion indications.
|EMFaCIS (Eddy current array)
||Electronic recorded signal.
Large coverage (multi-element probe).
|Operator needs high skill level to interpret signals.
|ACFM (Alternating current field measurement)
||Electronic recorded signal.
Large coverage (multi-element probe).
|Operator needs high skill level to interpret signals.
2.1 Ultrasonics: solid axles
There are well-established techniques for the ultrasonic inspection of solid axles. 'Near end' scans from the end of the axle and 'high angle' scans from the axle surface are used where inspection is required under bearings and wheels, where these have not been removed. All users visited by TWI in the course of their survey reported confidence in these scans. The 'far end' scan was developed to inspect the full length of the axle from one end and is generally applied during overhauls. The signal is very complex and difficult to interpret. None of those visited had any confidence in the capability of the far end scan to detect small defects at mid-span of the axle. This has also been reported in the literature. 
2.2 Ultrasonics: hollow axles
A limited number of vehicles in the UK have hollow axles and equipment is available to carry out ultrasonic inspection of these from the bore.  All users report a high degree of satisfaction with the bore inspection technique, partly because it produces a hard copy report but also because they have confidence that the technique has a better capability to detect small surface-breaking defects at mid-span than the techniques for solid axles. One operator made the point that on some locomotives there is a bearing tube which is not usually removed on overhaul and which prevents surface inspection. The ability to inspect these axles from the bore avoids a more expensive overhaul.
A detailed review of the hollow axle technique was beyond the scope of this review but, based on experience in using similar techniques in other industries, the users' confidence in the technique is probably well-based. If it has not already been done, it would be useful to carry out a systematic study of the capability and reliability of the technique in comparison with surface inspection techniques.
2.3 Surface inspection techniques
Possible surface inspection techniques are penetrant testing, magnetic particle inspection (MPI) and various electromagnetic techniques. Penetrant testing is not normally recommended for ferritic components like these where MPI is possible and we found no indication that it is used for axle inspection. The technique is not discussed further here.
MPI is widely used and is discussed below. There are two electromagnetic techniques available and these are also discussed below. Comparative trials have been carried out of MPI and the two electromagnetic techniques and these are reviewed in a separate section below.
2.3.1 Magnetic particle inspection (MPI)
MPI is a classical NDT technique for surface crack detection. A magnetic field is induced in a component and the magnetic flux leakage at a surface breaking defect is revealed by spraying the component with a magnetic particle ink. The magnetic field may be produced by an electromagnet yoke, by current flow in the specimen or by an encircling coil or adjacent conductor. The magnetic ink may contain black particles (in which case the surface is usually coated with a thin white contrast paint) or the particles may be fluorescent, in which case the item is inspected under ultraviolet light. For MPI to be effective, surface coatings such as paint need to be removed prior to inspection.
A typical MPI test installation for rail axle inspection was visited by TWI, although the test equipment was not working at the time of the visit. Wheelsets are usually tested with bearings removed and wheels present. Paint is removed by grit blasting and the axle is placed on a stand where it can be rotated for testing. Magnetisation is by encircling coil and inspection is carried out using fluorescent ink and ultraviolet light. This type of test installation is regarded as 'state of the art' in many industry sectors and probably represents the best that can be obtained by MPI.
Withdrawal of axles from service because of signals detected in MPI is not uncommon. However in many cases this was reported to be due to corrosion pits or ballast impact dents.
2.3.2 AC field measurement (ACFM) and electro-magnetic array (EMA)
ACFM is a surface crack inspection technique developed by TSC Inspection Systems and is being offered as an inspection technique for various applications by Bombardier. EMA is another electromagnetic surface inspection system operating on a slightly different physical principle to ACFM. It was developed by Newt International and is being offered as a test method by Alstom Transport Service under the name of EMFaCIS (Electro-magnetic Metal Fatigue & Crack Inspection Service). In both cases, an inspection probe (or array of probes) induces an alternating current in the surface of the component under test. This produces a magnetic field that is distorted by the presence of a crack. Components of the magnetic field are detected and the equipment displays and records signals that give the length and depth of the defect calculated by comparison with a theoretical model of the electromagnetic fields. The techniques are relatively insensitive to lift-off and so can be used to test through paint coatings. Another advantage arises from the fact that the data is recorded as the probe is scanned over the component. This means that:
- The scanning can be carried out by relatively inexperienced staff, with data analysis by more skilled staff.
- Data is available for further analysis and checking.
- Inspection can be carried out in conditions of poor access, so long as it is possible to manipulate the probe over the test surface.
3. Inspection reliability studies most relevant to rail axle inspection
3.1 Ultrasonics - Near end, high angle and bore inspection scans
Most inspection reliability trials carried out to assess ultrasonics have been concerned with weld defects, which are not particularly relevant to axle inspection. These studies generally showed a very poor reliability for manual UT inspection in general (around 50% POD for defects of greater than 10mm length and 1-4mm depth)  , and wide variation in operator performance in  . However there were some studies carried out by Martin Marietta Aerospace in 1973-5 which are recorded in the NTIAC Non Destructive Evaluation Capabilities Data Book.  These give a 90% POD for fatigue cracks in aluminium using a full skip pulse echo method (angle not recorded) at a crack length of around 0.05 inches (depth 1.2mm). The similarities between this trial result and the near end scan are that (a) the defect is a fatigue crack (b) angled ultrasonics is used (c) the inspection and defect surfaces are smooth. The differences are (a) the material (b) the inspection was full skip rather than half (c) it was carried out at 10MHz rather than 5MHz normally used for rail axle inspection. However, the type of sensitivity shown here is that which might be expected from near end and high angle scans. The most significant difference is the higher frequency in the aerospace trials, which in this case would improve the POD values compared to those for the axles.
The figure of 1.2 mm given above seems to indicate that the technique is better than the estimate of Benyon and Watson  (90% POD at around 3mm). Moreover, Hall  estimates the minimum flaw sizes detectable by the high angle and far end scans to be 0.5mm and 1mm respectively, although this is not associated with a POD estimate. There is therefore some evidence that the Benyon and Watson figure might be conservative, although the fact that it is a manual inspection might have influenced their estimate.
3.2 Ultrasonics - Far end scan
The nearest equivalent result given is possibly that provided by Chapman and Bowker  from a summary of work in the nuclear industry. This quotes a figure of 4mm deep x 15mm long 'reliably detectable' for ranges above 150mm in welds. The axle inspection far end scan is at a range of around 700mm and therefore this limit of 150mm would be exceeded for the axle case.
The Chapman and Bowker result is not wholly inconsistent with the figure quoted by Benyon and Watson (50% POD at 5mm depth) given the increased range. This figure is typical of the one used to estimate intervals for inspection. However, strictly speaking there is no equivalent experimental data on POD for this inspection.
It is considered that this POD for the far end scan is needed. If that scan is not effective (as appears to be widely believed) then time and cost could be saved during overhaul by omitting it and relying on a surface inspection method carried out at less frequent intervals. More importantly, it was reported that this test is sometimes carried out where surface inspection is not possible. If that is the case, and if the inspection is not effective, then the results will provide a false confidence in the integrity of the axle.
3.3 Surface inspections
A comparative study of the surface inspection methods was organised by Angel Trains and carried out by Engineering Link following the Rickerscote accident. 
Extracts of the reports of this study have been made available to us. The fundamental conclusion appears to be that the ACFM and EMFaCIS systems are superior to MPI.
The difference is significant for cracks greater than 0.5mm deep. The POD results for this dimension range was:
Assuming that the conclusions reached by the report can be substantiated, the implication of this conclusion is that axles, after MPI, are being returned to service with less certainty of their integrity than if tested by the electromagnetic systems. However this trial is based on an analysis of the consistency of results between the techniques investigated using a traditional eddy current technique as a reference. There appears to be no independent confirmation of the existence or otherwise of the crack when an indication is reported. This means that some of the indications reported by the electromagnetic techniques and not reported by MPI could possibly be false calls by the electromagnetic techniques. False calls are a problem for MPI on welds but should be less of a problem on flat surfaces.
Studies have been made in other industries of MPI and of similar inspection systems to those used by ACFM and EMFaCIS. A comprehensive study of underwater inspection of welds using these techniques was carried out by University College London. 
The results are summarised in Table 2. 90% POD is used as a reference, where this data is available, although full POD curves are available in the literature.
Table 2 Summary of POD Data for surface techniques from other sources
Note 1: 90% lower bound with 95% confidence
Note 2: Calculated from extremes of depth/thickness ratio given (limited to plate thickness)
|Method||Source||Environment||Component||Result (90% POD)|
||University College London (UCL) (1989-94)
||Approximately 90mm length (Note 1) [Analysis by Rudlin and Wolstenholme (13) of the same data showed that defects of aspect ratio 5:1 gave 90% POD once they reached a size 10mm long 2mm deep]
||University College London 1991-4
||Approximately 90mm length (Note 1)
|MOSER (EMFaCIS predecessor)
||University College London 1990-94
||Approximately 100mm length (Note 1)
||NTIAC (8) (1974)
||Plates (¼ inch thick)
Depth 2-6mm deep (Note 2)
After proof loading
Depth 0.6-3.5mm (Note 2)
|Eddy Current (High Frequency)
||NTIAC (8) (1976)
||Plates (same as above)
||90% POD not achieved
||BNFL (2000) (14)
||100% detection cracks greater than 1.5mm deep
Summarising these results is quite difficult, but there are some things which may be concluded:
- ACFM, MOSER (EMFaCIS) and MPI applied underwater had similar levels of detection for weld inspection. (Note: application underwater is more likely to be deleterious to performance for MPI).
- MPI sensitivity does vary considerably, but the majority of results suggest 90% POD for flaws around 1.5mm deep or less as a reasonable estimate on flat surfaces.
- Benyon and Watson have a 90% POD of around 1mm for both MPI and Eddy Currents
The conclusion of the UCL work  above does not agree with the study carried out within the rail industry on the axles; neither does the latter agree with the expert estimates of Benyon and Watson (which also have similar PODs for MPI and eddy current). It should be noted that the UCL work was carried out with very early versions of the ACFM and EMFaCIS systems. It should also be noted that the crack lengths used by UCL are originally determined by fluorescent MPI applied in air with sample sectioning to confirm some crack lengths in critical cases (e.g. missed cracks).
However there is no study available other than the rail industry one that compares the performance of the techniques on good quality (non-welded) surfaces, and this is also the most recent study. As NDT performance is very component and technique specific, the rail industry result should be the most appropriate to use. The implications of not using this result if an accident occurred after MPI had been carried out could be politically very significant. In view of the doubts expressed above confirmation work needs to be done as a follow up to this work to give the result full credence. If the samples and records are still available this would simply involve replicas or sections taken where cracks have been reported.
There is no equivalent POD available for the ultrasonic far end scan in the literature. The closest available comparison (from the nuclear industry) is fairly consistent with models already in use. There is some anecdotal evidence that this scan does not detect many defects, and its value has been questioned.
There is no equivalent to the PODs used in the rail industry for the near end and high angle scans. The available data closest to the application (from the aircraft and manufacturing industries) suggests that the estimates of sensitivity currently in use could possibly be conservative.
The data obtained within the rail industry for the surface inspection methods is highly relevant, and cannot be ignored. However it does appear inconsistent with some previous results obtained on welds in the offshore industry and with the expert judgement of other authors.
It is important that the results obtained are further validated, because if better methods are available than MPI and these are not put into use, then there are serious implications if an accident occurs after MPI.
The results obtained in the post Rickerscote trials should be further validated by replicas, or sectioning the predicted cracked areas. (Note the trials may need to be repeated if the defect locations cannot be identified).
To give greater confidence in the inspection interval used for axle testing and to evaluate new methods, the POD needs to be experimentally determined for each of the NDT methods. This may enable better use and choice of NDT techniques.
Thanks are due to Railway Safety for their support of this work and their permission to publish.
- MIL-HDBK 1827 Nondestructive Examination System Reliability Assessment US Department of Defense April 1999.
- J R Rudlin & W D Dover The ICON Project - data for underwater inspection Insight Vol 38 No 6 June 1996.
- Benyon J A et al: The Use of Monte Carlo Analysis to Increase Axle Intervals. Proc. International Wheelset Conference, Rome 2001.
- Hall D C: Ensuring Wheelset Integrity - A Safety Regulators View. Proc. International Wheelset Conference, Rome 2001.
- Gartside C: Automated Ultrasonic Testing of Rail Axles on TGV and Channel Tunnel Trains'. Insight, Vol. 36, May 1994, pp310-312.
- Verkkooijen J: The Need for Reliable NDT Measurements in Plant Management Systems. Proc European-American Workshop Determination of Reliability and Validation Methods for NDT', Berlin, June, 18-20, 1997, pp299-309.
- Programme for the Assessment of NDT in Industry. Report 1999 from AEA Technology (now SERCO) Inspection Validation Centre, RD9, Risley, Warrington, Cheshire WA3 6AT.
- Rummel W D et al: Non-Destructive Evaluation (NDE) Capabilities Data Book (Third Edition). Non-Destructive Testing Information Analysis Centre (NTIAC) Austin, Texas.
- Hall K G: Non-Destructive Testing of Axles Fitted to railway Wheelsets'. INSIGHT, Vol.37, 1995, pp268-273.
- Chapman R K et al: The Production of Capability Statements for Standard NDT Procedures. INSIGHT, Vol.43, No 1, January 2001.
- Pollard D M et al: Evaluation of Electromagnetic Non-Destructive Testing Techniques'. Engineering Link Reports 11363/01,11363/02 and 11363/03.
- Rudlin J R et al: Results of Trials on Uncoated Nodes. IOCE Conference, Aberdeen 1992, Day 3, Spearhead Publications.
- Rudlin J R et al: Development of Statistical Probability of Detection Models Using Actual Trial Inspection Data. Brit Jnl NDT, Vol.34, No 12, December 1992 pp583-589.
- Bowker K J et al: Evaluation of the Capability of MPI for Surface Crack Detection. Proceedings of NDT-99 Poole 1999 pp215-220 (British Institute of NDT).