Neural network forecasting of equipment inoperability
for an aircraft operator 1

Kenneth Wang* and James T. Luxhøj

Department of Industrial Engineering
Rutgers, The State University of New Jersey
96 Frelinghuysen Rd.
Piscataway, NJ 08854-8088

1 This report is based on research performed at Rutgers University and is partially supported by Federal Aviation Administration grant # 97-G-005. The contents of this paper reflect the view of the authors who are solely responsible for the accuracy of the facts, analyses, conclusions, and recommendations presented herein, and do not necessarily reflect the official view or policy of the Federal Aviation Administration.

*Rutgers Undergraduate Research Fellow


Abstract

     The issue of aircraft safety is a major concern for the Federal Aviation Administration, the aviation industry, and the general public. In this research report, the problem of aircraft safety is dealt with by focusing on the Service Difficulty Report (SDR). SDRs are completed for each instance of equipment inoperability and an SDR often precedes a critical safety problem. The purpose of this research effort is to forecast the monthly number of SDRs for an aircraft operator.

     The method used to create the SDR forecasting model is a neural network. Neural networks are extremely useful in detecting underlying trends in data. Data from the Federal Aviation Administration and Department of Defense databases were used for the forecasting model. An important feature of this research effort is that the trending of SDRs was done using a mixed fleet composition of data, which in the past has not given favorable results. Next, several different architectures for the neural network model were implemented and analyzed. The General Regression Neural Network (GRNN) architecture was found to be the best. The forecasted values of the model were then compared to the actual values using statistical analysis. For the GRNN model, the coefficient of multiple determination, R2, was found to be 0.8962 which indicates that a good SDR forecasting model has been found for the operator.


Introduction


      Safety is a major concern for the Federal Aviation Administration (FAA) and for the entire aviation industry. Each airline operator and the FAA have an obligation to ensure the safety of each passenger aboard each aircraft. The major concern for most passengers, when traveling by air, is the issue of safety. The dangers and risks involved with flying will always exist, but with new research efforts, these dangers and risks can be minimized.

     Many improvements have been made over the past decade to improve safety standards for the entire aviation industry. However, the number of people utilizing air travel has significantly increased. The FAA is predicting 800 million domestic passengers will travel by air by the year 2000 and over a billion by the year 2010 (Luxhøj and Cheng, 1998). This has spurred new research efforts by both the FAA and the aviation industry to improve airline safety.

     One method of assessing the safety of an aircraft is by analyzing a Service Difficulty Report (SDR). The SDR provides the necessary information, such as airworthiness and reliability, to assess the safety of an aircraft. An SDR is completed for instances of equipment inoperability. SDRs provide information about problems or failures of aircraft components and equipment ("Automated Trend Monitoring for Service Difficulty Reports", 1998). SDRs are completed for each instance of equipment inoperability such as, in-service difficulties, malfunctions, and defects.


Problem statement


     The research problem that is being examined in this study is the issue of aircraft safety. The problem of aircraft safety can be approached in many different ways. This research effort is focused on the Service Difficulty Report (SDR). An SDR is completed each time any difficulties occur with an aircraft. The SDR often precedes a critical safety problem. Also, an SDR provides vital information necessary to assess the safety of an aircraft (Luxhøj and Cheng, 1998). The purpose of this project is to forecast SDRs. By forecasting SDRs, any potential safety problem can be dealt with before any serious repercussions occur. The method that has been used for this particular SDR forecasting model is a neural network.

     The primary sources of data used in this research include the National Program Tracking and Reporting Subsystem (NPTRS), the National Vital Information Subsystem (NVIS), and the Department of Defense (DoD). The data were taken for a single air operator, which will be referred to as Operator A. The data for only one operator were used for purposes of consistency of data. The specific range of data examined was from August 1993 to May 1998. Data prior to August 1993 were not considered due to the inconsistency of the data with the data necessary for this model.

     Next, the fleet composition of Operator A was taken into consideration. Operator A's fleet was first decomposed by aircraft make/model. The assumption being that the data that had been decomposed by aircraft make/model would outperform data composed of mixed fleets. The results from the forecasting model did not verify this assumption. The data that had been decomposed by aircraft make/model gave very poor results in comparison to the data composed of mixed fleets. Therefore, the data composed of mixed fleets were used for this model.


SDR forecasting model


      In order to build an SDR forecasting model, many parameters must be considered. Several different modeling techniques exist. Each modeling technique is distinctly different and has its advantages/disadvantages. The particular model chosen for this project is a neural network.

     Neural networks have many advantages, but also some disadvantages. A significant advantage of a neural network is that it takes comparatively little time to build a model. A modeling technique such as a regression model could take more time to construct than a neural network. Neural network models also have outperformed other techniques of aviation safety modeling in many past cases (Luxhøj and Cheng, 1998). One of the disadvantages of a neural network is that it is essentially a black box, and not a causal model, so it is difficult to understand the exact process that the neural network uses to achieve the output. Although this is quite a disadvantage, the advantages of using a neural network, such as the ability to handle nonlinear data, far outweigh the disadvantages.

     Once the neural network was chosen as the modeling technique, the next step was to choose the inputs and outputs for the model. Choosing the output for the model is fairly simple. Since we are forecasting SDRs, the output of the model should obviously be the SDR Submission Rate. Choosing the inputs for the model is a task that is much more difficult. Possible inputs included data on operations and airworthiness (i.e. maintenance) surveillances, fleet size, changes in personnel, cabin enroute inspections, among others. Using the statistical analysis software, SPSS, the cross-correlation function between each of the possible inputs with the output, SDR Submission Rate, was found. The correlation is a measure of the relationship between two variables. The correlation describes how two variables vary together. The cross-correlation functions determined for the possible inputs and the output show the relationship of the possible inputs graphically. The cross-correlation functions also show the correlation of the variables with a lag. The lag shows the correlation of the variables with past data. In the cross-correlation functions, the vertical axis represents the correlation coefficient and the horizontal axis represents the lag number. In addition, two black lines run across the functions; the two black lines represent the confidence interval for the correlation coefficient. After studying the cross-correlation functions between the possible inputs and the output, the inputs for the neural network were chosen. Of the eleven possible inputs, five were chosen as actual inputs for the neural network.

     Possible inputs that had a correlation coefficient close to or above the confidence interval were chosen to be the inputs for the neural network. The inputs were chosen without consideration to lag. If any of the possible inputs had a correlation coefficient with the output, the possible input was selected. A sample cross-correlation function is shown in Figure 1.


Figure 1. Cross-correlation of FAA/DoD Ramp Inspections and SDR Submission Rate


      After choosing the input data for the neural network, the next step was to partition the input and output data that were to be entered into the neural network. The input and output data were divided into three data sets - training, test, and production sets. The neural network first learns patterns through the inputs and outputs of the training set and builds a network. This network is then used on the inputs of the test set to predict the outputs of the test set. Each time the outputs of the network give a result closer to actual outputs of the test set the network is saved. The procedure of learning patterns through the training set and predicting the output of the test set continues until the network no longer improves results on the test set.

     The accuracy of the model is at this point unknown. The production set, which was partitioned earlier, will now be used to see how accurate the neural network model is. The inputs from the production set are fed into the neural network model. The neural network model then forecasts the outputs of the production set. The actual outputs of the production set are then compared to the forecasted outputs of the production set. Since the production set is data the neural network model has never seen before, the forecasting of the production set output is a good measure of the accuracy of the neural network model. The data for this neural network model were partitioned allocating 60% of the data to the training set, 20% of the data to the test set, and 20% of the data to the test set.

     The next step for building a neural network model is to choose the specific architecture of the neural network. Many different architectures exist for neural networks. Certain architectures of neural networks work well for certain forecasting models. Different architectures vary in their computational properties. Much difficulty exists in attempting to determine out which architectures will work well with which type of data. Some general guidelines do exist, and these guidelines were followed to limit the choices of architecture. Various neural network architectures were used to build a model.


SDR forecasting using neural networks

      A neural network is a mathematical representation of the human brain. The mathematical computing elements in neural networks are known as neurons. A neural network is made up of many neurons, which are connected by links. Input neurons receive the problem and transmit the data to adjacent neurons. These data are then propagated through the network of neurons until the data are outputted through the output neurons (Burke and Ignizio, 1997).

     The neurons in a neural network are simple computing elements. Each neuron receives data from its input links and computes an activation level that is then propagated through to the output links. The activation level is based on the data from the input links and from the weights. The computation has two components, an input function and an activation function. The input function is a linear function that computes the weight (Wj) of the input variables (aj). The activation function g is a nonlinear function transforms the weighted sum to its final value (Russel, 1995).

     The following formula is for the input function:

     

     The activation value is computed by applying the activation function, g, to the input function , ini:

     


SDR forecasting results with backpropagation and ward networks      The first type of architecture used for the neural network model was the standard backpropagation network (BPNN). The standard backpropagation network is a simple neural network. This network has an input layer, an output layer, and one or several hidden layers. Backpropagation networks were chosen because of their ability to generalize data well on a wide variety of problems (NeuroShell2, 1996).

     The results using the standard backpropagation network were not satisfactory. A brief explanation to the statistics given for the backpropagation network is given here. The R2 value is the coefficient of multiple determination. The closer this value is to 1, the more the output from the neural network matches the actual output. The r2 value is the correlation coefficient; this measures the linear relationship between the output of the neural network and the actual output. When this value is close to one, a positively linear relationship exists; when this value is close to negative one, a negatively linear relationship exists. The Mean Squared Error (MSE) is the square of the neural network output minus the actual results. The Mean, Min., and Max absolute error are the mean, minimum, and maximum of the neural network output minus the actual results. The percent within X % represent the percent of neural network output within X percent of the actual output. Table 1 gives the forecasting results for the backpropagation neural network.



Table 1. Operator a backpropagation network (BPNN) results


R squared 0.5031
r squared 0.5602
Mean squared error 4033.113
Mean absolute error 44.839
Min. absolute error 6.588
Max. absolute error 135.232
Correlation coefficient r 0.7485
Percent within 5% 45.455
Percent within 5% to 10% 9.091
Percent within 10% to 20% 9.091
Percent within 20% to 30% 9.091
Percent over 30% 27.273


     The Ward network was the next type of architecture used. The Ward network is a backpropagation neural network with multiple hidden layers. Past experience has shown that Ward networks work well with data that have several inputs. Also, Ward networks are known for their ability to detect features (NeuroShell2, 1996). For this reason, the Ward network was applied to this model.

     The Ward network also did not have much success. The results from the Ward network are shown in Table 2.



Table 2. Operator a ward network results


R squared 0.4018
r squared 0.708
Mean squared error 3946.317
Mean absolute error 49.078
Min. absolute error 0.984
Max. absolute error 147.311
Correlation coefficient r 0.8414
Percent within 5% 18.18
Percent within 5% to 10% 0
Percent within 10% to 20% 36.364
Percent within 20% to 30% 18.182
Percent over 30% 27.273


SDR forecasting results with general regression neural network

     The general regression neural network (GRNN) was the next type of architecture that was used. This type of network works well with few data points; therefore, this type of architecture is suitable for this forecasting model. The general regression neural network forecasted very good results. The general regression neural network was ultimately chosen as the SDR forecasting model. The results of the general regression neural network are given in Table 3. The smoothing factor is used by the network to arrange the data. Table 4 displays the actual SDR Submission Rate of the forecasting model and actual SDR Submission Rate.



Table 3. Operator a general regression neural network (GRNN) Results


R squared 0.8962
r squared 0.8978
Mean squared error 842.132
Mean absolute error 23.985
Min. absolute error 2.999
Max. absolute error 63
Correlation coefficient r 0.9475
Percent within 5% 27.273
Percent within 5% to 10% 27.273
Percent within 10% to 20% 36.364
Percent within 20% to 30% 0
Percent over 30% 9.091
Smoothing factor 0.043


Table 4. Actual SDR submission rate and neural network results


Actual SDR Network SDR

190 127
210 207.0006866
194 206.9901733
221 230.0124054
249 226.0010834
201 234.9858093
215 239.2892456
171 174.9997711
164 195.082428
172 204.0213776
501 473.53998


     Figure 2 presents the GRNN results from Table 4 in a graphical format.


Figure 2. Operator a actual SDR submission rate and general regression neural network (GRNN) results


Analysis


     The general guideline used to determine the adequacy of the forecasting models created was the coefficient of multiple determination, R2. The backpropagation network and the Ward network models obtained poor results. The coefficient of multiple determination, R2, for the backpropagation networks and the Ward networks, were 0.5031 and 0.4018 respectively. The general regression neural network SDR forecasting model has shown good results in comparison to the other models created. The R2 was 0.8962, which is relatively good. For Operator A, the general regression neural network model forecasted the best results.

     The predicted values from the forecasting model fit the actual data from the production set for Operator A. The data from the production set for Operator A were somewhat unstable, yet the forecasting model was able to predict the SDR Submission Rate very closely. This shows that the forecasting model created for Operator A was accurate and precise.

     The SDR forecasting model used a mixed fleet composition of data, which in previous studies has not given favorable results, but in this study has shown some very promising results. The data were initially decomposed by aircraft make/model, but the results of the model were poor, suggesting that fleet homogeneity may not be a crucial factor in SDR trending for some operators. Further research is required to understand the impact that other factors, such as operating and maintenance policies, may have on SDR trending.


Conclusions

     The SDR forecasting model created does a very good job of forecasting the SDR Submission Rate for the selected operator. However, some improvements can be made to this model. Due to the unavailability of data, this SDR forecasting model has been very limited in terms of numbers of inputs. Logically, a model is difficult to create with sparse data. With improved availability of data, an improved SDR forecasting can be created.

     This SDR forecasting model has shown promise in its results. This research effort has demonstrated how a neural network forecasting model can be utilized for the trending of equipment inoperability. More data is necessary to confirm the results of the SDR forecasting model. However, the techniques used for creating this SDR forecasting model did show good results. The idea of utilizing a SDR forecasting model to trend equipment inoperability has merit and should continue to be researched.



References

"Automated Trend Monitoring for Service Difficulty Reports" (1998), Transport Canada Civil Aviation

Burke, Laura and James P. Ignizio (1997), "A practical overview of neural networks." Journal or Intelligent Manufacturing. London: Chapman and Hall.

Luxhøj, James T. (1998), "Trending of Equipment Inoperability for Commercial Aircraft." Piscataway, NJ: Department of Industrial Engineering Rutgers University.

NeuroShell 2 (1996), Frederick, MD: Ward Systems Group, Inc.

Ross, Sheldon (1998), A First Course in Probability. Upper Saddle River, NJ: Prentice-Hall, Inc.

Russel, Stuart J. (1995), Artificial Intelligence: a modern approach. Englewood Cliffs: Prentice-Hall Inc.



Copyright 1999 by James T. Luxhøj
Current URL: http://rutgersscholar.rutgers.edu/volume01/luxhwang/luxhwang.htm