Tracking Field Reliability

The importance of tracking reliability within the engineering development process cannot be understated. Reliability growth (or decline) can be accurately measured by the Crow-AMSAA (AMSAA = Army Material System Analysis Activity). Based on a Weibull Probability Distribution (typically used for tracking reliability for commercialized equipment and appliances), it is a discrete model (as opposed to Weibull, which is a continuous model) known as a Non-homogeneous Poisson Process. While these are complicated names for mathematical ways of thinking, the practical definition is that this methodology will allow the developer to assess and correct wear components in their newly developed machines.
In a previous post on Fatigue Analysis, I spoke about a preproduction unit field test of about 300 machines. The machines that were involved with this test did not have any reliability analysis prior to being placed into the field for the market test. The result was nearly twenty unique systemic failures ranging from electronic user interface static discharge screen freezes, seal chemical compatibility issues, stack up tolerance issues, to inadequate equipment packaging for global transport. These failures amounted to an average of seven annual reactive work tasks per machine. In other words, a number that was excessively high ahead of a potential launch. Once a root cause analysis was completed and corrective actions were put into place, a method to validate improvement was needed. This is where the Crow-AMSAA plots brought visibility whether or not a corrective action actually addressed the issue. The plot would also allow the developer to project how many more test days would be necessary to prove that the machine could attain the desired reliability. The plot was constructed the following way:


1. The cumulative life of the entire fleet was calculated by adding up each machine x day of operation.
2. The cumulative life was divided through by the total summed reactive work tasks for each date of the test to arrive as a "mean time between failures (or repairs)."
3. The mean time between repairs (MTTR) would be plotted on a vertical axis against the a time domain, either by date or the elapsed time of the test in days.
4. The data is then curve-fitted to a power function by a Levenberg-Marquardt algorithm (or equivalent). Since the MTTR function is plotted, the exponent of the power function would be subtracted from 1 to arrive at the Crow shape parameter, "beta".
5. "Beta" is the slope of the line for cumulative failures plotted in the time domain on a log-log scale. For beta < 1, this would mean the machine is becoming more reliable. As can be seen in the graph below for the said machine test, the slope increased (beta <1) on the MTTR plot once corrective actions were put into place about 100 days into the test, indicating that the corrective actions are working to resolve the issues.

For more technical background, conduct a web search: "Crow-AMSAA."

Previous
Previous

Mathematically Modeling Engineering Problems

Next
Next

Consideration of Fatigue in the Development Process