The Kaplan-Meier Estimator: Nonparametric Estimation of Reliability
What is the Kaplan-Meier estimator?
Kaplan-Meier estimation is one approach to reliability analysis that estimates failure times and probabilities without assuming an underlying statistical distribution of failure times. This means it is a nonparametric method. The Kaplan-Meier estimator of reliability was invented for the analysis of morbidity and mortality data from medical subjects. Frequently called survival analysis, it is often used in medical fields and life sciences but is generalizable to product reliability and any other time-to-event data.
How do you use the Kaplan-Meier estimator?
The Kaplan-Meier estimator is based on the empirical survival. Since only the data are used to calculate probabilities, they are not based on a fitted probability distribution such as the Weibull. Because this estimator was developed in the medical industry, it focuses on probability of survival instead of failure or mortality. (For use in product reliability, we can use the fact that the probability of failure is 1 – the probability of survival.) The cumulative probability of survival at a certain time is the product of the probability of surviving at each previously measured point in time.
How do you calculate reliability or survival using Kaplan-Meier estimation?
To estimate survival probability at a specific time, follow these steps:
-
Sort the life data in ascending order.
-
Estimate the probability of surviving at the time of the ith observation as:
$\pi_i = \frac{n_\text{at risk} - n_\text{failed}}{n_\text{at risk}}$
Here, the number at risk is the total sample size – i.
The number failed is 1 if row i is a failure and 0 if row i has survived. -
Calculate the Kaplan-Meier estimate of survival as $\rho_i = \prod_{j=1}^{i} \pi_j$
-
For product reliability instead of survival, estimate failure probability as 1 – survival probability, by $\ 1-\rho_i$
The Kaplan-Meier estimate is plotted as a step function, and reliability is assumed to be the same (i.e., flat) until the next failure. In product reliability, however, it is more useful to plot the probability midway between failures instead of as a step. This method gives a better estimate of reliability than the end points, and it agrees with the maximum likelihood estimate for parametric models.