What is Fault Tolerance?
Fault tolerance enables systems to continue operating in the event of one or more component failing. Some systems can carry on with full functionality depending on the severity of the failure while other system’s functionality may scale down proportionally to the severity of the failure. Traditionally, most systems are designed in a way that one failure leads to total system failure. This is why fault tolerance is a system property desired by many users– it allows for the availability of critical systems.