This another blog in a continuing series discussing what makes for a resilient manufacturing organization. If you missed the opening discussion please read part 1.a summary of the topic, part 2 the introduction, part 3 Flexible Manufacturing Capacity & Scheduling, and Part 4 Enterprise Asset Management .
Redundancy is one of the ways of making a system resilient. Adding and installing back up equipment is an easy and straightforward task, but expensive, and should be done after a proper cost benefit analysis has been done. Keep in mind too many pieces of equipment ties up much needed capital.
NASA in one of their preferred reliability practice documents defines redundancy is as multiple ways of performing a function and breaks redundancy into the following categories
1. Operational or Active “fully on “redundancy
2. Standby redundancy
3. Like Redundancy
4. Un Like redundancy
Operational redundancy (also called parallel redundancy) is the case where the all the pieces of redundant equipment are operating simultaneously during operations rather than being switched on when needed. These redundant pieces of equipment are connected in such a manner that the failure of one piece of equipment will not disrupt operations as the redundant equipment automatically continues the process. Switching out of the failed piece of equipment is not required for the continuation of the process. As with any automatic redundancy switching process, this must be monitored, and someone notified. The failed piece of equipment must be replaced before the next failure. If this is not done then when the redundant piece of equipment fails the whole process fails, that is unless you redundant equipment to redundant equipment. But eventually you will run out of equipment and the process will fail.
With standby redundancy, the redundant equipment are non-operative until they are switched into the system on the failure of the primary piece of equipment. This can be done by either automated or manual means. The disadvantage to standby redundancy is that there usually a period of disruption before the redundant piece of equipment can be brought into service. This type of redundancy is rarely satisfactory for critical systems in the modern manufacturing environment.
With like redundancy identical pieces of equipment are used to perform the same function. These pieces of equipment are not installed on line when the current process had a failure, but are swapped with the equipment at fault. Of course the time it takes to swap out the failed equipment and replace it with a functioning piece adds to the total downtime of the process. This downtime is usually greater than that of the downtime associated with standby redundancy. This like redundancy is not normally used for critical systems.
A special case for like redundancy is where the redundancy is built in through parallel processes. In normal operations the parallel line is not seen as part of a redundancy system but as a production line in itself. Only when there is a problem with a line does the parallel line become one of the possibilities for recovery. Usually with this case there is a change over process with the product being produced is transferred to the parallel line with some associated production interruption. One thing to keep in mind is that there parallel lines are usually in use producing product. An important part of this type of redundancy is the scheduling / rescheduling of the production lines to account for the new circumstances.
A slight variation of this strategy is replacing or adding to the equipment in error with the non like equipment and the reducing the operational capacity. This again will add to the complexities of meeting the production schedule, so rescheduling will most likely occur.
With unlike redundancy non identical pieces of equipment are used to perform the same function of the equipment that is in trouble. In this case the equipment is installed and is made to perform the function of the troubled equipment it replaced. Addition set points, testing, and adjustment would have to be made on this equipment to make it perform as the original equipment. All this would increase the amount of downtime.
Adding redundancy to operations is costly, equipment and processes must be analyzed to identify those candidates for the addition of redundant equipment. A risk assessment has to be carried out where the risk of specific failures on the equipment or system is judged. Priority should be given to those failures whose combination of probability and organizational effect is the highest. Included in this assessment are all the single points of failure (those areas where a single failure can interrupt the whole process). This assessment should not just look at pieces of equipment, but include control boxes, signal wiring, controlling software, equipment controllers, and other supporting pieces of infrastructure. If the supporting infrastructure fails or is the cause of the failure, then investment in stand by equipment will not be of use.
Have you faced issues with creating a resilient organization? Is it possible to build a resilient organization in the chemical industry? Feel free to discuss/share stories about these questions along with manufacturing in the chemicals industry in general in the comment space below.
Or join the conversation at @SAP4Chemicals
NASA Preferred reliability Practices – Redundancy In Critical Mechanical Systems – Practice Number GSE-3003 October 1995