Here is the final statement from Bluesquare regarding the outage last Thursday:-
“This is a Reason for Outage Report with details regarding the power supply in BS2/3 with BlueSquare Data Services Ltd.
At 10:06 on Thursday 17th March one of the six UPS modules located in BlueSquare 2/3 suffered a critical component failure which resulted in a dead short on the output side (critical load side) of the UPS. This failure also caused an amount of smoke to be released by the failed UPS system which resulted in the fire alarm activating and the fire service attending. Once the fire service was happy with the situation we were able to restore power to the site via the generators with the UPS system bypassed whilst we investigated the fault further.
Due to the short circuit occurring on the output side of the UPS this meant the other UPS’s immediately went into an overload condition which then switched all modules into bypass mode, as per the design of the system. This overload then transferred to the raw mains and tripped the main incomer to the site. This caused the overload condition to cease and power was lost to the site. The UPS manufactures then worked to check all the remaining UPS modules to ensure the same component was within specification, and to fully test each UPS system, replacing some components where necessary. No further faults were found on the remaining UPS modules, and load was then switched back to full UPS protection at approx 02:15 and building load was transferred back from the generators to utility mains at approx 02:25.
Due to the size of the failure we have commissioned an independent organisation to forensically examine the failed UPS module. This work is scheduled to be completed next week and we will provide further details once we receive their report. This was an extremely unusual type of failure and the manufactures have not experienced such a problem before, despite over 3,000 similar UPS units being deployed. This suggests there isn’t an inherent design problem in the units but we will not reach any conclusions until the forensic examination is complete.
The failed UPS module will be replaced within the next 4 weeks and until that time we will remain on ‘N’ redundancy level at BlueSquare 2 & 3. Further updates will be provided before this replacement work takes place.
A number of customers have asked as to why this failure could occur when we operate an N+1 UPS architecture. The reason for this is that all of the six UPS modules in BlueSquare 2/3 are paralleled together as one large UPS system. BlueSquare 2/3 only requires 5 modules to hold the critical load to the site, however we have an additional unit which provides the redundancy in the event of a UPS module failure. However, as this failure was on the common critical load side of the UPS (the same output that feeds the distribution boards which then in turn feed the racks) and all the UPS systems are paralleled together, this had the effect of causing all UPS modules to go down.