InformationWeek: RIM Outage Explanation Leaves Big Questions

RIM did more dancing around the issues than frank sharing as it tried to explain the BlackBerry outage–leaving CIOs to speculate. And goodwill’s running short.

After an almost four-day outage of RIM’s Blackberry service, RIM’s co-CEOs gave a status update Thursday morning. Mike Lazaridis delivered what appeared to be a prepared statement, followed by questions, largely from the media. The way that RIM reacted to the outage will likely shape the company’s fortunes for the foreseeable future. And on the key question, the future health of RIM’s network and its ability to scale, too many questions went unanswered.

Lazaridis started out with an apology and something of a promise. “You expect better of us, I expect better of us,” he said. “We are, and will take every action feasibly, to minimize the risk of this happening again.”

Apparently, one switch’s failure with a bonked-up backup system had such a tremendous “ripple effect” that it caused a world-wide outage for days.

The question that many CIOs and CTOs are asking is, if architecture is planned out right, and testing occurred on a reasonably diligent basis, how exactly could that happen?

More of the Information Week article from Jonathan Feldman

Alex Carroll

Alex Carroll

Managing Member at Lifeline Data Centers
Alex, co-owner, is responsible for all real estate, construction and mission critical facilities: hardened buildings, power systems, cooling systems, fire suppression, and environmentals. Alex also manages relationships with the telecommunications providers and has an extensive background in IT infrastructure support, database administration and software design and development. Alex architected Lifeline’s proprietary GRCA system and is hands-on every day in the data center.