Procedures could be key to reducing data center downtime

Anyone in the data center industry has heard the statistics about the heavy costs of data center downtime, with some estimates putting the rate at an average of $7,900 per minute. But where does the fault lie? And how do you minimize your risks?

While you must contend with cyberattacks, human error also is an area that can lead to significant downtime, according to Steven Shapiro, mission critical practice lead at Morrison Hershfield, an engineering firm with a history of studying data center practices. Shapiro said one of the biggest culprits — when dealing with downtime caused by human error — is a company’s failure to have documented procedures.

Procedures could be key to reducing data center downtimeWhen dealing with downtime caused by human errors, you could practically eliminate those risks through training and a robust procedure. “If the training is there, and the procedures are there, we find a facility that has almost no human error associated with failure,” Shapiro said.

To reduce the risk of downtime caused by human error, follow these guidelines for documenting and implementing procedures:

  1. Train numerous staffers. Here’s a situation that often happens in many departments: You have a senior staff member who can deal with issues quickly and effectively. While this is a “nice to have,” it doesn’t allow for a good back-up plan. Without comprehensive training of other staff members to take on all tasks, you could run the risk of a critical situation in the event the senior staff member isn’t available, or worse yet, decides to part ways with your company.
  2. Document procedures in writing. While it can be laborious and time-consuming, getting all policies and procedures in writing is an essential step in ensuring that procedures can be quickly assessed for training and referrals. The procedures should include the following:

Standard Operating Procedure (SOP), which outlines common operating procedure for easy reference.

Method of Procedure (MOP), which provides procedures for all tasks in the data center.

Emergency Operating Procedure (EOP), which provides procedures in case of an emergency — from safety procedures to those taken in the case of a disaster recovery situation.

At Lifeline Data Centers, a wholesale colocation center operating in the Midwest, we believe that a comprehensive solution helps our clients maintain a high rate of uptime. Take a virtual tour of our facility and let’s talk about how we can help you with your data center solutions.

Schedule a Tour

Other resources:

Rich Banta

Rich Banta

Managing Member at Lifeline Data Centers
Rich is responsible for Compliance and Certifications, Data Center Operations, Information Technology, and Client Concierge Services. Rich has an extensive background in server and network management, large scale wide-area networks, storage, business continuity, and monitoring. Rich is a former CTO of a major health care system. Rich is hands-on every day in the data centers. He also holds many certifications, including: CISA – Certified Information Systems Auditor CRISC – Certified in Risk & Information Systems Management CDCE – Certified Data Center Expert CDCDP – Certified Data Center Design Professional