Top 3 Challenges To Creating an Effective DRP
Whatever the industry, the times when businesses could continue operating in spite of a major computer glitch is over. Meaning the Business Continuity Plan (BCP) has become a strategic asset for a company’s executives faced with mounting risks, be they criminal, technological, climatic or terrorist.
As part of the Business Continuity Plan, the Disaster Recovery Plan (DRP) aims to deliver sustained IT services and reducing any downtime to an absolute minimum. The DRP objectives are specific to each company and are mostly measured with the help of two metrics: the Recovery Time Objective (RTO) which represents the maximum time allowed before recovery, and the Recovery Point Objective (RPO) which specifies the amount of data loss that the business can accept.
Disaster Recovery, a leap into the unknown
If designing a DRP is something that most companies are well versed in, triggering actual recovery operations is most of the time a big leap into the unknown. A survey by Forrester and the Disaster Recovery Journal delivers interesting insights about companies’ readiness for handling a disaster:
- Only 18% of companies from the survey believe they are fully prepared to trigger the disaster recovery processes
- More than 45% have said they do not have central coordination for the disaster recovery processes
- Only 19% report they are able to test disaster recovery processes more than once a year, and nearly 21% just never test them.
However, for a business to survive a catastrophic event, it is mandatory that the IT organization responds fast. It is the RTO that determines the acceptable limit before business activities are severely impacted. Problem detection, decision to enact DR, execution of recovery procedures, systems checks after recovery … the total duration of these operations must be kept contained under the RTO that has been agreed within the DRP.
So, it is easily understandable that the condition for a successful disaster recovery depends mainly on how quickly and how reliably the various steps formalized in the DRP are run. It is critical to avoid the risks of slow execution and human error.
The 3 challenges of an effective DRP
We all know that a disaster always happens when and where we do not expect it. Organizations must remain cautious, and above all very realistic, as to their ability to get access to the right skills, in the right place, at the right time. Natural disaster or influenza epidemic, experience has shown that even the most elaborate on-call systems can be undermined. In the event of regional scale fire or flooding, a significant proportion of resources supporting the execution of the DRP might be stuck evacuating the danger zone. These are totally unpredictable , they are likely to slow down the recovery process, because you do not have at hand the indispensable skills for configuring a system or more stupidly no access to the admin password holder.
DRP’s best enemy is probably change. Or at least changes that are made to the infrastructure and applications after the DRP has been established. If the plan is not updated on a regular basis, inconsistencies will impact the recovery procedures. Then in the best case, the restart of business activities will fail, and in the worst-case, business will be restarted with corrupted data. The main difficulty resides in the centralization of changes in order to allow easily updating the DRP procedures. This is a problem that often outmatches classical CMDB-based governance, since you are dealing with very granular procedures, such as cleanup or restart scripts. Most often scattered throughout the information system, these procedures are frequently poorly referenced.
During a crisis, it is essential to focus your best resources on what is really important. This seems obvious because not all IT assets have the same value, they do not have the same criticality. However, today’s systems are increasingly complex, interconnected and dependent on each other as never before. It is therefore difficult to effectively coordinate teams in the moment of intense stress that is disaster. Proper disaster recovery execution requires clear visibility on the order and the progress of the operations to allow making the right decisions but also to provide real0time information to executive management and the wider business.
Another survey by Gartner shows that nearly three companies out of four have not yet automated the recovery procedures involved in their DRP. These companies rely almost entirely on human resources to restart their business activities. However, automation remains the solution of choice to overcome key missing resources, centralize changes and effectively manage systems and applications restart dependencies regardless of their complexity. Process automation also relieves DRP stakeholders from manual and repetitive tasks, offering better prospects for regularly testing recovery procedures.
IT organizations are faced with the proliferation of risks of all kinds. Like a phoenix they must ensure their resilience. If the DRP is a great asset to formalize means and goals for restarting business activities, the challenge of running efficiently recovery procedures in a moment of great tension must not be underestimated. As often, automation finds its place as an indispensable tool providing an insurance policy in the event of a disaster that the business will be impacted as little as possible