Skip to main content

Disaster recovery plans explained

Develop a disaster recovery plan that boosts your cyber resilience and recovery capability

 

What is a disaster recovery plan and how does it work?

A disaster recovery plan (DR or DRP) is a formal document created by an organization that contains detailed instructions on how to respond to unplanned incidents such as natural disasters, power outages, cyber attacks and any other disruptive events. The plan contains strategies to minimize the effects of a disaster, so an organization can continue to operate or quickly resume key operations.

Disruptions can lead to lost revenue, brand damage and dissatisfied customers — and the longer the recovery time, the greater the adverse business impact. Therefore, a good disaster recovery plan should enable rapid recovery from disruptions, regardless of the source of the disruption.  

Explore DRaaS

A DR plan is more focused than a business continuity plan and does not necessarily cover all contingencies for business processes, assets, human resources and business partners.

A successful DR solution typically addresses all types of operation disruption and not just the major natural or man-made disasters that make a location unavailable. Disruptions can include power outages, telephone system outages, temporary loss of access to a facility due to bomb threats, a "possible fire" or a low-impact non-destructive fire, flood or other event. A DR plan should be organized by type of disaster and location. It must contain scripts (instructions) that can be implemented by anyone.

Before the 1970s, most organizations only had to concern themselves with making copies of their paper-based records. Disaster recovery planning gained prominence during the 1970s as businesses began to rely more heavily on computer-based operations. At that time, most systems were batch-oriented mainframes. Another offsite mainframe could be loaded from backup tapes, pending recovery of the primary site.

In 1983 the U.S. government mandated that national banks must have a testable backup plan. Many other industries followed as they understood the significant financial losses associated with long-term outages.

By the 2000s, businesses had become even more dependent on digital online services. With the introduction of big data, cloud, mobile and social media, companies had to cope with capturing and storing massive amounts of data at an exponential rate. DR plans had to become much more complex to account for much larger amounts of data storage from a myriad of devices. The advent of cloud computing in the 2010s helped to alleviate this disaster recovery complexity by allowing organizations to outsource their disaster recovery plans and solutions.

Another current trend that emphasizes the importance of a detailed disaster recovery plan is the increasing sophistication of cyber attacks. Industry statistics show that many attacks stay undetected for well over 200 days. With so much time to hide in a network, attackers can plant malware that finds its way into the backup sets –infecting even recovery data. Attacks may stay dormant for weeks or months, allowing malware to propagate throughout the system. Even after an attack is detected, it can be extremely difficult to remove malware that is so prevalent throughout an organization.

Business disruption due to a cyber attack can have a devastating impact on an organization. For instance, cyber outage at a package delivery company can disrupt operations across its supply chain, leading to financial and reputational loss. In today’s digitally dependent world, every second of that disruption counts.

 

Why is a disaster recovery plan important?

The compelling need to drive superior customer experience and business outcome is fueling the growing trend of hybrid multicloud adoption by enterprises. Hybrid multicloud, however, creates infrastructure complexity and potential risks that require specialized skills and tools to manage. As a result of the complexity, organizations are suffering frequent outages and system breakdown, coupled with cyber-attacks, lack of skills, and supplier failure. The business impact of outages or unplanned downtime is extremely high, more so in a hybrid multicloud environment. Delivering resiliency in a hybrid multicloud requires a disaster recovery plan that includes specialized skills, an integrated strategy and advanced technologies, including orchestration for data protection and recovery. Organizations must have comprehensive enterprise resiliency with orchestration technology to help mitigate business continuity risks in hybrid multicloud, enabling businesses to achieve their digital transformation goals.

Other key reasons why a business would want a detailed and tested disaster recovery plan include:

  • To minimize interruptions to normal operations.
  • To limit the extent of disruption and damage.
  • To minimize the economic impact of the interruption.
  • To establish alternative means of operation in advance.
  • To train personnel with emergency procedures.
  • To provide for smooth and rapid restoration of service.

To meet today's expectation of continuous business operations, organizations must be able to restore critical systems within minutes, if not seconds of a disruption.

How are organizations using disaster recovery plans?

Many organizations struggle to evolve their DR plan strategies quickly enough to address today’s hybrid-IT environments and complex business operations. In an always-on, 24/7-world, an organization can gain a competitive advantage –or lose market share –depending on how quickly it can recover from a disaster and recover core business services.

Some organizations use external disaster recovery and business continuity consulting services to address a company’s needs for assessments, planning and design, implementation, testing and full resiliency program management.

There are proactive services to help businesses overcome disruptions with flexible, cost-effective IT DR solutions.

With the growth of cyber attacks, companies are moving from a traditional/manual recovery approach to an automated and software-defined resiliency approach. Other companies turn to cloud-based backup services provide continuous replication of critical applications, infrastructure, data and systems for rapid recovery after an IT outage. There are also virtual server options to protect critical servers in real-time. This enables rapid recovery of your applications to keep businesses operational during periods of maintenance or unexpected downtime.

For a growing number of organizations, the solution is with resiliency orchestration, a cloud-based approach that uses disaster recovery automation and a suite of continuity-management tools designed specifically for hybrid-IT environments and protecting business process dependencies across applications, data and infrastructure components. The solution increases the availability of business applications so that companies can access necessary high-level or in-depth intelligence regarding Recovery Point Objective (RPO)Recovery Time Objective (RTO) and the overall health of IT continuity from a centralized dashboard.

In today’s always-on world, your business can’t afford downtime, which can result in revenue loss, reputational damage, and regulatory penalties. Learn how Kyndryl can help transform your IT recovery management through automation to simplify disaster recovery process, increase workflow efficiency, and reduce risk, cost, and system testing time.

How is a disaster recovery plan used in industry?

Hyundai Heavy Industries (HHI) was faced with that harsh reality when a 5.8 magnitude earthquake struck in 2016. Since the company’s backup center was located near headquarters in Ulsan City, Korea, the earthquake served as a wake-up call for HHI to examine its disaster recovery systems and determine preparedness for a full range of potential disruption. In 2016 an earthquake showed just how close a natural disaster could come to damaging Hyundai's mission critical IT infrastructure. The IT leadership responded quickly, working with Kyndryl Business Resiliency Services to implement a robust disaster recovery solution with a remote data center.

What are the key steps of a disaster recovery plan?

The objective of a DR plan is to ensure that an organization can respond to a disaster or other emergency that affects information systems –and minimize the effect on business operations. Kyndryl has a template for producing a basic DR plan. The following are the suggested steps as found in the DR template. Once you have prepared the information, it is recommended that you store the document in a safe, accessible location off site.

  1. Major goals: The first step is to broadly outline the major goals of a disaster recovery plan.
  2. Personnel: Record your data processing personnel. Include a copy of the organization chart with your plan.
  3. Application profile: List applications and whether they are critical and if they are a fixed asset.
  4. Inventory profile: List the manufacturer, model, serial number, cost and whether each item is owned or leased.
  5. Information services backup procedures: Include information such as: “Journal receivers are changed at ________ and at ________.” And: “Changed objects in the following libraries and directories are saved at ____.”
  6. Disaster recovery procedures: For any DR plan, these three elements should be addressed:
    • Emergency response procedures to document the appropriate emergency response to a fire, natural disaster, or any other activities in order to protect lives and limit damages.
    • Backup operations procedures to ensure that essential data processing operational tasks can be conducted after the disruption.
    • Recovery actions procedures to facilitate the rapid restoration of a data processing system following a disaster.
  7. DR plan for mobile site: The plan should include a mobile site setup plan, a communication disaster plan (including the wiring diagrams) and an electrical service diagram.
  8. DR plan for hot site: An alternate hot site plan should provide for an alternative (backup) site. The alternate site has a backup system for temporary use while the home site is being reestablished.
  9. Restoring the entire system: To get your system back to the way it was before the disaster, use the procedures on recovering after a complete system loss in Systems management: Backup and recovery.
  10. Rebuilding process: The management team must assess the damage and begin the reconstruction of a new data center.
  11. Testing the disaster recovery and cyber recovery plan: In successful contingency planning, it is important to test and evaluate the DR plan regularly. Data processing operations are volatile in nature, resulting in frequent changes to equipment, programs and documentation. These actions make it critical to consider the plan as a changing document.
  12. Disaster site rebuilding: This step should include a floor plan of the data center, the current hardware needs and possible alternatives –as well as the data center square footage, power requirements and security requirements.
  13. Record of plan changes: Keep your DR plan current. Keep records of changes to your configuration, your applications and your backup schedules and procedures.