Types of Disaster Recovery and Business Continuity Testing: A Comprehensive Overview

Rollback to version 1

0:03 - 0:05

we've spent a long time or a couple of
0:05 - 0:08

videos now discussing our Dr testing and
0:08 - 0:10

processes and our business continuity
0:10 - 0:13

testing and processes let's jump in and
0:13 - 0:14

take a look at some of the testing
0:14 - 0:17

methodologies across our both through
0:17 - 0:20

respectively to our disaster recovery
0:20 - 0:22

and business continuity and what's
0:22 - 0:24

available to us so in in order for us to
0:24 - 0:26

test and plan let's take a look at what
0:26 - 0:28

some of those are I'm going to separate
0:28 - 0:30

this out into two couple of areas here
0:30 - 0:32

and then we'll just sort of work through
0:32 - 0:33

this because there's a couple that I
0:33 - 0:35

want to sort of outline now the first
0:35 - 0:38

one is walkthroughs now we can outline a
0:38 - 0:40

couple of walkthroughs as we just
0:40 - 0:42

finished writing that out
0:42 - 0:45

walkthroughs is basically running our
0:45 - 0:47

tabletop exercises or scenarios right so
0:47 - 0:50

these are all sort of theory based and
0:50 - 0:53

they're sort of table top scenarios and
0:53 - 0:54

you may have a bunch of people that sort
0:54 - 0:56

of come around
0:56 - 0:59

that's horrible isn't it enabled top
0:59 - 1:02

scenarios I should write 10 for sure and
1:02 - 1:03

these are a bunch of people that we can
1:03 - 1:05

maybe get together in a boardroom so
1:05 - 1:07

we've got a table here and we can all
1:07 - 1:09

just come around here you might have a
1:09 - 1:11

few people and basically what we can do
1:11 - 1:13

is we basically give scenarios on how
1:13 - 1:14

we're going to handle and these are
1:14 - 1:17

obviously you know your
1:17 - 1:19

okay I can't draw so I'm just going to
1:19 - 1:21

remove that all together and just write
1:21 - 1:24

Board Room we're in a boardroom
1:24 - 1:27

okay that's good boardroom great uh so
1:27 - 1:29

basically when we're in the boardroom we
1:29 - 1:31

bring everyone around us and do and
1:31 - 1:33

perform either from the Dr team and we
1:33 - 1:35

sit around a tabletop and then you know
1:35 - 1:37

the leader of that you know who's
1:37 - 1:39

driving that who's got the initiation or
1:39 - 1:41

the actual delivery focus of that extra
1:41 - 1:43

scenario and we basically walk through
1:43 - 1:45

that scenario so they'll say okay well
1:45 - 1:46

we're gonna
1:46 - 1:49

walk through X
1:49 - 1:51

well walk me through this scenario on
1:51 - 1:53

how you're going to handle uh this
1:53 - 1:55

scenario and then you've got you know
1:55 - 1:57

your Dr team which are your I.T people
1:57 - 1:58

that's responsible for that you may have
1:58 - 2:00

your network team
2:00 - 2:01

you know your network engineering team
2:01 - 2:05

you have your systems guys girls uh you
2:05 - 2:08

may have your you know your your change
2:08 - 2:10

management team there you may have you
2:10 - 2:12

know your engineering maybe you've got
2:12 - 2:15

your Dev teams at Dev
2:15 - 2:17

engineering team you know and so on
2:17 - 2:19

right so you've got your I.T responsible
2:19 - 2:20

I.T people that are going to be
2:20 - 2:21

responsible for that process of
2:21 - 2:23

restoration should we go down that and
2:23 - 2:26

we'll run through you know uh you know
2:26 - 2:28

site goes offline and you know um you
2:28 - 2:30

know we've got three sites
2:30 - 2:33

Give an example we've got three sides
2:33 - 2:35

now I'm just sort of spitballing here so
2:35 - 2:36

bear with me if I don't get anything
2:36 - 2:38

right but I'm just going to walk through
2:38 - 2:42

this so we've got three sites alrighty
2:42 - 2:47

slight X goes down so this is location
2:48 - 2:50

I don't know Brisbane now Brisbane
2:50 - 2:53

branch has gone off-site well what do we
2:53 - 2:54

do
2:54 - 2:56

okay this is Sydney
2:56 - 2:58

this is Melbourne and obviously all
2:58 - 2:59

these are all connected and such and
2:59 - 3:01

then we've got our backbones back here
3:01 - 3:03

obviously they're connecting things as
3:03 - 3:04

well so obviously these are all our
3:04 - 3:07

backbone infrastructure well uh Brisbane
3:07 - 3:09

location goes offline for whatever it is
3:09 - 3:12

and someone you know walks into the data
3:12 - 3:13

center into the commons room and they've
3:13 - 3:16

tripped over the cable and now our data
3:16 - 3:18

center is offline well what do we do and
3:18 - 3:19

then we walk through that scenario on
3:19 - 3:21

how someone's going to recover from that
3:21 - 3:23

situation could be minor it could be
3:23 - 3:24

significant depending on the scenario so
3:24 - 3:26

basically the walkthrough is US walking
3:26 - 3:29

through it's the least amount of risk
3:29 - 3:31

with our Dr and obviously a business
3:31 - 3:33

continuity testing because we're talking
3:33 - 3:35

but we're not actually doing anything so
3:35 - 3:37

again we're going to involve this the
3:37 - 3:39

the you know the relevant people in the
3:39 - 3:40

parties and then we're going to walk
3:40 - 3:41

through that scenario based on those
3:41 - 3:45

smes and then obviously their knowledge
3:45 - 3:46

of how they're going to recover and
3:46 - 3:48

bring the systems back up to normal in
3:48 - 3:51

obviously a time-sensitive approach so
3:51 - 3:53

that's walkthroughs
3:53 - 3:54

uh to the other side we've got
3:54 - 3:57

simulation so we could run actual
3:57 - 3:59

simulations now simulation could be a
3:59 - 4:01

physical walkthrough or basically what
4:01 - 4:04

we call something like a mock event
4:04 - 4:06

and we give it a scenario we give a very
4:06 - 4:09

specific scenario and we walk through
4:09 - 4:11

what we're actually going to do so we
4:11 - 4:13

simulate what we're going to do if we're
4:13 - 4:15

using backups or the restoration of our
4:15 - 4:17

backups process well we would log into
4:17 - 4:19

the backup app server so if we're using
4:19 - 4:21

a specific vendor we're saying okay well
4:21 - 4:22

we're going to log into this server
4:22 - 4:25

we're going to click our restoration you
4:25 - 4:26

know and then our process is you know
4:26 - 4:29

restore hard drive X draw from you know
4:29 - 4:31

server Y and then that's going to take
4:31 - 4:34

maybe eight hours to do a full recovery
4:34 - 4:36

and then I'm going to take that hard
4:36 - 4:37

drive and then that's going to be our
4:37 - 4:39

state from how we're going to recover or
4:39 - 4:40

whatever that process looks like so
4:40 - 4:43

you'll simulate to the point of not
4:43 - 4:45

actually clicking
4:45 - 4:48

or doing anything it's to the point of
4:48 - 4:50

action right so you're gonna yes I'm
4:50 - 4:52

gonna log into the server I'm going to
4:52 - 4:54

look around here's what our hypervisors
4:54 - 4:56

here's our infrastructure and here's how
4:56 - 4:57

we're going to restore that process from
4:57 - 4:58

there we're going to log into this
4:58 - 5:00

vendors portal page we're going to get a
5:00 - 5:03

copy of our off-site backups whatever
5:03 - 5:04

that process looks like right so you run
5:04 - 5:07

through that mock simulation
5:07 - 5:09

um it touched equipment you trial it out
5:09 - 5:11

but to the point of doing it but not
5:11 - 5:13

actively executing it so you're not
5:13 - 5:15

going to go away and actually execute
5:15 - 5:17

your recovery you're just going to
5:17 - 5:19

basically simulate it up to the point of
5:19 - 5:21

of doing it uh from here on then we've
5:21 - 5:25

got something to do with a parallel
5:26 - 5:29

hist and parallel testing is something
5:29 - 5:31

like uh basically if we have two
5:31 - 5:32

environments and you might have
5:32 - 5:35

something like a prod
5:35 - 5:38

and test environment
5:38 - 5:40

that is probably this test and then with
5:40 - 5:42

parallel test we would recover our
5:42 - 5:44

production environment in that test
5:44 - 5:46

environment so we would go through all
5:46 - 5:48

the restore process but not take
5:48 - 5:50

production offline so I'm going to say
5:50 - 5:53

not offline
5:54 - 5:56

this basically just been doing you know
5:56 - 5:57

we're just going to go away we're going
5:57 - 5:58

to test and ensure the backups are
5:58 - 6:01

working correctly if there are any folds
6:01 - 6:02

or lessons to learn or issues that we
6:02 - 6:04

need to Define then we know what they
6:04 - 6:05

are we're aware of those and everyone
6:05 - 6:07

knows what to do so we're not taking
6:07 - 6:09

production offline production remains
6:09 - 6:10

online we're just going to take our
6:10 - 6:12

obviously
6:12 - 6:15

take our recover our production
6:15 - 6:16

environments we're going to take our
6:16 - 6:17

product environment and then we're going
6:17 - 6:18

to replicate that into our test
6:18 - 6:20

environment so we've got a test bed and
6:20 - 6:22

we're going to see how that process kind
6:22 - 6:23

of looks but we're not going to tinkle
6:23 - 6:25

with or touch our production and
6:25 - 6:27

production will remain online and
6:27 - 6:29

testing now the other part of that is
6:29 - 6:30

our cut over and the cut over is quite
6:30 - 6:33

similar in that nature
6:33 - 6:35

um again that sort of broad test
6:35 - 6:37

scenario so I'm going to use that so
6:37 - 6:40

let's just go Broad
6:40 - 6:42

and then test
6:42 - 6:44

and similar to that where we would go
6:44 - 6:45

well we're going to store our prod
6:45 - 6:48

service and then take broad offline and
6:48 - 6:50

bring the restored service online so
6:50 - 6:52

it's a full test there's Interruption
6:52 - 6:54

involved
6:54 - 6:55

um you know obviously interrupting
6:55 - 6:57

production as well so we're going to
6:57 - 6:59

obviously do the switch over and
6:59 - 7:01

obviously Interruption of some sort
7:01 - 7:03

right now even if it's a minor
7:03 - 7:05

Interruption of you know a second or two
7:05 - 7:07

that is still that's still an
7:07 - 7:09

interruption right so there will be some
7:09 - 7:11

sort of interruption but the cutover
7:11 - 7:13

test is the full given caboodle right
7:13 - 7:15

it's the four tests it during that's the
7:15 - 7:17

highest risk because if something does
7:17 - 7:20

go wrong during that cut over
7:20 - 7:21

um obviously then it's going to be an
7:21 - 7:23

outage so you have to be very mindful of
7:23 - 7:25

if you're going to do a cut over in any
7:25 - 7:27

state of testing that you've either done
7:27 - 7:29

a parallel test or you've done some sort
7:29 - 7:30

of mock simulation you've sort of
7:30 - 7:32

rehearsed it you understood it not just
7:32 - 7:34

go and do a cut over straight away now
7:34 - 7:35

if you're a smaller environment and you
7:35 - 7:37

don't have really much to impact
7:37 - 7:39

I'm still cautioning against it because
7:39 - 7:42

a lot of things can go wrong we want to
7:42 - 7:44

avoid any disruption or keep that as
7:44 - 7:49

minimal as minimal as possible again I
7:49 - 7:51

probably wouldn't advise that we turn
7:51 - 7:52

off the infrastructure or turn off the
7:52 - 7:54

service per se I'll probably keep them
7:54 - 7:55

online or maybe disconnect them from
7:55 - 7:57

their Network ports that way the servers
7:57 - 7:58

still remain online if anything does go
7:58 - 8:01

wrong we can obviously plug them in and
8:01 - 8:02

obviously you know get things back up
8:02 - 8:04

and running depending on you know the
8:04 - 8:05

complexity and depending on how things
8:05 - 8:07

are situated and what's dependent on
8:07 - 8:09

what so we want to make sure that we're
8:09 - 8:12

reducing risk and keeping our downtime
8:12 - 8:15

minimal as possible so again kind of cut
8:15 - 8:16

over is running through that actual
8:16 - 8:18

simulation and actually doing everything
8:18 - 8:20

and then restoring it in into your
8:20 - 8:22

product environment so it will go away
8:22 - 8:23

you'll restore your test and then you'll
8:23 - 8:25

restore it back into prod again you
8:25 - 8:27

would go through the full cutover so you
8:27 - 8:29

will turn off the appliances if you do
8:29 - 8:30

want to otherwise you can just
8:30 - 8:31

disconnect them from the network
8:31 - 8:33

connection you know depending on how you
8:33 - 8:34

actually want to run the card over but
8:34 - 8:35

the cut over essentially is running that
8:35 - 8:38

full test from there
8:38 - 8:40

once we've done everything then we want
8:40 - 8:42

to go over and document and this is the
8:42 - 8:45

most vital part as well as equally
8:45 - 8:47

important as the rest of them because we
8:47 - 8:49

are going to want to document and keep
8:49 - 8:53

things updated right so RPO
8:53 - 8:55

RTO so
8:55 - 8:59

our point of um our Point objectives so
8:59 - 9:01

what is our return of Point what's our
9:01 - 9:04

time objective what do they look like so
9:04 - 9:06

did we meet those objectives so I'm
9:06 - 9:08

going to say meet
9:08 - 9:10

objectives because obviously
9:10 - 9:11

everything's going to have some sort of
9:11 - 9:13

metrics associated with it so did we
9:13 - 9:15

meet this did this occur in the right
9:15 - 9:17

manner of the right time do we need to
9:17 - 9:18

work on it did something go wrong is
9:18 - 9:21

there room for improvement so room for
9:21 - 9:23

improvement
9:23 - 9:24

right that's an item room for
9:24 - 9:26

improvement because you had something
9:26 - 9:27

that's going to need Improvement right
9:27 - 9:29

did we do something wrong we were not
9:29 - 9:30

aware of something did something need
9:30 - 9:32

some training to do something else you
9:32 - 9:34

know it's a multiple of a multitude of
9:34 - 9:36

different issues that
9:36 - 9:39

um you know we we can improve on so
9:39 - 9:42

that's one and then third Point here
9:42 - 9:43

that I want to sort of mention is
9:43 - 9:46

lessons mode so Lessons Learned is what
9:46 - 9:47

are our key takeaways did we identify
9:47 - 9:49

something that needs updating because
9:49 - 9:52

something was missed did we maybe
9:52 - 9:55

um change a backup solution and did we
9:55 - 9:57

not know how to you know do we now need
9:57 - 9:59

to account for those plans and document
9:59 - 10:01

them plus you know lots of other things
10:01 - 10:02

right so we don't know what the solution
10:02 - 10:04

is you know if if we've maybe gone
10:04 - 10:06

through that solution around and we've
10:06 - 10:08

maybe implemented a change solution you
10:08 - 10:09

know
10:09 - 10:11

do now we do we now need to account for
10:11 - 10:13

that right so if we've got that solution
10:13 - 10:14

there maybe we have an account for it so
10:14 - 10:16

that could be something that in lines
10:16 - 10:18

about documentation or maybe a role of
10:18 - 10:19

responsibility with who is now
10:19 - 10:20

responsible for that maybe that was
10:20 - 10:21

missed
10:21 - 10:23

um there's obviously a lot of things
10:23 - 10:24

that come out of the Lessons Learned
10:24 - 10:25

basically what you're saying this
10:25 - 10:27

Lessons Learned is what have we defined
10:27 - 10:28

and what did we learn during that
10:28 - 10:30

exercise and then this could be through
10:30 - 10:31

a procurement this could be
10:31 - 10:33

technological this could be leadership
10:33 - 10:35

this could be documentation this could
10:35 - 10:38

be report this could be you know a bunch
10:38 - 10:40

of different areas that could improve
10:40 - 10:42

across through that continuous cycle of
10:42 - 10:44

improvement around our disaster recovery
10:44 - 10:47

and business continuity so you know
10:47 - 10:49

that's the three sort of areas around
10:49 - 10:52

testing our disaster recovery and
10:52 - 10:54

business continuity so going through
10:54 - 10:56

your walkthroughs and there's obviously
10:56 - 10:57

depending on the appetite of the
10:57 - 10:59

organization there's no right solution
10:59 - 11:01

here for for anyone it's just what works
11:01 - 11:02

and each customer or each people
11:02 - 11:04

business are at different phases right
11:04 - 11:07

you've got maybe six customers that are
11:07 - 11:08

doing you know cut over testing because
11:08 - 11:09

they're highly mature they've done
11:09 - 11:11

simulations they've done parallel
11:11 - 11:12

testing
11:12 - 11:13

and yeah they just set up to cut over
11:13 - 11:14

face where they're doing actual
11:14 - 11:16

stimulation of events but you've got
11:16 - 11:18

customers that are starting things out
11:18 - 11:19

and you know quite sensitive to these
11:19 - 11:21

things so you're going to run some
11:21 - 11:23

tabletops walk through scenarios and you
11:23 - 11:25

sort of gradually eat yourself into it
11:25 - 11:26

so
11:26 - 11:28

each of these have their own sort of
11:28 - 11:30

very specific areas there is no right
11:30 - 11:32

solution for you know there is no Silver
11:32 - 11:34

Bullet essentially so I hope you've
11:34 - 11:36

enjoyed this overview introduction into
11:36 - 11:39

testing of our thus recovery and
11:39 - 11:40

business continuity I hope you've
11:40 - 11:42

enjoyed this video see you all in the
11:42 - 11:44

next video and thank you all for viewing
11:44 - 11:47

bye for now

Title:: Types of Disaster Recovery and Business Continuity Testing: A Comprehensive Overview
Description:: more » « less
Video Language:: English
Duration:: 11:48

	OEVIDEOS edited English subtitles for Types of Disaster Recovery and Business Continuity Testing: A Comprehensive Overview
	OEVIDEOS edited English subtitles for Types of Disaster Recovery and Business Continuity Testing: A Comprehensive Overview
	OEVIDEOS edited English subtitles for Types of Disaster Recovery and Business Continuity Testing: A Comprehensive Overview
	OEVIDEOS edited English subtitles for Types of Disaster Recovery and Business Continuity Testing: A Comprehensive Overview
	OEVIDEOS edited English subtitles for Types of Disaster Recovery and Business Continuity Testing: A Comprehensive Overview
	OEVIDEOS edited English subtitles for Types of Disaster Recovery and Business Continuity Testing: A Comprehensive Overview

English subtitles

Revisions Compare revisions

Revision 6 Edited

OEVIDEOS
Revision 5 Edited

OEVIDEOS
Revision 4 Edited

OEVIDEOS
Revision 3 Edited

OEVIDEOS
Revision 2 Edited

OEVIDEOS
Revision 1 Uploaded

OEVIDEOS

	Revision Number	Author	Created
	6	OEVIDEOS
	5	OEVIDEOS
	4	OEVIDEOS
	3	OEVIDEOS
	2	OEVIDEOS
	1	OEVIDEOS

Types of Disaster Recovery and Business Continuity Testing: A Comprehensive Overview

Revisions Compare revisions

Our website uses cookies

Operating cookies (Required)