Go Back   TechArena Community > Technical Support > Computer Help > Windows Server > Windows Server Help
Become a Member!
Forgot your username/password?
Tags Active Topics RSS Search Mark Forums Read

Sponsored Links



Disaster recovery for clusters

Windows Server Help


Reply
 
Thread Tools Search this Thread
  #16  
Old 12-02-2008
Edwin vMierlo [MVP]
 
Posts: n/a
Re: Disaster recovery for clusters

have you considered using standby cluster ?
http://technet.microsoft.com/en-us/l...EXCHG.65).aspx




"CFPDSA" <CFPDSA@discussions.microsoft.com> wrote in message
news:07CE5B6A-52BA-4483-B6F1-D35928282703@microsoft.com...
> Well, the whole reason I started this thread was to see what the answer to
> the obvious question of "how do you restore a cluster" was.
>
> The particular cluster we are working with here (as stated previously) is

a
> VMWARE based scsi cluster, just for testing purposes. But the procedure

used
> to restore in this case should be the same for any scenario.
>
> We do not have MS support (AFAIK) so that is not an option for us. I have
> been doing research into a proper disaster recovery plan for our Exchange
> clusters and have been unable to find precise guidance on how to restore a
> dead cluster (i.e. the system state was backed up, now the cluster won't
> start, how do you restore the cluster?).
>
> I thought this would be an easy question... oh well...
>
> "Edwin vMierlo [MVP]" wrote:
>
> >
> >
> > > Verified the signatures are correct using diskpart disk detail

compared w/
> > > the registry entries.
> > >
> > > Renamed mscs, then reenabled clusdisk.sys and rebooted.
> > >
> > > Attempted to start cluster service with -resetquorumlog and it fails

again
> > > with the same error.

> >
> > hm, this is going to be a long time before we solve these type of

problems
> > in a news group, if you need quick response I guess you need to start
> > getting help from Microsoft.
> >
> > Something not right with either you backup or your restore procedure...
> >
> >
> >
> > If you still want to keep trying to get this restored cluster online; I

do
> > start to believe this is not the quorum then, maybe some group policy
> > blocking something, maybe the account which is running the cluster

service
> > cannot access registry, or a file.... again this is going to be a tough

one
> > to troubleshoot in a forum.
> >
> > have you checked that the user account running the cluster service is a
> > local admin ?
> >
> > on the server, log on with the account which is used to run cluster

service.
> > Launch filemon.exe and regmon.exe. then start the cluster service, and
> > capture filemon and regmon files and see if this gives you a clue
> >
> > >
> > > FYI, when the clusdisk.sys driver is enabled, all the disk resources

are
> > > inaccessible. They are visible in Explorer, but give an error of "The

> > device
> > > is not ready." when double clicked.
> > >

> >
> > that is normal, first they have to be online in cluster prior to you can
> > access the disks
> >
> >
> >



Reply With Quote
  #17  
Old 12-02-2008
Russ Kaufmann [MVP]
 
Posts: n/a
Re: Disaster recovery for clusters

"CFPDSA" <CFPDSA@discussions.microsoft.com> wrote in message
news:07CE5B6A-52BA-4483-B6F1-D35928282703@microsoft.com...
> Well, the whole reason I started this thread was to see what the answer to
> the obvious question of "how do you restore a cluster" was.
>
> The particular cluster we are working with here (as stated previously) is
> a
> VMWARE based scsi cluster, just for testing purposes. But the procedure
> used
> to restore in this case should be the same for any scenario.
>
> We do not have MS support (AFAIK) so that is not an option for us.


MS support is always an option. It should absolutely be something that you
are ready to use in case of a disaster. Microsoft has a pay per incident
support model which is extremely cost effective and very responsive.

> been doing research into a proper disaster recovery plan for our Exchange
> clusters and have been unable to find precise guidance on how to restore a
> dead cluster (i.e. the system state was backed up, now the cluster won't
> start, how do you restore the cluster?).


Google is your friend.

http://technet.microsoft.com/en-us/l...EXCHG.65).aspx

http://www.microsoft.com/technet/tec...y/default.aspx


--
Russ Kaufmann
MVP - Windows Server - Clustering
ClusterHelp.com, a Microsoft Certified Gold Partner
Web http://www.clusterhelp.com
Blog http://msmvps.com/clusterhelp

The next ClusterHelp classes are:
Mar 10- 13 in Denver
May 12-15 in New York

Reply With Quote
  #18  
Old 13-02-2008
Russ Kaufmann [MVP]
 
Posts: n/a
Re: Disaster recovery for clusters

"Russ Kaufmann [MVP]" <russ@clusterhelp.com> wrote in message
news:DAE37043-EECB-4CA0-B36A-5F455FF57E13@microsoft.com...
> Google is your friend.
>
> http://technet.microsoft.com/en-us/l...EXCHG.65).aspx
>
> http://www.microsoft.com/technet/tec...y/default.aspx


Oh, yeah... whoops... "Microsoft Live" is your friend. <G>

Reply With Quote
  #19  
Old 13-02-2008
CFPDSA
 
Posts: n/a
Re: Disaster recovery for clusters

Obviously you have not read the thread where previously I have indicated I've
read the DR ops guide and done extensive research already to try and find a
solution (other than ASR) to this problem.

Thanks to everyone who has actually tried to help, but I think this is a
dead end. If I can convince my company it is worth the up front cost I'll
try giving PSS a call and see if they can tell me what they would do if a
customer called up with the question. I'll post the results once I have them.

"Russ Kaufmann [MVP]" wrote:
Reply With Quote
  #20  
Old 13-02-2008
Raistlin
 
Posts: n/a
Re: Disaster recovery for clusters

there is a straight forward and much easier way restoring a cluster.
quit the original node, join a new one, repair your service through
Setup program from the disc shipped with your SQL or Exch package.

i c no reason we got stuck here about restoring from a backup, since
the backup contains no critical infomation about our services: either
programfiles or datafiles of SQL/Exch. so join a new node should do
all we need to rebuild a functional cluster.

am i right? i just tried this on my VMware Server. seems to be working
as before the 'crash'.


On Feb 12, 1:23?pm, CFPDSA <CFP...@discussions.microsoft.com> wrote:
> Well, the whole reason I started this thread was to see what the answer to
> the obvious question of "how do you restore a cluster" was.
>
> The particular cluster we are working with here (as stated previously) is a
> VMWARE based scsi cluster, just for testing purposes. ?But the procedureused
> to restore in this case should be the same for any scenario.
>
> We do not have MS support (AFAIK) so that is not an option for us. ?I have
> been doing research into a proper disaster recovery plan for our Exchange
> clusters and have been unable to find precise guidance on how to restore a
> dead cluster (i.e. the system state was backed up, now the cluster won't
> start, how do you restore the cluster?).
>
> I thought this would be an easy question... ?oh well...
>
>
>
> "Edwin vMierlo [MVP]" wrote:
>
> > > Verified the signatures are correct using diskpart disk detail compared w/
> > > the registry entries.

>
> > > Renamed mscs, then reenabled clusdisk.sys and rebooted.

>
> > > Attempted to start cluster service with -resetquorumlog and it fails again
> > > with the same error.

>
> > hm, this is going to be a long time before we solve these type of problems
> > in a news group, if you need quick response I guess you need to start
> > getting help from Microsoft.

>
> > Something not right with either you backup or your restore procedure...

>
> > If you still want to keep trying to get this restored cluster online; I do
> > start to believe this is not the quorum then, maybe some group policy
> > blocking something, maybe the account which is running the cluster service
> > cannot access registry, or a file.... again this is going to be a tough one
> > to troubleshoot in a forum.

>
> > have you checked that the user account running the cluster service is a
> > local admin ?

>
> > on the server, log on with the account which is used to run cluster service.
> > Launch filemon.exe and regmon.exe. then start the cluster service, and
> > capture filemon and regmon files and see if this gives you a clue

>
> > > FYI, when the clusdisk.sys driver is enabled, all the disk resources are
> > > inaccessible. ?They are visible in Explorer, but give an error of "The

> > device
> > > is not ready." when double clicked.

>
> > that is normal, first they have to be online in cluster prior to you can
> > access the disks- Hide quoted text -

>
> - Show quoted text -


Reply With Quote
  #21  
Old 13-02-2008
Edwin vMierlo [MVP]
 
Posts: n/a
Re: Disaster recovery for clusters


> there is a straight forward and much easier way restoring a cluster.
> quit the original node, join a new one,


if you have the luxury that at least one of your original nodes is still
up/running/accessible.

for a full disaster //loss of all cluster nodes// that is not an option

rgds,
edwin.


Reply With Quote
  #22  
Old 13-02-2008
CFPDSA
 
Posts: n/a
Re: Disaster recovery for clusters

I got my answer from PSS. Based on experience, the recommended approach to
full cluster failure is to rebuild. Even if you can use the system restore
to recover a node, you end up with a messed up metabase (let alone the
problems I was experiencing).

Here was the recommended procedure for the full rebuild:

1) Have a machine that has the Windows operating system ready to go.
2) Swing the storage over to this machine from the SAN.
3) Install the cluster service, configure the cluster group and Exchange
group with the physical disk resources.
4) Ensure the appropriate drive letters are in use.
5) Install the Exchange pre-reqs.
6) Install the Exchange binaries.
7) Create the Exchange IP and Exchange NAME resources using the same name
and IP as prior installation.
8) Create the system attendant and link it to the Exchange network name.


"This works because we have code in Exchange 2003 SA creation on a cluster
that if we find the EVS already exists in active directory we will bind back
with it. It's sort of a /disasterrecovery installation for cluster."

So what I'm going to do is take this and create our plan using it as an
outline. Obviously having good documentation is key to this approach, so
I'll be creating documentation templates for each of our nodes to ensure we
have everything covered should the worst happen.

"Edwin vMierlo [MVP]" wrote:

>
> > there is a straight forward and much easier way restoring a cluster.
> > quit the original node, join a new one,

>
> if you have the luxury that at least one of your original nodes is still
> up/running/accessible.
>
> for a full disaster //loss of all cluster nodes// that is not an option
>
> rgds,
> edwin.
>
>
>

Reply With Quote
  #23  
Old 13-02-2008
Russ Kaufmann [MVP]
 
Posts: n/a
Re: Disaster recovery for clusters

"CFPDSA" <CFPDSA@discussions.microsoft.com> wrote in message
news:F586ECD2-A966-482D-8B42-257B7990508A@microsoft.com...
> Obviously you have not read the thread where previously I have indicated
> I've
> read the DR ops guide and done extensive research already to try and find
> a
> solution (other than ASR) to this problem.


Actually, I did. I think you missed a great deal of information that is in
the DR guide, but I certainly understand what you are saying.


--
Russ Kaufmann
MVP - Windows Server - Clustering
ClusterHelp.com, a Microsoft Certified Gold Partner
Web http://www.clusterhelp.com
Blog http://msmvps.com/clusterhelp

The next ClusterHelp classes are:
Mar 10- 13 in Denver
May 12-15 in New York

Reply With Quote
  #24  
Old 14-02-2008
Raistlin
 
Posts: n/a
Re: Disaster recovery for clusters

thanks for ur review.

though it raraly happens two or more nodes crashing at the same time
when the quorum is not corrupted, i'd like 2 discuss sth about what to
do when a Complete Cluster Failure happens.

in our daily administration work, to handle the complecated IT
environment, i got accustomed to planning every step of my work
according to the dependency and other relationships among different
component, may it be apps/services/sth, in our infrastructure. and
it's really worth of consideration what depends on what when planning
to recover the cluster. obviously, SQL/Exchange depend on Cluster
Service/Shared Disk/IP/Name, but do they depends on the original state
of the cluster when their databases are fully backed up and handy for
a restore? In short, if we can rebuild a cluster faster, why take
those steps restoring one when it might take too much time. however i
would agree to restoring every node if it takes simple steps such as
copy/paste chkxxx.tmp files or sth.

And when the Quorum itselft is corrupted, things got worse. we should
replace the Q:\, rebuild the cluster or restore checkpoint files.




On Feb 13, 7:29?pm, "Edwin vMierlo [MVP]"
<EdwinvMie...@discussions.microsoft.com> wrote:
> > there is a straight forward and much easier way restoring a cluster.
> > quit the original node, join a new one,

>
> if youhavethe luxury that at least one of your original nodes is still
> up/running/accessible.
>
> for a full disaster //loss of all cluster nodes// that is not an option
>
> rgds,
> edwin.


Reply With Quote
  #25  
Old 15-02-2008
CFPDSA
 
Posts: n/a
Re: Disaster recovery for clusters

While I wouldn't claim to have memorized it, I did read the entire document
prior to beginning my testing. It appears to emphasize the backup/restore
method of recovering clusters instead of the rebuild approach actually
recommended by PSS. Naturally, this led me to believe that a system state
backup should be sufficient to restore a cluster from scratch, except that in
my testing it did not in fact work as advertised (though the ASR did one must
say).

Here is an interesting quote (in light of the recommendation from PSS) from
the DR Ops guide:

To rebuild a whole cluster using your cluster's information records instead
of restoring the quorum, contact Microsoft Help and Support. The procedures
required in this type of recovery are for advanced-level administrators only.
Additionally, advanced-level administrators should only consider this cluster
recovery method if there is no alternative method available.

I am now in the middle of testing the rebuild procedure using my vmware
cluster. I'll let you know how it goes.

"Russ Kaufmann [MVP]" wrote:

> "CFPDSA" <CFPDSA@discussions.microsoft.com> wrote in message
> news:F586ECD2-A966-482D-8B42-257B7990508A@microsoft.com...
> > Obviously you have not read the thread where previously I have indicated
> > I've
> > read the DR ops guide and done extensive research already to try and find
> > a
> > solution (other than ASR) to this problem.

>
> Actually, I did. I think you missed a great deal of information that is in
> the DR guide, but I certainly understand what you are saying.
>
>
> --
> Russ Kaufmann
> MVP - Windows Server - Clustering
> ClusterHelp.com, a Microsoft Certified Gold Partner
> Web http://www.clusterhelp.com
> Blog http://msmvps.com/clusterhelp
>
> The next ClusterHelp classes are:
> Mar 10- 13 in Denver
> May 12-15 in New York
>

Reply With Quote
  #26  
Old 21-02-2008
Russ Kaufmann [MVP]
 
Posts: n/a
Re: Disaster recovery for clusters

"CFPDSA" <CFPDSA@discussions.microsoft.com> wrote in message
news:72FEFCE4-A9C1-4278-9649-694AF2B1F219@microsoft.com...
> While I wouldn't claim to have memorized it, I did read the entire
> document
> prior to beginning my testing. It appears to emphasize the backup/restore
> method of recovering clusters instead of the rebuild approach actually
> recommended by PSS. Naturally, this led me to believe that a system state
> backup should be sufficient to restore a cluster from scratch, except that
> in
> my testing it did not in fact work as advertised (though the ASR did one
> must
> say).


I do understand what you are saying. Personally, I always evict the node,
rebuild it from scratch using the build documentation that was created the
first time, add it to the cluster, then install the appropriate
applications.

--
Russ Kaufmann
MVP - Windows Server - Clustering
ClusterHelp.com, a Microsoft Certified Gold Partner
Web http://www.clusterhelp.com
Blog http://msmvps.com/clusterhelp

The next ClusterHelp classes are:
Mar 10- 13 in Denver
May 12-15 in New York

Reply With Quote
Reply

  TechArena Community > Technical Support > Computer Help > Windows Server > Windows Server Help
Tags: , ,



Thread Tools Search this Thread
Search this Thread:

Advanced Search


Similar Threads for: "Disaster recovery for clusters"
Thread Thread Starter Forum Replies Last Post
HP mini SP42226 disaster Recovery utility download $kRITIKa$ Portable Devices 6 02-07-2011 11:22 PM
Need information about sun Solaris disaster recovery plan Aalap Operating Systems 6 02-06-2011 10:09 PM
Fault Tolerance and Disaster Recovery DwinHell Operating Systems 3 11-08-2009 10:21 PM
How to implement Disaster recovery Exchange server 2007 Shanbaag Windows Software 2 03-07-2009 12:04 PM


All times are GMT +5.5. The time now is 07:37 AM.