Go Back   TechArena Community > Technical Support > Computer Help > Windows Server > Windows Server Help
Become a Member!
Forgot your username/password?
Tags Active Topics RSS Search Mark Forums Read

Sponsored Links



Failover behaviour on a 2 node cluster

Windows Server Help


Reply
 
Thread Tools Search this Thread
  #1  
Old 22-04-2005
Ariel
 
Posts: n/a
Failover behaviour on a 2 node cluster

Hello,

I've got a fibre san at work and 2 blade servers and chasis...
My goal is to setup a 2 node exchange cluster. I'm a newb to clusters.
At this point I have the 2 nodes configured in a cluster. I was just
testing failover. I went through a couple test failure configurations.

I test failure in 3 different ways (the third one I am experiencing
issues):
1)I gracefully shutdown the active node: The passive node took over and
became the active node
2)Selected a Resource and Initiated failur on it 4 times: The Passive
node took over that resource...

3)Hard shutdown the Active node: The Passive node is unable to Take
over.

I'm getting these sort of errors in my event log:

-------Begin Event---------
Service: ClusSvc
Category: Physical Disk Resource
Event ID: 1034

The disk associated with cluster disk resource 'Disk Q:' could not be
found. The expected signature of the disk was 059E1D89. If the disk was
removed from the server cluster, the resource should be deleted. If the
disk was replaced, the resource must be deleted and created again in
order to bring the disk online. If the disk has not been removed or
replaced, it may be inaccessible at this time because it is reserved by
another server cluster node.
-------End Event-----------

How ever if I power on the pseudo Failed node the cluster, it comes
back up and If I fail in one of the 1st 2 ways everything is fine again
(the passive node becomse active)...

At this point I uninstalled multipath drivers for our san disks because
I read of issues such as the above event that are caused by multipath
software...

But this has not fixed it..

It's wierd it seems like the active node that fails somehow locks the
Cluster disk resources... Does anybody have any ideas?

TIA
Ariel


Reply With Quote
  #2  
Old 23-04-2005
Daniel Escudero de F?lix
 
Posts: n/a
Re: Failover behaviour on a 2 node cluster

Good afternoon.

As you said, some "multi-path" software could cause this kind of problems
(http://support.microsoft.com/default...en-us;Q293778).

Could you check that when you extract the disks signatures, the disk
signature for the quorum disk when this reside on "nodeA" is the same that
when quorum disk it is on "nodeB" after fail over ?

Best regards,
Daniel Escudero

"Ariel" <kamayamaya@gmail.com> wrote in message
news:1114188169.416077.272430@o13g2000cwo.googlegroups.com...
> Hello,
>
> I've got a fibre san at work and 2 blade servers and chasis...
> My goal is to setup a 2 node exchange cluster. I'm a newb to clusters.
> At this point I have the 2 nodes configured in a cluster. I was just
> testing failover. I went through a couple test failure configurations.
>
> I test failure in 3 different ways (the third one I am experiencing
> issues):
> 1)I gracefully shutdown the active node: The passive node took over and
> became the active node
> 2)Selected a Resource and Initiated failur on it 4 times: The Passive
> node took over that resource...
>
> 3)Hard shutdown the Active node: The Passive node is unable to Take
> over.
>
> I'm getting these sort of errors in my event log:
>
> -------Begin Event---------
> Service: ClusSvc
> Category: Physical Disk Resource
> Event ID: 1034
>
> The disk associated with cluster disk resource 'Disk Q:' could not be
> found. The expected signature of the disk was 059E1D89. If the disk was
> removed from the server cluster, the resource should be deleted. If the
> disk was replaced, the resource must be deleted and created again in
> order to bring the disk online. If the disk has not been removed or
> replaced, it may be inaccessible at this time because it is reserved by
> another server cluster node.
> -------End Event-----------
>
> How ever if I power on the pseudo Failed node the cluster, it comes
> back up and If I fail in one of the 1st 2 ways everything is fine again
> (the passive node becomse active)...
>
> At this point I uninstalled multipath drivers for our san disks because
> I read of issues such as the above event that are caused by multipath
> software...
>
> But this has not fixed it..
>
> It's wierd it seems like the active node that fails somehow locks the
> Cluster disk resources... Does anybody have any ideas?
>
> TIA
> Ariel
>



Reply With Quote
  #3  
Old 23-04-2005
Charles Tolento
 
Posts: n/a
RE: Failover behaviour on a 2 node cluster

Ariel

Take a look at : http://support.microsoft.com/kb/895092

Thanks

CT
"Ariel" wrote:

> Hello,
>
> I've got a fibre san at work and 2 blade servers and chasis...
> My goal is to setup a 2 node exchange cluster. I'm a newb to clusters.
> At this point I have the 2 nodes configured in a cluster. I was just
> testing failover. I went through a couple test failure configurations.
>
> I test failure in 3 different ways (the third one I am experiencing
> issues):
> 1)I gracefully shutdown the active node: The passive node took over and
> became the active node
> 2)Selected a Resource and Initiated failur on it 4 times: The Passive
> node took over that resource...
>
> 3)Hard shutdown the Active node: The Passive node is unable to Take
> over.
>
> I'm getting these sort of errors in my event log:
>
> -------Begin Event---------
> Service: ClusSvc
> Category: Physical Disk Resource
> Event ID: 1034
>
> The disk associated with cluster disk resource 'Disk Q:' could not be
> found. The expected signature of the disk was 059E1D89. If the disk was
> removed from the server cluster, the resource should be deleted. If the
> disk was replaced, the resource must be deleted and created again in
> order to bring the disk online. If the disk has not been removed or
> replaced, it may be inaccessible at this time because it is reserved by
> another server cluster node.
> -------End Event-----------
>
> How ever if I power on the pseudo Failed node the cluster, it comes
> back up and If I fail in one of the 1st 2 ways everything is fine again
> (the passive node becomse active)...
>
> At this point I uninstalled multipath drivers for our san disks because
> I read of issues such as the above event that are caused by multipath
> software...
>
> But this has not fixed it..
>
> It's wierd it seems like the active node that fails somehow locks the
> Cluster disk resources... Does anybody have any ideas?
>
> TIA
> Ariel
>
>

Reply With Quote
  #4  
Old 23-04-2005
Ariel
 
Posts: n/a
Re: Failover behaviour on a 2 node cluster

I was unable to find Dumpcfg.exe, it wasn't in the w2k3 resource kit
microsoft has for download so I found a vbs script
(http://www.castalk.com/ftopic4344.html) to get the disk signatures
After a hard shutdown on NodeA the NodeB had the same disk but with
BLANK Disk signatures... compared against the disk signatures(wich
were the same on both nodes) it had before I the shutdown on NodeA...

But graceful shutdowns do not have this affect....

Reply With Quote
  #5  
Old 23-04-2005
Ariel
 
Posts: n/a
Re: Failover behaviour on a 2 node cluster

I did obtain 1 of the hotfixes mentioned above as it pertains to my
issue http://support.microsoft.com/kb/886800
But I would not install as I have SP1 installed already...

Thanx Charles I'm going to review the other Hotfixes closely to see if
they also pertain to me.

Reply With Quote
  #6  
Old 23-04-2005
Daniel Escudero de F?lix
 
Posts: n/a
Re: Failover behaviour on a 2 node cluster

Hi.

Disk signatures should not be in blank. Could you check if all data in
HKLM/System/CurrentControlSet/Services/Clusdisk/Parameters on both servers
are the same ?

Regards,
Daniel Escudero


"Ariel" <kamayamaya@gmail.com> wrote in message
news:1114201342.928293.267640@z14g2000cwz.googlegroups.com...
> I was unable to find Dumpcfg.exe, it wasn't in the w2k3 resource kit
> microsoft has for download so I found a vbs script
> (http://www.castalk.com/ftopic4344.html) to get the disk signatures
> After a hard shutdown on NodeA the NodeB had the same disk but with
> BLANK Disk signatures... compared against the disk signatures(wich
> were the same on both nodes) it had before I the shutdown on NodeA...
>
> But graceful shutdowns do not have this affect....
>



Reply With Quote
  #7  
Old 25-04-2005
Ariel
 
Posts: n/a
Re: Failover behaviour on a 2 node cluster

OK it the disk signatures are correct under the registry... here are
the event failures that happen after a hard shutdown on the active
node. the 2nd event seems interesting, Event ID: 1177. Is it saying
that to switch the quorum disk from NodeA to NodeB it requires NodeA
be Up to fascilitate in transferring the quorum disk. So are my
problems by design? Or should the Passive Node be able to failover on
a power failure of the Active Node?

Event Type: Error
Event Source: ClusSvc
Event Category: Physical Disk Resource
Event ID: 1034
Date: 4/25/2005
Time: 9:31:46 AM
User: N/A
Computer: BERT
Description:
The disk associated with cluster disk resource 'Disk Q:' could not be
found. The expected signature of the disk was 059E1D89. If the disk was
removed from the server cluster, the resource should be deleted. If the
disk was replaced, the resource must be deleted and created again in
order to bring the disk online. If the disk has not been removed or
replaced, it may be inaccessible at this time because it is reserved by
another server cluster node.

For more information, see Help and Support Center at
http://go.microsoft.com/fwlink/events.asp.
----------------------------
Event Type: Error
Event Source: ClusSvc
Event Category: Membership Mgr
Event ID: 1177
Date: 4/25/2005
Time: 9:31:46 AM
User: N/A
Computer: BERT
Description:
Cluster service is shutting down because the membership engine failed
to arbitrate for the quorum device. This could be due to the loss of
network connectivity with the current quorum owner. Check your
physical network infrastructure to ensure that communication between
this node and all other nodes in the server cluster is intact.

For more information, see Help and Support Center at
http://go.microsoft.com/fwlink/events.asp.

---------------------------------
Event Type: Error
Event Source: ClusSvc
Event Category: Startup/Shutdown
Event ID: 1073
Date: 4/25/2005
Time: 9:31:46 AM
User: N/A
Computer: BERT
Description:
Cluster service was halted to prevent an inconsistency within the
server cluster. The error code was 5892.

For more information, see Help and Support Center at
http://go.microsoft.com/fwlink/events.asp.

------------------------------------
Event Type: Error
Event Source: Service Control Manager
Event Category: None
Event ID: 7034
Date: 4/25/2005
Time: 9:31:47 AM
User: N/A
Computer: BERT
Description:
The Cluster Service service terminated unexpectedly. It has done this
1 time(s).

For more information, see Help and Support Center at
http://go.microsoft.com/fwlink/events.asp.

Reply With Quote
  #8  
Old 25-04-2005
Ariel
 
Posts: n/a
Re: Failover behaviour on a 2 node cluster

http://www.microsoft.com/technet/pro...nsfl.mspx#EDAA
Caught this in another thread. In a 2 noded cluster it cannot tolerate
any failure. a 3 node cluster can tolerate 1 failure.

So it seems as though it is by design correct me if I'm wrong...
I have a lot to learn about clusters I guess...

-Ariel

Reply With Quote
  #9  
Old 26-04-2005
John Toner [MVP]
 
Posts: n/a
Re: Failover behaviour on a 2 node cluster

This is only true if you are using a "majority node set" quorum disk. If you
plan to use a 2-node cluster, you should use a volume from the shared
storage for your quorum disk.

Regards,
John

"Ariel" <kamayamaya@gmail.com> wrote in message
news:1114447541.904032.258840@o13g2000cwo.googlegroups.com...
>

http://www.microsoft.com/technet/pro...nsfl.mspx#EDAA
> Caught this in another thread. In a 2 noded cluster it cannot tolerate
> any failure. a 3 node cluster can tolerate 1 failure.
>
> So it seems as though it is by design correct me if I'm wrong...
> I have a lot to learn about clusters I guess...
>
> -Ariel
>



Reply With Quote
  #10  
Old 30-04-2005
Daniel Escudero de F?lix
 
Posts: n/a
Re: Failover behaviour on a 2 node cluster

Good night.

How are configured your cluster communications ? Public network as "mixed",
and Private network for only internal communications ?
How are the thresholds configured ? The failover ?

What is the cluster.log log file on node 2 saying ?

Best regards,
Daniel Escudero



"John Toner [MVP]" <jtoner@DIE.SPAM.DIE.mvps.org> wrote in message
news:##b0NwnSFHA.3076@tk2msftngp13.phx.gbl...
> This is only true if you are using a "majority node set" quorum disk. If

you
> plan to use a 2-node cluster, you should use a volume from the shared
> storage for your quorum disk.
>
> Regards,
> John
>
> "Ariel" <kamayamaya@gmail.com> wrote in message
> news:1114447541.904032.258840@o13g2000cwo.googlegroups.com...
> >

>

http://www.microsoft.com/technet/pro.../technologies/
clustering/majnsfl.mspx#EDAA
> > Caught this in another thread. In a 2 noded cluster it cannot tolerate
> > any failure. a 3 node cluster can tolerate 1 failure.
> >
> > So it seems as though it is by design correct me if I'm wrong...
> > I have a lot to learn about clusters I guess...
> >
> > -Ariel
> >

>
>



Reply With Quote
Reply

  TechArena Community > Technical Support > Computer Help > Windows Server > Windows Server Help
Tags: , , ,



Thread Tools Search this Thread
Search this Thread:

Advanced Search


Similar Threads for: "Failover behaviour on a 2 node cluster"
Thread Thread Starter Forum Replies Last Post
Live Migration of Failover Cluster Node (SQL) causes cluster halt Mr. Joe Windows Server Help 1 21-05-2011 12:41 AM
New cluster - fail to add node 2 to the cluster, Error 0x800706ba: The RPC server is unavailable stepws Windows Server Help 1 23-04-2011 06:14 PM
Add a node to a Hyper-V Failover Cluster - Disk issues delacom Windows Server Help 5 18-11-2010 03:56 PM
Can't failover resources on Win2k3 Cluster odd IP Address Conflict kb1lxm Windows Server Help 4 01-09-2010 06:07 PM
Issue failover MS SQL cluster defstar Windows Software 3 21-07-2009 02:03 PM


All times are GMT +5.5. The time now is 10:59 PM.