Results 1 to 10 of 10

Thread: Failover behaviour on a 2 node cluster

  1. #1
    Ariel Guest

    Failover behaviour on a 2 node cluster

    Hello,

    I've got a fibre san at work and 2 blade servers and chasis...
    My goal is to setup a 2 node exchange cluster. I'm a newb to clusters.
    At this point I have the 2 nodes configured in a cluster. I was just
    testing failover. I went through a couple test failure configurations.

    I test failure in 3 different ways (the third one I am experiencing
    issues):
    1)I gracefully shutdown the active node: The passive node took over and
    became the active node
    2)Selected a Resource and Initiated failur on it 4 times: The Passive
    node took over that resource...

    3)Hard shutdown the Active node: The Passive node is unable to Take
    over.

    I'm getting these sort of errors in my event log:

    -------Begin Event---------
    Service: ClusSvc
    Category: Physical Disk Resource
    Event ID: 1034

    The disk associated with cluster disk resource 'Disk Q:' could not be
    found. The expected signature of the disk was 059E1D89. If the disk was
    removed from the server cluster, the resource should be deleted. If the
    disk was replaced, the resource must be deleted and created again in
    order to bring the disk online. If the disk has not been removed or
    replaced, it may be inaccessible at this time because it is reserved by
    another server cluster node.
    -------End Event-----------

    How ever if I power on the pseudo Failed node the cluster, it comes
    back up and If I fail in one of the 1st 2 ways everything is fine again
    (the passive node becomse active)...

    At this point I uninstalled multipath drivers for our san disks because
    I read of issues such as the above event that are caused by multipath
    software...

    But this has not fixed it..

    It's wierd it seems like the active node that fails somehow locks the
    Cluster disk resources... Does anybody have any ideas?

    TIA
    Ariel


  2. #2
    Daniel Escudero de Félix Guest

    Re: Failover behaviour on a 2 node cluster

    Good afternoon.

    As you said, some "multi-path" software could cause this kind of problems
    (http://support.microsoft.com/default...en-us;Q293778).

    Could you check that when you extract the disks signatures, the disk
    signature for the quorum disk when this reside on "nodeA" is the same that
    when quorum disk it is on "nodeB" after fail over ?

    Best regards,
    Daniel Escudero

    "Ariel" <kamayamaya@gmail.com> wrote in message
    news:1114188169.416077.272430@o13g2000cwo.googlegroups.com...
    > Hello,
    >
    > I've got a fibre san at work and 2 blade servers and chasis...
    > My goal is to setup a 2 node exchange cluster. I'm a newb to clusters.
    > At this point I have the 2 nodes configured in a cluster. I was just
    > testing failover. I went through a couple test failure configurations.
    >
    > I test failure in 3 different ways (the third one I am experiencing
    > issues):
    > 1)I gracefully shutdown the active node: The passive node took over and
    > became the active node
    > 2)Selected a Resource and Initiated failur on it 4 times: The Passive
    > node took over that resource...
    >
    > 3)Hard shutdown the Active node: The Passive node is unable to Take
    > over.
    >
    > I'm getting these sort of errors in my event log:
    >
    > -------Begin Event---------
    > Service: ClusSvc
    > Category: Physical Disk Resource
    > Event ID: 1034
    >
    > The disk associated with cluster disk resource 'Disk Q:' could not be
    > found. The expected signature of the disk was 059E1D89. If the disk was
    > removed from the server cluster, the resource should be deleted. If the
    > disk was replaced, the resource must be deleted and created again in
    > order to bring the disk online. If the disk has not been removed or
    > replaced, it may be inaccessible at this time because it is reserved by
    > another server cluster node.
    > -------End Event-----------
    >
    > How ever if I power on the pseudo Failed node the cluster, it comes
    > back up and If I fail in one of the 1st 2 ways everything is fine again
    > (the passive node becomse active)...
    >
    > At this point I uninstalled multipath drivers for our san disks because
    > I read of issues such as the above event that are caused by multipath
    > software...
    >
    > But this has not fixed it..
    >
    > It's wierd it seems like the active node that fails somehow locks the
    > Cluster disk resources... Does anybody have any ideas?
    >
    > TIA
    > Ariel
    >




  3. #3
    Charles Tolento Guest

    RE: Failover behaviour on a 2 node cluster

    Ariel

    Take a look at : http://support.microsoft.com/kb/895092

    Thanks

    CT
    "Ariel" wrote:

    > Hello,
    >
    > I've got a fibre san at work and 2 blade servers and chasis...
    > My goal is to setup a 2 node exchange cluster. I'm a newb to clusters.
    > At this point I have the 2 nodes configured in a cluster. I was just
    > testing failover. I went through a couple test failure configurations.
    >
    > I test failure in 3 different ways (the third one I am experiencing
    > issues):
    > 1)I gracefully shutdown the active node: The passive node took over and
    > became the active node
    > 2)Selected a Resource and Initiated failur on it 4 times: The Passive
    > node took over that resource...
    >
    > 3)Hard shutdown the Active node: The Passive node is unable to Take
    > over.
    >
    > I'm getting these sort of errors in my event log:
    >
    > -------Begin Event---------
    > Service: ClusSvc
    > Category: Physical Disk Resource
    > Event ID: 1034
    >
    > The disk associated with cluster disk resource 'Disk Q:' could not be
    > found. The expected signature of the disk was 059E1D89. If the disk was
    > removed from the server cluster, the resource should be deleted. If the
    > disk was replaced, the resource must be deleted and created again in
    > order to bring the disk online. If the disk has not been removed or
    > replaced, it may be inaccessible at this time because it is reserved by
    > another server cluster node.
    > -------End Event-----------
    >
    > How ever if I power on the pseudo Failed node the cluster, it comes
    > back up and If I fail in one of the 1st 2 ways everything is fine again
    > (the passive node becomse active)...
    >
    > At this point I uninstalled multipath drivers for our san disks because
    > I read of issues such as the above event that are caused by multipath
    > software...
    >
    > But this has not fixed it..
    >
    > It's wierd it seems like the active node that fails somehow locks the
    > Cluster disk resources... Does anybody have any ideas?
    >
    > TIA
    > Ariel
    >
    >


  4. #4
    Ariel Guest

    Re: Failover behaviour on a 2 node cluster

    I was unable to find Dumpcfg.exe, it wasn't in the w2k3 resource kit
    microsoft has for download so I found a vbs script
    (http://www.castalk.com/ftopic4344.html) to get the disk signatures
    After a hard shutdown on NodeA the NodeB had the same disk but with
    BLANK Disk signatures... compared against the disk signatures(wich
    were the same on both nodes) it had before I the shutdown on NodeA...

    But graceful shutdowns do not have this affect....


  5. #5
    Ariel Guest

    Re: Failover behaviour on a 2 node cluster

    I did obtain 1 of the hotfixes mentioned above as it pertains to my
    issue http://support.microsoft.com/kb/886800
    But I would not install as I have SP1 installed already...

    Thanx Charles I'm going to review the other Hotfixes closely to see if
    they also pertain to me.


  6. #6
    Daniel Escudero de Félix Guest

    Re: Failover behaviour on a 2 node cluster

    Hi.

    Disk signatures should not be in blank. Could you check if all data in
    HKLM/System/CurrentControlSet/Services/Clusdisk/Parameters on both servers
    are the same ?

    Regards,
    Daniel Escudero


    "Ariel" <kamayamaya@gmail.com> wrote in message
    news:1114201342.928293.267640@z14g2000cwz.googlegroups.com...
    > I was unable to find Dumpcfg.exe, it wasn't in the w2k3 resource kit
    > microsoft has for download so I found a vbs script
    > (http://www.castalk.com/ftopic4344.html) to get the disk signatures
    > After a hard shutdown on NodeA the NodeB had the same disk but with
    > BLANK Disk signatures... compared against the disk signatures(wich
    > were the same on both nodes) it had before I the shutdown on NodeA...
    >
    > But graceful shutdowns do not have this affect....
    >




  7. #7
    Ariel Guest

    Re: Failover behaviour on a 2 node cluster

    OK it the disk signatures are correct under the registry... here are
    the event failures that happen after a hard shutdown on the active
    node. the 2nd event seems interesting, Event ID: 1177. Is it saying
    that to switch the quorum disk from NodeA to NodeB it requires NodeA
    be Up to fascilitate in transferring the quorum disk. So are my
    problems by design? Or should the Passive Node be able to failover on
    a power failure of the Active Node?

    Event Type: Error
    Event Source: ClusSvc
    Event Category: Physical Disk Resource
    Event ID: 1034
    Date: 4/25/2005
    Time: 9:31:46 AM
    User: N/A
    Computer: BERT
    Description:
    The disk associated with cluster disk resource 'Disk Q:' could not be
    found. The expected signature of the disk was 059E1D89. If the disk was
    removed from the server cluster, the resource should be deleted. If the
    disk was replaced, the resource must be deleted and created again in
    order to bring the disk online. If the disk has not been removed or
    replaced, it may be inaccessible at this time because it is reserved by
    another server cluster node.

    For more information, see Help and Support Center at
    http://go.microsoft.com/fwlink/events.asp.
    ----------------------------
    Event Type: Error
    Event Source: ClusSvc
    Event Category: Membership Mgr
    Event ID: 1177
    Date: 4/25/2005
    Time: 9:31:46 AM
    User: N/A
    Computer: BERT
    Description:
    Cluster service is shutting down because the membership engine failed
    to arbitrate for the quorum device. This could be due to the loss of
    network connectivity with the current quorum owner. Check your
    physical network infrastructure to ensure that communication between
    this node and all other nodes in the server cluster is intact.

    For more information, see Help and Support Center at
    http://go.microsoft.com/fwlink/events.asp.

    ---------------------------------
    Event Type: Error
    Event Source: ClusSvc
    Event Category: Startup/Shutdown
    Event ID: 1073
    Date: 4/25/2005
    Time: 9:31:46 AM
    User: N/A
    Computer: BERT
    Description:
    Cluster service was halted to prevent an inconsistency within the
    server cluster. The error code was 5892.

    For more information, see Help and Support Center at
    http://go.microsoft.com/fwlink/events.asp.

    ------------------------------------
    Event Type: Error
    Event Source: Service Control Manager
    Event Category: None
    Event ID: 7034
    Date: 4/25/2005
    Time: 9:31:47 AM
    User: N/A
    Computer: BERT
    Description:
    The Cluster Service service terminated unexpectedly. It has done this
    1 time(s).

    For more information, see Help and Support Center at
    http://go.microsoft.com/fwlink/events.asp.


  8. #8
    Ariel Guest

    Re: Failover behaviour on a 2 node cluster

    http://www.microsoft.com/technet/pro...nsfl.mspx#EDAA
    Caught this in another thread. In a 2 noded cluster it cannot tolerate
    any failure. a 3 node cluster can tolerate 1 failure.

    So it seems as though it is by design correct me if I'm wrong...
    I have a lot to learn about clusters I guess...

    -Ariel


  9. #9
    John Toner [MVP] Guest

    Re: Failover behaviour on a 2 node cluster

    This is only true if you are using a "majority node set" quorum disk. If you
    plan to use a 2-node cluster, you should use a volume from the shared
    storage for your quorum disk.

    Regards,
    John

    "Ariel" <kamayamaya@gmail.com> wrote in message
    news:1114447541.904032.258840@o13g2000cwo.googlegroups.com...
    >

    http://www.microsoft.com/technet/pro...nsfl.mspx#EDAA
    > Caught this in another thread. In a 2 noded cluster it cannot tolerate
    > any failure. a 3 node cluster can tolerate 1 failure.
    >
    > So it seems as though it is by design correct me if I'm wrong...
    > I have a lot to learn about clusters I guess...
    >
    > -Ariel
    >




  10. #10
    Daniel Escudero de Félix Guest

    Re: Failover behaviour on a 2 node cluster

    Good night.

    How are configured your cluster communications ? Public network as "mixed",
    and Private network for only internal communications ?
    How are the thresholds configured ? The failover ?

    What is the cluster.log log file on node 2 saying ?

    Best regards,
    Daniel Escudero



    "John Toner [MVP]" <jtoner@DIE.SPAM.DIE.mvps.org> wrote in message
    news:##b0NwnSFHA.3076@tk2msftngp13.phx.gbl...
    > This is only true if you are using a "majority node set" quorum disk. If

    you
    > plan to use a 2-node cluster, you should use a volume from the shared
    > storage for your quorum disk.
    >
    > Regards,
    > John
    >
    > "Ariel" <kamayamaya@gmail.com> wrote in message
    > news:1114447541.904032.258840@o13g2000cwo.googlegroups.com...
    > >

    >

    http://www.microsoft.com/technet/pro.../technologies/
    clustering/majnsfl.mspx#EDAA
    > > Caught this in another thread. In a 2 noded cluster it cannot tolerate
    > > any failure. a 3 node cluster can tolerate 1 failure.
    > >
    > > So it seems as though it is by design correct me if I'm wrong...
    > > I have a lot to learn about clusters I guess...
    > >
    > > -Ariel
    > >

    >
    >




Similar Threads

  1. Replies: 1
    Last Post: 21-05-2011, 12:41 AM
  2. Replies: 1
    Last Post: 23-04-2011, 06:14 PM
  3. Add a node to a Hyper-V Failover Cluster - Disk issues
    By delacom in forum Windows Server Help
    Replies: 5
    Last Post: 18-11-2010, 03:56 PM
  4. Replies: 4
    Last Post: 01-09-2010, 06:07 PM
  5. Issue failover MS SQL cluster
    By defstar in forum Windows Software
    Replies: 3
    Last Post: 21-07-2009, 02:03 PM

Tags for this Thread

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Page generated in 1,713,281,844.82285 seconds with 17 queries