Results 1 to 5 of 5

Thread: Cluster problems, how to recover?

  1. #1
    Sigitas Skublickas Guest

    Cluster problems, how to recover?

    Hello,

    I have Windows 2003 cluster server with two nodes. I was hosting SQL 2005
    failover on top of it. For some reason this morning I'm not able to connect
    to the
    cluster administrator console anymore from any node. If I try to do so,
    after quite
    some time I get an error:

    "The Cluster Service on the node "SQLCluster" cannot be started. The network
    path was not found. Error ID: -2147024843 (80070035)."

    I have verified that the cluster service is started on both nodes. I tried
    restarting it
    for testing purpose but still can’t get to the console. Servers and storage
    reboot did
    not help as well. If I go to windows explorer I can get only into quorum
    volume,
    while the other two volumes are “listed” only as a drive letters on BOTH
    nodes. I
    can’t ping cluster IP resources from network. I have noticed that the quorum
    volume fails over OK to another node if I reboot one node, but the other two
    volumes with my SQL stuff totally dead no matter what I try to do. When I
    was rebooting nodes windows event viewer did not record anything interesting
    about dead volumes or cluster itself other then the events that the “….server
    was removed from the active server cluster membership…” and “…connection has
    been lost with the other node”….

    The storage device HP 1500 StorageWorks looks ok, no warnings are shown in
    the LCD panels. I’m able to connect to the controllers via HP ACU and see
    arrays fine.

    Any ideas how to troubleshoot this situation?


  2. #2
    Jeff Hughes [MSFT] Guest

    Re: Cluster problems, how to recover?

    Do a start/run and type

    cluadmin .

    That's cluadmin followed by a space then a period '.'
    That should open up cluster administrator. You'll probably find that your
    cluster name and IP (and possibly everything else) are offline/failed. If
    they are offline, just manually bring them online. If any resources are
    failed, troubleshoot accordingly.

    --
    Jeff Hughes, MCSE
    Support Escalation Engineer
    Microsoft Enterprise Platforms Support (Server Core/Cluster)

    "Sigitas Skublickas" <SigitasSkublickas@discussions.microsoft.com> wrote in
    message news:96384475-32AB-4754-BB71-4C9D5AEE661E@microsoft.com...
    > Hello,
    >
    > I have Windows 2003 cluster server with two nodes. I was hosting SQL 2005
    > failover on top of it. For some reason this morning I'm not able to
    > connect
    > to the
    > cluster administrator console anymore from any node. If I try to do so,
    > after quite
    > some time I get an error:
    >
    > "The Cluster Service on the node "SQLCluster" cannot be started. The
    > network
    > path was not found. Error ID: -2147024843 (80070035)."
    >
    > I have verified that the cluster service is started on both nodes. I tried
    > restarting it
    > for testing purpose but still can’t get to the console. Servers and
    > storage
    > reboot did
    > not help as well. If I go to windows explorer I can get only into quorum
    > volume,
    > while the other two volumes are “listed” only as a drive letters on BOTH
    > nodes. I
    > can’t ping cluster IP resources from network. I have noticed that the
    > quorum
    > volume fails over OK to another node if I reboot one node, but the other
    > two
    > volumes with my SQL stuff totally dead no matter what I try to do. When I
    > was rebooting nodes windows event viewer did not record anything
    > interesting
    > about dead volumes or cluster itself other then the events that the
    > “….server
    > was removed from the active server cluster membership…” and “…connection
    > has
    > been lost with the other node”….
    >
    > The storage device HP 1500 StorageWorks looks ok, no warnings are shown in
    > the LCD panels. I’m able to connect to the controllers via HP ACU and see
    > arrays fine.
    >
    > Any ideas how to troubleshoot this situation?
    >


  3. #3
    Sigitas Skublickas Guest

    Re: Cluster problems, how to recover?

    Mr. Jeff,

    Thanks for your quick response!

    Looks like it was a case.... the only thing that I'm curious about is why
    did the resources go to offline by themselves? I'm the only one who has
    access to the cluster and I know for sure that I did not bring them offline.
    Do you know if Microsoft has some kind of guide or Technet article about
    cluster troubleshooting/recovery?

    Thanks!
    Sigitas Skublickas

    "Jeff Hughes [MSFT]" wrote:

    > Do a start/run and type
    >
    > cluadmin .
    >
    > That's cluadmin followed by a space then a period '.'
    > That should open up cluster administrator. You'll probably find that your
    > cluster name and IP (and possibly everything else) are offline/failed. If
    > they are offline, just manually bring them online. If any resources are
    > failed, troubleshoot accordingly.
    >
    > --
    > Jeff Hughes, MCSE
    > Support Escalation Engineer
    > Microsoft Enterprise Platforms Support (Server Core/Cluster)
    >
    > "Sigitas Skublickas" <SigitasSkublickas@discussions.microsoft.com> wrote in
    > message news:96384475-32AB-4754-BB71-4C9D5AEE661E@microsoft.com...
    > > Hello,
    > >
    > > I have Windows 2003 cluster server with two nodes. I was hosting SQL 2005
    > > failover on top of it. For some reason this morning I'm not able to
    > > connect
    > > to the
    > > cluster administrator console anymore from any node. If I try to do so,
    > > after quite
    > > some time I get an error:
    > >
    > > "The Cluster Service on the node "SQLCluster" cannot be started. The
    > > network
    > > path was not found. Error ID: -2147024843 (80070035)."
    > >
    > > I have verified that the cluster service is started on both nodes. I tried
    > > restarting it
    > > for testing purpose but still can’t get to the console. Servers and
    > > storage
    > > reboot did
    > > not help as well. If I go to windows explorer I can get only into quorum
    > > volume,
    > > while the other two volumes are “listed” only as a drive letters on BOTH
    > > nodes. I
    > > can’t ping cluster IP resources from network. I have noticed that the
    > > quorum
    > > volume fails over OK to another node if I reboot one node, but the other
    > > two
    > > volumes with my SQL stuff totally dead no matter what I try to do. When I
    > > was rebooting nodes windows event viewer did not record anything
    > > interesting
    > > about dead volumes or cluster itself other then the events that the
    > > “….server
    > > was removed from the active server cluster membership…” and “…connection
    > > has
    > > been lost with the other node”….
    > >
    > > The storage device HP 1500 StorageWorks looks ok, no warnings are shown in
    > > the LCD panels. I’m able to connect to the controllers via HP ACU and see
    > > arrays fine.
    > >
    > > Any ideas how to troubleshoot this situation?
    > >


  4. #4
    Jeff Hughes [MSFT] Guest

    Re: Cluster problems, how to recover?

    There is nothing the cluster does to offline resources by themselves. It's
    either user initiated or the cluster service was started with the
    /fixquorum. Cluster troubleshooting is too broad a topic to have just an
    article or guide on but you can start here:

    Windows Server 2003 R2 Enterprise Edition – Cluster Server Resource Center
    http://www.microsoft.com/windowsserv...resources.mspx


    Hope this helps.

    Jeff Hughes, MCSE
    Support Escalation Engineer
    Microsoft Enterprise Platforms Support (Server Core/Cluster)

    "Sigitas Skublickas" <SigitasSkublickas@discussions.microsoft.com> wrote in
    message news:5B66D9D3-5713-475E-A268-032172F2113B@microsoft.com...
    > Mr. Jeff,
    >
    > Thanks for your quick response!
    >
    > Looks like it was a case.... the only thing that I'm curious about is why
    > did the resources go to offline by themselves? I'm the only one who has
    > access to the cluster and I know for sure that I did not bring them
    > offline.
    > Do you know if Microsoft has some kind of guide or Technet article about
    > cluster troubleshooting/recovery?
    >
    > Thanks!
    > Sigitas Skublickas
    >
    > "Jeff Hughes [MSFT]" wrote:
    >
    >> Do a start/run and type
    >>
    >> cluadmin .
    >>
    >> That's cluadmin followed by a space then a period '.'
    >> That should open up cluster administrator. You'll probably find that your
    >> cluster name and IP (and possibly everything else) are offline/failed. If
    >> they are offline, just manually bring them online. If any resources are
    >> failed, troubleshoot accordingly.
    >>
    >> --
    >> Jeff Hughes, MCSE
    >> Support Escalation Engineer
    >> Microsoft Enterprise Platforms Support (Server Core/Cluster)
    >>
    >> "Sigitas Skublickas" <SigitasSkublickas@discussions.microsoft.com> wrote
    >> in
    >> message news:96384475-32AB-4754-BB71-4C9D5AEE661E@microsoft.com...
    >> > Hello,
    >> >
    >> > I have Windows 2003 cluster server with two nodes. I was hosting SQL
    >> > 2005
    >> > failover on top of it. For some reason this morning I'm not able to
    >> > connect
    >> > to the
    >> > cluster administrator console anymore from any node. If I try to do so,
    >> > after quite
    >> > some time I get an error:
    >> >
    >> > "The Cluster Service on the node "SQLCluster" cannot be started. The
    >> > network
    >> > path was not found. Error ID: -2147024843 (80070035)."
    >> >
    >> > I have verified that the cluster service is started on both nodes. I
    >> > tried
    >> > restarting it
    >> > for testing purpose but still can’t get to the console. Servers and
    >> > storage
    >> > reboot did
    >> > not help as well. If I go to windows explorer I can get only into
    >> > quorum
    >> > volume,
    >> > while the other two volumes are “listed” only as a drive letters on
    >> > BOTH
    >> > nodes. I
    >> > can’t ping cluster IP resources from network. I have noticed that the
    >> > quorum
    >> > volume fails over OK to another node if I reboot one node, but the
    >> > other
    >> > two
    >> > volumes with my SQL stuff totally dead no matter what I try to do. When
    >> > I
    >> > was rebooting nodes windows event viewer did not record anything
    >> > interesting
    >> > about dead volumes or cluster itself other then the events that the
    >> > “….server
    >> > was removed from the active server cluster membership…” and
    >> > “…connection
    >> > has
    >> > been lost with the other node”….
    >> >
    >> > The storage device HP 1500 StorageWorks looks ok, no warnings are shown
    >> > in
    >> > the LCD panels. I’m able to connect to the controllers via HP ACU and
    >> > see
    >> > arrays fine.
    >> >
    >> > Any ideas how to troubleshoot this situation?
    >> >


  5. #5
    Sigitas Skublickas Guest

    Re: Cluster problems, how to recover?


    Thanks Jeff. I really appreciate your help.

    "Jeff Hughes [MSFT]" wrote:

    > There is nothing the cluster does to offline resources by themselves. It's
    > either user initiated or the cluster service was started with the
    > /fixquorum. Cluster troubleshooting is too broad a topic to have just an
    > article or guide on but you can start here:
    >
    > Windows Server 2003 R2 Enterprise Edition – Cluster Server Resource Center
    > http://www.microsoft.com/windowsserv...resources.mspx
    >
    >
    > Hope this helps.
    >
    > Jeff Hughes, MCSE
    > Support Escalation Engineer
    > Microsoft Enterprise Platforms Support (Server Core/Cluster)
    >
    > "Sigitas Skublickas" <SigitasSkublickas@discussions.microsoft.com> wrote in
    > message news:5B66D9D3-5713-475E-A268-032172F2113B@microsoft.com...
    > > Mr. Jeff,
    > >
    > > Thanks for your quick response!
    > >
    > > Looks like it was a case.... the only thing that I'm curious about is why
    > > did the resources go to offline by themselves? I'm the only one who has
    > > access to the cluster and I know for sure that I did not bring them
    > > offline.
    > > Do you know if Microsoft has some kind of guide or Technet article about
    > > cluster troubleshooting/recovery?
    > >
    > > Thanks!
    > > Sigitas Skublickas
    > >
    > > "Jeff Hughes [MSFT]" wrote:
    > >
    > >> Do a start/run and type
    > >>
    > >> cluadmin .
    > >>
    > >> That's cluadmin followed by a space then a period '.'
    > >> That should open up cluster administrator. You'll probably find that your
    > >> cluster name and IP (and possibly everything else) are offline/failed. If
    > >> they are offline, just manually bring them online. If any resources are
    > >> failed, troubleshoot accordingly.
    > >>
    > >> --
    > >> Jeff Hughes, MCSE
    > >> Support Escalation Engineer
    > >> Microsoft Enterprise Platforms Support (Server Core/Cluster)
    > >>
    > >> "Sigitas Skublickas" <SigitasSkublickas@discussions.microsoft.com> wrote
    > >> in
    > >> message news:96384475-32AB-4754-BB71-4C9D5AEE661E@microsoft.com...
    > >> > Hello,
    > >> >
    > >> > I have Windows 2003 cluster server with two nodes. I was hosting SQL
    > >> > 2005
    > >> > failover on top of it. For some reason this morning I'm not able to
    > >> > connect
    > >> > to the
    > >> > cluster administrator console anymore from any node. If I try to do so,
    > >> > after quite
    > >> > some time I get an error:
    > >> >
    > >> > "The Cluster Service on the node "SQLCluster" cannot be started. The
    > >> > network
    > >> > path was not found. Error ID: -2147024843 (80070035)."
    > >> >
    > >> > I have verified that the cluster service is started on both nodes. I
    > >> > tried
    > >> > restarting it
    > >> > for testing purpose but still can’t get to the console. Servers and
    > >> > storage
    > >> > reboot did
    > >> > not help as well. If I go to windows explorer I can get only into
    > >> > quorum
    > >> > volume,
    > >> > while the other two volumes are “listed” only as a drive letters on
    > >> > BOTH
    > >> > nodes. I
    > >> > can’t ping cluster IP resources from network. I have noticed that the
    > >> > quorum
    > >> > volume fails over OK to another node if I reboot one node, but the
    > >> > other
    > >> > two
    > >> > volumes with my SQL stuff totally dead no matter what I try to do. When
    > >> > I
    > >> > was rebooting nodes windows event viewer did not record anything
    > >> > interesting
    > >> > about dead volumes or cluster itself other then the events that the
    > >> > “….server
    > >> > was removed from the active server cluster membership…” and
    > >> > “…connection
    > >> > has
    > >> > been lost with the other node”….
    > >> >
    > >> > The storage device HP 1500 StorageWorks looks ok, no warnings are shown
    > >> > in
    > >> > the LCD panels. I’m able to connect to the controllers via HP ACU and
    > >> > see
    > >> > arrays fine.
    > >> >
    > >> > Any ideas how to troubleshoot this situation?
    > >> >


Similar Threads

  1. Replies: 1
    Last Post: 21-05-2011, 12:41 AM
  2. Replies: 1
    Last Post: 11-05-2011, 07:32 AM
  3. Replies: 1
    Last Post: 23-04-2011, 06:14 PM
  4. move cluster resource using the command line cluster.exe
    By faf1967 in forum Windows Server Help
    Replies: 7
    Last Post: 06-04-2010, 01:58 AM
  5. Windows 2008 cluster for SQL server 2005 cluster
    By Roger in forum Windows Server Help
    Replies: 2
    Last Post: 21-05-2008, 10:16 AM

Tags for this Thread

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Page generated in 1,714,025,303.74231 seconds with 17 queries