Go Back   TechArena Community > Technical Support > Computer Help > Windows Server > Windows Server Help
Become a Member!
Forgot your username/password?
Tags Active Topics RSS Search Mark Forums Read

Sponsored Links



Cluster problems, how to recover?

Windows Server Help


Reply
 
Thread Tools Search this Thread
  #1  
Old 08-02-2008
Sigitas Skublickas
 
Posts: n/a
Cluster problems, how to recover?

Hello,

I have Windows 2003 cluster server with two nodes. I was hosting SQL 2005
failover on top of it. For some reason this morning I'm not able to connect
to the
cluster administrator console anymore from any node. If I try to do so,
after quite
some time I get an error:

"The Cluster Service on the node "SQLCluster" cannot be started. The network
path was not found. Error ID: -2147024843 (80070035)."

I have verified that the cluster service is started on both nodes. I tried
restarting it
for testing purpose but still can’t get to the console. Servers and storage
reboot did
not help as well. If I go to windows explorer I can get only into quorum
volume,
while the other two volumes are “listed” only as a drive letters on BOTH
nodes. I
can’t ping cluster IP resources from network. I have noticed that the quorum
volume fails over OK to another node if I reboot one node, but the other two
volumes with my SQL stuff totally dead no matter what I try to do. When I
was rebooting nodes windows event viewer did not record anything interesting
about dead volumes or cluster itself other then the events that the “….server
was removed from the active server cluster membership…” and “…connection has
been lost with the other node”….

The storage device HP 1500 StorageWorks looks ok, no warnings are shown in
the LCD panels. I’m able to connect to the controllers via HP ACU and see
arrays fine.

Any ideas how to troubleshoot this situation?


Reply With Quote
  #2  
Old 08-02-2008
Jeff Hughes [MSFT]
 
Posts: n/a
Re: Cluster problems, how to recover?

Do a start/run and type

cluadmin .

That's cluadmin followed by a space then a period '.'
That should open up cluster administrator. You'll probably find that your
cluster name and IP (and possibly everything else) are offline/failed. If
they are offline, just manually bring them online. If any resources are
failed, troubleshoot accordingly.

--
Jeff Hughes, MCSE
Support Escalation Engineer
Microsoft Enterprise Platforms Support (Server Core/Cluster)

"Sigitas Skublickas" <SigitasSkublickas@discussions.microsoft.com> wrote in
message news:96384475-32AB-4754-BB71-4C9D5AEE661E@microsoft.com...
> Hello,
>
> I have Windows 2003 cluster server with two nodes. I was hosting SQL 2005
> failover on top of it. For some reason this morning I'm not able to
> connect
> to the
> cluster administrator console anymore from any node. If I try to do so,
> after quite
> some time I get an error:
>
> "The Cluster Service on the node "SQLCluster" cannot be started. The
> network
> path was not found. Error ID: -2147024843 (80070035)."
>
> I have verified that the cluster service is started on both nodes. I tried
> restarting it
> for testing purpose but still can’t get to the console. Servers and
> storage
> reboot did
> not help as well. If I go to windows explorer I can get only into quorum
> volume,
> while the other two volumes are “listed” only as a drive letters on BOTH
> nodes. I
> can’t ping cluster IP resources from network. I have noticed that the
> quorum
> volume fails over OK to another node if I reboot one node, but the other
> two
> volumes with my SQL stuff totally dead no matter what I try to do. When I
> was rebooting nodes windows event viewer did not record anything
> interesting
> about dead volumes or cluster itself other then the events that the
> “….server
> was removed from the active server cluster membership…” and “…connection
> has
> been lost with the other node”….
>
> The storage device HP 1500 StorageWorks looks ok, no warnings are shown in
> the LCD panels. I’m able to connect to the controllers via HP ACU and see
> arrays fine.
>
> Any ideas how to troubleshoot this situation?
>

Reply With Quote
  #3  
Old 08-02-2008
Sigitas Skublickas
 
Posts: n/a
Re: Cluster problems, how to recover?

Mr. Jeff,

Thanks for your quick response!

Looks like it was a case.... the only thing that I'm curious about is why
did the resources go to offline by themselves? I'm the only one who has
access to the cluster and I know for sure that I did not bring them offline.
Do you know if Microsoft has some kind of guide or Technet article about
cluster troubleshooting/recovery?

Thanks!
Sigitas Skublickas

"Jeff Hughes [MSFT]" wrote:

> Do a start/run and type
>
> cluadmin .
>
> That's cluadmin followed by a space then a period '.'
> That should open up cluster administrator. You'll probably find that your
> cluster name and IP (and possibly everything else) are offline/failed. If
> they are offline, just manually bring them online. If any resources are
> failed, troubleshoot accordingly.
>
> --
> Jeff Hughes, MCSE
> Support Escalation Engineer
> Microsoft Enterprise Platforms Support (Server Core/Cluster)
>
> "Sigitas Skublickas" <SigitasSkublickas@discussions.microsoft.com> wrote in
> message news:96384475-32AB-4754-BB71-4C9D5AEE661E@microsoft.com...
> > Hello,
> >
> > I have Windows 2003 cluster server with two nodes. I was hosting SQL 2005
> > failover on top of it. For some reason this morning I'm not able to
> > connect
> > to the
> > cluster administrator console anymore from any node. If I try to do so,
> > after quite
> > some time I get an error:
> >
> > "The Cluster Service on the node "SQLCluster" cannot be started. The
> > network
> > path was not found. Error ID: -2147024843 (80070035)."
> >
> > I have verified that the cluster service is started on both nodes. I tried
> > restarting it
> > for testing purpose but still can’t get to the console. Servers and
> > storage
> > reboot did
> > not help as well. If I go to windows explorer I can get only into quorum
> > volume,
> > while the other two volumes are “listed” only as a drive letters on BOTH
> > nodes. I
> > can’t ping cluster IP resources from network. I have noticed that the
> > quorum
> > volume fails over OK to another node if I reboot one node, but the other
> > two
> > volumes with my SQL stuff totally dead no matter what I try to do. When I
> > was rebooting nodes windows event viewer did not record anything
> > interesting
> > about dead volumes or cluster itself other then the events that the
> > “….server
> > was removed from the active server cluster membership…” and “…connection
> > has
> > been lost with the other node”….
> >
> > The storage device HP 1500 StorageWorks looks ok, no warnings are shown in
> > the LCD panels. I’m able to connect to the controllers via HP ACU and see
> > arrays fine.
> >
> > Any ideas how to troubleshoot this situation?
> >

Reply With Quote
  #4  
Old 08-02-2008
Jeff Hughes [MSFT]
 
Posts: n/a
Re: Cluster problems, how to recover?

There is nothing the cluster does to offline resources by themselves. It's
either user initiated or the cluster service was started with the
/fixquorum. Cluster troubleshooting is too broad a topic to have just an
article or guide on but you can start here:

Windows Server 2003 R2 Enterprise Edition – Cluster Server Resource Center
http://www.microsoft.com/windowsserv...resources.mspx


Hope this helps.

Jeff Hughes, MCSE
Support Escalation Engineer
Microsoft Enterprise Platforms Support (Server Core/Cluster)

"Sigitas Skublickas" <SigitasSkublickas@discussions.microsoft.com> wrote in
message news:5B66D9D3-5713-475E-A268-032172F2113B@microsoft.com...
> Mr. Jeff,
>
> Thanks for your quick response!
>
> Looks like it was a case.... the only thing that I'm curious about is why
> did the resources go to offline by themselves? I'm the only one who has
> access to the cluster and I know for sure that I did not bring them
> offline.
> Do you know if Microsoft has some kind of guide or Technet article about
> cluster troubleshooting/recovery?
>
> Thanks!
> Sigitas Skublickas
>
> "Jeff Hughes [MSFT]" wrote:
>
>> Do a start/run and type
>>
>> cluadmin .
>>
>> That's cluadmin followed by a space then a period '.'
>> That should open up cluster administrator. You'll probably find that your
>> cluster name and IP (and possibly everything else) are offline/failed. If
>> they are offline, just manually bring them online. If any resources are
>> failed, troubleshoot accordingly.
>>
>> --
>> Jeff Hughes, MCSE
>> Support Escalation Engineer
>> Microsoft Enterprise Platforms Support (Server Core/Cluster)
>>
>> "Sigitas Skublickas" <SigitasSkublickas@discussions.microsoft.com> wrote
>> in
>> message news:96384475-32AB-4754-BB71-4C9D5AEE661E@microsoft.com...
>> > Hello,
>> >
>> > I have Windows 2003 cluster server with two nodes. I was hosting SQL
>> > 2005
>> > failover on top of it. For some reason this morning I'm not able to
>> > connect
>> > to the
>> > cluster administrator console anymore from any node. If I try to do so,
>> > after quite
>> > some time I get an error:
>> >
>> > "The Cluster Service on the node "SQLCluster" cannot be started. The
>> > network
>> > path was not found. Error ID: -2147024843 (80070035)."
>> >
>> > I have verified that the cluster service is started on both nodes. I
>> > tried
>> > restarting it
>> > for testing purpose but still can’t get to the console. Servers and
>> > storage
>> > reboot did
>> > not help as well. If I go to windows explorer I can get only into
>> > quorum
>> > volume,
>> > while the other two volumes are “listed” only as a drive letters on
>> > BOTH
>> > nodes. I
>> > can’t ping cluster IP resources from network. I have noticed that the
>> > quorum
>> > volume fails over OK to another node if I reboot one node, but the
>> > other
>> > two
>> > volumes with my SQL stuff totally dead no matter what I try to do. When
>> > I
>> > was rebooting nodes windows event viewer did not record anything
>> > interesting
>> > about dead volumes or cluster itself other then the events that the
>> > “….server
>> > was removed from the active server cluster membership…” and
>> > “…connection
>> > has
>> > been lost with the other node”….
>> >
>> > The storage device HP 1500 StorageWorks looks ok, no warnings are shown
>> > in
>> > the LCD panels. I’m able to connect to the controllers via HP ACU and
>> > see
>> > arrays fine.
>> >
>> > Any ideas how to troubleshoot this situation?
>> >

Reply With Quote
  #5  
Old 08-02-2008
Sigitas Skublickas
 
Posts: n/a
Re: Cluster problems, how to recover?


Thanks Jeff. I really appreciate your help.

"Jeff Hughes [MSFT]" wrote:

> There is nothing the cluster does to offline resources by themselves. It's
> either user initiated or the cluster service was started with the
> /fixquorum. Cluster troubleshooting is too broad a topic to have just an
> article or guide on but you can start here:
>
> Windows Server 2003 R2 Enterprise Edition – Cluster Server Resource Center
> http://www.microsoft.com/windowsserv...resources.mspx
>
>
> Hope this helps.
>
> Jeff Hughes, MCSE
> Support Escalation Engineer
> Microsoft Enterprise Platforms Support (Server Core/Cluster)
>
> "Sigitas Skublickas" <SigitasSkublickas@discussions.microsoft.com> wrote in
> message news:5B66D9D3-5713-475E-A268-032172F2113B@microsoft.com...
> > Mr. Jeff,
> >
> > Thanks for your quick response!
> >
> > Looks like it was a case.... the only thing that I'm curious about is why
> > did the resources go to offline by themselves? I'm the only one who has
> > access to the cluster and I know for sure that I did not bring them
> > offline.
> > Do you know if Microsoft has some kind of guide or Technet article about
> > cluster troubleshooting/recovery?
> >
> > Thanks!
> > Sigitas Skublickas
> >
> > "Jeff Hughes [MSFT]" wrote:
> >
> >> Do a start/run and type
> >>
> >> cluadmin .
> >>
> >> That's cluadmin followed by a space then a period '.'
> >> That should open up cluster administrator. You'll probably find that your
> >> cluster name and IP (and possibly everything else) are offline/failed. If
> >> they are offline, just manually bring them online. If any resources are
> >> failed, troubleshoot accordingly.
> >>
> >> --
> >> Jeff Hughes, MCSE
> >> Support Escalation Engineer
> >> Microsoft Enterprise Platforms Support (Server Core/Cluster)
> >>
> >> "Sigitas Skublickas" <SigitasSkublickas@discussions.microsoft.com> wrote
> >> in
> >> message news:96384475-32AB-4754-BB71-4C9D5AEE661E@microsoft.com...
> >> > Hello,
> >> >
> >> > I have Windows 2003 cluster server with two nodes. I was hosting SQL
> >> > 2005
> >> > failover on top of it. For some reason this morning I'm not able to
> >> > connect
> >> > to the
> >> > cluster administrator console anymore from any node. If I try to do so,
> >> > after quite
> >> > some time I get an error:
> >> >
> >> > "The Cluster Service on the node "SQLCluster" cannot be started. The
> >> > network
> >> > path was not found. Error ID: -2147024843 (80070035)."
> >> >
> >> > I have verified that the cluster service is started on both nodes. I
> >> > tried
> >> > restarting it
> >> > for testing purpose but still can’t get to the console. Servers and
> >> > storage
> >> > reboot did
> >> > not help as well. If I go to windows explorer I can get only into
> >> > quorum
> >> > volume,
> >> > while the other two volumes are “listed” only as a drive letters on
> >> > BOTH
> >> > nodes. I
> >> > can’t ping cluster IP resources from network. I have noticed that the
> >> > quorum
> >> > volume fails over OK to another node if I reboot one node, but the
> >> > other
> >> > two
> >> > volumes with my SQL stuff totally dead no matter what I try to do. When
> >> > I
> >> > was rebooting nodes windows event viewer did not record anything
> >> > interesting
> >> > about dead volumes or cluster itself other then the events that the
> >> > “….server
> >> > was removed from the active server cluster membership…” and
> >> > “…connection
> >> > has
> >> > been lost with the other node”….
> >> >
> >> > The storage device HP 1500 StorageWorks looks ok, no warnings are shown
> >> > in
> >> > the LCD panels. I’m able to connect to the controllers via HP ACU and
> >> > see
> >> > arrays fine.
> >> >
> >> > Any ideas how to troubleshoot this situation?
> >> >

Reply With Quote
Reply

  TechArena Community > Technical Support > Computer Help > Windows Server > Windows Server Help
Tags: ,



Thread Tools Search this Thread
Search this Thread:

Advanced Search


Similar Threads for: "Cluster problems, how to recover?"
Thread Thread Starter Forum Replies Last Post
Live Migration of Failover Cluster Node (SQL) causes cluster halt Mr. Joe Windows Server Help 1 21-05-2011 12:41 AM
Problems with Clustering - Can't Manage Cluster and Server Name Allyn Windows Server Help 1 11-05-2011 07:32 AM
New cluster - fail to add node 2 to the cluster, Error 0x800706ba: The RPC server is unavailable stepws Windows Server Help 1 23-04-2011 06:14 PM
move cluster resource using the command line cluster.exe faf1967 Windows Server Help 7 06-04-2010 01:58 AM
Windows 2008 cluster for SQL server 2005 cluster Roger Windows Server Help 2 21-05-2008 10:16 AM


All times are GMT +5.5. The time now is 08:51 PM.