Results 1 to 10 of 10

Thread: Cluster Service fails to start 1st time following reboot.

  1. #1
    Chalkie Guest

    Cluster Service fails to start 1st time following reboot.

    Hi,

    I have a 6 node (active) and 2 (passive) cluster, running on W2K3 Enterprise
    Edition with SP1. The hardware is IBM and the SAN is a HDS, both on the HCL
    list for clusters. The cluster works fine in so much as all resources can
    move between all nodes. However, when a node is rebooted intially it comes
    back online immediately. A few reboots later, there seems to be a delay,
    checking the system event log I see event id 1009 'Cluster service could not
    join an existing server cluster and could not form a new server cluster.
    Cluster service has terminated.' shortly followed by event id 7031 'The
    Cluster Service service terminated unexpectedly. It has done this 1 time(s).
    The following corrective action will be taken in 60000 milliseconds: Restart
    the service.' The service then starts. Further, test reboots result in the
    service not starting at all. Am I missing something?

    Thanks for taking the time.

    --
    Regards,
    Chalkie

  2. #2
    Jeremy Lyons Guest

    Re: Cluster Service fails to start 1st time following reboot.

    I see this quite often in our hosting environment. From event viewer
    investigations, it appears that the cluster service is trying to start
    before the NIC teaming is initialized. Later in the event log I see
    the NIC teaming software service start, and the cluster service
    successfully starts after that.

    Do you have a similar issue?

    JL!

    On Feb 12, 3:31 pm, Chalkie <Chal...@discussions.microsoft.com> wrote:
    > Hi,
    >
    > I have a 6 node (active) and 2 (passive) cluster, running on W2K3 Enterprise
    > Edition with SP1. The hardware is IBM and the SAN is a HDS, both on the HCL
    > list for clusters. The cluster works fine in so much as all resources can
    > move between all nodes. However, when a node is rebooted intially it comes
    > back online immediately. A few reboots later, there seems to be a delay,
    > checking the system event log I see event id 1009 'Cluster service could not
    > join an existing server cluster and could not form a new server cluster.
    > Cluster service has terminated.' shortly followed by event id 7031 'The
    > Cluster Service service terminated unexpectedly. It has done this 1 time(s).
    > The following corrective action will be taken in 60000 milliseconds: Restart
    > the service.' The service then starts. Further, test reboots result in the
    > service not starting at all. Am I missing something?
    >
    > Thanks for taking the time.
    >
    > --
    > Regards,
    > Chalkie




  3. #3
    Chalkie Guest

    Re: Cluster Service fails to start 1st time following reboot.

    Hi Jeremy,

    I've had a scan through the system log and found that all the NICS appear to
    be initialized, then immediately after the clusdisk and clussvc errors
    (clusdisk are event id 1209 - 'Cluster service is requesting a bus reset for
    \Device\CluskDisk0.'. Then I get Event ID 1009 ClusSvc, followed by Event ID
    1122 ''The node (re)established communication with the cluster node
    'whatever' on network 'Heartbeat LAN'.
    So, the NIC's are active, but the links to the cluster have not been
    established and thus the Cluster service fails to join.

    Did you follow that?
    --
    Regards,
    Chalkie


    "Jeremy Lyons" wrote:

    > I see this quite often in our hosting environment. From event viewer
    > investigations, it appears that the cluster service is trying to start
    > before the NIC teaming is initialized. Later in the event log I see
    > the NIC teaming software service start, and the cluster service
    > successfully starts after that.
    >
    > Do you have a similar issue?
    >
    > JL!
    >
    > On Feb 12, 3:31 pm, Chalkie <Chal...@discussions.microsoft.com> wrote:
    > > Hi,
    > >
    > > I have a 6 node (active) and 2 (passive) cluster, running on W2K3 Enterprise
    > > Edition with SP1. The hardware is IBM and the SAN is a HDS, both on the HCL
    > > list for clusters. The cluster works fine in so much as all resources can
    > > move between all nodes. However, when a node is rebooted intially it comes
    > > back online immediately. A few reboots later, there seems to be a delay,
    > > checking the system event log I see event id 1009 'Cluster service could not
    > > join an existing server cluster and could not form a new server cluster.
    > > Cluster service has terminated.' shortly followed by event id 7031 'The
    > > Cluster Service service terminated unexpectedly. It has done this 1 time(s).
    > > The following corrective action will be taken in 60000 milliseconds: Restart
    > > the service.' The service then starts. Further, test reboots result in the
    > > service not starting at all. Am I missing something?
    > >
    > > Thanks for taking the time.
    > >
    > > --
    > > Regards,
    > > Chalkie

    >
    >
    >


  4. #4
    Jeff Hughes [Microsoft] Guest

    Re: Cluster Service fails to start 1st time following reboot.

    Are you using NIC teaming as Jeremy mentioned? I've also seen Internet
    Connection Sharing cause this. Check this service and see if it's set to
    'automatic'. If it is, disable and repro.
    --
    Jeff Hughes, MCSE
    Support Escalation Engineer
    Microsoft Enterprise Platforms Support (Server Core/Cluster)
    "Chalkie" <Chalkie@discussions.microsoft.com> wrote in message
    news:ACBA9237-EA41-457C-A277-5CEE6ADCC18F@microsoft.com...
    > Hi Jeremy,
    >
    > I've had a scan through the system log and found that all the NICS appear
    > to
    > be initialized, then immediately after the clusdisk and clussvc errors
    > (clusdisk are event id 1209 - 'Cluster service is requesting a bus reset
    > for
    > \Device\CluskDisk0.'. Then I get Event ID 1009 ClusSvc, followed by Event
    > ID
    > 1122 ''The node (re)established communication with the cluster node
    > 'whatever' on network 'Heartbeat LAN'.
    > So, the NIC's are active, but the links to the cluster have not been
    > established and thus the Cluster service fails to join.
    >
    > Did you follow that?
    > --
    > Regards,
    > Chalkie
    >
    >
    > "Jeremy Lyons" wrote:
    >
    >> I see this quite often in our hosting environment. From event viewer
    >> investigations, it appears that the cluster service is trying to start
    >> before the NIC teaming is initialized. Later in the event log I see
    >> the NIC teaming software service start, and the cluster service
    >> successfully starts after that.
    >>
    >> Do you have a similar issue?
    >>
    >> JL!
    >>
    >> On Feb 12, 3:31 pm, Chalkie <Chal...@discussions.microsoft.com> wrote:
    >> > Hi,
    >> >
    >> > I have a 6 node (active) and 2 (passive) cluster, running on W2K3
    >> > Enterprise
    >> > Edition with SP1. The hardware is IBM and the SAN is a HDS, both on
    >> > the HCL
    >> > list for clusters. The cluster works fine in so much as all resources
    >> > can
    >> > move between all nodes. However, when a node is rebooted intially it
    >> > comes
    >> > back online immediately. A few reboots later, there seems to be a
    >> > delay,
    >> > checking the system event log I see event id 1009 'Cluster service
    >> > could not
    >> > join an existing server cluster and could not form a new server
    >> > cluster.
    >> > Cluster service has terminated.' shortly followed by event id 7031 'The
    >> > Cluster Service service terminated unexpectedly. It has done this 1
    >> > time(s).
    >> > The following corrective action will be taken in 60000 milliseconds:
    >> > Restart
    >> > the service.' The service then starts. Further, test reboots result in
    >> > the
    >> > service not starting at all. Am I missing something?
    >> >
    >> > Thanks for taking the time.
    >> >
    >> > --
    >> > Regards,
    >> > Chalkie

    >>
    >>
    >>



  5. #5
    Chalkie Guest

    Re: Cluster Service fails to start 1st time following reboot.

    Jeff,

    Spot on. Many thanks.
    --
    Regards,
    Chalkie


    "Jeff Hughes [Microsoft]" wrote:

    > Are you using NIC teaming as Jeremy mentioned? I've also seen Internet
    > Connection Sharing cause this. Check this service and see if it's set to
    > 'automatic'. If it is, disable and repro.
    > --
    > Jeff Hughes, MCSE
    > Support Escalation Engineer
    > Microsoft Enterprise Platforms Support (Server Core/Cluster)
    > "Chalkie" <Chalkie@discussions.microsoft.com> wrote in message
    > news:ACBA9237-EA41-457C-A277-5CEE6ADCC18F@microsoft.com...
    > > Hi Jeremy,
    > >
    > > I've had a scan through the system log and found that all the NICS appear
    > > to
    > > be initialized, then immediately after the clusdisk and clussvc errors
    > > (clusdisk are event id 1209 - 'Cluster service is requesting a bus reset
    > > for
    > > \Device\CluskDisk0.'. Then I get Event ID 1009 ClusSvc, followed by Event
    > > ID
    > > 1122 ''The node (re)established communication with the cluster node
    > > 'whatever' on network 'Heartbeat LAN'.
    > > So, the NIC's are active, but the links to the cluster have not been
    > > established and thus the Cluster service fails to join.
    > >
    > > Did you follow that?
    > > --
    > > Regards,
    > > Chalkie
    > >
    > >
    > > "Jeremy Lyons" wrote:
    > >
    > >> I see this quite often in our hosting environment. From event viewer
    > >> investigations, it appears that the cluster service is trying to start
    > >> before the NIC teaming is initialized. Later in the event log I see
    > >> the NIC teaming software service start, and the cluster service
    > >> successfully starts after that.
    > >>
    > >> Do you have a similar issue?
    > >>
    > >> JL!
    > >>
    > >> On Feb 12, 3:31 pm, Chalkie <Chal...@discussions.microsoft.com> wrote:
    > >> > Hi,
    > >> >
    > >> > I have a 6 node (active) and 2 (passive) cluster, running on W2K3
    > >> > Enterprise
    > >> > Edition with SP1. The hardware is IBM and the SAN is a HDS, both on
    > >> > the HCL
    > >> > list for clusters. The cluster works fine in so much as all resources
    > >> > can
    > >> > move between all nodes. However, when a node is rebooted intially it
    > >> > comes
    > >> > back online immediately. A few reboots later, there seems to be a
    > >> > delay,
    > >> > checking the system event log I see event id 1009 'Cluster service
    > >> > could not
    > >> > join an existing server cluster and could not form a new server
    > >> > cluster.
    > >> > Cluster service has terminated.' shortly followed by event id 7031 'The
    > >> > Cluster Service service terminated unexpectedly. It has done this 1
    > >> > time(s).
    > >> > The following corrective action will be taken in 60000 milliseconds:
    > >> > Restart
    > >> > the service.' The service then starts. Further, test reboots result in
    > >> > the
    > >> > service not starting at all. Am I missing something?
    > >> >
    > >> > Thanks for taking the time.
    > >> >
    > >> > --
    > >> > Regards,
    > >> > Chalkie
    > >>
    > >>
    > >>

    >


  6. #6
    Join Date
    Jan 2008
    Posts
    1
    Hi Jeff and All,
    I am having the same problem on a 4 node cluster (os windows Server 2003 EE) and I do not have ICS enabled on network cards, but in fact i am using HP Teaming.
    The beahviour is the same, server restart, NICs started, then BUS Reset (event id 1209), then cluster fails to join (event id 1009) and after that cluster try again to start and starts succesfully and is able to join the cluster, and of course the active node is already since the reboot other server/node.

    Any more information that can help?

    Thx,
    Berna

  7. #7
    Edwin vMierlo [MVP] Guest

    Re: Cluster Service fails to start 1st time following reboot.


    > but in fact i
    > am using HP Teaming.


    Is the behaviour the same if you break the team, and just use single NIC's ?



  8. #8
    John Toner [MVP] Guest

    Re: Cluster Service fails to start 1st time following reboot.

    Do you have Windows Firewall service enabled on these cluster nodes? If so,
    try disabling this service and then reboot again

    Regards,
    John

    Visit my blog: http://msmvps.com/blogs/jtoner


    "bernardes15" <bernardes15.338pbc@DoNotSpam.com> wrote in message
    news:bernardes15.338pbc@DoNotSpam.com...
    >
    > Hi Jeff and All,
    > I am having the same problem on a 4 node cluster (os windows Server
    > 2003 EE) and I do not have ICS enabled on network cards, but in fact i
    > am using HP Teaming.
    > The beahviour is the same, server restart, NICs started, then BUS Reset
    > (event id 1209), then cluster fails to join (event id 1009) and after
    > that cluster try again to start and starts succesfully and is able to
    > join the cluster, and of course the active node is already since the
    > reboot other server/node.
    >
    > Any more information that can help?
    >
    > Thx,
    > Berna
    >
    >
    > --
    > bernardes15
    > ------------------------------------------------------------------------
    > bernardes15's Profile: http://forums.techarena.in/member.php?userid=39655
    > View this thread: http://forums.techarena.in/showthread.php?t=681324
    >
    > http://forums.techarena.in
    >




  9. #9
    Chuck [MSFT] Guest

    Re: Cluster Service fails to start 1st time following reboot.

    Tell me if you are running TrendMicro Anti-Virus.

    Thanks.

    --
    Chuck Timon, Jr.
    Microsoft Corporation
    Windows Server 2008 Readiness Team
    This posting is provided 'AS IS" with no warranties, and confers no rights.
    "bernardes15" <bernardes15.338pbc@DoNotSpam.com> wrote in message
    news:bernardes15.338pbc@DoNotSpam.com...
    >
    > Hi Jeff and All,
    > I am having the same problem on a 4 node cluster (os windows Server
    > 2003 EE) and I do not have ICS enabled on network cards, but in fact i
    > am using HP Teaming.
    > The beahviour is the same, server restart, NICs started, then BUS Reset
    > (event id 1209), then cluster fails to join (event id 1009) and after
    > that cluster try again to start and starts succesfully and is able to
    > join the cluster, and of course the active node is already since the
    > reboot other server/node.
    >
    > Any more information that can help?
    >
    > Thx,
    > Berna
    >
    >
    > --
    > bernardes15
    > ------------------------------------------------------------------------
    > bernardes15's Profile: http://forums.techarena.in/member.php?userid=39655
    > View this thread: http://forums.techarena.in/showthread.php?t=681324
    >
    > http://forums.techarena.in
    >



  10. #10
    Join Date
    May 2008
    Posts
    3

    nic teaming on active/passive

    sorry posted in the wrong section
    Last edited by cedtech23; 27-05-2008 at 10:45 PM. Reason: wrong section

Similar Threads

  1. windows cannot start cluster service in local computer
    By tajwal in forum Operating Systems
    Replies: 3
    Last Post: 03-01-2014, 10:39 AM
  2. Cluster Validation fails - SCSI 3 Persistent Reservation
    By Sheridan^OS in forum Operating Systems
    Replies: 5
    Last Post: 29-09-2010, 04:41 PM
  3. MSExchangeTransport Service fails to start with port 25
    By Harmony60 in forum Networking & Security
    Replies: 5
    Last Post: 01-04-2010, 12:18 PM
  4. could not start the cluster service on node 2
    By Rami Khatib in forum Windows Server Help
    Replies: 4
    Last Post: 19-06-2009, 09:07 PM
  5. Replies: 6
    Last Post: 30-04-2009, 02:29 AM

Tags for this Thread

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Page generated in 1,713,957,060.37879 seconds with 17 queries