Results 1 to 8 of 8

Thread: Cluster node hangs with event 1146

  1. #1
    Jari H Guest

    Cluster node hangs with event 1146

    Hi,

    I have a 2-node cluster running W2003 R2 Enterprise. Shared disks are in a
    FC SAN.
    Recently I have had problems when one of the nodes hangs. Event log reports
    warning 1146: 'The cluster resource monitor died unexpectedly, an attempt
    will be made to restart it.' followed with several warnings of ftdisk event
    57 and ntfs event 50. I have to power cycle the hung node to regain control
    of the cluster.

    The cluster log shows following errors:
    00000b9c.00000474::2006/10/17-06:24:54.335 INFO [Qfs] GetDiskFreeSpaceEx
    Q:\MSCS\, status 0
    00000940.00000784::2006/10/17-06:25:00.272 ERR [RM] Exception. Code =
    0xc0000194, Address = 0x77E55E02
    00000940.00000784::2006/10/17-06:25:00.272 ERR [RM] Exception parameters:
    0, d2fa2000, fffffff4, 7ffdb000
    00000940.00000784::2006/10/17-06:25:00.272 INFO [RM] GenerateMemoryDump:
    Start memory dump to file C:\WINDOWS\Cluster\resrcmon.dmp
    00000940.00000784::2006/10/17-06:25:01.085 ERR [RM] CallStack:
    00000940.00000784::2006/10/17-06:25:01.085 ERR [RM] Frame Address
    00000940.00000784::2006/10/17-06:25:01.085 ERR 0075F858 77E55E02
    RaiseException+0x3C
    00000940.00000784::2006/10/17-06:25:01.085 ERR 0075F880 0100B4A4
    0001:0000A4A4 C:\WINDOWS\cluster\resrcmon.exe
    00000940.00000784::2006/10/17-06:25:01.085 ERR 0075F8B4 77C70F3B
    NdrServerInitialize+0x462
    00000940.00000784::2006/10/17-06:25:01.085 ERR 0075FCB4 77CE23F7
    NdrStubCall2+0x217
    00000940.00000784::2006/10/17-06:25:01.085 ERR 0075FCD0 77CE26ED
    NdrServerCall2+0x19
    00000940.00000784::2006/10/17-06:25:01.085 ERR 0075FD04 77C709BE
    I_RpcGetBuffer+0x1D8
    00000940.00000784::2006/10/17-06:25:01.085 ERR 0075FD58 77C7093F
    I_RpcGetBuffer+0x159
    00000940.00000784::2006/10/17-06:25:01.085 ERR 0075FD7C 77C70865
    I_RpcGetBuffer+0x7F
    00000940.00000784::2006/10/17-06:25:01.085 ERR 0075FDBC 77C657EB
    NdrConformantArrayMemorySize+0x558
    00000940.00000784::2006/10/17-06:25:01.085 ERR 0075FDFC 77C71E26
    RpcRevertToSelfEx+0x919
    00000940.00000784::2006/10/17-06:25:01.085 ERR 0075FE20 77C71BB3
    RpcRevertToSelfEx+0x6A6
    00000940.00000784::2006/10/17-06:25:01.085 ERR 0075FF84 77C75458
    I_RpcBindingIsClientLocal+0x68B
    00000940.00000784::2006/10/17-06:25:01.085 ERR 0075FF8C 77C5778F
    NdrOleFree+0x3C5
    00000940.00000784::2006/10/17-06:25:01.085 ERR 0075FFAC 77C5F7DD
    I_RpcTransGetThreadEvent+0x188
    00000940.00000784::2006/10/17-06:25:01.085 ERR 0075FFB8 77C5DE88
    I_RpcLogEvent+0xE92
    00000940.00000784::2006/10/17-06:25:01.085 ERR 0075FFEC 77E6608B
    GetModuleFileNameA+0xEB
    00000940.00000784::2006/10/17-06:25:01.085 ERR [RM] Active Resource =
    000B35C0
    00000940.00000784::2006/10/17-06:25:01.085 ERR [RM] Resource State is 9,
    "LooksAlivePoll"
    00000940.00000784::2006/10/17-06:25:01.085 ERR [RM] Resource name is Print
    Spooler
    00000940.00000784::2006/10/17-06:25:01.085 ERR [RM] Resource type is Print
    Spooler

    This has now happened twice in one node and once in the other.

    Any help would be appreciated,
    -Jari


  2. #2
    Mark Guest

    RE: Cluster node hangs with event 1146

    Got this from Eventid.com, does it apply?

    As per Microsoft: "The problem is caused by a limitation of 100 physical
    disk drives. During the scanning of the drives, when the software reached
    physical drive number 101, a buffer is released for the second time, and this
    causes damage in the heap for the process".
    --
    Mark


    "Jari H" wrote:

    > Hi,
    >
    > I have a 2-node cluster running W2003 R2 Enterprise. Shared disks are in a
    > FC SAN.
    > Recently I have had problems when one of the nodes hangs. Event log reports
    > warning 1146: 'The cluster resource monitor died unexpectedly, an attempt
    > will be made to restart it.' followed with several warnings of ftdisk event
    > 57 and ntfs event 50. I have to power cycle the hung node to regain control
    > of the cluster.
    >
    > The cluster log shows following errors:
    > 00000b9c.00000474::2006/10/17-06:24:54.335 INFO [Qfs] GetDiskFreeSpaceEx
    > Q:\MSCS\, status 0
    > 00000940.00000784::2006/10/17-06:25:00.272 ERR [RM] Exception. Code =
    > 0xc0000194, Address = 0x77E55E02
    > 00000940.00000784::2006/10/17-06:25:00.272 ERR [RM] Exception parameters:
    > 0, d2fa2000, fffffff4, 7ffdb000
    > 00000940.00000784::2006/10/17-06:25:00.272 INFO [RM] GenerateMemoryDump:
    > Start memory dump to file C:\WINDOWS\Cluster\resrcmon.dmp
    > 00000940.00000784::2006/10/17-06:25:01.085 ERR [RM] CallStack:
    > 00000940.00000784::2006/10/17-06:25:01.085 ERR [RM] Frame Address
    > 00000940.00000784::2006/10/17-06:25:01.085 ERR 0075F858 77E55E02
    > RaiseException+0x3C
    > 00000940.00000784::2006/10/17-06:25:01.085 ERR 0075F880 0100B4A4
    > 0001:0000A4A4 C:\WINDOWS\cluster\resrcmon.exe
    > 00000940.00000784::2006/10/17-06:25:01.085 ERR 0075F8B4 77C70F3B
    > NdrServerInitialize+0x462
    > 00000940.00000784::2006/10/17-06:25:01.085 ERR 0075FCB4 77CE23F7
    > NdrStubCall2+0x217
    > 00000940.00000784::2006/10/17-06:25:01.085 ERR 0075FCD0 77CE26ED
    > NdrServerCall2+0x19
    > 00000940.00000784::2006/10/17-06:25:01.085 ERR 0075FD04 77C709BE
    > I_RpcGetBuffer+0x1D8
    > 00000940.00000784::2006/10/17-06:25:01.085 ERR 0075FD58 77C7093F
    > I_RpcGetBuffer+0x159
    > 00000940.00000784::2006/10/17-06:25:01.085 ERR 0075FD7C 77C70865
    > I_RpcGetBuffer+0x7F
    > 00000940.00000784::2006/10/17-06:25:01.085 ERR 0075FDBC 77C657EB
    > NdrConformantArrayMemorySize+0x558
    > 00000940.00000784::2006/10/17-06:25:01.085 ERR 0075FDFC 77C71E26
    > RpcRevertToSelfEx+0x919
    > 00000940.00000784::2006/10/17-06:25:01.085 ERR 0075FE20 77C71BB3
    > RpcRevertToSelfEx+0x6A6
    > 00000940.00000784::2006/10/17-06:25:01.085 ERR 0075FF84 77C75458
    > I_RpcBindingIsClientLocal+0x68B
    > 00000940.00000784::2006/10/17-06:25:01.085 ERR 0075FF8C 77C5778F
    > NdrOleFree+0x3C5
    > 00000940.00000784::2006/10/17-06:25:01.085 ERR 0075FFAC 77C5F7DD
    > I_RpcTransGetThreadEvent+0x188
    > 00000940.00000784::2006/10/17-06:25:01.085 ERR 0075FFB8 77C5DE88
    > I_RpcLogEvent+0xE92
    > 00000940.00000784::2006/10/17-06:25:01.085 ERR 0075FFEC 77E6608B
    > GetModuleFileNameA+0xEB
    > 00000940.00000784::2006/10/17-06:25:01.085 ERR [RM] Active Resource =
    > 000B35C0
    > 00000940.00000784::2006/10/17-06:25:01.085 ERR [RM] Resource State is 9,
    > "LooksAlivePoll"
    > 00000940.00000784::2006/10/17-06:25:01.085 ERR [RM] Resource name is Print
    > Spooler
    > 00000940.00000784::2006/10/17-06:25:01.085 ERR [RM] Resource type is Print
    > Spooler
    >
    > This has now happened twice in one node and once in the other.
    >
    > Any help would be appreciated,
    > -Jari
    >


  3. #3
    Jari H Guest

    RE: Cluster node hangs with event 1146

    Thanks for the reply.

    Unfortunately your suggestion does not apply in this case. The cluster has
    one file/print cluster group and three SQL server cluster groups each having
    two clustered disks, which makes a total of eight plus two local disks.

    -Jari

    "Mark" wrote:

    > Got this from Eventid.com, does it apply?
    >
    > As per Microsoft: "The problem is caused by a limitation of 100 physical
    > disk drives. During the scanning of the drives, when the software reached
    > physical drive number 101, a buffer is released for the second time, and this
    > causes damage in the heap for the process".
    > --
    > Mark
    >
    >
    > "Jari H" wrote:
    >
    > > Hi,
    > >
    > > I have a 2-node cluster running W2003 R2 Enterprise. Shared disks are in a
    > > FC SAN.
    > > Recently I have had problems when one of the nodes hangs. Event log reports
    > > warning 1146: 'The cluster resource monitor died unexpectedly, an attempt
    > > will be made to restart it.' followed with several warnings of ftdisk event
    > > 57 and ntfs event 50. I have to power cycle the hung node to regain control
    > > of the cluster.
    > >
    > > The cluster log shows following errors:
    > > 00000b9c.00000474::2006/10/17-06:24:54.335 INFO [Qfs] GetDiskFreeSpaceEx
    > > Q:\MSCS\, status 0
    > > 00000940.00000784::2006/10/17-06:25:00.272 ERR [RM] Exception. Code =
    > > 0xc0000194, Address = 0x77E55E02
    > > 00000940.00000784::2006/10/17-06:25:00.272 ERR [RM] Exception parameters:
    > > 0, d2fa2000, fffffff4, 7ffdb000
    > > 00000940.00000784::2006/10/17-06:25:00.272 INFO [RM] GenerateMemoryDump:
    > > Start memory dump to file C:\WINDOWS\Cluster\resrcmon.dmp
    > > 00000940.00000784::2006/10/17-06:25:01.085 ERR [RM] CallStack:
    > > 00000940.00000784::2006/10/17-06:25:01.085 ERR [RM] Frame Address
    > > 00000940.00000784::2006/10/17-06:25:01.085 ERR 0075F858 77E55E02
    > > RaiseException+0x3C
    > > 00000940.00000784::2006/10/17-06:25:01.085 ERR 0075F880 0100B4A4
    > > 0001:0000A4A4 C:\WINDOWS\cluster\resrcmon.exe
    > > 00000940.00000784::2006/10/17-06:25:01.085 ERR 0075F8B4 77C70F3B
    > > NdrServerInitialize+0x462
    > > 00000940.00000784::2006/10/17-06:25:01.085 ERR 0075FCB4 77CE23F7
    > > NdrStubCall2+0x217
    > > 00000940.00000784::2006/10/17-06:25:01.085 ERR 0075FCD0 77CE26ED
    > > NdrServerCall2+0x19
    > > 00000940.00000784::2006/10/17-06:25:01.085 ERR 0075FD04 77C709BE
    > > I_RpcGetBuffer+0x1D8
    > > 00000940.00000784::2006/10/17-06:25:01.085 ERR 0075FD58 77C7093F
    > > I_RpcGetBuffer+0x159
    > > 00000940.00000784::2006/10/17-06:25:01.085 ERR 0075FD7C 77C70865
    > > I_RpcGetBuffer+0x7F
    > > 00000940.00000784::2006/10/17-06:25:01.085 ERR 0075FDBC 77C657EB
    > > NdrConformantArrayMemorySize+0x558
    > > 00000940.00000784::2006/10/17-06:25:01.085 ERR 0075FDFC 77C71E26
    > > RpcRevertToSelfEx+0x919
    > > 00000940.00000784::2006/10/17-06:25:01.085 ERR 0075FE20 77C71BB3
    > > RpcRevertToSelfEx+0x6A6
    > > 00000940.00000784::2006/10/17-06:25:01.085 ERR 0075FF84 77C75458
    > > I_RpcBindingIsClientLocal+0x68B
    > > 00000940.00000784::2006/10/17-06:25:01.085 ERR 0075FF8C 77C5778F
    > > NdrOleFree+0x3C5
    > > 00000940.00000784::2006/10/17-06:25:01.085 ERR 0075FFAC 77C5F7DD
    > > I_RpcTransGetThreadEvent+0x188
    > > 00000940.00000784::2006/10/17-06:25:01.085 ERR 0075FFB8 77C5DE88
    > > I_RpcLogEvent+0xE92
    > > 00000940.00000784::2006/10/17-06:25:01.085 ERR 0075FFEC 77E6608B
    > > GetModuleFileNameA+0xEB
    > > 00000940.00000784::2006/10/17-06:25:01.085 ERR [RM] Active Resource =
    > > 000B35C0
    > > 00000940.00000784::2006/10/17-06:25:01.085 ERR [RM] Resource State is 9,
    > > "LooksAlivePoll"
    > > 00000940.00000784::2006/10/17-06:25:01.085 ERR [RM] Resource name is Print
    > > Spooler
    > > 00000940.00000784::2006/10/17-06:25:01.085 ERR [RM] Resource type is Print
    > > Spooler
    > >
    > > This has now happened twice in one node and once in the other.
    > >
    > > Any help would be appreciated,
    > > -Jari
    > >


  4. #4
    nnowak43@gmail.com Guest

    Re: Cluster node hangs with event 1146

    I am having the same exact thing happen on a Windows 2003 4+1
    active/passive cluster. I don't have over 100 disks either. In the
    event log, There is Event ID 1146 stating the cluster resource monitor
    died unexpectedly followed by many event 50 and 57. I had to manually
    power the server off for it to trigger failover. So far it has only
    happened on one node of the cluster.

    I can't seem to find any information on this anywhere, other than this
    group. Any help would be greatly appreciated.

    -Nolan

    Jari H wrote:
    > Thanks for the reply.
    >
    > Unfortunately your suggestion does not apply in this case. The cluster has
    > one file/print cluster group and three SQL server cluster groups each having
    > two clustered disks, which makes a total of eight plus two local disks.
    >
    > -Jari
    >
    > "Mark" wrote:
    >
    > > Got this from Eventid.com, does it apply?
    > >
    > > As per Microsoft: "The problem is caused by a limitation of 100 physical
    > > disk drives. During the scanning of the drives, when the software reached
    > > physical drive number 101, a buffer is released for the second time, and this
    > > causes damage in the heap for the process".
    > > --
    > > Mark
    > >
    > >
    > > "Jari H" wrote:
    > >
    > > > Hi,
    > > >
    > > > I have a 2-node cluster running W2003 R2 Enterprise. Shared disks are in a
    > > > FC SAN.
    > > > Recently I have had problems when one of the nodes hangs. Event log reports
    > > > warning 1146: 'The cluster resource monitor died unexpectedly, an attempt
    > > > will be made to restart it.' followed with several warnings of ftdisk event
    > > > 57 and ntfs event 50. I have to power cycle the hung node to regain control
    > > > of the cluster.
    > > >
    > > > The cluster log shows following errors:
    > > > 00000b9c.00000474::2006/10/17-06:24:54.335 INFO [Qfs] GetDiskFreeSpaceEx
    > > > Q:\MSCS\, status 0
    > > > 00000940.00000784::2006/10/17-06:25:00.272 ERR [RM] Exception. Code =
    > > > 0xc0000194, Address = 0x77E55E02
    > > > 00000940.00000784::2006/10/17-06:25:00.272 ERR [RM] Exception parameters:
    > > > 0, d2fa2000, fffffff4, 7ffdb000
    > > > 00000940.00000784::2006/10/17-06:25:00.272 INFO [RM] GenerateMemoryDump:
    > > > Start memory dump to file C:\WINDOWS\Cluster\resrcmon.dmp
    > > > 00000940.00000784::2006/10/17-06:25:01.085 ERR [RM] CallStack:
    > > > 00000940.00000784::2006/10/17-06:25:01.085 ERR [RM] Frame Address
    > > > 00000940.00000784::2006/10/17-06:25:01.085 ERR 0075F858 77E55E02
    > > > RaiseException+0x3C
    > > > 00000940.00000784::2006/10/17-06:25:01.085 ERR 0075F880 0100B4A4
    > > > 0001:0000A4A4 C:\WINDOWS\cluster\resrcmon.exe
    > > > 00000940.00000784::2006/10/17-06:25:01.085 ERR 0075F8B4 77C70F3B
    > > > NdrServerInitialize+0x462
    > > > 00000940.00000784::2006/10/17-06:25:01.085 ERR 0075FCB4 77CE23F7
    > > > NdrStubCall2+0x217
    > > > 00000940.00000784::2006/10/17-06:25:01.085 ERR 0075FCD0 77CE26ED
    > > > NdrServerCall2+0x19
    > > > 00000940.00000784::2006/10/17-06:25:01.085 ERR 0075FD04 77C709BE
    > > > I_RpcGetBuffer+0x1D8
    > > > 00000940.00000784::2006/10/17-06:25:01.085 ERR 0075FD58 77C7093F
    > > > I_RpcGetBuffer+0x159
    > > > 00000940.00000784::2006/10/17-06:25:01.085 ERR 0075FD7C 77C70865
    > > > I_RpcGetBuffer+0x7F
    > > > 00000940.00000784::2006/10/17-06:25:01.085 ERR 0075FDBC 77C657EB
    > > > NdrConformantArrayMemorySize+0x558
    > > > 00000940.00000784::2006/10/17-06:25:01.085 ERR 0075FDFC 77C71E26
    > > > RpcRevertToSelfEx+0x919
    > > > 00000940.00000784::2006/10/17-06:25:01.085 ERR 0075FE20 77C71BB3
    > > > RpcRevertToSelfEx+0x6A6
    > > > 00000940.00000784::2006/10/17-06:25:01.085 ERR 0075FF84 77C75458
    > > > I_RpcBindingIsClientLocal+0x68B
    > > > 00000940.00000784::2006/10/17-06:25:01.085 ERR 0075FF8C 77C5778F
    > > > NdrOleFree+0x3C5
    > > > 00000940.00000784::2006/10/17-06:25:01.085 ERR 0075FFAC 77C5F7DD
    > > > I_RpcTransGetThreadEvent+0x188
    > > > 00000940.00000784::2006/10/17-06:25:01.085 ERR 0075FFB8 77C5DE88
    > > > I_RpcLogEvent+0xE92
    > > > 00000940.00000784::2006/10/17-06:25:01.085 ERR 0075FFEC 77E6608B
    > > > GetModuleFileNameA+0xEB
    > > > 00000940.00000784::2006/10/17-06:25:01.085 ERR [RM] Active Resource =
    > > > 000B35C0
    > > > 00000940.00000784::2006/10/17-06:25:01.085 ERR [RM] Resource State is 9,
    > > > "LooksAlivePoll"
    > > > 00000940.00000784::2006/10/17-06:25:01.085 ERR [RM] Resource name is Print
    > > > Spooler
    > > > 00000940.00000784::2006/10/17-06:25:01.085 ERR [RM] Resource type is Print
    > > > Spooler
    > > >
    > > > This has now happened twice in one node and once in the other.
    > > >
    > > > Any help would be appreciated,
    > > > -Jari
    > > >



  5. #5
    Join Date
    Apr 2007
    Posts
    1

    Cluster

    Hi all,

    did anyone get a answer for this ? I 'm having the same issues @ my office...I have a 2 node cluster using a SCSI disk array. Node A always freezes up I have to manually power off and power back on to get back my network shares.

  6. #6
    Mark Guest

    Re: Cluster node hangs with event 1146

    Not sure if this applies but EventID.net shows the following for 1146;

    As per Microsoft: "The problem is caused by a limitation of 100 physical
    disk drives. During the scanning of the drives, when the software reached
    physical drive number 101, a buffer is released for the second time, and this
    causes damage in the heap for the process". See M314753 for more details.
    --
    Mark


    "jeremydavila" wrote:

    >
    > Hi all,
    >
    > did anyone get a answer for this ? I 'm having the same issues @ my
    > office...I have a 2 node cluster using a SCSI disk array. Node A
    > always freezes up I have to manually power off and power back on to get
    > back my network shares.
    >
    >
    > --
    > jeremydavila
    > ------------------------------------------------------------------------
    > jeremydavila's Profile: http://forums.techarena.in/member.php?userid=24890
    > View this thread: https://forums.techarena.in/windows-server-help/608948.htm
    >
    > http://forums.techarena.in
    >
    >


  7. #7
    tony@taylor-townsend.com Guest

    Re: Cluster node hangs with event 1146

    On Apr 23, 2:52 pm, Mark <M...@discussions.microsoft.com> wrote:
    > Not sure if this applies but EventID.net shows the following for 1146;
    >
    > As per Microsoft: "The problem is caused by a limitation of 100 physical
    > disk drives. During the scanning of the drives, when the software reached
    > physical drive number 101, a buffer is released for the second time, and this
    > causes damage in the heap for the process". See M314753 for more details.
    > --
    > Mark
    >
    >
    >
    > "jeremydavila" wrote:
    >
    > > Hi all,

    >
    > > did anyone get a answer for this ? I 'm having the same issues @ my
    > > office...I have a 2 nodeclusterusing a SCSI disk array. Node A
    > > always freezes up I have to manually power off and power back on to get
    > > back my network shares.

    >
    > > --
    > > jeremydavila
    > > ------------------------------------------------------------------------
    > > jeremydavila's Profile:http://forums.techarena.in/member.php?userid=24890
    > > View this thread:https://forums.techarena.in/windows-server-help/608948.htm

    >
    > >http://forums.techarena.in- Hide quoted text -

    >
    > - Show quoted text -


    I have the exact same problem using windows 2k3 r2 cluster in an
    active/passive file server role. Anyone have any ideas?


  8. #8
    Join Date
    Nov 2008
    Posts
    1

    Re: Cluster node hangs with event 1146

    I have a theory, this happened to us yesterday in our production environment. Cluster Services hung on both nodes. It seems, from what I can read in the cluster.log, that a resource (a disk in our case) errored (looks like some kind of file corruption) which caused the resource monitor to hang while trying to restart it, which, for some reason killed the rest of the resources. Restarting one node failing over fixed it for me. I guess this happens more often with 3rd party resources (ours is doubletake). I have since found out that it is a good idea to give these types of resources their own monitor (it's in the general tab if you right click-->property the resource).

    This doesn't specifically solve the issue, but it does keep it from downing the whole cluster for one failed resource (a secondary resource at that in our case). Nice of doubletake to let me know that I should have done this from the beginning. Now as to fixing the specific issue, you need to work with the vendor (3rd party resource or microsoft if it's a common resource like a shared drive, or ip address..etc)

Similar Threads

  1. Replies: 1
    Last Post: 21-05-2011, 12:41 AM
  2. Replies: 1
    Last Post: 23-04-2011, 06:14 PM
  3. node.event causes hangs on Windows 7
    By Reema_n in forum Software Development
    Replies: 3
    Last Post: 05-12-2009, 04:28 PM
  4. Replies: 3
    Last Post: 13-11-2008, 09:28 PM
  5. Failover behaviour on a 2 node cluster
    By Ariel in forum Windows Server Help
    Replies: 9
    Last Post: 30-04-2005, 02:16 AM

Tags for this Thread

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Page generated in 1,710,824,482.47438 seconds with 16 queries