I have a 2 node Windows 2003 SP1 EE cluster connected to an MSA1000 SAN
via integrated FC hub. My SAN is single-path since this is our Dev/QA
environment.
When I reboot any node in the cluster all physical disk resources go
offline while the rebooted server goes through POST. I get Delayed
Write Failed errors in the event log of the node that is still running.
Once the rebooted node is up and running the cluster returns to
normal.
I'm worried that our production cluster may exhibit the same issues
when it goes live even though it is built in a more robust fashion.
I'm open for suggestions.
The servers are HP DL145's, using Emulex FC2243 cards. If I simply
failover a cluster group everything works great.
Thanks.
-DK
Bookmarks