Hi,
We are running Exchange 2007 SP3 RU13 on Windows 2003 R2 with SP2 in a 2003 native AD environment and recently decided to upgrade to Exchange 2013. We installed a pair of new DELL R420 servers running Windows 2008 R2 Enterprise then threw Exchange 2013 SP1 onto them. This all went fine and the servers are running stable.
We connected the second NIC of each server to the other via a separate switch, the second NIC has Client for MS Networks and File/Printer Sharing disabled plus a totally separate subnet with no DNS or GW address assigned. DAG setup was run and completed
OK. I created the DAG network in Exchange and enabled replication, I also left replication enabled across the production LAN. Finally, I went into the advanced network settings and made sure the replication network was below the production network in the binding
order.
After an hour or two the BSOD's started.. both servers would crash within a few minutes of each other and reboot with a Kernel Panic. I have attached the contents of the dump file below. This seems to happen every few hours and it always seems to be the
server hosting the passive DB copies crashes first, followed by the server hosting the active copies. Note that if we disable the replication NIC on both servers they do not crash.
I got the impression from somewhere that perhaps the servers had mixed up the binding order and were trying to use the replication network as primary, losing access to AD and rebooting (which I have read is the behaviour for Exchange now). It appears the Exchange Health service has killed WININIT which causes the crash.
Thanks!!!
The crash dump text is below:
CRITICAL_OBJECT_TERMINATION (f4)
A process or thread crucial to system operation has unexpectedly exited or been
terminated.
Several processes and threads are necessary for the operation of the
system; when they are terminated (for any reason), the system can no
longer function.
Arguments:
Arg1: 0000000000000003, Process
Arg2: fffffa80192ebb30, Terminating object
Arg3: fffffa80192ebe10, Process image file name
Arg4: fffff80001dc37b0, Explanatory message (ascii)
Debugging Details:
------------------
PROCESS_OBJECT: fffffa80192ebb30
DEBUG_FLR_IMAGE_TIMESTAMP: 0
MODULE_NAME: wininit
FAULTING_MODULE: 0000000000000000
PROCESS_NAME: MSExchangeHMWo
BUGCHECK_STR: 0xF4_MSExchangeHMWo
CUSTOMER_CRASH_COUNT: 1
DEFAULT_BUCKET_ID: DRIVER_FAULT_SERVER_MINIDUMP
CURRENT_IRQL: 0
LAST_CONTROL_TRANSFER: from fffff80001e4cab2 to fffff80001abebc0
STACK_TEXT:
fffff880`0d7f39c8 fffff800`01e4cab2 : 00000000`000000f4 00000000`00000003 fffffa80`192ebb30 fffffa80`192ebe10 : nt!KeBugCheckEx
fffff880`0d7f39d0 fffff800`01df7abb : ffffffff`ffffffff fffffa80`1bcf3060 fffffa80`192ebb30 fffffa80`383ea060 : nt!PspCatchCriticalBreak+0x92
fffff880`0d7f3a10 fffff800`01d77674 : ffffffff`ffffffff 00000000`00000001 fffffa80`192ebb30 00000000`00000008 : nt! ?? ::NNGAKEGL::`string'+0x17486
fffff880`0d7f3a60 fffff800`01abde53 : fffffa80`192ebb30 fffff880`ffffffff fffffa80`1bcf3060 00000000`00000000 : nt!NtTerminateProcess+0xf4
fffff880`0d7f3ae0 00000000`7772157a : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : nt!KiSystemServiceCopyEnd+0x13
00000000`34eed638 00000000`00000000 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : 0x7772157a
STACK_COMMAND: kb
FOLLOWUP_NAME: MachineOwner
IMAGE_NAME: wininit.exe
FAILURE_BUCKET_ID: X64_0xF4_MSExchangeHMWo_IMAGE_wininit.exe
BUCKET_ID: X64_0xF4_MSExchangeHMWo_IMAGE_wininit.exe
Followup: MachineOwner