Hi,
I have been having this issues for a while now, I have two physical exchange servers in a DAG, both on Exchange 2013 CU1. Randomly, every few days and various times, Server1 will fail all of it's databases over to Server2. I'll redistribute them, and again, say Server2 will fail all databases to Server1. In short, both servers at times have failed their databases over.
I started with this: http://technet.microsoft.com/en-us/library/dd351258(v=exchg.150).aspx which led me to setup monitoring of the Microsoft-Exchange-ManagedAvailability logs. I can tell you that replication tests work fine, and the health of all the databases are fine.
My monitoring turned up the following errors, in this example "EX0001" was the server that failed all of it's databases over to "EX0002". It seems pretty clear to me, that Exchange Managed Availability, is finding an issue with EWS, attempting to restart the MSExchangeServicesApp pool and cannot due to "Throttling" so ti fails the DB's over, that's my best guess...the problem is I dont know how to fix this...I've run through troubleshooting EWS Healthset, nothing really turns up... http://technet.microsoft.com/en-us/library/ms.exch.scom.ews.protocol(v=exchg.150).aspx
| EX0001 |
| 1011 |
| Microsoft-Exchange-ManagedAvailability |
| Recovery |
| Microsoft-Exchange-ManagedAvailability/RecoveryActionLogs |
| 5/22/2014 7:06:43 AM |
| Warning (Info) |
| 1520183 |
| NT AUTHORITY\SYSTEM |
| |
| |||
| |||
| RecycleApplicationPool-MSExchangeServicesAppPool-EWSSelfTestRestart: Throttling rejected the operation |
EX0001 |
| 4 |
Microsoft-Exchange-ManagedAvailability |
| Monitoring |
Microsoft-Exchange-ManagedAvailability/Monitoring |
| 5/22/2014 7:17:27 AM |
Error (Info) |
| 8287 |
NT AUTHORITY\SYSTEM |
| |
| ||
| ||
The EWS.Protocol health set has detected a problem on EX0001 beginning at 5/22/2014 10:55:12 AM (UTC). The health manager is reporting that recycling the MSExchangeServicesAppPool app pool has failed to restore health and it has tried to fail over active copies of local databases to a healthy server. Attempts to auto-recover from this condition have failed and requires Administrator attention. Details below: <b>MachineName:</b> EX0001 <b>ServiceName:</b> EWS.Protocol <b>ResultName:</b> EWSSelfTestProbe/MSExchangeServicesAppPool <b>Error:</b> System.Exception: System.Exception: >>> PRIMARY ENDPOINT VERIFICATION EwsUrl=https://localhost:444/ews/exchange.asmx UserName/Password=HealthMailbox663889950a344102878cede289222a46@domain.local/xGAVmP[^jn{qGgOx0Jtx:4X+-j@?d%XM?@7yErsoFF[_#u[%LcX=0hPzMln#1PiQ/7z?14rJJs8Dc)AYLi0F9mU)bMpL_gj{Q3*[Yt1:UgX=:CkQc=[Xuagz%Od=|@tt AuthMethod=CAFE ConvertId (Attempt #0) Status=The request failed. The operation has timed out ConvertId (Attempt #0) Latency=59521.1327 ConvertId (Attempt #1) Status=iteration 1; 55.427003 seconds elapsed ConvertId (Attempt #1) Latency=55427.003 at Microsoft.Exchange.Monitoring.ActiveMonitoring.Ews.Probes.EWSCommon.RetrySoapActionAndThrow(Action operation, String soapAction, ExchangeServiceBase service) at Microsoft.Exchange.Monitoring.ActiveMonitoring.Ews.Probes.EWSGenericProbeCommon.ExecuteEWSCall(String endPoint, String operation, Boolean verifyAffinity) at Microsoft.Exchange.Monitoring.ActiveMonitoring.Ews.Probes.EWSGenericProbeCommon.DoWorkInternal(CancellationToken cancellationToken) <b>Exception:</b> System.Exception: System.Exception: System.Exception: >>> PRIMARY ENDPOINT VERIFICATION EwsUrl=https://localhost:444/ews/exchange.asmx UserName/Password=HealthMailbox663889950a344102878cede289222a46@domain.local/xGAVmP[^jn{qGgOx0Jtx:4X+-j@?d%XM?@7yErsoFF[_#u[%LcX=0hPzMln#1PiQ/7z?14rJJs8Dc)AYLi0F9mU)bMpL_gj{Q3*[Yt1:UgX=:CkQc=[Xuagz%Od=|@tt AuthMethod=CAFE ConvertId (Attempt #0) Status=The request failed. The operation has timed out ConvertId (Attempt #0) Latency=59521.1327 ConvertId (Attempt #1) Status=iteration 1; 55.427003 seconds elapsed ConvertId (Attempt #1) Latency=55427.003 at Microsoft.Exchange.Monitoring.ActiveMonitoring.Ews.Probes.EWSCommon.RetrySoapActionAndThrow(Action operation, String soapAction, ExchangeServiceBase service) at Microsoft.Exchange.Monitoring.ActiveMonitoring.Ews.Probes.EWSGenericProbeCommon.ExecuteEWSCall(String endPoint, String operation, Boolean verifyAffinity) at Microsoft.Exchange.Monitoring.ActiveMonitoring.Ews.Probes.EWSGenericProbeCommon.DoWorkInternal(CancellationToken cancellationToken) at Microsoft.Exchange.Monitoring.ActiveMonitoring.Ews.Probes.EWSCommon.ThrowError(Object key, Object exceptiondata, String logDetails) at Microsoft.Exchange.Monitoring.ActiveMonitoring.Ews.Probes.EWSGenericProbeCommon.DoWorkInternal(CancellationToken cancellationToken) at Microsoft.Exchange.Monitoring.ActiveMonitoring.Ews.Probes.EWSGenericProbeCommon.RunEWSGenericProbe(CancellationToken cancellationToken) at Microsoft.Exchange.WorkerTaskFramework.WorkItem.Execute(CancellationToken joinedToken) at Microsoft.Exchange.WorkerTaskFramework.WorkItem.<>c__DisplayClass2.<StartExecuting>b__0() at System.Threading.Tasks.Task.Execute() <b>ExecutionContext:</b> EWSGenericProbeError:Exception=System.Exception: System.Exception: >>> PRIMARY ENDPOINT VERIFICATION EwsUrl=https://localhost:444/ews/exchange.asmx UserName/Password=HealthMailbox663889950a344102878cede289222a46@domain.local/xGAVmP[^jn{qGgOx0Jtx:4X+-j@?d%XM?@7yErsoFF[_#u[%LcX=0hPzMln#1PiQ/7z?14rJJs8Dc)AYLi0F9mU)bMpL_gj{Q3*[Yt1:UgX=:CkQc=[Xuagz%Od=|@tt AuthMethod=CAFE ConvertId (Attempt #0) Status=The request failed. The operation has timed out ConvertId (Attempt #0) Latency=59521.1327 ConvertId (Attempt #1) Status=iteration 1; 55.427003 seconds elapsed ConvertId (Attempt #1) Latency=55427.003 at Microsoft.Exchange.Monitoring.ActiveMonitoring.Ews.Probes.EWSCommon.RetrySoapActionAndThrow(Action operation, String soapAction, ExchangeServiceBase service) at Microsoft.Exchange.Monitoring.ActiveMonitoring.Ews.Probes.EWSGenericProbeCommon.ExecuteEWSCall(String endPoint, String operation, Boolean verifyAffinity) at Microsoft.Exchange.Monitoring.ActiveMonitoring.Ews.Probes.EWSGenericProbeCommon<b>FailureContext:</b> <b>ResultType:</b> Failed <b>IsNotified:</b> False <b>DeploymentId:</b> 0 <b>RetryCount:</b> 0 <b>ExtensionXml:</b> <b>Version:</b> <b>StateAttribute1:</b> EWS <b>StateAttribute2:</b> Unknown <b>StateAttribute3:</b> <b>StateAttribute4:</b> <b>StateAttribute5:</b> <b>StateAttribute6:</b> 0 <b>StateAttribute7:</b> 0 <b>StateAttribute8:</b> 0 <b>StateAttribute9:</b> 0 <b>StateAttribute10:</b> 0 <b>StateAttribute11:</b> <b>StateAttribute12:</b> <b>StateAttribute13:</b> <b>StateAttribute14:</b> <b>StateAttribute14:</b><b>StateAttribute16:</b> 0 <b>StateAttribute17:</b> 0 <b>StateAttribute18:</b> 0 <b>StateAttribute19:</b> 0 <b>StateAttribute20:</b> 120011 <b>StateAttribute21:</b> [000.000] EWSCommon start: 5/22/2014 11:13:13 AM [000.000] Configuring EWScommon [000.000] Probe time limit: 120000ms, HTTP timeout: 59500ms, RetryCount: 1 [000.047] using authN: CAFEHealthMailbox663889950a344102878cede289222a46@domain.local xGAVmP[^jn{qGgOx0Jtx:4X+-j@?d%XM?@7yErsoFF[_#u[%LcX=0hPzMln#1PiQ/7z?14rJJs8Dc)AYLi0F9mU)bMpL_gj{Q3*[Yt1:UgX=:CkQc=[Xuagz%Od=|@tt [000.047] using HTTP request timeout: 59500 ms [000.047] action iteration 0 [000.047] starting (total time left 119954 ms) [059.568] action threw Microsoft.Exchange.WebServices.Data.ServiceRequestException: The request failed. The operation has timed out [064.584] action iteration 1 [064.584] starting (total time left 55416 ms) [120.011] action wait timed out [120.011] action threw System.TimeoutException: iteration 1; 55.427003 seconds elapsed <b>StateAttribute22:</b> <b>StateAttribute23:</b><b>StateAttribute24:</b> <b>StateAttribute25:</b> <b>PoisonedCount:</b> 0 <b>ExecutionId:</b> 32395373 <b>ExecutionStartTime:</b> 5/22/2014 11:13:13 AM <b>ExecutionEndTime:</b> 5/22/2014 11:15:13 AM <b>ResultId:</b> 253233015 <b>SampleValue:</b> 0 ------------------------------------------------------------------------------- States of all monitors within the health set: Note: Data may be stale. To get current data, run: Get-ServerHealth -Identity 'EX0001' -HealthSet 'EWS.Protocol' State Name TargetResource HealthSet AlertValue ServerComponent ----- ---- -------------- --------- ---------- --------------- NotApplicable EWSSelfTestMonitor MSExchangeServicesAppPool EWS.Protocol Unhealthy None NotApplicable EWSDeepTestMonitor DG01DB15 EWS.Protocol Unhealthy None NotApplicable PrivateWorkingSetWarningThresholdExc... msexchangeservicesapppool EWS.Protocol Healthy None NotApplicable ProcessProcessorTimeErrorThresholdEx... msexchangeservicesapppool EWS.Protocol Healthy None NotApplicable ExchangeCrashEventErrorThresholdExce... msexchangeservicesapppool EWS.Protocol Healthy None States of all health sets: Note: Data may be stale. To get current data, run: Get-HealthReport -Identity 'EX0001' State HealthSet AlertValue LastTransitionTime MonitorCount ----- --------- ---------- ------------------ ------------ NotApplicable Autodiscover.Protocol Healthy 3/8/2014 12:46:17 AM 4 NotApplicable ActiveSync.Protocol Healthy 3/8/2014 1:15:35 AM 7 NotApplicable ActiveSync Healthy 3/8/2014 2:08:15 AM 3 NotApplicable EDS Healthy 5/22/2014 5:19:41 AM 13 NotApplicable ECP Healthy 3/8/2014 1:15:27 AM 3 NotApplicable EventAssistants Healthy 5/22/2014 5:48:56 AM 28 NotApplicable EWS.Protocol Unhealthy 5/22/2014 7:07:12 AM 5 NotApplicable FIPS Healthy 5/21/2014 10:24:01 PM 18 NotApplicable AD Healthy 2/23/2014 10:42:29 PM 10 NotApplicable OWA.Protocol.Dep Healthy 5/22/2014 5:19:40 AM 1 NotApplicable Monitoring Unhealthy 5/22/2014 5:35:31 AM 9 Online HubTransport Unhealthy 5/22/2014 5:19:43 AM 138 NotApplicable DataProtection Healthy 5/22/2014 7:08:02 AM 201 NotApplicable AntiSpam Healthy 5/22/2014 5:19:40 AM 4 NotApplicable Network Healthy 5/21/2014 10:36:54 PM 1 NotApplicable OWA.Protocol Healthy 3/8/2014 1:15:34 AM 5 NotApplicable MailboxMigration Healthy 3/8/2014 12:46:18 AM 4 NotApplicable MRS Healthy 3/8/2014 12:44:35 AM 9 NotApplicable MailboxTransport Healthy 5/22/2014 5:19:41 AM 57 NotApplicable PublicFolders Healthy 5/21/2014 10:44:15 PM 4 NotApplicable RPS Healthy 2/23/2014 11:38:33 PM 1 NotApplicable Outlook.Protocol Healthy 4/22/2014 11:04:18 AM 3 NotApplicable UserThrottling Healthy 5/22/2014 5:51:13 AM 7 NotApplicable SiteMailbox Healthy 3/8/2014 2:10:53 AM 3 NotApplicable UM.Protocol Healthy 5/22/2014 5:19:41 AM 17 NotApplicable Store Healthy 5/22/2014 5:19:43 AM 225 NotApplicable MSExchangeCertificateDeplo... Disabled 1/1/0001 12:00:00 AM 2 NotApplicable DAL Healthy 8/2/2013 12:59:03 AM 16 NotApplicable Search Healthy 5/22/2014 5:37:18 AM 269 Online EWS.Proxy Healthy 5/5/2014 1:34:08 AM 1 Online RPS.Proxy Healthy 5/5/2014 1:34:38 AM 13 Online OAB.Proxy Healthy 5/5/2014 1:34:37 AM 1 Online ECP.Proxy Healthy 5/5/2014 1:34:17 AM 4 Online OWA.Proxy Healthy 5/5/2014 1:34:25 AM 2 Online Outlook.Proxy Healthy 5/5/2014 1:34:08 AM 1 Online Autodiscover.Proxy Healthy 5/5/2014 1:34:08 AM 1 Online ActiveSync.Proxy Healthy 5/5/2014 1:34:35 AM 1 Online RWS.Proxy Healthy 5/5/2014 1:34:18 AM 10 NotApplicable Autodiscover Healthy 5/21/2014 10:24:01 PM 2 Online FrontendTransport Healthy 5/15/2014 12:49:31 AM 11 NotApplicable EWS Unhealthy 5/22/2014 7:06:01 AM 2 NotApplicable OWA Healthy 2/23/2014 11:37:56 PM 1 NotApplicable Outlook Healthy 3/8/2014 12:45:14 AM 5 Online UM.CallRouter Healthy 5/22/2014 5:19:41 AM 7 NotApplicable RemoteMonitoring Healthy 8/2/2013 12:58:03 AM 1 NotApplicable POP.Protocol Healthy 5/20/2014 9:22:12 AM 5 NotApplicable IMAP.Protocol Healthy 5/20/2014 9:22:21 AM 5 Online POP.Proxy Healthy 3/7/2014 1:31:10 PM 1 Online IMAP.Proxy Healthy 3/7/2014 1:31:10 PM 1 NotApplicable IMAP Healthy 5/20/2014 9:23:32 AM 2 NotApplicable POP Healthy 5/20/2014 9:17:18 AM 2 NotApplicable Antimalware Healthy 5/15/2014 8:33:13 AM 8 NotApplicable FfoQuarantine Healthy 8/2/2013 12:58:20 AM 1 Online Transport Healthy 5/22/2014 5:38:00 AM 9 NotApplicable Security Healthy 3/8/2014 12:46:09 AM 3 NotApplicable Datamining Healthy 3/8/2014 12:45:44 AM 3 NotApplicable Provisioning Healthy 3/8/2014 12:45:40 AM 3 NotApplicable ProcessIsolation Healthy 3/8/2014 12:47:05 AM 12 NotApplicable TransportSync Healthy 3/8/2014 12:45:37 AM 3 NotApplicable MessageTracing Healthy 3/8/2014 12:44:56 AM 3 NotApplicable CentralAdmin Healthy 3/8/2014 12:45:12 AM 3 NotApplicable OAB Healthy 8/2/2013 1:02:27 AM 3 NotApplicable Calendaring Healthy 8/2/2013 1:02:07 AM 3 NotApplicable PushNotifications.Protocol Healthy 2/23/2014 10:46:17 PM 3 NotApplicable Ediscovery.Protocol Healthy 5/21/2014 10:38:16 PM 1 NotApplicable HDPhoto Healthy 5/6/2014 9:36:25 AM 1 NotApplicable Clustering Healthy 3/8/2014 12:45:34 AM 4 NotApplicable DiskController Healthy 4/22/2014 2:51:30 AM 1 NotApplicable MailboxSpace Healthy 5/22/2014 6:16:51 AM 96 NotApplicable FreeBusy Healthy 5/22/2014 5:32:54 AM 1 Note: Subsequent detected alerts are suppressed until the health set is healthy again. |