Exchange DAG witness in Failed State

Exchange DAG Witness is in Failed State.

WARNING: Database availability group ‘DAG01’ witness is in a failed state. The database availability group requires the witness server to maintain quorum. Please use the Set-DatabaseAvailabilityGroup cmdlet to re-create the witness server and the directory.

To try and resolve the issue on the DAG , lets have a look at the failover cluster settings using failover cluster manager.

In Failover Cluster Manager you will notice the following warnings.

When clicking on the Warning next to Name, you will see at the bottom of the page that the cluster resource is offline and the File Share Witness is Failed.

Now lets go ahead and right click on the Cluster Name and click on Bring resource online.

After a few minutes of trying to bring the cluster resource back online , you will see the following error.

To try and resolve the issue , I will create new file share on another server and then run the following Exchange PowerShell cmdlets.

Set-DatabaseAvailabilityGroup -Identity DAG01 -WitnessServer EXCAS01 -WitnessDirectory D:\EXDAG01_Witness

After running the cmdlet I was prevented with the following error.

So by now this error starting making me talk in funny languages , so I decided to take a green tea break.

After the tea break I released there was an post from Microsoft regarding the same issue and they have provided a work around for this.

Cause

If you use the Move Group functionality to manually move a group from one node to another, or if the cluster fails, a Possible Owners list is built for the entire group and for all of the resources in the group. The behavior that is described in the “Summary” section of this article can occur if one of the resources in the group that does not come online does not have an online node listed in the Possible Owners list.

Resolution

To work around this behavior, bring the node that was taken offline back online. Move the group back to the node that you just brought online, and then check the Possible Owners list of each resource to verify that the node on which the group did not come online is listed.

If you cannot bring the node online because the node truly failed, add the proper node to the Possible Owners list:

In Cluster Administrator, open the General properties of each resource and review the Possible Owners list.After you find the resource that has a blank Possible Owners list and a dimmed Modify button, note the name of this resource
Open a command prompt and type the following command, where name of resource is the name of the resource that you noted in step 1:cluster resname of resource /listowners
A list of possible owners of the resource is displayed. This list indicates that the only possible owner is the node that is down.
From the command prompt, run the following command to add the other node to the Possible Owners list, where missing node is the node that is down:
cluster res name of resource /addowner:missing node
In Cluster Administrator, bring the group that was previously in the offline state online again.

NOTE: After the group comes online, if you have problems reviewing the resources in Cluster Administrator, quit Cluster Administrator, and then reconnect to the cluster.

Lets apply this work around on the Exchange environment, lets start by running the following command in cmd and PowerShell.

Cluster res (Press Enter)

As you can notice that the “File Share Witness” resource is in a failed state. Lets go ahead and view the “OwnerList” of this resource.

Next I have switched to PowerShell to run the following command to get the Cluster Resource.

Import-Module FailoverClusters

1	Import-Module FailoverClusters

Get-ClusterResource "File Share Witness (\\excas01.m.biz\EXDAG01.m.biz)"

1	Get-ClusterResource "File Share Witness (\\excas01.m.biz\EXDAG01.m.biz)"

After running the PowerShell cmdlet , I was yet again faced with another error. 🙁 fun times.

Now I need to re-create the “File Share Witness” resource object in “Failover Clustering” , so lets start doing that by by running the following cmdlet in Exchange PowerShell.

Set-DatabaseAvailabilityGroup -Identity DAG1

1	Set-DatabaseAvailabilityGroup -Identity DAG1

The”Set-DatabaseAvailabilityGroup”cmdlet did not resolve and I received the below Error.

Next lets run the Cluster Quorum Wizard to temporary change from Node and File Share Majority to Node Majority.

Rerun the Set-DatabaseAvailablityGroup cmdlet with the WitnessServer and WitnessDirectory Parameters, and after some much fun 🙂 we have the cluster back alive again.

Lets confirm all is good by running the following cmdlet in Exchange PowerShell.

Get-DatabaseAvailabilityGroup DAG01 -Status |ft Name,WitnessServer,WitnessDirectory,WitnessShareInUse

1	Get-DatabaseAvailabilityGroup DAG01 -Status \|ft Name,WitnessServer,WitnessDirectory,WitnessShareInUse

The below order can be followed to resolve the issues in the future.

In FailOver Cluster Manager to the Quorum to Node Majority
Then run the Set-DatabaseAvailablityGroup cmdlet again with the WitnessServer and WitnessDirectory Parameters
Verify that DAG Status is health and not showing any WARNINGS or Errors.

#ThatLazyAdmin