Beginning with Exchange 2013 SP1 Microsoft introduced the IP-less DAG as an option. In this article, we explore how to create an IP-less DAG, as well as the pros and cons of deploying one.
Note: With Exchange 2016 IP-less DAGs will become the default configuration.
Why would I want to do this?
Quite simply, its easier to set up.
With an IP-less DAG, you don’t need to pre-stage a Cluster Name Object (CNO) in Active Directory. This is especially useful for organizations that have implemented the split-permission model. You also don’t need to burn an IP address for the cluster.
Any downsides?
Couple gotchas.
Before you implement a DAG without an Administrative Access Point (AAP) you need to check compatibility with 3rd party programs. Backup software is typically the sticking point for migrating to the new DAGs. You need to make sure your 3rd party software doesn’t require an AAP.
The lack of an AAP means the cluster cannot be managed with Failover Cluster Manager either. But the Exchange Team doesn’t want you messing around in there anyway–and for good reason–they want Exchange to manage the cluster. Let Exchange do the heavy lifting for you.
Finally, there is no conversion process to take an IP-based DAG to an IP-less DAG. You will need to create a new DAG.
Choosing an OS (maybe)
While Windows Server 2008 R2 and above support Exchange 2013 DAGs, only 2012 R2 can support an IP-less DAG. If you wish to go with 2008 R2 or 2012 RTM, you will need to create an IP-based DAG instead. For that purpose, I recommend Paul Cunningham’s blog post here.
Are IP-less DAGs the only benefit of going with 2012 R2?
Actually, no. If I haven’t convinced you yet then consider these two additional benefits.
The first is that 2012 includes clustering in its Standard Edition. This can result in some serious cost savings compared to 2008 R2 Enterprise. Especially if you plan on building a 16 member DAG.
The second is 2012 introduced the concept of dynamic quorum. Dynamic quorum automatically adjusts the votes needed to maintain quorum as servers go offline. Take, for example, a traditional five node DAG. To maintain a quorum three servers must remain online.
If three of the five servers were to go offline quorum would be lost and the databases would dismount.
With a dynamic quorum, if two of the servers went offline their votes would be removed. At this point, the quorum is recalculated for the three remaining servers. To maintain quorum only two of the three remaining servers would need to be online. Should a third server go offline, that server’s vote would be removed and, the quorum would be recalculated for the two remaining servers. In many situations, a dynamic quorum can successfully navigate a ‘last man standing’ scenario where only a single server remains operational.
As servers come back online votes are assigned back and the quorum is recalculated.
Lab Setup
In our example lab, we will have two multi-role servers. We also have a file server that will host our File Share Witness (FSW) directory. Based on the Exchange Team’s preferred architecture we will use a single network for all replication and MAPI traffic. We will leave the DAG to configure its own networking. We will have two copies of each database. Both servers hosting an active database. Our lab will look like this.
Configure the File Share Witness
Before we create the DAG we must set the appropriate permissions on the server destined to host our File Share Witness (FSW).
Note: If you plan to use a split-role Client Access server for the File Share Witness you can skip this section. The permissions are already in place. Keep in mind that in 2016 you will not be able to split the CAS role out anymore.
To configure permissions:
- From the file server click Start >> Administrative Tools >> Computer Management.
- Expand Local Users and Groups.
- Select Groups.
- Double-click the Administrators local group.
- Click Add.
- Type Exchange Trusted Subsystem and click Check Names.
- Click Ok twice.
Deploying an IP-less DAG
Now that our file share witness is ready, let us create a DAG. In Exchange Admin Center navigate to Servers >> Database Availability Groups and select the + (New) button.
On the DAG creation page specify the following criteria.
DAG name (1) – Whatever you wish to make it (we simply named ours ‘DAG’).
Witness server (2) – If you leave this blank Exchange looks for a split-role CAS server. Otherwise, enter the name of a non-Exchange server (in our example we specified our file server FQDN of ‘fs.skaro.local).
Witness directory (3) – If you do not specify a directory Exchange will create a default directory in the root of C: (in our example we went with C:\Witness).
IP address (4) – For an IP-less DAG specify 255.255.255.255.
Click Save.
Adding members to the DAG
Now that the DAG is created we need to add our mailbox servers. You can see from the screenshot below that the Member Servers column is currently empty. Select the DAG and click the Manage DAG membership () icon.
Click the + (Add) icon.
Select the servers you wish to add to the DAG and click the Add button. Click Ok.
Click the Save button.
It’s at this point Exchange installs the failover cluster components on each mailbox server, checks to see whether the server already belongs to a DAG and if not, adds it to the specified DAG.
Once complete click Close.
You can now see in the Member Server column our two multi-role servers are listed. We now have an IP-less DAG with two member servers and a file share witness. In our next article, we will explore adding database copies to each server. Stay tuned!
How about you? Have you implemented an IP-less DAG yet? How did it go? Any hiccups? We’d love to hear from you. Drop a comment below and let us know about your experience.
Kevin says
Hello Gareth,
First and foremost thank you for the article it was excellent. Question, in an IP-less DAG with 2 nodes or even number members, the quorum model or type is supposed to be Node and File Share Majority. However, my DAG shows it to be Majority, how can this be when I only have 2 Exchange Servers and 1 File Server as a witness? Is there something about the IP-less DAG and dynamic quorum that alters the model? Normally I wouldn’t have noticed but the cluster was going offline randomly and I thought perhaps the reason was related to the incorrect quorum type. I tried changing it with set-clusterquorum -NodeAndFileMajority with no success as that generates an error. I want to avoid deleting the DAG and starting from scratch. Any help would be appreciated.
Thank You
Kevin
Corneel Stirbu says
Hi Gareth,
I’m trying to set in place a DAG across 2 sites within the same forest/domain. The sites are connected with each other using VPN IPsec.
This is the current topology:
Site A: 2 GC domain controllers Windows Server 2012 r2 (one of the two is also owning all the FSMO roles), Filesrv and other servers
Site B: 1 domain controller GC Windows Server 2016 and Exchange Server 2016 hosting approx 250 mailboxes. The exchange server was back in the days placed in the site B because was having a better internet connection.
The plan is to create a second exchange server in Site A and to configure the DAG between the servers.
I’ve created a lab were I am testing this deployment.
So far I’ve prepared all vm’s, and there is direct connection between SiteA and SiteB (no filter on the firewall).
Both exchange servers are online and can communicate with each other and with the Witness file server. Also the DC replication between the sites is ok across all domain controllers.
I was trying to configure the IP less model for Exchange 2016, but while adding both exchange servers in DAG, we get this error:
“A server-side database availability group administrative operation failed. Error The operation failed. CreateCluster errors may result from incorrectly configured static addresses. Error: An error occurred while attempting a cluster operation. Error: Cluster API failed: “AddClusterNode() (MaxPercentage=100) failed with 0x5b4. Error: This operation returned because the timeout period expired”. [Server: E2016-2.fci.local]”
Both exchange servers are configured with one network interface: E2016-2 (site A): 192.168.0.209, E2016 (siteB): 10.20.0.209. The DAG has as IP address 255.255.255.255.
Any advice is much appreciated.
Best Regards,
Corneel
Gareth Gudger says
Hi Corneel,
Just to make sure I understand correctly, is the existing Exchange 2016 server in Site B (the one that has 250 users) installed on a domain controller? Or is it a member server?
Corneel Stirbu says
Hi Gareth,
The Exchange2016 is installed in site B on an member server (Windows server 2016). Also in site B there is a Domain Controller (also running from a Windows Server 2016). DC replication between sites in working well (no errors related to dcdiag). Please let me know if you need more details.
BR,
Corneel
Corneel Stirbu says
Hi Gareth,
Just a quick update on this. Rebooting the exchange server (that I couldn’t add at first in the DAG), seemed to have helped. After a reboot I was able to add the second exchange server to the DAG, but we’re not out of the woods yet. The issue now is that in the DAG members list the second added exchange server it is showed as un-operational.
Also the DAGFileShareWitnesses folder was created but is empty. While running the Get-ClusterNode from the un-operational exchange we get this error: “Get-ClusterNode : The remote server has been paused or is in the process of being started”. Get-ClusterNetwork on the operational exchange server returns only one cluster network online: Cluster Network 1 10.20.0.0 255.255.255.0 ClusterAndClient Up. “Get-ClusterNode” on the “up” server reveals this:
Name ID State
—- — —–
E2016 1 Up
E2016-2 2 Down
Gareth Gudger says
Hi Corneel,
Is the Cluster Service running if you go into Services.msc on the failed computer? You may also wish to check the component state by running Get-ServerComponentState -Identity E2016-2. Also have you tried to bring the node up? Is the DAG in DAC mode?
Corneel Stirbu says
Hi Gareth,
The ClusterService is runnig on the failed DAG node. While trying to bring up the node I get as result Joining, but then is failing with this log: “Cluster node ‘E2016-2’ failed to join the cluster because it could not communicate over the network with any other node in the cluster. Verify network connectivity and configuration of any network firewalls.” – Btw Windows Firewall is disabled on both exchange nodes.
The DAG is in DAC mode (DagOnly) and there is also an AlternateWitnessServer configured.
This is the component state of E2016-2:
Server Component State
—— ——— —–
E2016-2.fci.local ServerWideOffline Active
E2016-2.fci.local HubTransport Active
E2016-2.fci.local FrontendTransport Active
E2016-2.fci.local Monitoring Active
E2016-2.fci.local RecoveryActionsEnabled Active
E2016-2.fci.local AutoDiscoverProxy Active
E2016-2.fci.local ActiveSyncProxy Active
E2016-2.fci.local EcpProxy Active
E2016-2.fci.local EwsProxy Active
E2016-2.fci.local ImapProxy Active
E2016-2.fci.local OabProxy Active
E2016-2.fci.local OwaProxy Active
E2016-2.fci.local PopProxy Active
E2016-2.fci.local PushNotificationsProxy Active
E2016-2.fci.local RpsProxy Active
E2016-2.fci.local RwsProxy Active
E2016-2.fci.local RpcProxy Active
E2016-2.fci.local UMCallRouter Active
E2016-2.fci.local XropProxy Active
E2016-2.fci.local HttpProxyAvailabilityGroup Active
E2016-2.fci.local ForwardSyncDaemon Inactive
E2016-2.fci.local ProvisioningRps Inactive
E2016-2.fci.local MapiProxy Active
E2016-2.fci.local EdgeTransport Active
E2016-2.fci.local HighAvailability Active
E2016-2.fci.local SharedCache Active
E2016-2.fci.local MailboxDeliveryProxy Active
E2016-2.fci.local RoutingUpdates Active
E2016-2.fci.local RestProxy Active
E2016-2.fci.local DefaultProxy Active
E2016-2.fci.local Lsass Active
E2016-2.fci.local RoutingService Active
E2016-2.fci.local E4EProxy Active
E2016-2.fci.local CafeLAMv2 Active
E2016-2.fci.local LogExportProvider Active
Corneel Stirbu says
Hi Gareth,
I’ve forgot to mention an important detail for this troubleshooting. The E2016-2 was joined in the DAG, but the database was dismounted. This should be the reason why this node is down in the Cluster. The problem is that I can’t bring up any database on E2016-2.
“Failed to mount database “E2016-2-DB1″. Error: An Active Manager operation failed. Error: The database action failed. Error: An error occurred while trying to validate the specified database copy for possible activation. Error: E2016-2: Server ‘E2016-2.fci.local’ is not up according to the Windows Failover Cluster service. [Database: E2016-2-DB1, Server: E2016.fci.local]”
The creation of a database copy from E2016 to E2016-2 is working fine but i can’t activate the copy on the second node (Server is not up according to the Windows Failover Cluster service)
One again thank you for your time and support!
Gareth Gudger says
Hey Corneel,
Are there any network firewalls between the two sites that could be blocking ports?
Corneel Stirbu says
Hi Gareth,
The sites are configured on the same firewall as separate hardware switch, but the traffic on ALL ports is allowed between the sites (as being a lab). In production the sites are connected via IPsec. Have you came across a 2 node IP less DAG between 2 sites connected via IPsec?
I’ve tried as well the other way around – to add first the e2016-2 in the cluster. In this situation the E2016-2 is listed as UP in the cluster but I’m unable to add the E2016.
Not sure what I’m missing, but if I can be sure that this is the right design for this situation I can go ahead and deploy this in the production.
Corneel Stirbu says
Hi Gareth,
I’ll share also the tracert results from booth nodes.
From E2016 (10.20.0.209)
Tracing route to e2016-2.fci.local [192.168.0.209]
over a maximum of 30 hops:
1 <1 ms <1 ms <1 ms 10.20.0.254
2 1 ms <1 ms <1 ms e2016-2.fci.local [192.168.0.209]
From E2016-2 (192.168.0.209)
Tracing route to E2016.fci.local [10.20.0.209]
over a maximum of 30 hops:
1 <1 ms <1 ms <1 ms 192.168.0.254
2 1 ms <1 ms <1 ms E2016.fci.local [10.20.0.209]
megatube says
We can still create a traditional DAG. Transition from traditional DAG to DAG without an administrative access point is not supported and there is no way to transition except creating new DAG and moving mailboxes.
Gareth Gudger says
That is correct.
AN says
Where u have 2 nodes in primary site A and 2 nodes in standby site B and a witness server in site C, how will dynamic quorum work when the link between A and B is down?
Emmanuel Bentil says
Hi,
I have Two Exchange servers in Head Office with 10.20.90.0/24 subnet and i have other two 10.20.100.0/24 subnet in DR, I have already created DAG for the Head Office and it is working fine, but when i add the DR Exchange to Head Office Cluster i get the below error…
A server-side database availability group administrative operation failed. Error The operation failed. CreateCluster errors may result from incorrectly configured static addresses. Error: An error occurred while attempting a cluster operation. Error: Cluster API failed: “AddClusterNode() (MaxPercentage=100) failed with 0x5b4. Error: This operation returned because the timeout period expired”
Ahsan says
Hi Gareth ,
I am kind of new to Exchange. I have Deployed Exchange 2016 CU4 on Windows 2016 Std in a LAB environment.
I have configured IP-Less DAG.
Now my question is, how do the clients access Exchange on Outlook or OWA.
What will be the URL to access it?
I have following:
Ex01 – IP 192.168.0.1
Ex02 – IP 192.168.0.2
Wintess server – IP 192.168.0.10
Do I need to have some DNS entry like
192.168.0.5 -> mail.exch.com
If yes, than where will this IP point to?
Thanks in advance for your help and advise.
Gareth Gudger says
Hey Ahsan,
You will want to configure a namespace, virtual directories and a certificate on your Exchange servers. Essentially what FQDNs you specify in your virtual directories is what your clients will connect to. Check this article.
https://supertekboy.com/2015/09/17/install-exchange-2016-in-your-lab-part-4/
Secondly, you will want to load balance the client connects between the two servers. You can do this quickly with something like DNS Round Robin. Or you can put a load balancer in front of it such as a Kemp Load Balancer. I have an article on configuring a Kemp with Exchange 2016 here.
https://supertekboy.com/2015/11/17/configure-kemp-load-balancer-for-exchange-2016/
Al says
Hi Gareth,
Is there any outage caused or required when I create the IP less DAG on my current stand alone Mailbox server ?
Gareth Gudger says
Not when you create the DAG itself, but when you add a member (server) to the DAG it is installing Windows Clustering Services behind the scenes as well as making various other configuration changes and service restarts. I would plan for downtime on this one.
tom says
maybe a dumb question, but how do i deliver the mail to the server if there is no ip adress? i cant open firewall ports to all mailserver
Gareth Gudger says
Hey Tom. Great question. You would want some form of load balancing behind your firewall for port 25 mail flow. For example a Kemp Load Balancer or IIS ARR. Your load balancer would have a Virtual IP (VIP). You would NAT port 25 on your firewall to that VIP. The load balancer would then distribute mail flow evenly between the servers. The DAG itself is not involved with the transport of mail.
Jozef Woo says
Hi Gareth, you can’t load balance SMTP traffic with ARR as far as I know?
Sukhrob says
We have 2 MBX and 2 CAS servers( second CAS server is virtual server).
I have done all the things that were shown here, except creating CNO, we are using second CAS server as FWS. But our witness directory is empty there is nothing, no files and etc. Today i did some tests, i turned off my mailbox server which contain active database copies and my Outlook lost connection till i have turned on my mailbox server again. So i hope you can help me to resolve the issue. Thank you.
Gareth Gudger says
Hey Sukhrob,
Something is definitely amiss. That directory should not be empty. What cumulative update are all Exchange servers on? Any errors when you created the DAG or added members to it? What is the underlying OS as well? Needs to be 2012 R2 to support IP-less DAGs.
Sukhrob says
Hi.
All Exchange Servers is 2013 SP1 CU4 (Version 15.0 (Build 847.32)) and OS versions is 2012R2. There was an error when i tried add mailbox servers to DAG – “Some or all Identity references could not be translated”. I have read some articles about that error and it says there is nothing that i need worry about and i just ignored and added mailboxes.
Gareth Gudger says
Can you try moving the FSW to another server? Perhaps a File Server (not a DC or Exchange). You will need to do the permissions though for that server.
Sukhrob says
I tried but again the error “Some or all identity references could not be translated.”.
Gareth Gudger says
Hey Sukhrob,
I did a little digging and Paul Cunningham explains this exact behavior here. Looks like it is resolved in CU5. Rather than implement his workaround for CU4 I would upgrade your nodes to the latest CU (CU10) and then recreate the DAG. http://exchangeserverpro.com/error-identity-references-translated-exchange-2013-dag/