Friday 16 January 2015

Invoking Pool failover when one Pool is the CMS

There are a lot of articles talking about Lync Front End Pool failover on the internet. Some are Technet articles relating to Standard Edition Servers like this one. Whilst others talk about the concept.

I wanted to go through a very real world scenario that I did this week, with it coming up with a few real world experiences I wanted to share with you.

So let's take a look at the scenario:

The topology is as follows:

1. There is a Front End Pool hosting the CMS in Site 1
2. There is a Front End Pool in Site 2
3. The Front End Pools are Backup Registrars for each other
4. The Edge Pool in each site has the retrospective Front End Pool in the same site as its next Hop
5. Access Edge for external users in Site 1 is:
6. Access Edge for external users in Site 2 is:
7. Both Front End Pools have their own Backend SQL databases in the site the Front End Pool is in, we are not stretching any SQL services over sites.

First things first, before failover over the Pool to the Backup Registrar there's other things to consider.

So how do we make the user that is currently connecting to connect to the other Access Edge? In normal circumstances you don't have to worry about this as hopefully things won't break, but I found for the Lync client to successfully connect to a secondary Access Edge FQDN I had to ensure the second record was in public DNS with a different weight to the primary, or preferred Access Edge FQDN. Then in the event all your Edge servers were down, inaccessible, or failed it would automatically connect to the other Access Edge.


And utilizing the Weight:

If you didn't want to do this then simple updating your _sip records to point to an alternative Access Edge would be absolutely sufficient, or manually updating each client (not fun) to the alternative FQDN.


Now we've got that out of the way, let's look at invoking Pool failover when one Front End Pool is the CMS!

1. First of all confirm Replication is fine using Get-CsManagementStoreReplicationStatus, there's absolutely no point performing this failover if replication is broken (unless you are in a real DR scenario!)

2. If at this point we try to fail over the Pool in Site 1 that is the CMS, we will get a failure saying it is the CMS and the CMS must be moved, so let's do that. We simply use the command Invoke-CsManagementServerFailover, as the pools are paired this command will know to move the CMS over to the Backup Registrar, which is the Front End Pool in Site 2. Select Y to confirm the move.

3. Once the CMS is failed over we now need to perform another task before we can invoke a pool failover, we need to move the Edge Pool dependency. We do this using the Set-CsEdgeServer command (we can also use the Topology Builder, but the management shell is far simpler here, and in DR scenarios we may be stating away from publishing topologies until we are in a semi working state).

What we are doing here is removing the association of the Edge Pool in Site 1 with the Front End Pool in Site 1, and now binding it to the Front End Pool in Site 2.

4. We are now in the position to fail the pool over using the Invoke-CsPoolFailover command. This is simply run as 'Invoke-CsPoolFailover –PoolFQDN "Site 1 Front End Pool FQDN"'. If this is an actual disaster then we can also add the –DisasterMode switch, but if the pool is up and working there's no need to add this.

That's it, you'll find all users will switch over to using the other pool. Moving the users back is just a case of running 'Invoke-CsPoolFailBack –PoolFQDN "Site 1 Front End Pool FQDN"' – and don't forget to move the CMS back if you so desire!

Oliver Moazzezi - MVP Exchange Server

1 comment:

shyam said...

Nice job. Great explanation.