Getting Azure 99.95% SLA for Cisco FTD virtual appliances in Azure

Leveraging ARM templates and Availability Sets

In the real world there are numerous lessons learned, experiences, opinions and vendors recommendations that dictate and what constitutes “best practice” when it comes to internet edge security. It’s a can of worms that I don’t want to open as I am not claiming to be an expert in that regard. I can say that I do have enough experience to know that not having any security is a really bad idea and having bank level security for regular enterprise customers can be excessive.

I’ve been working with an enterprise customer that falls pretty much in the middle of that dichotomy. They are a regular large enterprise organisation that is concerned about internet security and have little experience with Azure. That said, the built in tools and software defined networking principles of Azure don’t meet the requirements they’ve set. So, to accommodate those requirements, moving from Azure NSGs and WAFs and all the goodness that Azure provides to dedicated virtual appliances was not difficult, but, did require a lot of thinking and working with various team members and 3rd parties to get the result.

Cisco Firepower Threat Defence Virtual for Microsoft Azure

From what I understand, Cisco’s next generation firewall has been in the Azure marketplace for about 4 months now, maybe a little longer. Timelines are not that much of a concern, rather, they are a consideration in that it relates to the maturity of the product. Unlike competitors, there is indeed a lag behind in some features.

The firewalls themselves, Cisco Firepower Threat Defence Virtual for Microsoft Azure, are Azure specific Azure Marketplace available images of the virtual appliances Cisco has made for some time. The background again, not that important. It’s just the foundational knowledge for the following:

Cisco FTDv supports 4 x network interfaces in Azure. These interfaces include:

  • A management interface (Nic0) - cannot route traffic over this
  • A diagnostics interface (Nic1) - again, cannot route traffic over this. I found this out the hard way…
  • An external / untrusted interface (Nic2)
  • An internal / trusted interface (Nic3)

So we have a firewall that essentially is an upgraded Cisco ASA (Cisco Adaptive Security Appliance) with expanded feature sets unlocked through licensing. An already robust product with new features.

The design

Availability is key in the cloud. Scale out dominates scale up methodologies and as the old maxim goes: two is better than one. For a customer, I put together the following design to leverage Azure availability sets (to guarantee instance uptime of at least one instance in the set; and to guarantee different underlying Azure physical separation of these resources) and to have a level of availability higher than a single instance. NOTE: Cisco FTDv does not support high availability (out of the box) and is not a stateful appliance in Azure.

Implementation

To deploy a Cisco FTDv in Azure, the quick and easy way is to use the Azure Marketplace and deploy through the portal. It’s a quick and pretty much painless process. To note though, here are some important pieces of information when deploying these virtual appliances from the Azure marketplace:

Going through the wizard is relatively painless and straight forward and within 15-20min you can have a firewall provisioned and ready to connect to your on-premises management server. Yes, another thing to note is that the appliance is managed from Firepower Management Centre (FMC). The FMC, from that I have read, cannot be deployed in Azure at this time. However, i’ve not looked into that tidbit to much, so I may be wrong there.

The problem

In my design I have a requirement for two appliances. These appliances would be in a farm, which is supported in the FMC, and the two appliances can have common configuration applied to both devices- stuff like allow/deny rules. In Azure, without an availability set, there is a small chance, however a chance nonetheless, that both devices could someone be automagically provisioned in the same rack, on the same physical server infrastructure in the Australia East region (my local region).

Availability is a rather large requirement and ensuring that all workloads across upwards of 500+ instances for the customer I was working with is maintained was a tricky proposition. Here’s how I worked around the problem at hand as officially Cisco do not state they “do not support availability sets”.

The solution

Pretty much all resources when working with the Azure Portal have a very handy tab under their properties. I use this tab a lot. It’s the Automation Script section of the properties blade of a resource.

Automation script

After I provisioned a single firewall, I reviewed the Automation Script blade of the instance. There is plenty of good information there. What was particularly is handy to know is the following:

 },
 "storageProfile": {
 "imageReference": {
 "publisher": "cisco",
 "offer": "cisco-ftdv",
 "sku": "ftdv-azure-byol",
 "version": "620362.0.0"
 },
 "osDisk": {
 "osType": "Linux",
 "name": "[concat(parameters('virtualMachines_FW1_name'),'-disk')]",
 "createOption": "FromImage",

So with that, we have all the key information to leverage ARM templates to deploy the firewalls. In practice though, I copied the entire Automation Script 850 line JSON file and put it into Atom. Then I did the following:

Low and behold the firewall instance provisioned just fine and indeed there was an availability set associated with that. Additionally, when I provisioned the second appliance, I followed the same process and both are now in the same availability set. This makes using the Azure Load Balancer nice and easy! Happy days!

For your reference, here’s the availability set JSON I added in my file:

"parameters": [
 {
"availabilitySetName": {
 "defaultValue": "FW-AS",
 "type": "string"
 }

Then you need to add the following under “resources”:

"resources": [
 {
 "type": "Microsoft.Compute/availabilitySets",
 "name": "[parameters('availabilitySetName')]",
 "apiVersion": "2015-06-15",
 "location": "[resourceGroup().location]",
 "properties": {
 "platformfaultdomaincount": "2",
 "platformupdatedomaincount": "2"
 }
 },

Then you’ll also need to add in the resources "type": "Microsoft.Compute/virtualMachines":

 "properties": {
 "availabilitySet": {
 "id": "[resourceId('Microsoft.Compute/availabilitySets', parameters('availabilitySetName'))]"
 },
  "dependsOn": [
 "[resourceId('Microsoft.Compute/availabilitySets', parameters('availabilitySetName'))]",

Those are really the only things that need to be added to the ARM template. It’s quick and easy!

BUT WAIT, THERES MORE!

No, I’m not talking about throwing in a set of steak knives with that, but, there is a little more to this that you dear reader need to be aware of.

Once you deploy the firewall and the creating process finalises and its state is now running, there is an additional challenge. When deploying via the Marketplace, the firewall enters Advanced User mode and is able to be connected to the FMC. I’m sure you can guess where this is going… When deploying the firewall via an ARM template, the same mode is not entered. You get the following error message:

User [admin] is not allowed to execute /bin/su/ as root on deviceIDhere

After much time digging through Cisco documentation, which I am sorry to say is not up to standard, Cisco TAC were able to help. The following command needs to be run in order to get into the correct mode:

~$ su admin
~$ [password goes here] which is Admin123 (the default admin password, not the password you set)

Once you have entered the correct mode, you can add the device to the FMC with the following:

~$ configure manager add [IP address of FMC] [key - one time use to add the FW, just a single word]

The summary

I appreciate that speciality network vendors provide really good quality products to manage network security. Due to limitations in the Azure Fabric, not all work 100% as expected. From a purists point of view, NSGs and the Azure provided software defined networking solutions and the wealth of features provided, works amazingly well out of the box.

The cloud is still new to a lot of people. That trust that network admins place in tried and true vendors and products is just not there yet with BSOD Microsoft. In time I feel it will be. For now though, deploying virtual appliances can be a little tricky to work with.


Questions?

Have a question about this post? Ask away on Twitter or in my AMA repo.