Deploy Azure VM with a Managed Service Identity

The Plan

This is something that I have been working through over the last week or so, I was unable to find any comprehensive documentation on what is required to achieve the outcome I was after. So I am documenting the process here for my own future reference ;)

I want to be able to use Terraform to deploy an application to Azure that will support auto backup & recovery. To do this the application will need to be able to write and read backups from an Azure storage container. In this case the application runs on a Virtual Machine so we will be deploying this as a VM Scale Set. A few Specifics:

  • Deploy Azure VM Scale Set
  • Use a Custom Data script to restore backup on VM creation
  • Use a cron job to create periodic backups to Azure Storage Container
  • Authenticate VM to azure programmatically using a Managed Service Identity

Defining a Role in Terraform

The first thing we need to do is define the Role in terraform that will be assigned to the Managed Service Identity of the VM Scale Set. The definition of the Role should be scoped in a least privilege way. In this case an example of a Role that allows access to Azure Storage Containers could be:

# Get information on the current subscription
data "azurerm_subscription" "subscription" {

}

# Create the backup role definition
resource "azurerm_role_definition" "backup_role" {
  name  = "backup_role"
  scope = data.azurerm_subscription.subscription.id

  permissions {
    actions      = ["Microsoft.Storage/storageAccounts/*", "Microsoft.Storage/storageAccounts/blobServices/containers/read", "Microsoft.Storage/storageAccounts/blobServices/containers/write", "Microsoft.Storage/storageAccounts/blobServices/generateUserDelegationKey/action"]
    not_actions  = []
    data_actions = ["Microsoft.Storage/storageAccounts/blobServices/containers/blobs/read", "Microsoft.Storage/storageAccounts/blobServices/containers/blobs/write", "Microsoft.Storage/storageAccounts/blobServices/containers/blobs/add/action"]
  }

  assignable_scopes = [
    data.azurerm_subscription.subscription.id,
  ]
}

Creating the VM Scale set

The next thing to do is defining the VM Scale Set, this can be defined as normal for everything (Check the Terraform docco). The only notable thing to pay attention to here is the identity should be SystemAssigned as shown below. I am also defining custom_data in the os_profile block as a startup script to restore from backup on launch.

resource "azurerm_virtual_machine_scale_set" "vm-scale-set" {
  
  -----
  -----
  
  identity {
    type = "SystemAssigned"
  }

  os_profile {
    ----
    custom_data = file("config/restore.sh")
  }  

  -----
  -----

}

Assigning the Role

Now we need to assign the role we defined to the VM Scale set. We can do this using an azure_role_assignment resource:

resource "azurerm_role_assignment" "vm_backup" {
  scope              = data.azurerm_subscription.subscription.id
  role_definition_id = azurerm_role_definition.backup_role.id
  principal_id       = azurerm_virtual_machine_scale_set.vm-scale-set.identity[0].principal_id
}

Note that when deploying this you may get an error something like the following

Error: Invalid index

  on stack.tf line 319, in resource "azurerm_role_assignment" "vm_backup":
 319:   principal_id       = azurerm_virtual_machine_scale_set.vm-scale-set.identity[0].principal_id
    |----------------
    | azurerm_virtual_machine_scale_set.vm-scale-set.identity is empty list of object

The given key does not identify an element in this collection value.

Releasing state lock. This may take a few moments...

This is because at time of writing there is a bug in the azurerm Terraform provider. As annoying as it is. the workaround is quite simple:

  • Remove the azure_role_assignment resource from the stack
  • Deploy the stack
  • Add the azure_role_assignment resource back into the stack
  • Deploy again

Hopefully this is resolved soon, I spent quite a while struggling with this issue.

Authenticate VM with Azure using the Managed Service Identity

With a Managed Service Identity assigned to the VM you can easily authenticate to Azure using the assigned role. In the case I am using the Azure CLI. This becomes as simple as:

az login --identity

Then you can ship your backups to an Azure Storage Container:

az storage blob upload-batch -d backups --account-name backups -s /home/azureuser/backups

You can then easily tie this into a bash script set to run on a cron for periodic backups of your application to Azure. This is an example of backing up an InfluxDB

#!/bin/bash
influxd backup /home/azureuser/backup
az login --identity
az storage blob upload-batch -d backup --account-name backup -s /home/azureuser/backup

Restoring backup form Azure Storage Container on VM launch

This is done by configuring a script very similar to the backup script and setting it as the custom_data script in the azurerm_virtual_machine_scale_set as above. We will also use this startup script to create out backup cronjob:

#!/bin/bash
az login --identity
az storage blob download-batch -d backup --account-name backup -s /home/azureuser/backup
influxd restore /home/azureuser/backup
(crontab -l ; echo "0 1 * * * /home/azureuser/backup.sh >> /home/azureuser/backup.log") | crontab -

If all goes well you should now have automatic backups and restores for you application using Azure Container Storage.