Posted on Leave a comment

Chaos Engineering in Azure: Automating Resilience Testing with Terraform & Pipelines

Chaos Engineering in Azure with Chaos Studio

Azure Chaos Studio is Microsoft’s managed Chaos Engineering service, allowing teams to create controlled failure scenarios in a safe and repeatable manner. With fault injection capabilities across compute, networking, and application layers, teams can simulate real-world incidents and enhance their system’s resilience.

Key Features of Azure Chaos Studio:

  • Agent-based and Service-based faults: Inject failures at the infrastructure or application level.
  • Targeted chaos experiments: Apply disruptions to specific resources like VMs, AKS, or networking components.
  • Integration with Azure Pipelines: Automate experiment execution within CI/CD workflows.

Automating Chaos Engineering with Terraform and Azure Pipelines

The repository https://github.com/geralexgr/ai-cloud-modern-workplace provides a ready-to-use automation pipeline that streamlines the deployment and execution of Chaos Engineering experiments.

Terraform for Experiment Setup

Terraform is used to define and deploy chaos experiments in Azure. The repository includes IaC (Infrastructure as Code) to:

  • Provision Chaos Studio experiments.
  • Define failure scenarios (e.g., CPU stress, network latency, VM shutdowns).
  • Assign experiments to specific Azure resources.

Using Terraform ensures that experiments are version-controlled, repeatable, and easily managed across different environments.

Azure DevOps Pipeline for Experiment Execution

A CI/CD pipeline is included in the repository to automate:

  1. Deployment of Chaos Experiments using Terraform.
  2. Execution of Chaos Tests within Azure Chaos Studio.
  3. Monitoring and reporting of experiment results.

This automation allows teams to integrate chaos testing into their release process, ensuring that new changes do not introduce unforeseen weaknesses.

Details

The pipeline consists of two stages. The first one creates the experiment through terraform and the second one will run the experiment that is created from the previous step.

The experiment is designed to target a specific web app, identified via a variable, with the intended action of stopping it. A prerequisite in order to run the experiments would be to work with a user assigned managed identity and provide the necessary IAM actions on the identity.

Finally you can find the result of the experiment on Azure inside Chaos Studio.

By combining Terraform, Azure Chaos Studio, and Azure Pipelines, you can automate and streamline Chaos Engineering in Azure. This approach helps identify system weaknesses early, improves system reliability, and ensures your cloud workloads can handle unexpected failures.

Links:

https://github.com/geralexgr/ai-cloud-modern-workplace

Posted on Leave a comment

Automating chaos experiment execution with Azure DevOps

In the previous article I demonstrated how one can create chaos experiments to test their infrastructure against failures through Azure portal.

In order to automate the experiment execution through Azure DevOps we will need to create a new pipeline and use the task for az cli.

trigger:
- none

variables:
- name: EXP_NAME
  value: chaos-az-down
- name: SUB_NAME
  value: YOUR_SUB_ID
- name: RG_NAME
  value: chaos

pool:
  vmImage: ubuntu-latest
stages:
- stage: chaos_stage
  displayName: Chaos Experiment stage
  jobs:
  - job: run_experiment
    displayName: Run chaos experiment job
    steps:
    - task: AzureCLI@2
      displayName: run experiment to stop app service
      inputs:
        azureSubscription: 'MVP'
        scriptType: 'pscore'
        scriptLocation: 'inlineScript'
        inlineScript: 'az rest --method post --uri https://management.azure.com/subscriptions/$(SUB_NAME)/resourceGroups/$(RG_NAME)/providers/Microsoft.Chaos/experiments/$(EXP_NAME)/start?api-version=2023-11-01'

When we run the pipeline we will see that the task succeeded.

Finally the experiment execution will start automatically.

Links:

https://learn.microsoft.com/en-us/azure/chaos-studio/chaos-studio-tutorial-agent-based-cli

Posted on Leave a comment

Deploying kubernetes applications with 2-clicks | Azure DevOps & Terraform

When you read the title you may think that this article can be a clickbait. That’s the reason you should continue reading until this end to figure out that deploying k8s application with Azure DevOps and terraform can be very easy when you create everything through infrastructure as code.

In this example we will utilize Azure DevOps pipelines and terraform to deploy a yaml definition on an AKS cluster that runs on Azure. For this output we will need three steps.

The first step is to create an AKS cluster on Azure. When we have the infrastructure ready we can then continue and bind Azure DevOps pipelines with the AKS resource so that we can deploy on the cluster. The last step is to have the yaml definition of the application that we need to deploy and run the application deployment process inside azure devops.

The project is structure as shown in the below picture.

  • The code folder contains the yaml k8s definition file.
  • The iac_aks creates the AKS cluster inside Azure
  • The iac_devops creates the Azure Devops resources needed (Service connection with AKS)
  • And finally the azure-pipeline and application-pipeline are the pipelines that will run the automation and do the job.

In order to try out the example the first think that you need to do is to create a variable group inside azure devops and store two values. The first value will be the secret Personal access token that will be used to create the Azure DevOps resources. The second one is the URL of your Azure DevOps organization.

When those are set you will need to change the tfvars files and add the names that you prefer for the resources creation. Finally you can have your deployment ready with just two clicks. One for the infra pipelines and one for the application pipeline.

Code is hosted on Github
https://github.com/geralexgr/globalazuregreece2024

Posted on Leave a comment

Pass terraform provider variables as secrets

Many times you need to provide values in provider information when using terraform. Lets take as an example the below code block. The azuredevops provider needs some variables in order to deploy successfully and we need to pass those values as secrets because they contain sensitive information.

terraform {
required_providers {
azuredevops = {
source = "microsoft/azuredevops"
version = ">=1.0.0"
}
}
}

provider "azuredevops" {
org_service_url = URL
personal_access_token = TOKEN
}

We should never hardcode such information in the application as this information may get leaked. In order to pass those as secrets we will need to create a variable group or standalone variables and place the secrets there.

Then we will need to create some terraform variables and pass the values for those through the pipeline.

variable "org_service_url" {
description = "The URL of your Azure DevOps organization."
}

variable "personal_access_token" {
description = "The personal access token for authentication."
}

The provider block should be updated accordingly.

provider "azuredevops" {
org_service_url = var.org_service_url
personal_access_token = var.personal_access_token
}

Finally we pass those values through the pipeline step by providing those with -var argument on terraform.

    - task: TerraformTaskV4@4
displayName: terraform apply
inputs:
provider: 'azurerm'
command: 'apply'
workingDirectory: '$(System.DefaultWorkingDirectory)/src/iac_devops'
commandOptions: '-var="org_service_url=$(URL)" -var="personal_access_token=$(PAT)"'
environmentServiceNameAzureRM: 'SUBSCRIPTION'

Finally the pipeline will succeed.