Posted on 2 Comments

Run jobs with containers on Azure batch service

Azure Batch can be a great tool for instant batch processing as it creates and manages a pool of compute nodes (virtual machines), installs the applications you want to run, and schedules jobs to run on the nodes. Sometimes however a container could be a more appropriate solution for simplicity and scaling than a virtual machine. In this guide I will explain how you could use containers for batch service in order to run jobs and tasks.

Use the Azure Compute Gallery to create a custom image pool – Azure Batch | Microsoft Learn

First things first, you will need to have a azure container registry or another public or private registry to store your container image. I have already created mine and pushed my batchcontainer image inside which is a .NET micro service that returns a hello world message as an output.

using System;

namespace samplebatch
{
    internal class Program
    {
        static void Main(string[] args)
        {
            Console.WriteLine($"Hello {args[0]}");
        }
    }
}

https://github.com/geralexgr/samplebatch

The next step would be to create your batch service account. The part on which you set your container as the workload is when you create a pool. Pools consist of the compute node that will execute your jobs and there you will add a new pool which will host containers from the image that you pushed earlier.

On the node selection you will have to select Marketplace on the Image type and specifically microsoft-azure-batch and ubuntu-server-container of 20-04-lts version. Then you will need to select Custom on the container configuration and add your container registry by pressing the hyperlink.

selection of custom container image on the batch service

Then you will need to input the username and password for the container registry as well as the registry URL.

When you have your pool ready you can go and create your job. You can leave the default settings on the job creation but you should specify the pool where the job will run.

Then you can create a task or multiple tasks for your job and provide the commands or inputs for them. In my case I created a task named kati with the command of my name. This will be provided as input in my container which is a .NET microservice that prints a hello world message based on the input.

The important thing to do is to fill the image name from your repository. You can also provide container run options that you want for this node to have like mount of directories etc.

Example: repo.azurecr.io/batchservice:latest

As a result the output would be Hello gerasimos

The output of the run can be found on the stdout.txt file which is located on the task pane. You can also find a stderr.txt file which will log errors/failures that could appear during the execution.

Lastly, you can locate your job execution by navigating in the nodes where you can find a history of your tasks. As you can see I have two successful task executions and non failed.

YouTube video:

Posted on Leave a comment

Maintenance Jobs for build agents explained – Azure DevOps

When you need to scale up your infrastructure, you should enable as much automated options for maintenance as possible. One of the available options for devops agents are included under Organization Settings -> Agent pools -> Settings.

There you can define automated procedures for cleanup on your agent pools.

In my setup, I changed the days to keep unused working directories to 20.

The working directories of the agent are some folders with specific numbers inside C:\agent\work.

Every time a new build is initiated a new folder for this specific run is created. If the same pipeline runs more than one time, then the same working directory will be kept and the files will get overridden. For example lets say my pipeline A is bind to the folder 5. Then every time the pipeline A runs, then the folder 5 will be used for the sources (git repositories) builds, artifacts etc. All previous data hosted there will be deleted and written again.

The maintenance jobs will remove any directories than are not used for x period of days. In my example I had set up 20 days for that task.

You can configure agent pools to periodically clean up stale working directories and repositories. This should reduce the potential for the agents to run out of disk space. Maintenance jobs are configured at the project collection or organization level in agent pool settings.

You can check your history under Organization Settings -> Agent pools -> Maintenance History.

You can also download the log and figure out how much data have been deleted.

Maintenance jobs:

Create and manage agent pools – Azure Pipelines | Microsoft Docs

Video tutorial on YouTube: