Category Network Security Groups

Automating virtual machine management – Designing Compute Solutions-1

What to watch out for

Power Automate is only for simpler workflows and is not suitable when deeper or more advanced integration is required.

In this section, we have briefly looked at the many different compute technologies available in Azure. PaaS options are fully managed by the platform, allowing architects and developers to focus on the solution rather than management. However, when traditional IaaS compute options are required, such as virtual machines, security and OS patches must be managed yourself. Next, we will look at the native tooling that Azure provides to make this management easier.

Automating virtual machine management

Virtual machines are part of the IaaS family of components. One of the defining features of VMs in Azure is that you are responsible for keeping the OS up to date with the latest security patches.

In an on-premise environment, this could be achieved by manually configuring individual servers to apply updates as they become available; however, in many organizations, more control is required; such as, for example, the ability to have patches verified and approved before mass roll out to production systems, control when they happen, and control reboots when required.

Typically, this could be achieved using Windows Server Update Services (WSUS) and Configuration Manager, part of the Microsoft Endpoint Manager suite of products. However, these services require additional management and setup, which can be time-consuming.

As with most services, Azure helps simplify managing VM updates with a native Update Management service. Update Management uses several other Azure components, including the following:

  • Log Analytics: Along with the Log Analytics agent, reports on the current status of patching for a VM
  • PowerShell Desired State Configuration (DSC): Required for Linux patching
  • Automation Hybrid Runbooks / Automation Account: Used to perform updates

Automation Account and Log Analytics workspaces are not supported together in all regions, and therefore you must plan when setting up Update Management. For example, if your Log Analytics workspace is in East US, your automation account must be created in East US 2.

See the following link for more details on region pairings: https://docs.microsoft.com/en-gb/azure/automation/how-to/region-mappings.

When setting up Update Management, you can either create the Log Analytics workspaces and automation accounts yourself or let the Azure portal make them for you. In the following example, we will select an Azure VM and have the portal set up Update Management.

What to watch out for – Designing Compute Solutions

In general, AKS and Kubernetes are more complicated than other technologies, especially Azure native alternatives such as App Service or Azure Functions. Additional tools are often required to better monitor and deploy solutions, which can sometimes lead to security concerns for some organizations. Although these can, of course, be satisfied, there is more work involved in setting up and using AKS for the first time.

Kubernetes is also designed to host multiple services and therefore may not be cost-effective for smaller, more straightforward applications such as a single, basic website. As an example, the recommended minimum number of nodes in a production AKS cluster is three nodes. In comparison, a resilient web app can be run on just a single node when using Azure App Service.

App Service

App Service is a fully managed hosting platform for web applications, RESTful APIs, mobile backend services, and background jobs.

App Service supports applications built in ASP.NET, ASP.NET Core, Java, Ruby, Node.js, PHP, and Python. Applications deployed to App Service are scalable, secure, and adhere to many industry-standard compliance standards.

App Service is linked to an App Service plan, which defines the amount of CPU and RAM available to your applications. You can also assign multiple app services to the same App Service plan to share resources.

For highly secure environments, Application Service Environments (ASEs) provide isolated environments built on top of VNets.

App Service is, therefore, best suited to web apps, RESTful APIs, and mobile backend apps. It can be easily scaled by defining CPU and RAM-based thresholds and are fully managed, so you do not need to worry about security patching or resilience within a region.

What to watch out for

Because App Service is always running, is always costs – that is, it is never idle. However, using automated scaling can at least ensure a minimal cost during low usage, and scale-out with additional instances in response to demand.

Container Instances – Designing Compute Solutions

Virtual machines offer a way to run multiple, isolated applications on a single piece of hardware. However, virtual machines are relatively inefficient in that every single instance contains a full copy of the operating system.

Containers wrap and isolate individual applications and their dependencies but still use the same underlying operating systems as other containers running on the host – as we can see in the following diagram:

Figure 7.4 – Containers versus virtual machines

This provides several advantages, including speed and the way they are defined. Azure uses Docker as the container engine, and Docker images are built-in code. This enables easier and repeatable deployments.

Because containers are also lightweight, they are much faster to provision and start up, enabling applications based on them to react quickly to demands for resources.

Containers are ideal for a range of scenarios. Many legacy applications can be containerized relatively quickly, making them a great option when migrating to the cloud.

Containers’ lightweight and resource-efficient nature also lends itself to microservice architectures whereby applications are broken into smaller services that can scale out with more instances in response to demand.

We cover containers in more detail later in this chapter, in the Architecting for containerization and Kubernetes section.

What to watch out for

Not all applications can be containerized, and containerization removes some controls that would otherwise be available on a standard virtual machine.

As the number of images and containers increases in an application, it can become challenging to maintain and manage them; in these cases, an orchestration layer may be required, which we will cover next.

Azure Kubernetes Service (AKS)

Microservice-based applications often require specific capabilities to be effective, such as automated provisioning and deployment, resource allocation, monitoring and responding to container health events, load balancing, traffic routing, and more.

Kubernetes is a service that provides these capabilities, which are often referred to as orchestration.

AKS stands for Azure Kubernetes Service and is the ideal choice for microservice-based applications that need to dynamically respond to events such as individual node outages or automatically scaling resources in response to demand. Because AKS is a managed service, much of the complexity of creating and managing the cluster is taken care of for you.

The following shows a high-level overview of a typical AKS cluster and it is described in more detail in the Azure Kubernetes Service section later in this chapter:

Figure 7.5 – AKS cluster

AKS is also platform-independent – any application built to run on the Kubernetes service can easily be migrated from one cluster to another regardless of whether it is in Azure, on-premise, or even another cloud vendor.

As already stated, we cover containers and AKS in more detail later in this chapter, in the Architecting for containerization and Kubernetes section.

What to watch out for – Designing Compute Solutions

Azure Batch is, of course, not suited to interactive applications such as websites or services that must store files locally for periods – although, as already discussed, it can output results to Azure Storage.

Service Fabric

Modern applications are often built or run as microservices, smaller components that can be scaled independently of other services. To achieve greater efficiency, it is common to run multiple services on the same VM. However, as an application will be built of numerous services, each of which needs to scale, managing, distributing, and scaling, known as orchestration, can become difficult.

Azure Service Fabric is a container orchestrator that makes the management and deployment of software packages onto scalable infrastructure easier.

The following diagram shows a typical Service Fabric architecture; applications are deployed to VMs or VM scale sets:

Figure 7.3 – Azure Service Fabric example architecture

It is particularly suited to .NET applications that would traditionally run on a virtual machine, and one of its most significant benefits is that it supports stateful services. Service Fabric powers many of Microsoft’s services, such as Azure SQL, Cosmos DB, Power BI, and others.

Tip

When building modern applications, there is often discussion around stateful and stateless applications. When a client is communicating with a backend system, such as a website, you need to keep track of those requests – for example, when you log in, how can you confirm the next request is from that same client? This is known as state. Stateless applications expect the client to track this information and provide it back to the server with every request – usually in the form of a token validated by the server. With stateful applications, the server keeps track of the client, but this requires the client to always to use the same backend server – which is more difficult when your systems are spread across multiple servers.

Using Service Fabric enables developers to build distributed systems without worrying about how those systems scale and communicate. It is an excellent choice for moving existing applications into a scalable environment without the need to completely re-architect.

What to watch out for

You will soon see that there are many similarities between Service Fabric and AKS clusters – one of the most significant differences between the two is portability. Because Service Fabric is tightly integrated into Azure and other Microsoft technologies, it may not work well if you need to move the solution to another platform.

Comparing compute options – Designing Compute Solutions

Each type of compute has its own set of strengths; however, each also has its primary use cases, and therefore, might not be suitable for some scenarios.

Virtual machines

As the closest technology to existing on-premise systems, VMs are best placed for use cases requiring either fast migration to the cloud or those legacy systems that cannot run on other services without reworking the application.

The ability to quickly provision, test, and destroy a VM makes them ideal for testing and developing products, especially when you need to ascertain how a particular piece of software works on different operating systems.

Sometimes a solution may have stringent requirements around security in that they cannot use shared compute. Running such applications on VMs helps ensure processing is not shared. Through the use of dedicated hosts, you can even provision your physical hardware to run those VMs on.

What to watch out for

To make VMs scalable and resilient, you must architect and deploy supporting technologies or configure the machines accordingly. By default, a single VM is not resilient. Failure of the physical hardware can disrupt services, and the servers do not scale automatically.

Building multiple VMs in availability sets and across Availability Zones can protect you against many such events, and scale sets allow you to configure automatic scaling. However, these are optional configurations and may require additional components such as load balancers. These options require careful planning and can increase costs.

Important note

We will cover availability sets and scale sets in more detail in Chapter 14, High Availability and Redundancy Concepts.

Azure Batch

With Azure Batch, you create applications that perform specific tasks, which run in node pools. Node pools can contain thousands of VMs that are created, run a task, and are then decommissioned. No information is stored on the VMs themselves. However, the input and output of datasets can be achieved by reading and writing to Azure storage accounts.

Azure Batch is suited to the parallel processing of tasks and high-performance computing (HPC) batch jobs. Being able to provision thousands of VMs for short periods, combined with per-second billing, ensures efficient costs for such projects.

The following diagram shows how a typical batch service might work. As we can see, input files can be ingested from Azure Storage by the Batch service, which then distributes it to a node in a node pool for processing. The code that performs the processing is held within Azure Batch as a ZIP file. All output is then sent back out to the storage account:

Figure 7.2 – Pool, job, and task management

Some examples of a typical workload may include the following:

  • Financial risk modeling
  • Image and video rendering
  • Media transcoding
  • Large data imports and transformation

With Azure Batch, you can also opt for low-priority VMs – these are cheaper but do not have guaranteed availability. Instead, they are allocated from surplus capacity within the data center. In other words, you must wait for the surplus compute to become available.

Understanding different types of compute – Designing Compute Solutions-2

  • Scalability

Different services have different methods for scaling. Legacy applications may need to use traditional load balancing methods by building VMs in web farms with load balancers in front to distribute the load.

Modern web applications can make use of App Service or Azure Functions, which scale automatically without the need for additional components.

  • Availability

Each Azure service has a Service-Level Agreement (SLA) that determines a baseline for how much uptime a service offers. The mix of components used can also affect this value. For example, a single VM has an SLA of 95%, whereas two VMs across Availability Zones with a load balancer in front has an SLA of 99.99%.

Azure Functions and App Service have an SLA of 99.95% without any additional components.

Important note

Service-Level Agreements (SLAs) define specific metrics by which a service is measured. In Azure, it is the amount of time any particular service is agreed to be available for. This is usually measured as a percentage of that uptime – for example, 99.95% (referred to as three and a half nines) or 99.99% (referred to as four nines). Your choice of components and how they are architected will impact the SLA Microsoft offers.

An SLA of 99.95% means up to 4.38 hours of downtime a year is allowed, whereas 99.99% means only 52.60 minutes are permitted.

  • Security

As services move from IaaS to PaaS and FaaS, the security responsibility shifts. For VMs, Microsoft is responsible for the physical security and underlying infrastructure, whereas you are responsible for patching, anti-virus software, and applications that run on them. For PaaS and FaaS, Microsoft is also responsible for security on the service. However, you need to be careful of different configuration elements within the service that may not be compliant with your requirements.

For some organizations, all traffic flow needs to be tightly controlled, especially for internal services; most PaaS solutions support this but only as a configurable option, which can sometimes increase costs.

  • Cost

FaaS provides a very granular cost model in that you pay for execution time. Whereas IaaS and some PaaS demand you provision set resources based on required CPU and RAM. For example, a VM incurs costs as long as it is running, which is continual for many use cases.

When migrating existing legacy applications, this may be the only option, but it isn’t the most efficient from a cost perspective. Refactoring applications may cost more upfront but could be cheaper in the long run as they only consume resources and incur costs periodically.

Similarly, a new microservice built to respond to events on an ad hoc basis would suit an Azure function, whereas the same process running on a VM would not be cost-effective.

  • Architecture styles

How an application is designed can directly impact the choice of technology. VMs are best suited to older architectures such as N-tier, whereas microservice and event-driven patterns are well suited to Azure Functions or containerization.

  • User skills

Azure provides several technologies for no-code development. Power Automate, and the workflow development system, is specifically built to allow end users with no development knowledge to quickly create simple apps.

As you can see, to decide on a compute technology, you must factor in many different requirements. The following chart shows a simple workflow to help in this process:

Figure 7.1 – Compute options workflow

Next, we will look in more detail at each service and provide example use cases.

Load balancing and advanced traffic routing – Network Connectivity and Security

Many PaaS options in Azure, such as Web Apps and Functions, automatically scale as demand increases (and within limits you set). For this to function, Azure places services such as these behind a load balancer to distribute the load between them and redirect traffic from unhealthy nodes to healthy ones.

There are times when either a load balancer is not included, such as with VMs, or when you want to provide additional functionality not provided by the standard load balancers – such as the ability to balance between regions. In these cases, we have the option to build and configure our load balancers. You can choose several options, each providing its capabilities depending on your requirements.

Azure Load Balancer

Azure Load Balancer allows you to distribute traffic across VMs, allowing you to scale apps by distributing load and offering high availability. If a node becomes unhealthy, traffic is not sent to us, as shown in the following diagram:

Figure 8.16 – Azure Load Balancer

Load balancers distribute traffic and manage the session persistence between nodes in one of two ways:

  • The default is a five-tuple hash. The tuple is composed of the source IP, source port, destination IP, destination port, and protocol type. Because the source port is included in the hash and the source port changes for each session, clients might be using different VMs between sessions. This means applications that need to maintain a state for a client between requests will not work.
  • The alternative is source IP affinity. This is also known as session affinity or client IP affinity. This mode uses a two-tuple hash (from the source IP address and destination IP address) or a three-tuple hash (from the source IP address, destination IP address, and protocol type). This ensures that a specific client’s requests are always sent to the same VM behind the load balancer. Thus, applications that need to maintain state will still function.

Load balancers can be configured to be either internally (private) facing or external (public), and there are two SKUs for load balancers – Basic and Standard. The Basic tier is free but only supports 300 instances, VMs in availability sets or scale sets, and HTTP and TCP protocols when configuring health probes. The standard tier supports more advanced management features, such as zone-redundant frontends for inbound and outbound traffic and HTTPS probes, and you can have up to 1,000 instances. Finally, the Standard tier has an SLA of 99.99%, whereas the basic tier offers no SLA.