Quick start guide to the smallest OpenShift cluster for Windows workload

August 21, 2023

Introduction

As a Solution Architect for Red Hat’s ecosystem, I talk to many independent software vendors (ISVs) about modernizing their applications to enable hybrid cloud and edge strategies. Linux, containers, Kubernetes and micro-services architecture are a default choice nowadays in many new application development projects. But sometimes you can not get rid of some dependencies in your legacy stack as quickly as you would like to. Maybe your monolithic application with decades of development invested is still at the core of your stack. Slicing it into microservices will take years. Or it has dependencies on libraries or drivers for periphery that are only available on Microsoft Windows. Or you just have to run it a bit longer until a certain product version will finally reach its end of life.

Red Hat OpenShift provides several ways to run this legacy workload close to your modern workload. In bare metal setups, OpenShift Virtualization is one way to run Windows virtual machines within a Kubernetes-cluster. VM workload can run on the same host as your containers for highest density and easy network setups, especially at the edge where compute capacity may be sparse. If applications are already using .Net Core, Red Hat even provides base container images with Linux for it. Windows containers can be an interesting option when you are somewhere in between. While they are often huge in size, they can help development teams for Windows applications to benefit from automation capabilities. But though they are also called containers, they still require a Windows host to run. After years in previews and with long lists of limitations, OpenShift supports adding Windows worker nodes to the cluster now for quite some time and the technology has matured. 

This blog post is a quick start guide to setup a small mixed cluster for Linux and Windows containers with Red Hat OpenShift. To minimize hardware requirements (maybe for deployments at the edge – or if you are in a limited lab environment like me), I will walk you through adding a new Windows worker node to an OpenShift cluster running on a single node (SNO). 

In datacenter and cloud environments you may leverage a lot of automation in the OpenShift installer and control plane to achieve an installation like this. But this post describes how it can run in a “bring your own” scenario (with manually set up VMs running on a KVM host in this case). So when you’re looking for a fast track to the smallest OpenShift cluster running Windows containers or you’re searching inspiration for your new edge architecture, hopefully this post is valuable to you. Please note that while it works, some aspects of this architecture may be unsupported at this stage and require a Red Hat support review before taking it into production.

Overview of lab setup

My lab setup is depicted in the diagram below. I will be running both nodes for this scenario, the SNO (in red) and Windows worker (in green), side by side as virtual machines on a bare metal RHEL 8 installation hosted at Hetzner. Since I am following the concept of “user provisioned infrastructure” (UPI) for OpenShift and “bring your own host” (BYOH) for the Windows worker, the instructions below will also work in any other x86 environment (bare metal, virtualized or cloud) and do not rely on automation capabilities of the underlying platform. I also use this RHEL host as a bastion for connecting to the nodes and executing CLI commands (such as “oc”).

The virtualized setup allows me to quickly repurpose my lab and also implement supporting networking functions that you would expect in any production IT environment – and which are prerequisites for many OpenShift installations.

My DNS records for OpenShift Ingress- and API-traffic are hosted at AWS Route 53 (in purple) for easy access during demos, but again this could be any other nameserver present in your environment (e.g. on your bastion system). The arrows give a rough indication of the communication flow when accessing the OpenShift console or workload ingress.

Here I am also running a HAProxy instance in a container running on Podman on my RHEL bastion host. This serves as load balancer in other scenarios I normally run on the same host. This could be three node OpenShift clusters with a high available control plane. It’s also handy when adding a second Linux worker to the SNO and running my ingress router in HA mode. If you’re only planning a single node of OpenShift without any intention to scale for more Linux workers (as in this blog post) this is actually not required.

lab overview diagram

If you are interested to learn more about setting up OpenShift labs with limited hardware, check out the RedHat-EMEA-SSA-Team GitHub repository.  

Prerequisites Overview

In summary the prerequisites for this lab are:

  • One system for SNO with minimum 8 vCPU cores, 16 GB RAM and 120 GB Storage
  • One system as Windows worker with minimum 2 vCPU cores, 8 GB RAM and 100 GB Storage
  • Three DNS entries pointing to your SNO for API- and Ingress-traffic. For a cluster named “ocp4” in the “example.com” domain, the entries should look like this:
    • api.ocp4.example.com
    • api-int.ocp4.example.com
    • *.apps.ocp4.example.com
  • A linux bastion or workstation to run commands for installation and CLI clients
  • Internet connectivity for all systems

Technical implementation

In this blog I will guide you through the following high-level steps for adding a Windows worker node to an OpenShift cluster running on a single node:

  • Setting up a new SNO (optional if you already have a working cluster)
  • Setting up the new Windows worker node
  • Configuration of an existing OpenShift cluster to accept Windows nodes
  • Verifying the cluster setup with a new deployment

This is designed as an opinionated quick start guide for the cluster installation based on many defaults and out-of-the-box settings on a given release. Please confer the Red Hat OpenShift documentation for Windows Container Support with more configuration options and details.

Prerequisites check for existing OpenShift clusters

If you already have a running OpenShift cluster (or SNO) that matches the pre-requisites, you may skip the following section.

Please make sure that the cluster was set up with the “OVNKubernetes” network plugin, which is the default for SNO and newer versions of OpenShift in general. Verify this either in your install-config.yaml (… networking -> networkType: OVNKubernetes …) or with the following oc command:

oc get network.operator.openshift.io -o jsonpath="{.items[0].spec.defaultNetwork.type}"Code language: Shell Session (shell)

Windows worker nodes will only work with this network type and unfortunately this can only be set at installation time. If your existing cluster configuration does not match or you don’t have an OpenShift cluster yet, you may use the following instructions to quickly set up a SNO.

Installing a new SNO

I will walk you through the local agent-based installation which is most likely to work in lab setups. Login to https://console.redhat.com with your Red Hat account. Create a free account and use a 60 day evaluation subscription for OpenShift in case you don’t have one already. Then navigate to the OpenShift installer at https://console.redhat.com/openshift/install/platform-agnostic/agent-based.

Download and extract the installer, pull secret and CLI tools for your bastion system from section 1 of the page.

Create a install-config.yaml file for your new OpenShift cluster like this:

apiVersion: v1
baseDomain: example.com
compute:
- name: worker
  replicas: 0
controlPlane:
  name: master
  replicas: 1
metadata:
  name: ocp4
networking:
  clusterNetwork:
  - cidr: 10.128.0.0/14
    hostPrefix: 23
  machineNetwork:
  - cidr: 192.168.50.0/24
  networkType: OVNKubernetes
  serviceNetwork:
  - 172.30.0.0/16
platform:
  none: {}
fips: false
pullSecret: '{"auths": …}'
sshKey: 'ssh-ed25519 AAA…'Code language: YAML (yaml)

Replace the base domain “example.com” and clustername “ocp4” for your environment, matching your DNS records. Adjust the machine network for the subnet that both your nodes will reside in (192.168.50.0/24 in my lab). Copy in the content of the pull-secret.txt you downloaded before and your ssh public key to access the cluster.  

If you don’t have a SSH key already, create one like this (make sure to not use a passphrase):

ssh-keygen  -t ed25519 -N '' -f ./clusterkeyCode language: Shell Session (shell)

In this example, use the content of clusterkey.pub in the sshKey section at the bottom of your install-config.yaml.

Now create the agent-config.yaml file to provide the installer with configuration for your new node. My file with a static IP config for my node looks like this:

apiVersion: v1alpha1
kind: AgentConfig
metadata:
  name: ocp4
rendezvousIP: 192.168.50.10
hosts:
  - hostname: master-0
    interfaces:
      - name: enp1s0
        macAddress: 00:ef:44:21:e6:a5
    networkConfig:
      interfaces:
        - name: enp1s0
          type: ethernet
          state: up
          mac-address: 00:ef:44:21:e6:a5
          ipv4:
            enabled: true
            address:
              - ip: 192.168.50.10
                prefix-length: 23
            dhcp: false
      dns-resolver:
        config:
          server:
            - 192.168.50.1
      routes:
        config:
          - destination: 0.0.0.0/0
            next-hop-address: 192.168.50.1
            next-hop-interface: enp1s0
            table-id: 254Code language: YAML (yaml)

The installer will delete the config files after it’s run. So create a dedicated folder for the installation files (here ./install-dir) and copy the agent-config.yaml and install-config.yaml over there:

mkdir ./install-dir
cp agent-config.yaml ./install-dir
cp install-config.yaml ./install-dirCode language: Shell Session (shell)

Before running the installer, make sure to install the required dependencies:

sudo dnf install /usr/bin/nmstatectl -yCode language: Shell Session (shell)

Run the installer to create an iso image for the node installation:

./openshift-install --dir ./install-dir agent create imageCode language: Shell Session (shell)

Now boot your designated SNO system from the created image at install-dir/agent.x86_64.iso file.

This will take some time, you can see status updates on the console like this:

The system will automatically reboot during this process. You can also track the progress of the installation from your bastion host:

./openshift-install --dir install-dir/ agent wait-for bootstrap-complete --log-level=info
./openshift-install --dir install-dir/ agent wait-for install-complete --log-level=infoCode language: Shell Session (shell)

While you’re waiting, have a break or skip ahead and start to prepare your Windows node.

After successful installation, login to your newly provisioned OpenShift cluster. For CLI login:

export KUBECONFIG=<full-path-to>/install-dir/auth/kubeconfig
oc login --insecure-skip-tls-verify=true --username=kubeadmin --password=<kubeadmin-password>Code language: Shell Session (shell)

And via your browser at: https://console-openshift-console.apps.ocp4.example.com/ 

Replace cluster name and domain name in that URL with your settings, accept the security risk for unknown certificates and login as kubeadmin with the provided password from the install-dir/auth/kubeadmin-password file.

Preparing a new Windows system

Now that you have a working OpenShift cluster, go ahead and prepare a Windows node on the same network. In this blog I am working with Windows Server 2022 Datacenter Edition Build 10.0.20348. Microsoft provides a download for a Windows Server 2022 eval at: https://info.microsoft.com/ww-landing-windows-server-2022.html 

Login as Administrator and configure the system to run on the same network as your SNO with DNS lookup. It’s IP 192.168.50.20 in my environment.

Then set up ssh connectivity for the node via these PowerShell commands:

Add-WindowsCapability -Online -Name OpenSSH.Server
Add-WindowsCapability -Online -Name OpenSSH.Client
Set-Service -Name ssh-agent -StartupType 'Automatic'
Set-Service -Name sshd -StartupType 'Automatic'
Start-Service ssh-agent
Start-Service sshdCode language: PowerShell (powershell)

You may have to reboot after the capability installation. Find more details about this in the Microsoft documentation

Now you should be able to login to the Windows node via ssh from the bastion like this:

ssh [email protected]Code language: Shell Session (shell)

You will end up with a cmd.exe prompt and can switch to Windows PowerShell via the “powershell” command. Do not change the OpenSSH configuration for a different prompt. Next is to enable a SSH connection for the administrator user also via keys. For production systems confer the Microsoft documentation, here is the quick start:

If you don’t have a key yet, create one on your bastion like this:

ssh-keygen -N '' -f ./windowskeyCode language: Shell Session (shell)

Copy the content of the windowskey.pub into the authorized key configuration file on the Windows node. Then adjust file permissions via the following PowerShell commands:

"ssh-rsa AAA..." | Set-Content -Path 'C:\ProgramData\ssh\administrators_authorized_keys'
# Fix permission
$acl = Get-Acl C:\ProgramData\ssh\administrators_authorized_keys
$acl.SetAccessRuleProtection($true, $false)
$administratorsRule = New-Object system.security.accesscontrol.filesystemaccessrule("Administrators","FullControl","Allow")
$systemRule = New-Object system.security.accesscontrol.filesystemaccessrule("SYSTEM","FullControl","Allow")
$acl.SetAccessRule($administratorsRule)
$acl.SetAccessRule($systemRule)
$acl | Set-AclCode language: PowerShell (powershell)

Also configure firewall access to the container logs:

New-NetFirewallRule -DisplayName ContainerLogsPort -Direction Inbound -Action Allow -Protocol TCP -LocalPort 10250 -EdgeTraversalPolicy AllowCode language: PowerShell (powershell)

Preparing the SNO for Windows Workers

With SNO and a Windows system running, let’s introduce both nodes to each other. For the networking of Windows pods in the Kubernetes cluster, we have to configure a hybrid OVNKubernetes overlay config like this:

oc patch networks.operator.openshift.io cluster --type=merge \
  -p '{
    "spec":{
      "defaultNetwork":{
        "ovnKubernetesConfig":{
          "hybridOverlayConfig":{
            "hybridClusterNetwork":[
              {
                "cidr": "10.132.0.0/14",
                "hostPrefix": 23
              }
            ],
            "hybridOverlayVXLANPort": 9898
          }
        }
      }
    }
  }'Code language: Shell Session (shell)

Next task is to install the Windows Machine Config Operator (WMCO) from the OperatorHub in OpenShift. We’re going with all the defaults. We also enable monitoring on the namespace so that later on we’ll be able to see metrics from our Windows pods as well. For CLI, you can create a wmco-install.yaml file with the following content:

apiVersion: v1
kind: Namespace
metadata:
  name: openshift-windows-machine-config-operator
  labels:
    openshift.io/cluster-monitoring: "true"

---

apiVersion: operators.coreos.com/v1
kind: OperatorGroup
metadata:
  name: windows-machine-config-operator
  namespace: openshift-windows-machine-config-operator
spec:
  targetNamespaces:
  - openshift-windows-machine-config-operator

---

apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
  name: windows-machine-config-operator
  namespace: openshift-windows-machine-config-operator
spec:
  channel: "stable"
  installPlanApproval: "Automatic"
  name: "windows-machine-config-operator"
  source: "redhat-operators"
  sourceNamespace: "openshift-marketplace"Code language: YAML (yaml)

Create the resources specified in the yaml as per:

oc create -f wmco-install.yamlCode language: Shell Session (shell)

This will install the latest version of WMCO, v8.0.1 in my case. When running on public cloud or VMware environments, WMCO may even help with automatically scaling Windows workers. Here we are choosing the Bring Your Own Host (BYOH) approach as lined out in the documentation.

For that we’re adding the SSH key for the Windows node as a secret to OpenShift:

oc create secret generic cloud-private-key --from-file=private-key.pem=./windowskey -n openshift-windows-machine-config-operatorCode language: Shell Session (shell)

WMCO watches the windows-instances configmap in its namespace for newly added or removed Windows nodes. Create a windows-instances.yaml file like this:

kind: ConfigMap
apiVersion: v1
metadata:
  name: windows-instances
  namespace: openshift-windows-machine-config-operator
data:
  192.168.50.20: |-
    username=AdministratorCode language: YAML (yaml)

And apply it to the cluster:

oc create -f windows-instances.yamlCode language: Shell Session (shell)

You can watch the log of the WMCO pod processing the configmap and configuring the new node. All necessary installations such as the contained runtime and the kubelet are handled by the operator. The Windows host will also automatically reboot during this onboarding process.

After a short time, the Windows worker will appear as the second node on the OpenShift cluster.

We have successfully added a Windows worker node to a single node OpenShift cluster.

Testing the setup

You can create a win-test-deployment.yaml with the following content to quickly check if everything is working as expected. This file will create a windows-demo namespace with a sample deployment of a small webserver, service and route. Please make sure to adjust the spec of the route to match your *.apps wildcard domain.

apiVersion: v1
kind: Namespace
metadata:
  name: windows-demo

---

apiVersion: v1
kind: Service
metadata:
  name: win-webserver
  namespace: windows-demo
  labels:
    app: win-webserver
spec:
  ports:
    # the port that this service should serve on
  - port: 80
    targetPort: 80
  selector:
    app: win-webserver
  type: LoadBalancer

---

apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: win-webserver
  name: win-webserver
  namespace: windows-demo
spec:
  selector:
    matchLabels:
      app: win-webserver
  replicas: 1
  template:
    metadata:
      labels:
        app: win-webserver
      name: win-webserver
    spec:
      tolerations:
      - key: "os"
        value: "Windows"
        Effect: "NoSchedule"
      containers:
      - name: windowswebserver
        image: mcr.microsoft.com/windows/servercore:ltsc2022
        imagePullPolicy: IfNotPresent
        command:
        - powershell.exe
        - -command
        - $listener = New-Object System.Net.HttpListener; $listener.Prefixes.Add('http://*:80/'); $listener.Start();Write-Host('Listening at http://*:80/'); while ($listener.IsListening) { $context = $listener.GetContext(); $response = $context.Response; $content='<html><body><H1>You have successfully deployed your first Windows Container Workload!</H1></body></html>'; $buffer = [System.Text.Encoding]::UTF8.GetBytes($content); $response.ContentLength64 = $buffer.Length; $response.OutputStream.Write($buffer, 0, $buffer.Length); $response.Close(); };
        securityContext:
          runAsNonRoot: false
          windowsOptions:
            runAsUserName: "ContainerAdministrator"
      nodeSelector:
        kubernetes.io/os: windows
      os:
        name: windows

---

kind: Route
apiVersion: route.openshift.io/v1
metadata:
  name: win-route
  namespace: windows-demo
  labels:
    app: win-webserver
spec:
  host: win-route-default.apps.ocp4.example.com
  to:
    kind: Service
    name: win-webserver
    weight: 100
  port:
    targetPort: 80
  wildcardPolicy: NoneCode language: YAML (yaml)

Create the windows-demo namespace and all objects in the cluster via the OpenShift client:

oc create -f win-test-deployment.yamlCode language: Shell Session (shell)

The first deployment may take a while because the container image has to be downloaded to the node. The Windows container image in this deployment is about 1.75GB and so a tad bigger than typical Linux base images. Explore the created resources like you would with a Linux container. The pod object also shows events, logs and even a terminal running cmd.exe.

Navigate to your route at http://win-route-default.apps.ocp4.example.com/ (again adjust cluster name and domain for your environment). You should see the following message, served by the Windows container:

Congratulations!

Next steps with the cluster

Please note that the Windows worker has a taint configured to avoid scheduling of pods onto it.

This is a good idea, so we can avoid that Kubernetes tries to schedule any Linux workload here (which would result in ErrImagePull and ImagePullBackOff errors). This is annoying for your normal workload, but can even break your cluster control plane and upgrades when scheduling for those components ends on the wrong node. 

But if there is a taint – why did our test deployment work before? Have a closer look at the spec of the win-webserver deployment. It specifies a nodeselector for “kubernetes.io/os: windows” and has defined tolerations for our taint “os: Windows”. So that means it will deploy anyway. 

As a next learning item you may want to explore Labels, NodeSelector and RuntimeClass objects for pod scheduling in your mixed clusters.  The Kubernetes documentation provides an introduction to the differences of Linux and Windows containers and what to watch out for.

If you are familiar with containers on Windows, but new to OpenShift, there may be a few new elements to learn for you. Built-in security mechanisms in this enterprise distribution may make it seem harder to onboard your existing containers, but will make the transition to production eventually a lot easier. In an OpenShift cluster, even on Windows nodes, principles such as Security Context Constraints (SCC) will always be applied by default.

Summary

Adding Windows worker nodes to an OpenShift cluster has become a lot easier with the latest releases. It comes down to simply enabling SSH access and Firewall configuration on your Windows node as well as configuring networking and an operator with a secret and small configMap on the OpenShift side. It even works for the smallest clusters, like the single node with minimum requirements shown in this guide.

Running Windows containers in enterprise Kubernetes environments will introduce security requirements to your Windows workload and nodes. You will also see additional day 2 efforts when operating mixed clusters. Before going down the path of Windows containers, make sure your workload is suited for it. Always evaluate this implementation over other concepts such as VMs in OpenShift or Linux containers with .Net Core runtime.