10 lessons learned while automating elementary container deployment tasks on Linux

November 8, 2021

Our experiences preparing for a demo at Ansiblefest 2021

Speakers: Karoly “Charlie” Vegh and Robert “Bob” Baumgartner
Supported by: Chris Jung, Phil Griffith, Eric Lavarde, Elle Lathram

Introduction and Purpose

AnsibleFest is always pretty cool, all new announcements, live demos, roundtable discussions, etc. Bob, being the Kubernetes/Container Architect and my humble self being more a Linux and Automation guy, we debated what kind of talk we could deliver together, and we agreed on a topic that overlapped both of our expertises: Container task automation on Linux. 

It turns out we got ourselves into quite an experience (and you should too! Many open source conferences have Calls for Papers running throughout the year, submit your talks to share know-how!), and we thought it would make sense to share the lessons learned.

This article is about the top ten key technical and non-tech takeaways from our project. About conferences, about containers, about the new Ansible tools, about project preparations, and also some deeper dive tech lessons we learned while building the use-case demonstration for AnsibleFest. 

Anyway, here’s a teaser video for the session to get a feeling for the content before we dive into the list of lessons learned:

( …and here’s the link to the whole AnsibleFest recording: https://events.ansiblefest.redhat.com/widget/redhat/ansible21/sessioncatalog/session/16243547159370014zgW )

…but now let’s get down to it:

The Lessons, Part1, the non-hacking lessons:

0x1: The non-tech lesson, the human side: Yes, you can deliver content people are interested in.

At any given conference the audience’s experience background can be assumed as quite diverse. Whatever the depth and focus of your talk will be, it will fit many of the participants. Not all, obviously, but even at an introductory session like this one (starting with Podman 101 and Ansible Automation Platform 2.x 101 we still got very positive feedback thanking us for “this was very helpful and they have learned a lot”.)

Ergo: Take a deep breath, and submit your talks (even if they aren’t completely created yet) to conferences. Others (and you too!) will learn and profit from them. 

0x2: The project management lesson: When automating, start simple, prepare by defining the project basics:

Where do you start? Use-cases, technology definitions, efficiency, up-to-date software

  • Use-cases are king in automation projects. Define them well, clarify the limits of what you want to implement as well as what you do NOT intend to address. See the teaser of this talk as an example, we pinned down that we’re about to automate some container deployment procedures, but do not intend to build a new Ansible-based container-orchestration platform. 
  • Define what technology you are going to deal with. In our case, containers: Podman on RHEL. RHEL is available for everyone through a Developer Subscription. Podman is included in it. Podman is OCI compliant, hence what you now have is an Enterprise Linux plus OCI containers. A good start. 
  • Reuse, reduce, recycle – efficiency is key. Look up what Ansible content you have readily available. In our case: To automate Podman we used the Podman Ansible Collection (among others). 
  • Be up-to-date: Make sure you start with current software versions. In our case: RHEL 8.4 and Podman 3.2. Ansible Automation Platform 2.0 was available with Ansible Core 2.11, so we went with that.
  • Be future oriented: Use the modern features. We looked into the AAP’s new Execution Environments and planned to use those.

0x3: The Field of technology lesson: Learn the technology you’ll automate.

Get some understanding and a good feeling for the technology you intend to automate. In our case: Podman. And Podman has grown quite a bit!

Last, but very important: Have someone to talk to with experience in the field. I thankfully had the one and only Robert Baumgartner as Container Know-How Carrier Instance and could convince him to support the project.

0x4: The development lesson: You’ll develop some code – create a development workflow!

  • Versioning: Yes, you’ll need git. You have a GitHub or GitLab account, don’t you?
  • Roles: Break your use-cases into Ansible Roles, obviously. For portability. 
  • Collections: Build your own Ansible Collection, add dependencies and your new roles.
  • Publish your Collection to Ansible Galaxy, the Red Hat Automation Hub or to your corp’s Private Automation Hub. 
  • Build your Execution Environments with the newly created Collection and its dependencies
  • Create a container repo on your container registry for your EE image to upload (we used quay.io, created a publicly available repo there, uploaded the locally built EE image instead of trying to get quay.io build it from Containerfile) 
  • (you might want to script/automate these steps, for you’ll rebuild every now and then)

0x5: The UseCase lesson: Automation projects rise and fall with their use-cases – define them clearly

Ask yourself the question: “What is the bare minimum target I want to automate?” – and take it from there, extending the use-case step by step. Keep an eye on the capabilities of the Ansible Collection you’re using. In our case: 

Bob built a fairly simple setup of two containers: One being the DB backend (PostgreSQL) and the other being the application layer (node.js frontend). 

We defined: 

  • Usecase1: Rootless standalone containers. 
  • Usecase2: Rootless containers sharing a pod. 
  • Root containers communicating via a Podman (CNI) network
  • Root containers, replaying an OCP extracted Kubefile

0x6: The Complexity lesson: the KISS principle. Start with simple, classical, adopt more complex/modern features gradually.

  • Start with an ansible-playbook implementation
  • Then move on to ansible-navigator with Execution Environments pulled from your container registry
  • Move on to Automation Controller (ex Tower), import your EE, run there.

So far so good. Now let’s get technical.

The Lessons, Part2, the actual implementation lessons:

0x7: Implementation lesson #1: About building your EE image on Quay.

We intended to deliver only a Containerfile definition of my EE to quay.io to get Quay build our image itself, but since building our EE required accessing the AAP 2.x EEs, and we couldn’t get Quay authenticate to registry.redhat.io within some minutes, we opted for building our EE image locally with ansible-builder and making our Container repo on quay.io public for pulls. Was simpler that way.

0x8: Implementation lesson #2: About building your EE image on Quay.

When we moved on to Usecase2 and Usecase3, with Pods, it took a while to slap our foreheads and understand that the network configuration (portmapping) has to be configured on Pod level, not container level. 

0x9: Implementation lesson #3: ansible-navigator is very helpful at debugging.

If you need to understand what ansible-navigator is doing within your container, and how it is running ansible-playbook within it, have a look at the “–ll debug” option, the log level debug option. Setting that to debug allows you to see how ansible-navigator runs podman and how podman runs ansible-playbook:

Alternative: text for copypastability: 
--------------------------------------------------------8<------------------------------------


(navigator_venv) [ansible@AAP testenv]$ ansible-navigator --ll debug run usecase3_rootcontainers_networked.yml --vault-password-file /home/ansible/gitignored_secret --eei quay.io/kvegh/podautee -e @/home/ansible/vault-auth.yml -i inventory/hosts -m stdout -u ansible 

PLAY [UseCase3 create rootcontainers with network] *****************************

TASK [Gathering Facts] *********************************************************
ok: [rhel-pod-aut42.kveghdemo.at]
ok: [rhel-pod-aut41.kveghdemo.at]

TASK [kvegh.podman_autodemo.clean_env : clean all nonroot containers] **********
changed: [rhel-pod-aut42.kveghdemo.at]
changed: [rhel-pod-aut41.kveghdemo.at]

TASK [kvegh.podman_autodemo.clean_env : clean all root containers] *************
changed: [rhel-pod-aut41.kveghdemo.at]
changed: [rhel-pod-aut42.kveghdemo.at]

TASK [kvegh.podman_autodemo.clean_env : clean all nonroot containers] **********
^CTerminated
(navigator_venv) [ansible@AAP testenv]$ grep "container engine invocation" ansible-navigator.log | tail -1 
211020165954.123 DEBUG 'ansible-runner.wrap_args_for_containerization' container engine invocation: podman run --rm --tty --interactive -v /home/ansible/projects/podman_autodemo/testenv/:/home/ansible/projects/podman_autodemo/testenv/ --workdir /home/ansible/projects/podman_autodemo/testenv -v /home/ansible/:/home/ansible/ -v /home/ansible/projects/podman_autodemo/testenv/inventory/:/home/ansible/projects/podman_autodemo/testenv/inventory/ -v /tmp/ssh-JHIy2UTAtTkH/:/tmp/ssh-JHIy2UTAtTkH/ -e SSH_AUTH_SOCK=/tmp/ssh-JHIy2UTAtTkH/agent.2151411 -v /home/ansible/.ssh/:/home/runner/.ssh/ --group-add=root --ipc=host -v /tmp/ansible-navigator_tqcjp_gu/artifacts/:/runner/artifacts/:Z -v /tmp/ansible-navigator_tqcjp_gu/:/runner/:Z --env-file /tmp/ansible-navigator_tqcjp_gu/artifacts/b768e2a9-c7db-4a18-9718-4b788262d27a/env.list --quiet --name ansible_runner_b768e2a9-c7db-4a18-9718-4b788262d27a quay.io/kvegh/podautee:latest ansible-playbook /home/ansible/projects/podman_autodemo/testenv/usecase3_rootcontainers_networked.yml --vault-password-file /home/ansible/gitignored_secret -e @/home/ansible/vault-auth.yml -u ansible -i /home/ansible/projects/podman_autodemo/testenv/inventory/hosts
(navigator_venv) [ansible@AAP testenv]$ 
(navigator_venv) [ansible@AAP testenv]$ 
(navigator_venv) [ansible@AAP testenv]$ podman run --rm --tty --interactive -v /home/ansible/projects/podman_autodemo/testenv/:/home/ansible/projects/podman_autodemo/testenv/ --workdir /home/ansible/projects/podman_autodemo/testenv -v /home/ansible/:/home/ansible/ -v /home/ansible/projects/podman_autodemo/testenv/inventory/:/home/ansible/projects/podman_autodemo/testenv/inventory/ -v /tmp/ssh-JHIy2UTAtTkH/:/tmp/ssh-JHIy2UTAtTkH/ -e SSH_AUTH_SOCK=/tmp/ssh-JHIy2UTAtTkH/agent.2151411 -v /home/ansible/.ssh/:/home/runner/.ssh/ --group-add=root --ipc=host -v /tmp/ansible-navigator_tqcjp_gu/artifacts/:/runner/artifacts/:Z -v /tmp/ansible-navigator_tqcjp_gu/:/runner/:Z --env-file /tmp/ansible-navigator_tqcjp_gu/artifacts/b768e2a9-c7db-4a18-9718-4b788262d27a/env.list --quiet --name ansible_runner_b768e2a9-c7db-4a18-9718-4b788262d27a quay.io/kvegh/podautee:latest ansible-playbook /home/ansible/projects/podman_autodemo/testenv/usecase3_rootcontainers_networked.yml --vault-password-file /home/ansible/gitignored_secret -e @/home/ansible/vault-auth.yml -u ansible -i /home/ansible/projects/podman_autodemo/testenv/inventory/hosts

PLAY [UseCase3 create rootcontainers with network] *******************************************************************************************************************************************

TASK [Gathering Facts] ***********************************************************************************************************************************************************************
ok: [rhel-pod-aut41.kveghdemo.at]
ok: [rhel-pod-aut42.kveghdemo.at]

-------------------------------------------------------->8------------------------------------

...this can come quite handy if for some reason ansible cannot run, and you need information why. 
Code language: JavaScript (javascript)

0xa: Implementation lesson #3: ansible-navigator is very helpful at executing too:

I got to use some debugging features (see lesson 0x9), when Navigator executed Podman, Podman executed ansible-playbook, but it couldn’t log in to the remote endnotes to be managed, for it couldn’t authenticate itself from within the EE container. I am using SSH keys for Ansible to login, and those keys were (obviously!) not available within the EE, since the EE image is to be shared. Now what, how do I get those keys used? Turns out navigator is pretty cool at that too, if you start an ssh agent, and add the private key, it will map it into the runtime environment for ansible to use:

(navigator_venv) [ansible@AAP testenv]$ eval `ssh-agent -s` 
Agent pid 2155429
(navigator_venv) [ansible@AAP testenv]$ ssh-add -d /home/ansible/.ssh/id_rsa
Could not remove identity "/home/ansible/.ssh/id_rsa": agent refused operation
(navigator_venv) [ansible@AAP testenv]$ Code language: JavaScript (javascript)

Also, ansible-navigator bind-mounts the local directory into the runtime environment too, in case you needed to add some files at runtime. 

0xb: Implementation lesson #4: Open Source really is your friend.

Halfway through the demo project we ran into an issue where ansible-navigator failed to display output in case we used a custom EE (which we did). In a strict corporate environment we would’ve needed a Support Request to get this analyzed and fixed, but pretty soon we found out that this is a known issue, and there is a fix for it already in upstream, only it hasn’t yet been packaged and productized. 

The workaround? Create a new Python virtual environment (venv), and pip install ansible-navigator into it. Boom, workaround pulled in directly from Upstream, it worked right, we could proceed until the fix finds its way into an AAP2 version.

0xc: Implementation lesson #5: Accessing encrypted vault-like data for playbook use: Custom Credentials

I mentioned that we used AAP 2.x, and we used the Automation Controller as well to demonstrate running our Usecase4 through it. Now, our usecases included accessing container image registries that need authentication with sensitive data (our personal credentials). This can be simply solved by a vault and a –vault-password-file on the commandline. 

Now in Automation Controller you’ll need Custom Credentials for that, here’s the article where Jan-Piet Mens explained this in a straightforward manner.

EOFArticle and Summary

I actually wanted to list up to 0xd, 0xe … 0x10 to fulfill the promise of the title, but ran out of lessons learned. Hope this is useful, let us (Karoly Vegh and Robert Baumgartner) know if you have thoughts and/or feedback on LinkedIn – here’s the LinkedIn post for the session: https://www.linkedin.com/posts/kvegh_teamwork-conference-community-activity-6849382145731940352-nIB2