Introduction
Conversational AI, which is a set of technologies that enable intelligent virtual assistants (IVA) like chatbots to interact with users, gained a lot of popularity in recent years. The potential use cases are vast – from smart speakers, like Google Home or Amazon Alexa, to customer service related tasks like answering questions, scheduling appointments or connecting customers with the correct department. As a result, the opportunities for businesses to gain value from conversational AI systems are promising. Business Insider predicts that, by the year 2024, consumers may spend $142 billion through transactions with intelligent assistants – up from $2.8 billion USD in 2019. [0]
From a technical perspective, conversational AI is the symbiosis of Machine Learning (ML), which is closely related to Artificial intelligence (AI), and Natural language processing (NLP), the study of interacting with computers through human language. [1] To put it in layman’s terms, the goal is to be able to talk to a machine without entering a predetermined set of commands – and still get the desired results.
Introducing DUSBot
When I started working in Red Hat ‘s Düsseldorf office in September of 2018, I was quickly inspired to start a little side project of mine: a chatbot that would entertain my colleagues and, occasionally, do something useful. Since our main communication tool is Google Chat, I built the first version of the chatbot (which is called DUSBot), in Google Apps Script. As a Google Workspace customer, this approach came with a serverless deployment model as well as a straightforward programming language, monitoring and logging features and much more.
However, interacting with DUSBot always felt a bit unnatural – he seemed limited in his abilities. For example, he would immediately print out a help dialogue whenever user input was not exactly as expected (much like a command line interface tool would). So poor DUSBot was in dire need for an upgrade.
Rasa and OpenShift Pipelines
My interest in chatbots, conversational AI and open source software quickly lead me to Rasa – an open source platform for building intelligent assistants. The San Francisco and Berlin based company Rasa Technologies Inc. does a great job of developing an easy to use and yet powerful framework for building AI assistants. And on top of that, their commitment to open source as well as to maintaining and supporting a healthy community (while sustaining a business) made it an easy choice for me to check it out (to be fair, it may also be related to the many resemblances to the way Red Hat thinks about software, business and people).
Switching to a new framework meant that I could no longer rely on Google to host DUSBot. Thankfully, finding a new home for DUSBot was a quick endeavour, as I operate a private, semi-highly available Red Hat OpenShift Container Platform 4.6 cluster in a data center in Frankfurt, Germany. Since Rasa already offers support for containerized environments, the only thing left to figure out was how to go from source code all the way to production, without breaking poor DUSBot in between.
At this point in time, I had to shamefully admit to myself to never have worked with OpenShift Pipelines in practice before – so I thought I’d seize the opportunity. OpenShift Pipelines is part of OpenShift and is based on the popular open source Project Tekton. Tekton, which is overseen by the Continuous Delivery Foundation (CDF), makes it easy to build CI/CD systems in a Kubernetes-native way. In other words: you can write a bunch of YAML and you end up with a CI/CD pipeline that does not require the maintenance of a central system like Jenkins, Bamboo or similar alternatives. With the building blocks in place and the context established, let’s dive a bit deeper into the technical implementation.
DUSBot in a nutshell
At the time of writing, DUSBot can help with the following things. Due to Rasa’s architecture the list can be expanded quite a bit – the only limits are my Python skills, time and creativity.
- Get information about the current price of a number of crypto currencies (for expert investors, he might even give advice on whether to “hodl” or “buy the dip”).
- Recommend a restaurant near the office.
- Open the door to the parking garage below the office.
- Tell you a joke (mostly involving Chuck Norris).
- Provide you with the top 5 posts from the orange website.
As mentioned before, all of the above can be achieved by telling DUSBot what you need in your own words. No particular input is required. If DUSBot is not exactly sure what you want from him, he’ll let you know in an instant.
Going into the internals of Rasa would be out of scope of this blog post. But, at a high level, DUSBot works like so:
- The bot is taught a bunch of user inputs that it may encounter (called intents). In this context, we provide a bunch of common variations that are used to learn and properly classify user input and deduct the correct intent. For example, the intent of greeting DUSBot can be done in quite a few different ways (hi, hello, what’s up, hey, etc.).
- We then teach DUSBot what to do based on said inputs (called actions). Those actions can either be immediate responses like greeting the user, or more complex operations like querying a database based on user input, querying third-party APIs, etc. Latter are executed in an external Rasa component called the action server.
- Lastly, we provide a few examples of what an actual conversation might look like, so that DUSBot knows how to map intents to actions.
At the current state, the architecture is relatively simple. The DUSBot (rasa) and action-server containers share the same pod. This way they share, among other things, their network namespace, which enables local network connectivity.
There are a few architectural adjustments that could be implemented to make DUSBot more scalable. Those are outlined later.
Building and deploying DUSBot on OpenShift
When adjusting the source code, especially the pieces that influence DUSBot’s behaviour (as in: the Machine Learning model), it is important to make sure that the user experience does not suffer. To achieve that, a combination of automated and manual tests and quality gates are implemented – of course with the help of OpenShift Pipelines.
From a developer perspective, there are two paths that can be taken in order to work on DUSBot:
- Local development, testing and validation with Podman (no actual deployment).
- Using a GitHub-based development approach that integrates with OpenShift (deployment to production possible).
The latter of the two consists of three Tekton pipelines that are triggered at different steps of the development process and serve different purposes. Furthermore, the processes span a total of three OpenShift projects, or Kubernetes namespaces. The following is a brief overview of each individual pipeline.
Open a pull request against the ‘test’ branch
Whenever the code that is running either in the testing or production environment needs to be adjusted, one has to open a GitHub pull request aimed at the default branch, called ‘test’. for review. As soon as this happens, a Pipeline is triggered that aims to validate the changes.
To be more specific, after downloading the codebase, the pipeline pulls the latest trained machine learning model from an S3 bucket (hosted by Backblaze). The reason for that is that in some cases (e. g. when only the documentation or other unrelated files are changed), we don’t want to execute the relatively expensive operation of training and testing the model (Rasa is smart enough to figure out if training is necessary or not based on the code and the trained model). Should training be required, the pipeline does this, followed by a bunch of machine learning related tests:
- Check for inconsistencies or mistakes in the training data.
- Cross-validation of the data.
- Test the assistant against the defined stories and test the models accuracy in predicting the user’s intent and corresponding action.
On that note, you might have noticed that the steps ‘pull from s3’ and ‘validate training data’ run in parallel before ever attempting to train the mode. The reason for that is that training will fail (after using quite a few resources) when the training data is inconsistent, incorrect, etc.
Before pushing the (successfully tested) model back to S3, a set of Python unit tests are executed in order to test the custom actions. The last step is to update the pull request with a short comment.
On that note, did you know that you can integrate OpenShift Pipelines with GitHub? In this example, the repository is set up in a way that a given Pull Request can only be merged after all checks (the pipeline) have successfully completed. This is done with a little tool called commit status tracker.
Merge pull request into ‘test’ branch
Given that the proposed changes look good from an automated testing perspective, the next step is to perform a bunch of real-world tests (as in: talk to DUSBot). For this, we trigger a second Tekton Pipeline by simply merging the pull request.
The main purpose of this Pipeline is to build two container images: one for the DUSBot (Rasa) itself and one for the action server. Both images are tagged with the latest Git commit hash to be able to quickly check what code is inside the image. In addition to that, we point the tag called ‘test’ to that very same commit hash. Since this alone will not trigger a deployment of the latest version to OpenShift, we use a little trick to do that (a combination of a specific ImagePullPolicy
and the manipulation of a Kubernetes annotation
). The Pipeline will conduct a rolling update of DUSBot (in the ‘test’ project) and then exit.
It is now appropriate to have a little conversation with DUSBot’s brother, DUSBot-dev, in order to double check for any unforeseen errors or unexpected behaviour. Of course, rolling back to a previous version is trivial.
Merge pull request into ‘main’ branch
Assuming one is happy with the test results, triggering a production deployment would be the next logical step. For this, another GitHub pull request, this time from the ‘test’ to the ‘ main’ branch, is necessary.
Note that this will not trigger another set of tests or container image builds. We use literally the same artefacts (container images) that are currently deployed in the test namespace and roll them out to production in two easy steps.
Instead, the Pipeline simply adjusts the container image tags of the DUSBot (Rasa) and action-server containers so that they point to the version that is currently deployed in the test environment. Similarly to the previous pipeline, the last step is to trigger a rolling update so that the latest version is usable in production.
Going forward
While working on the project, I noticed a bunch of shortcomings and potential features that I’d like to look into going forward. Those include:
- The so-called Lock Store is currently configured as
InMemoryLockStore
, which makes it impossible to horizontally scale the bot (interactions between the user and the bot would be round-robin, which would make it hard for the bot to remember the context and previous answers). The goal would be to, instead, use Redis running on OpenShift. - Similarly, the Tracker Store is currently
InMemoryTrackerStore
. This could be replaced with a PostgreSQL Database. - The TensorFlow base container image (which Rasa is based on) is quite big (1.5 GB). In addition to that, the rasa server takes about 60 seconds to start. Apart from that, it would be a great candidate for a Knative / Serverless application.
- DUSBot could probably very easily be managed by a Kubernetes Operator.
- Add more complex use cases, e. g. make the restaurant search a two-way dialogue that takes the users cuisine preferences into account.
- Make the Rasa test results (confusion matrix, etc.) available in the GitHub pull request.
With that said, feel free to check out the source code on GitHub to see the nitty gritty details of this particular implementation. Stay safe and thank you for reading.
Conclusion
This blog post outlined that it is possible to enable powerful CI/CD use cases using OpenShift Pipelines with very little effort or domain specific knowledge. Furthermore, the demonstrated workflow showcased that container-based development is a powerful tool to ensure software quality, reduce errors and increase developer productivity. Lastly, a big shout out to the Rasa team: you can be fairly certain that you are on the right track when someone without any AI or ML experience (like myself) manages to get started with your project in a short amount of time.