• +43 660 1453541
  • contact@germaniumhq.com

Creating a New Python BPMN Process With Adhesive


Creating a New Python BPMN Process With Adhesive

BPMN is great. We draw what we want to execute, and the engine takes care about the parallelism parts. We visually represent how the parts of the program are wired. With adhesive, we take it one step further and instantly implement the backing process. So let’s start designing!

First we’ll model a simple process. We’ll just have some simple tasks wired together:

Basic Process

For this I’m using the amazing Yaoqiang BPMN Editor, but any BPMN modeler should be ok (simple.bpmn).

Then we’ll create a simple file (named _adhesive.py) that just tries to execute the process:

import adhesive
adhesive.bpmn_build("simple.bpmn")

Running in the shell python _adhesive.py, or just adhesive, dumps us the functions we still need to implement:

Missing tasks implementations. Generate with:

@adhesive.task('List files in /tmp')
def list_files_in_tmp(context):
    pass

@adhesive.task('Check if /etc/passwd is present')
def check_if_etc_passwd_is_present(context):
    pass

@adhesive.task('List files in /etc')
def list_files_in_etc(context):
    pass

To implement them, we’ll simply run ls in all cases. The execution works the same way as in Jenkins, and if the return of the shell is non zero, an exception is thrown:

# ...
@adhesive.task('Check if /etc/passwd is present')
def check_if_etc_passwd_is_present(context):
    # a return code different than 0 will throw an exception
    context.workspace.run("""
        ls /etc/passwd
    """)
# ...

To run it, we’ll simply call adhesive on the command line, since it implicitly picks up _adhesive.py files in the current folder:

ahdesive

or just run it, since it’s just a python program:

python _adhesive.py

This outputs something similar to:

2019-10-18 05:19:33,782 INFO     Run  [root process]
2019-10-18 05:19:33,783 INFO     Run  List files in /tmp
2019-10-18 05:19:33,789 INFO     Done List files in /tmp
2019-10-18 05:19:33,791 INFO     Run  Check if /etc/passwd is present
2019-10-18 05:19:33,792 INFO     Run  List files in /etc
2019-10-18 05:19:33,800 INFO     Done Check if /etc/passwd is present
2019-10-18 05:19:33,803 INFO     Done List files in /etc
2019-10-18 05:19:33,810 INFO     Done [root process]

Now on execution, you’ll notice that the two parallel tasks "Check if /etc/passwd is present", and "List files in /etc" are running in parallel. Adhesive runs all the tasks in parallel, on different threads. (It also supports running the tasks in different processes)

How this works is by having execution tokens, being passed around tasks. These execution tokens are available as the context parameter for the implementing task. We can assign data to the tokens to be visible for the following tasks:

context.data.name = "abc"

In the following tasks, we just read the value.

Or use it to execute things against the underlying workspace, that can be the local operating system, docker, kubernetes pods, or ssh remote connections via the context.workspace.

One last thing that’s still irking is the similitude between List files in /tmp, and List files in /etc. We now have parameters that come from our task name. When this is the case, we extract them with a regex expression, and pass it to the task implementation. We’ll rewrite the task implementations for the list as a single function:

@adhesive.task(re='List files in (.*)')
def list_files_in_folder(context, folder):
    context.workspace.run(f"""
        ls {folder}
    """)

With regex (re) names we’re extracting the matches as variables into the implementing function, so we’re having the same backing implementation for all the tasks that are doing the same thing.

Why designing the process with BPMN is super cool, instead of simple code? Because if we change the diagram to:

Basic Process Changed

The "List files in /tmp" is executed only after both the /etc/passwd check and the /etc/ listing, without manual synchronization, and still with the full parallelism on the previous tasks*.

We can still have conditions on the connections, so a particular task does not get executed. We can still use gateways, loops, subprocesses, etc to structure the order in which our code gets executed, without actually touching the implementation steps.

Resources

To run them, you just need adhesive installed (pip install -U adhesive)

One Last BPMN Note

Keen BPMN developers know that execution tokens aren’t just merged, a gateway is needed to do that, so the last task should have been executed twice. Normally a parallel gateway should have been inserted in front of it. This is of course supported by Adhesive, only the initial call should be changed to:

adhesive.bpmn_build("simple.bpmn", wait_tasks=False)

Adhesive supports exclusive, parallel, inclusive, and complex gateways.

The reason for this default value is that in 90% of the cases people are trying to model workflows, not BPM processes.

It’s also true that when processing events we don’t want them merged, and waited in front of a task, but rather the task executed multiple times.