How to create LXD Containers with Ansible 2.2

Ansible

LXD (Working example from this post you can find on my GitHub Page)

While working with Ansible since a couple of years now and working with LXD as my local test environment I was waiting for a simple solution to create LXD containers (locally and remote) with Ansible from scratch. Not using any helper methods like shell: lxd etc.

So, since Ansible 2.2 we have native LXD support.
Furthermore, the Ansible Team actually showed some respect to the Python3 Community, and has implemented Python3 Support.

Preparations

First of all, you need to have the latest Ansible Release), or install it in a Python3 Virtual Environment via pip install ansible.

Create your Ansible directory layout

To make your life later a little bit easier, create your Ansible directory structure and turn it to a Git repository.

user@home: ~> mkdir -p ~/Projects/git.ansible/lxd-containers  
user@home: ~> cd ~/Projects/git.ansible/lxd-containers  
user@home: ~/Projects/git.ansible/lxd-containers> mkdir -p {inventory,roles,playbooks}

Create your inventory file

Imagine, you want to create 5 new LXD containers. You can create 5 playbooks to do it, or you can be smart, and let Ansible do it for you.
Working with inventory files is easy, it's simply a file with an INI file structure.

Let's create an inventory file for new LXD containers in ~/Projects/git.ansible/lxd-containers/inventory/containers:

[local]
localhost

[containers]
blog-01 ansible_connection=lxd  
blog-02 ansible_connection=lxd  
blog-03 ansible_connection=lxd  
blog-04 ansible_connection=lxd  
blog-05 ansible_connection=lxd  

We defined now 5 containers.

Create a playbook for running Ansible

We need now an Ansible playbook.

A playbook is just a simple YAML file. You can edit this file with your editor of choice. I personally like Sublime Text 3 or GitHubs Atom, but any other editor (like Vim or Emacs) will do.

Create a new file under ~/Projects/git.ansible/lxd-containers/playbooks/lxd_create_containers.yml:

- hosts: localhost
  connection: local
  roles:
    - create_lxd_containers

Let's go shortly through this:

  • hosts: defines: the hosts to run Ansible on. Using it like this means, this playbook runs on your local machine.
  • connection: local: Ansible will use a local connection, like sshing into your local box.
  • roles: ...: is a list of Ansible roles to be used during this playbook.

You could also write all Ansible tasks in this playbook, but as you want to reuse several tasks for certain workloads, it's a better idea to divide them into roles.

Create the the Ansible role

Ansible Roles are being used for separating repeating tasks from the playbooks.

Think about this example: You have a playbook for all your webservers like this:

- hosts: webservers
  tasks:
    - name: apt update
      apt: update_cache=yes

and you have a playbook for all your database servers like this:

- hosts: databases
  tasks:
    - name: apt update
      apt: update_cache=yes

What do you see? Yes, two times the same task, namely "apt update".

To make our lives easier, instead of writing in every playbook a task to update the systems package archive cache, we create an Ansible role.

Ansible Roles do have a special directory structure, I advise to read the good documention over at the Ansible HQ

Let's start with our role for creating LXD containers:

Create the directory structure

user@home: ~> cd ~/Projects/git.ansible/lxd-containers/roles/  
user@home: ~/Projects/git.ansible/lxd-containers/roles/> mkdir -p create_lxd_containers/tasks  

Now create a new YAML file and name it ~/Projects/git.ansible/lxd-containers/roles/create_lxd_containers/tasks/main.yml with this content:

- name: Create LXD Container
  connection: local
  become: false
  lxd_container:
    name: "{{item}}"
    state: started
    source:
      type: image
      mode: pull
      server: https://cloud-images.ubuntu.com/releases
      protocol: simplestreams
      alias: 16.04/amd64
    profiles: ['default']
    wait_for_ipv4_addresses: true
    timeout: 600
  with_items:
    - "{{groups['containers']}}"

- name: Check if Python2 is installed in container
  delegate_to: "{{item}}"
  raw: dpkg -s python
  register: python_check_is_installed
  failed_when: python_check_is_installed.rc not in [0,1]
  changed_when: false
  with_items:
    - "{{groups['containers']}}"

- name: Install Python2 in container
  delegate_to: "{{item.item}}"
  raw: apt-get update && apt-get install -y python
  when: "{{item.rc == 1}}"
  with_items:
    - "{{python_check_is_installed.results}}"

Let's go through the different tasks

Create the LXD Container

- name: Create LXD Container
  connection: local
  become: false
  lxd_container:
    name: "{{item}}"
    state: started
    source:
      type: image
      mode: pull
      server: https://cloud-images.ubuntu.com/releases
      protocol: simplestreams
      alias: 16.04/amd64
    profiles: ['default']
    wait_for_ipv4_addresses: true
    timeout: 600
  with_items:
    - "{{groups['containers']}}"
  • connection: local: means it's only running on your local machine.
  • become: false: don't use su or sudo to become a superuser.
  • lxd_container: ...: this is the Ansible LXD module definition. Read the documentation about this module here: Ansible LXD Documentation
  • with_items: ...: this is one of the many Ansible loop statements. In this case, we are looping over the Inventory Group 'containers' (which we defined in the inventory file earlier).

The "{{item}}" will be prefilled by the loop from with_items:..., again a hint to read the good documentation of Ansible about loops.

Check if Python2 is installed inside the container

- name: Check if Python2 is installed in container
  delegate_to: "{{item}}"
  raw: dpkg -s python
  register: python_check_is_installed
  failed_when: python_check_is_installed.rc not in [0,1]
  changed_when: false
  with_items:
    - "{{groups['containers']}}"
  • delegate_to:...": this key tells ansible to not use the default connection anymore, but to delegate the connection and the work to the host mentioned in delegate_to.
  • raw:...: This key advises Ansible to use the raw module. Raw means, we don't actually have anything running, no Python for example, which we need for Ansible. So it just using an SSH connection (by default) or for now, it's using a local LXD connection (like lxc exec <container-name> -- <command>). In this case we are executing dpkg -s python, we want to find out of if Python2 is installed.
  • register: ...: during execution of the raw: ... command, Ansible is able to catch the output (stdout, stderr) and the result code of the raw: ... command. register: ... will define a "variable" to store this result. Normally this "variable" is a Python/JSON dictionary for a particular host, but as we are iterating through the 'containers' inventory group, this 'variable' has a results array (which we will use in the next task), where Ansible stores all outputs of all hosts checks. During the task execution but, this 'variable' is still usable as a single result set.
  • failed_when: ...: this will stop the task, if the registered 'variable' is not accessible or the return code is not 0 or 1 (so command returned no success or no real fail, but something else). (more documentation you can find here)
  • changed_when: false: so whenever this tasks runs, it will always change it status, and this would mean Ansible would report one change (i.e. return code changed). To prevent this, we set this to false.(more documentation you can find here)
  • with_items: ...: this is one of the many Ansible loop statements. In this case, we are looping over the Inventory Group 'containers' (which we defined in the inventory file earlier).

The "{{item}}" will be prefilled by the loop from with_items:..., again a hint to read the good documentation of Ansible about loops.

Install Python2 if it is not installed in the container

- name: Install Python2 in container
  delegate_to: "{{item.item}}"
  raw: apt-get update && apt-get install -y python
  when: "{{item.rc == 1}}"
  with_items:
    - "{{python_check_is_installed.results}}"
  • delegate_to:...": this key tells ansible to not use the default connection anymore, but to delegate the connection and the work to the host mentioned in delegate_to.
  • raw:...: This key advises Ansible to use the raw module. Raw means, we don't actually have anything running, no Python for example, which we need for Ansible. So it just using an SSH connection (by default) or for now, it's using a local LXD connection (like lxc exec <container-name> -- <command>). In this case we are executing dpkg -s python, we want to find out of if Python2 is installed.
  • when: ...: this is a conditional. It says, that this task only executes when the codition is met. In this case when the return code equals to 1. This is true when the Python2 install check returned, that Python2 was not installed.
  • with_items: ...: this is one of the many Ansible loop statements. In this case, we are looping over the Inventory Group 'containers' (which we defined in the inventory file earlier).

The "{{item}}" will be prefilled by the loop from with_items:..., again a hint to read the good documentation of Ansible about loops. In this case, we are looping through the result sets of the Python2 install check and the collected results in the 'variable' python_check_is_installed.

Some more informations

In the playbook and in the first task (create LXD containers) we used the a local connection, which means nothing else than Ansible should work on your local workstation.
Inside the Inventory INI file there is this key/value pair: ansible_connection=lxd.

So when the two other tasks who were delegated to the created containers, Ansible would normally use an SSH connection attempt (when you remove the ansible_connection=lxd). With this special configuration in the Inventory INI file it won't try to use SSH towards the containers, but the local LXD connection.

Bringing it all together

Let's start Ansible to do the work we want it to do:

~/Projects/git.ansible/lxd-containers > ansible-playbook -i inventory/inventory playbooks/lxd_create_containers.yml 

PLAY [localhost] ***************************************************************

TASK [setup] *******************************************************************  
ok: [localhost]

TASK [create_lxd_containers : Create LXD Container] ****************************  
changed: [localhost] => (item=blog-01)  
changed: [localhost] => (item=blog-02)  
changed: [localhost] => (item=blog-03)  
changed: [localhost] => (item=blog-04)  
changed: [localhost] => (item=blog-05)

TASK [create_lxd_containers : Check if Python2 is installed in container] ******  
ok: [localhost -> blog-01] => (item=blog-01)  
ok: [localhost -> blog-02] => (item=blog-02)  
ok: [localhost -> blog-03] => (item=blog-03)  
ok: [localhost -> blog-04] => (item=blog-04)  
ok: [localhost -> blog-05] => (item=blog-05)

TASK [create_lxd_containers : Install Python2 in container] ********************  
changed: [localhost -> blog-01] => (item={'changed': False, 'stdout': u'', '_ansible_no_log': False, '_ansible_delegated_vars': {'ansible_host': u'blog-01'}, '_ansible_item_result': True, 'failed': False, 'item': u'blog-01', 'rc': 1, 'invocation': {'module_name': u'raw', 'module_args': {u'_raw_params': u'dpkg -s python'}}, 'stdout_lines': [], 'failed_when_result': False, 'stderr': u"dpkg-query: package 'python' is not installed and no information is available\nUse dpkg --info (= dpkg-deb --info) to examine archive files,\nand dpkg --contents (= dpkg-deb --contents) to list their contents.\n"})  
changed: [localhost -> blog-02] => (item={'changed': False, 'stdout': u'', '_ansible_no_log': False, '_ansible_delegated_vars': {'ansible_host': u'blog-02'}, '_ansible_item_result': True, 'failed': False, 'item': u'blog-02', 'rc': 1, 'invocation': {'module_name': u'raw', 'module_args': {u'_raw_params': u'dpkg -s python'}}, 'stdout_lines': [], 'failed_when_result': False, 'stderr': u"dpkg-query: package 'python' is not installed and no information is available\nUse dpkg --info (= dpkg-deb --info) to examine archive files,\nand dpkg --contents (= dpkg-deb --contents) to list their contents.\n"})  
changed: [localhost -> blog-03] => (item={'changed': False, 'stdout': u'', '_ansible_no_log': False, '_ansible_delegated_vars': {'ansible_host': u'blog-03'}, '_ansible_item_result': True, 'failed': False, 'item': u'blog-03', 'rc': 1, 'invocation': {'module_name': u'raw', 'module_args': {u'_raw_params': u'dpkg -s python'}}, 'stdout_lines': [], 'failed_when_result': False, 'stderr': u"dpkg-query: package 'python' is not installed and no information is available\nUse dpkg --info (= dpkg-deb --info) to examine archive files,\nand dpkg --contents (= dpkg-deb --contents) to list their contents.\n"})  
changed: [localhost -> blog-04] => (item={'changed': False, 'stdout': u'', '_ansible_no_log': False, '_ansible_delegated_vars': {'ansible_host': u'blog-04'}, '_ansible_item_result': True, 'failed': False, 'item': u'blog-04', 'rc': 1, 'invocation': {'module_name': u'raw', 'module_args': {u'_raw_params': u'dpkg -s python'}}, 'stdout_lines': [], 'failed_when_result': False, 'stderr': u"dpkg-query: package 'python' is not installed and no information is available\nUse dpkg --info (= dpkg-deb --info) to examine archive files,\nand dpkg --contents (= dpkg-deb --contents) to list their contents.\n"})  
changed: [localhost -> blog-05] => (item={'changed': False, 'stdout': u'', '_ansible_no_log': False, '_ansible_delegated_vars': {'ansible_host': u'blog-05'}, '_ansible_item_result': True, 'failed': False, 'item': u'blog-05', 'rc': 1, 'invocation': {'module_name': u'raw', 'module_args': {u'_raw_params': u'dpkg -s python'}}, 'stdout_lines': [], 'failed_when_result': False, 'stderr': u"dpkg-query: package 'python' is not installed and no information is available\nUse dpkg --info (= dpkg-deb --info) to examine archive files,\nand dpkg --contents (= dpkg-deb --contents) to list their contents.\n"})

PLAY RECAP *********************************************************************  
localhost                  : ok=4    changed=2    unreachable=0    failed=0   

~/Projects/git.ansible/lxd-containers > lxc list
+---------+---------+-----------------------+------+------------+-----------+
|  NAME   |  STATE  |         IPV4          | IPV6 |    TYPE    | SNAPSHOTS |
+---------+---------+-----------------------+------+------------+-----------+
| blog-01 | RUNNING | 10.139.197.44 (eth0)  |      | PERSISTENT | 0         |
+---------+---------+-----------------------+------+------------+-----------+
| blog-02 | RUNNING | 10.139.197.10 (eth0)  |      | PERSISTENT | 0         |
+---------+---------+-----------------------+------+------------+-----------+
| blog-03 | RUNNING | 10.139.197.188 (eth0) |      | PERSISTENT | 0         |
+---------+---------+-----------------------+------+------------+-----------+
| blog-04 | RUNNING | 10.139.197.221 (eth0) |      | PERSISTENT | 0         |
+---------+---------+-----------------------+------+------------+-----------+
| blog-05 | RUNNING | 10.139.197.237 (eth0) |      | PERSISTENT | 0         |
+---------+---------+-----------------------+------+------------+-----------+

Awesome, 5 containers created and Python2 installed.

Now it's time do to the real work (like installing your app and testing them)

New Blog

Welcome to my new Blog :)

Long time no written article because I was too busy with work and with my private live.

But there is so much go write, what I did in the past, what I do in the future and whatever is important.

The old blog articles will go in this new blog as well, but there is no direct way to import them, so I have to do that manually, when time permits :)

10 Years of Ubuntu

Ok, eventually I am 2 months early, but I was appointed an Ubuntu Member on 2005-06-15...
but I was starting earlier with Ubuntu Packaging...

Anyhow, I already wrote my praise on Google+.

So just to make this public:

Thanks for 15.04 and all the other releases before (especially the LTS ones).

I think during the last 10 Years, Ubuntu made a difference towards the Linux Community,

When I joined this journey, Ubuntu was just another distribution, with a SABDFL who was pumping a lot of
money into his free project. I guess it was his private money, and the whole Linux community should be so
thankful to this Geek.

Without Marks engagement, I don't think that Linux on the Desktop is so known
to the wider public.

Don't get me wrong, we had SuSE, we had Red Hat, we had Debian (and other smaller Distros), but most of the
global players today were famous for the involvement on the servers (Well, not SuSE because they were focused on Desktop
before they lost track and made the wrong turn [and no I am not saying openSuSE this is a different story)

10 Years ago, actually 10 years and a couple of months, a small group of people were working on an integrated desktop environment,
based on GNOME. And they were right to do so. Those people, many of them still are doing their Job at Canonical, were right to
invest their time into that.

And look, where are we today! On the Desktop, on the server, in the middle of the cloud and on a freaking Phone!

Who thought about this 10 and half years ago?

Yeah, I know, there were some decisions which were not so Ok for the community, but honestly, even those wrong decisions were
needed. Without wrong decisions we don't learn. Errors are there to learn from them, even in a social environment.

To make my point, I think it's important to have one public figure, to bring a project like Ubuntu forward. One person who
directs all fame and hate towards him, and especially Mark is one of those figures.

Just see other huge OpenSource Projects, like OpenStack or Hadoop. Great projects, I give them that, but there is no
person who drives it. No Person who is making decisions, where the project has to go. That's why OpenStack as stock OpenSource project
is not a product. Hadoop, with all its fame, is not a product out of the box.

Too many companies do have a say. That's why, i.e. it's far from practical to install OpenStack from Source and have a running Cloud System.
This is wrong, and those Communities, they need someone who has the hat on to say where these Communities are moving forward.

Democracy is good, I know, but in some environments Democracy blocks innovation. Too many people, too many voices, too many wrong directions.
Just see the quality of Ubuntu Desktop, pre-installed on Dell Workstations or Laptops? That's how you do it. You concentrate on Quality, and
you get your Vendors who will ship your PRODUCT!

Let's see:

  • We have nowadays Ubuntu as Desktop OS (with Unity as Desktop)
  • We have Ubuntu as a Server OS, running on many uncounted bare metal machines.
  • We have Ubuntu as a Cloud OS, running on many, many Amazon instances, Docker instances and eventually Rackspace Instances.

But Ubuntu is more. The foundation of Ubuntu is driving many other Projects, like:

  • Kubuntu (aka the KDE Distro of Choice)
  • Ubuntu GNOME Remix
  • Ubuntu with XFCE, etc.
  • Mint Linux
  • Goobuntu
  • etc.

All those derivatives are based on the Ubuntu Foundation, made and integrated and plumbed by so many smart and awesome people.

Thanks to all of You!

So what now?

Mobile is growing. Mobile first. Mobile is the way to go!

Ubuntu on the Phone is not an idea anymore, it's reality. Well done people. You made it!

But Ubuntu can even do more. Let's think about the next hype.

Hype like CoreOS.

A Linux OS which is image based, no package management, just driven my some small utilities like systemd, fleetd and/or etcd.

CoreOS is one of the projects, I am really looking forward to use. But, I really want to see Ubuntu there.

And yes, there is Ubuntu Snappy....so why not trying to use Snappy as CoreOS replacement?

There is Docker. Docker is being used as the Dev Util for spinning up Instances, with specialised software on it.

Hell, Stephane Graber and his Friends over at the Linux Container Community, they have LXD!
LXD driven by Stephane and his friends. Stephane is working for Canonical. So, I say: LXD is a Canonical Project!

And what is Canonical? Canonical is a major contributor to Ubuntu. I want to see LXD as the Docker Replacement, with more
security, with more energy, with better integration into Cloud Systems like OpenStack and/or CloudStack!

To make a long story short, Ubuntu is one of those Projects, which are not going away.

Even with Mark (hopefully not) retiring, Canonical will be the driving force. There will be another Mark, and that's
why Ubuntu is one of the driving forces in our OpenSource Development. Forget about Contributor Licenses, forget about
all decisions which were wrongly made.

We are here! We don't go away! We are Ubuntu, Linux for Human Beings! And we are here to stay, whatever you say!
We are better, we are stronger, we are The Borg! ^W ^W ^W ^W forget this, this is a different movie ;)

And if you ask: "Dude, you are saying all this, and you were a member of this Project, where is your CONTRIBUTION!?!?"

My Answer is:

"I bring Ubuntu to the Business! I installed Ubuntu as Server OS in many Companies during the last couple of years.
I integrated Ubuntu as SupportOS in companies where you don't expect it would run and support Operations or Service Reliability Departments.
I am the Ubuntu Integrator and Evangelist you won't see, hear or read (normally). I am the one of the Ubuntu Apostles, who are not bragging,
but bringing the Light to the Darkness"

;-)

PS: Companies Like Netviewer AG, Podio (Both Belong now to Citrix Inc.) and Sony/Gaikai for their PlayStation Now product

Python and JavaScript?

Is it possible to combine the worlds amazing prototyping language (aka Python) with JavaScript?

Yes, it is. Welcome to PyV8!


Prerequisites

So, first we some libraries and modules:

  1. Boost with Python Support

    • On Ubuntu/Debian you just do apt-get install libboost-python-dev, for Fedora/RHEL use your package manager.
    • On MAC OSX:

      • When you are on Homebrew do this:

      brew install boost --with python

  2. PyV8 Module

    (You need Subversion installed for this)

       mkdir pyv8
    cd pyv8
    svn co http://pyv8.googlecode.com/svn/trunk/
    cd trunk
    

    When you are on Mac OS X you need to add this first:

    export CXXFLAGS='-std=c++11 -stdlib=libc++ -mmacosx-version-min=10.8'
    export LDFLAGS=-lc++
    

    Now just do this:

    python ./setup.py install

    And wait !

    (Some words of advise: When you are installing boost from your OS, make sure you are using the python version which boost was compiled with)

  3. Luck ;)

    Means, if this doesn't work, you have to ask Google.

Now, how does it work?

Easy, easy, my friend.

The question is, why should we use JavaScript inside a Python tool?

Well, while doing some crazy stuff with our ElasticSearch cluster, I wrote a small python script to do some nifty parsing and correlation. After not even 30 mins I had a commandline tool, which read in a YAML file, with ES Queries written in YAML Format, and an automated way to query more than one ES cluster.

So, let's say you have a YAML like this:

title:  
  name: "Example YAML Query File"
esq:  
  hosts:
    es_cluster_1:
      fqdn: "localhost"
      port: 9200
    es_cluster_2:
      fqdn: "localhost"
      port: 10200
    es_cluster3:
      fqdn: "localhost"
      port: 11200_
indices:  
  - index:
      id: "all"
      name: "_all"
      all: true
  - index:
      id: "events_for_three_days"
      name: "[events-]YYYY-MM-DD"
      type: "failover"
      days_from_today: 3
  - index:
      id: "events_from_to"
      name: "[events-]YYYY-MM-DD"
      type: "failover"
      interval:
        from: "2014-08-01"
        to: "2014-08-04"
query:  
  on_index:
    all:
      filtered:
        filter:
          term:
            code: "INFO"
    events_for_three_days_:
      filtered:
        filter:
          term:
            code: "ERROR"
    events_from_to:
      filtered:
        filter:
          term:
            code: "DEBUG"

No, this is not really what we are doing :) But I think you get the idea.

Now, in this example, we have 3 different ElasticSearch Clusters to search in, and all three have different data, but all are sharing the same Event format.
So, my idea was to generate reports of the requested data, but eventually for a single ES Cluster, or correlated over all three.
I wanted to have the functionality inside the YAML file, so everybody who is writing such a YAML file can also add some processing code.
Well, the result set of an ES search query is a JSON blob, and thanks to elasticsearch.py it will be converted to a Python dictionary.

Huu...so, why don't you use python code inside YAML and eval it inside your Python Script?

Well, when you ever wrote Front/Backend Web Apps, you know it's pretty difficult to write Frontend Python Scripts which are running inside your browser. So, JavaScript here for the rescue.
And everybody knows how easy it is, to deal with JSON object structures inside JavaScript. So, why don't we use this knowledge and invite users who are not familiar with Python, to participate?

Now, think about an idea like this:

title:  
  name: "Example YAML Query File"
esq:  
  hosts:
    es_cluster_1:
      fqdn: "localhost"
      port: 9200
    es_cluster_2:
      fqdn: "localhost"
      port: 10200
    es_cluster3:
      fqdn: "localhost"
      port: 11200_
indices:  
  - index:
      id: "all"
      name: "_all"
      all: true
  - index:
      id: "events_for_three_days"
      name: "[events-]YYYY-MM-DD"
      type: "failover"
      days_from_today: 3
  - index:
      id: "events_from_to"
      name: "[events-]YYYY-MM-DD"
      type: "failover"
      interval:
        from: "2014-08-01"
        to: "2014-08-04"
query:  
  on_index:
    all:
      filtered:
        filter:
          term:
            code: "INFO"
    events_for_three_days_:
      filtered:
        filter:
          term:
            code: "ERROR"
    events_from_to:
      filtered:
        filter:
          term:
            code: "DEBUG"
processing:  
    for:
        report1: |
            function find_in_collection(collection, search_entry) {
                for (entry in collection) {
                    if (search_entry[entry]['msg'] == collection[entry]['msg']) {
                        return collection[entry];
                    }
                }
                return null;
            } 
            function correlate_cluster_1_and_cluster_2(collections) {
                collection_cluster_1 = collections["cluster_1"]["hits"]["hits"];
                collection_cluster_2 = collections["cluster_2"]["hits"]["hits"];
                similar_entries = [];
                for (entry in collection_cluster_1) {
                    similar_entry = null;
                    similar_entry = find_in_collection(collection_cluster_2, collection_cluster_1[entry]);
                    if (similar_entry != null) {
                        similar_entries.push(similar_entry);
                    }
                }
                result = {'similar_entries': similar_entries};
                return(result)
            }
            var result = correlate_cluster_1_and_cluster_2(collections);
            // this will return the data to the python method result 
            result
output:  
    reports;
        report1: |
            {% for similar_entry in similiar_entries %}
            {{ similiar_entry.msg }}
            {% endfor %}

(This is not my actual code, I just scribbled it down, so don't lynch me if this fails)

So, actually, I am passing a python dict with all the query resulsets from the ES clusters (defined at the top of the YAML file) towards a PyV8 Context Object, can access those collections inside my JavaScript and return a JavaScript HASH / Object.
In the end, after JavaScript Processing, there could be a Jinja Template inside the YAML file, and we can pass the JavaScript results into this template, for printing a nice report.
There are many things you can do with this.

So, let's see it in python code:

# -*- coding: utf-8 -*-
# This will be a short form of this,
# so don't expect that this code will do the reading and validation
# of the YAML file

from elasticsearch import Elasticsearch  
import PyV8  
from jinja2 import Template

class JSCollections(PyV8.JSClass):  
    def __init__(self, *args, **kwargs):
        super(JSCollections, self).__init__()
        self.collections = {}
        if 'collections' in kwargs:
            self.collections=kwargs['collections']

    def write(self, val):
        print(val)

if __name__ == '__main__':  
    es_cluster_1 = Elasticsearch({"host":"localhost", port: 9200})
    es_cluster_2 = Elasticsearch({"host":"localhost", port: 10200})
    collections = {}
    collections['cluster_1] = es_cluster_1.search(index="_all", body={"query": { "filtered": {"filter": {"term": {"code": "DEBUG"}}}}}, size=100)
    collections['cluster_2] = es_cluster_2.search(index="_all", body={"query": { "filtered": {"filter": {"term": {"code": "DEBUG"}}}}}, size=100)
    js_ctx = PyV8.JSContext(JSCollection(collections=collections))
    js_ctx.enter()
    #
    # here comes the javascript code
    #
    process_result = js_ctx.eval("""
            function find_in_collection(collection, search_entry) {
                for (entry in collection) {
                    if (search_entry[entry]['msg'] == collection[entry]['msg']) {
                        return collection[entry];
                    }
                }
                return null;
            } 
            function correlate_cluster_1_and_cluster_2(collections) {
                collection_cluster_1 = collections["cluster_1"]["hits"]["hits"];
                collection_cluster_2 = collections["cluster_2"]["hits"]["hits"];
                similar_entries = [];
                for (entry in collection_cluster_1) {
                    similar_entry = null;
                    similar_entry = find_in_collection(collection_cluster_2, collection_cluster_1[entry]);
                    if (similar_entry != null) {
                        similar_entries.push(similar_entry);
                    }
                }
                result = {'similar_entries': similar_entries};
                return(result)
            }
            var result = correlate_cluster_1_and_cluster_2(collections);
            // this will return the data to the python method result 
            result
    """)
    # back to python
    print("RAW Process Result".format(process_result))
    # create a jinja2 template and print it with the results from javascript processing
    template = Template("""
        {% for similar_entry in similiar_entries %}
        {{ similiar_entry.msg }}
        {% endfor %}
    """)
    print(template.render(process_result))

Again, just wrote it down, not the actual code, so dunno if it really works.

But still, this is pretty simple.

You can even use JavaScript Events, or JS debuggers, or create your own Server Side Browsers. You can find those examples in the demos directory of the PyV8 Source Tree.

So, this was all a 30 mins Prove of Concept, and last night I refactored the code and this morning, I thought, well, let's write a real library for this. So, eventually there is some code on github over the weekend. I'll let you know.

Oh, before I forget, the idea of writing all this in a YAML file came from work with Junipers JunOS PyEZ Library, which has a similar way. But they are using the YAML file as description for autogenerated Python Classes. Very Nifty.

Thanks Jono

Thanks, Jono, for being this awesome Community Manager of Canonical/Ubuntu.

EoM