Network As a Service - NSX
As I began the journey towards network as a service I was unsure where to start. How do I begin with the end in mind? The end target being a fully sustainable network environment where users can land on a user portal, click the network options they need and then magically it is delivered (provisioned) in a short period of time instead of days, weeks or months later. An environment where the automation, configuration, management, testing, deployment and operation is controlled and managed. I am confident that this is where we need to go, the trick is to break this major activity into smaller, manageable puzzle pieces that when put together make the complete picture.
NSX
The first small puzzle piece I chose to tackle was NSX. When we began our journey with NSX we knew we wanted to implement some automation to accompany it. The problem is that we didn't know what we wanted to automate. We had to learn and understand how NSX worked in order to being to understand what we wanted to automate. Once we began to understand NSX and it's working what we wanted to automate became more clear. Start small, start simple and evolve from there. So, starting small and simple we knew we needed to automate the tagging of vm's, creation of NSGroups and deployment of firewall rules. The end game is to fully automate the lifecycle of a vm (or application) and have the user make the selections. NET as a Service.
Luckily there is already a quite extensive provider posting in Ansible for NSX located here https://github.com/vmware/ansible-for-nsxt. Recently there have been developments added with Ansible modules that make use of the Policy API within NSX. This is important as it keeps everything under the same GUI menu instead having to bounce between old and new. Those who have experienced this know what I am talking about.
Tags
In order to start simple and expand I looked to tag a vm, create NSGroup(s) and then create firewall policies and rules. This was accomplished by leveraging two examples from the github location mentioned above and customizing it to my environment.
To tag a vm is fairly simple, I just had to know the vm displayname and what tags I wanted to apply (of course admin access IP and credentials are also required). An extract of a the simple tag file is shown here:
As I said, it's a simple file with the current static limit of four tags. I plan on evolving this to be dynamic (and with a loop) and allow as many tags are needed (and allowed by NSX). Currently I read the information in from a simple CSV file that contains the vm display name as well as the tags required.
Groups
Once the tags are created then I focus on creation of NSGroups. In this example I am creating Learning groups. The reason will be explained when I get to the firewall policy and rules. I create two groups, SG-Learn-Ansible and SG-IaC-Ansible and the conditions for each group is dependent on the tag applied to the vm as shown below. Both groups have a condition (or key) for tag. The one I care about here is called ST-Learn. I use ST as the naming pattern so I can easily recognize Security Tags.
I also end my group name (for now) with -Ansible so that I know it was created via an Ansible playbook. When we have migrated all our consumption over to Ansible there will be no need for this as all the Groups will be created via Ansible.
Policy
Now that I have NSGroups created it's time to create a Learning policy and firewall rule. I choose to put all new vm's into what I call a learning policy so we can accept and log all traffic flows. This information is critical to know when locking down NSX to White List mode. A sample of the simple security policy I created is shown below:
I chose to create this policy in the Infrastructure group and make it the very first rule (sequence 1)
For the initial setup all three of these playbooks are run from the build_topologies playbook. A successful run would like what is shown below.
And this is what it looks like in NSX:
In order to support application or vm lifecycles they can be called individually. For example once I know all the flows for a vm I will run my vm_tag playbook and remove the ST-Learn tag while applying the tags that are appropriate to allow the application to function. A successful run produces this output:
The above output changed the tag(s) on a particular vm but did not remove the newly created Learning Policy or firewall rule. In the future as I bring new vm's into NSX all I need to do is tag them with the ST-Learn tag and they will automatically hit that Learning rule and the flows will be logged for evaluation.
I have played with these playbook and control the variables with an external csv file. This sets up the passing of information to the playbook without having to reveal the 'details' to the end user.
Any of these three playbooks can now be run independently as needed for CRUD of firewall rules, NSGroups and vm tags. Recently I used the vm tagging playbook to retag a handful of vm's that were being rebuilt. Luckily each vm had the same amount of tags so the csv file was quite simple. After each vm was built and visible I ran the playbook and in a matter of seconds the vm had it's required tags back and proper traffic was being allowed through the NSX firewall.
Conclusion
Automation will simplify and speed up our network deployments as we will be the enabler for Network as a Service. To be clear, automation is only as perfect as the person programming it so it is not fool proof but it has been shown to be more efficient than manual configuration. To that note we will continue to automate our NSX environment moving towards our goal of a self-service user portal.