Striving to reach Python literacy in Engineering

Striving to reach Python literacy in Engineering

This article discusses part of the journey of IT engineers, creating a framework for themselves with the intention of making it available to non-IT engineers in R&D as well.

Recognizing change

For more than 20 years, our software – for both computer and data science – was written using C, C++ or Fortran. The reasoning was obvious; get the most out of your resources, with highly optimized languages. This made writing software the prerogative of a chosen few.

Rationale for shooting ourselves in the foot

With the accessibility of languages like Matlab, R and Python, a lot of non-IT people started to write code. This is a known story found in many organizations, so I will make it short. There came a point where we thought:

"Python has its downsides, but it's a language we can all work with".

In a pragmatic sense: it can do almost everything we need across R&D and could therefore help foster collaborative work between IT and non-IT.

Hypotheses should hold against reality

We tested out that theory in 2014 with a small team of developers, and contributions from non-IT colleagues. We very quickly learned 3 major lessons:

  1. if you don't rewrite their code, people can keep improving it and fixing bugs for you
  2. if you provide them with a sensible framework, they are more than happy to use it
  3. the Python ecosystem is huge; standing on the shoulders of others makes you taller

The most agile one should be the one to adapt

At the core of my thinking was the fact that Python seemed like a powerful asset for engineers from all fields (read "not just IT"):

  • to prototype their ideas and refine their expectations from IT departments
  • to provide libraries that would organically integrate into our products

As developers, learning a new language should come fairly easy to you. So we pushed for Python to take over our IT department. This meant that C++ developers and Matlab aficionados had to start converting to Python.

This was a hard pill to swallow for a lot of people. On the one hand some developers saw moving to Python as short-selling themselves in terms of skills. On the other hand, some non-IT partners that were used to Matlab initially saw this as having to re-learn things, and therefore a loss of proficiency.

This article describes part of the framework that we put in place to onboard people to Python without forcing anyone. As everyone's job evolved towards data-driven activities and the need to collaborate, we saw the adoption rate increase through feedback, requests for support or training and consulting.

Be ambitious. Also reasonable.

Starting in 2018 we had one ambition: provide Michelin with a Python software factory that is both:

  • highly accessible for non-developers
  • at the state of the art for software engineers

Defining the baseline for code quality

Early on we agreed on one principle:

Our software factory should help people, not constrain them.

It is very easy to see code quality insurance tools as non-negotiable items. And indeed, you should not have to negotiate: people should be convinced. Make it so easy for them to write quality code, that they don't even question it.

So let's make it easy to run

We decided to use cookiecutter with a handful of custom written templates. The most populars are:

cookiecutter on a Click-based template

The above video shows how you get from zero to a working, configured Click app in 3 shell commands:

pip install cookiecutter
cookiecutter <template url>
pip install <your project>

All of these automatically setup the following tools:

  • black, because we don't have time to waste not using it
  • isort, because black but for your imports
  • pylint, because it helps you write better code
  • pytest, because code that is not tested does not exist
  • coverage, because you need metrics for your testing
  • sphinx, because documenting should not be an afterthought

And one final touch to all of that... automatically run all of the above in multiple versions of Python to check for compatibility issues on every git push with Gitlab CI.

Once this is done, the tools are all one command away:

Pipeline running all tools with multiple versions of Python

Make it easy to read

Okay, we've made it easy for people to access basic code quality tooling. We have to provide them with the ability to quickly know two things:

  • if something is broken
  • where it broke

So we also configure a dashboard on the file of the project:

Default badges, out of the box

Badges highlight the current state of each tools and provides a link to a summary of the issues. In doing so, you can easily fix residual mistakes that were not caught locally before pushing to the remote.

Remember that in the end our goal is to provide people working with Python at Michelin with something they can use to write better, safer code, without having a PhD in software engineering.

And finally, make it easy to share

Now that code quality is under control, let's take the heavy-lifting one step further and address the packaging and publishing.

The template also comes with rules only triggered on tags. When a tag is created (and only if code quality is validated), a packaged wheel will be pushed on our enterprise repository, making it instantly available to all within the company:

CI pipeline for a branch
CI pipeline for a tag

The above pipeline has been triggered after the creation of tag 1.0.0 of the cli-demo project. Anyone can now run the following commands and give it a go:

Installing and running an artifact that has been created by the pipeline
python -m venv demo-env
source demo-env/bin/activate
pip install cli-demo

cli_demo hello World

By the way, we also take care of setting the wheel version according to the tag name (in this case cli_demo-1.0.0-py3-none-any.whl).

The version can also be read when importing the app as a python module:

Compatible with PEP8 and packaging recommandations.

Different is good

We have defined what would constitute the basics for code quality, and provided an environment to leverage that.

If a lot of people will be very happy to use the out of the box tools provided to them, some of them will also express the need to deviate from that standardization. That's good!

Circling back to the fact that we should "help people, not constrain them", our templates and tooling should leave people with the ability to experiment, customize, add, and remove easily. We look for inspiration for introducing new standards from what new projects come up with. Each project is encouraged to add its own tools and badges beyond the ones already defined.

In the example below, vulture and pydocstyle have been added to the stack: file showing the current status of a project

In this other example, in addition of the basics, the developers are tracking how late they are from the latest available packages:

In yet another case, scientists have added exceptions to pylint variable naming rules because they would prevent them from using conventions that are clearly established in their fields. What is important is that pylint warnings reflect what makes sense for that team and that they can use it (instead of being frustrated with it and drop the metric entirely).

Example of additions made to the default .pylintrc file provided:

+   invalid-name,
+   arguments-differ,
+   protected-access,
+   too-many-function-args,

Outlook and growth

We define Michelin today as a software driven company. The innovations and disruptions of tomorrow will be coming in part thanks to the ability of our engineers to translate ideas into executable, shareable code quickly, and iterate on that idea.

After four years of promoting Python as a side gig, we are officially organizing ourselves in a way that will help us support the growth of this language for the whole company, on par with our Angular/Java and .NET communities.

Special thanks to all the people who have contributed one way or another to making this chain of tools better and keeping it relevant ❤️.