This article discusses part of the journey of IT engineers, creating a framework for themselves with the intention of making it available to non-IT engineers in R&D as well.
For more than 20 years, our software – for both computer and data science – was written using C, C++ or Fortran. The reasoning was obvious; get the most out of your resources, with highly optimized languages. This made writing software the prerogative of a chosen few.
Rationale for shooting ourselves in the foot
With the accessibility of languages like Matlab, R and Python, a lot of non-IT people started to write code. This is a known story found in many organizations, so I will make it short. There came a point where we thought:
"Python has its downsides, but it's a language we can all work with".
In a pragmatic sense: it can do almost everything we need across R&D and could therefore help foster collaborative work between IT and non-IT.
Hypotheses should hold against reality
We tested out that theory in 2014 with a small team of developers, and contributions from non-IT colleagues. We very quickly learned 3 major lessons:
- if you don't rewrite their code, people can keep improving it and fixing bugs for you
- if you provide them with a sensible framework, they are more than happy to use it
- the Python ecosystem is huge; standing on the shoulders of others makes you taller
The most agile one should be the one to adapt
At the core of my thinking was the fact that Python seemed like a powerful asset for engineers from all fields (read "not just IT"):
- to prototype their ideas and refine their expectations from IT departments
- to provide libraries that would organically integrate into our products
As developers, learning a new language should come fairly easy to you. So we pushed for Python to take over our IT department. This meant that C++ developers and Matlab aficionados had to start converting to Python.
This was a hard pill to swallow for a lot of people. On the one hand some developers saw moving to Python as short-selling themselves in terms of skills. On the other hand, some non-IT partners that were used to Matlab initially saw this as having to re-learn things, and therefore a loss of proficiency.
This article describes part of the framework that we put in place to onboard people to Python without forcing anyone. As everyone's job evolved towards data-driven activities and the need to collaborate, we saw the adoption rate increase through feedback, requests for support or training and consulting.
Be ambitious. Also reasonable.
Starting in 2018 we had one ambition: provide Michelin with a Python software factory that is both:
- highly accessible for non-developers
- at the state of the art for software engineers
Defining the baseline for code quality
Early on we agreed on one principle:
Our software factory should help people, not constrain them.
It is very easy to see code quality insurance tools as non-negotiable items. And indeed, you should not have to negotiate: people should be convinced. Make it so easy for them to write quality code, that they don't even question it.
So let's make it easy to run
We decided to use
cookiecutter with a handful of custom written templates. The most populars are:
Clickfor intuitive command-line interfaces
fastapifor web services
Djangofor database-backed websites
streamlitfor fast prototyping
The above video shows how you get from zero to a working, configured
Click app in 3 shell commands:
pip install cookiecutter
cookiecutter <template url>
pip install <your project>
All of these automatically setup the following tools:
black, because we don't have time to waste not using it
blackbut for your imports
pylint, because it helps you write better code
pytest, because code that is not tested does not exist
coverage, because you need metrics for your testing
sphinx, because documenting should not be an afterthought
And one final touch to all of that... automatically run all of the above in multiple versions of Python to check for compatibility issues on every git push with Gitlab CI.
Once this is done, the tools are all one command away:
Make it easy to read
Okay, we've made it easy for people to access basic code quality tooling. We have to provide them with the ability to quickly know two things:
- if something is broken
- where it broke
So we also configure a dashboard on the
README.md file of the project:
Badges highlight the current state of each tools and provides a link to a summary of the issues. In doing so, you can easily fix residual mistakes that were not caught locally before pushing to the remote.
Remember that in the end our goal is to provide people working with Python at Michelin with something they can use to write better, safer code, without having a PhD in software engineering.
And finally, make it easy to share
Now that code quality is under control, let's take the heavy-lifting one step further and address the packaging and publishing.
The template also comes with rules only triggered on tags. When a tag is created (and only if code quality is validated), a packaged wheel will be pushed on our enterprise repository, making it instantly available to all within the company:
The above pipeline has been triggered after the creation of tag
1.0.0 of the
cli-demo project. Anyone can now run the following commands and give it a go:
python -m venv demo-env
pip install cli-demo
cli_demo hello World
By the way, we also take care of setting the wheel version according to the tag name (in this case
The version can also be read when importing the app as a python module:
Different is good
We have defined what would constitute the basics for code quality, and provided an environment to leverage that.
If a lot of people will be very happy to use the out of the box tools provided to them, some of them will also express the need to deviate from that standardization. That's good!
Circling back to the fact that we should "help people, not constrain them", our templates and tooling should leave people with the ability to experiment, customize, add, and remove easily. We look for inspiration for introducing new standards from what new projects come up with. Each project is encouraged to add its own tools and badges beyond the ones already defined.
In the example below,
pydocstyle have been added to the stack:
In this other example, in addition of the basics, the developers are tracking how late they are from the latest available packages:
In yet another case, scientists have added exceptions to
pylint variable naming rules because they would prevent them from using conventions that are clearly established in their fields. What is important is that
pylint warnings reflect what makes sense for that team and that they can use it (instead of being frustrated with it and drop the metric entirely).
Example of additions made to the default
.pylintrc file provided:
Outlook and growth
We define Michelin today as a software driven company. The innovations and disruptions of tomorrow will be coming in part thanks to the ability of our engineers to translate ideas into executable, shareable code quickly, and iterate on that idea.
After four years of promoting Python as a side gig, we are officially organizing ourselves in a way that will help us support the growth of this language for the whole company, on par with our Angular/Java and .NET communities.
Special thanks to all the people who have contributed one way or another to making this chain of tools better and keeping it relevant ❤️.