Responsible Business Intelligence - Secure Reporting to External Customers via Salesforce + PowerBI
I work in the Business Intelligence and Analytics department as a full-stack architect for the Americas region. Michelin as a company is in the midst of a very important data-driven transformation. Our department is directly in the middle of it, often working to set foundational practices and approaches for the region.
The BI&A department has grown quite a bit over the last couple of years, and is segmented into 10 squads based on the customer for the analytics product being delivered. Examples include: sales, marketing, finance and supply chain, just to name a few. All come with their own sets of challenges and opportunities depending upon the business stakeholders, but one of our more unique cases is the squad for external customers, known as the Mercury squad.
The majority of BI&A's direct customers are internal, presenting Mercury with a unique challenge. They are tasked with delivering to a user base that doesn't work for Michelin, but is an irreplaceable partner in our business - primarily fleets and dealers.
Historically, our department has used Microstrategy as the primary user-facing tool for industrialized BI reporting. For a variety of reasons, we are in the middle of a journey to become more data literate and expand to other reporting tools, namely PowerBI. Some of these critical dashboards are already being built by the Mercury team in PowerBI. Globally, Microstrategy dashboards have been available through the external facing Michelin MyPortal, a customized Internet portal running on Salesforce.com. This has been the primary place for these external users to view dashboards for years. A shift to look at others tools created an urgent need to integrate these two in Michelin's ecosystem, so this article is about the underlying technical design and security validation to implement it properly.
Identifying the Need
As I said previously, Michelin had been using an existing solution with Microstrategy dashboards for some time. But the focus of this article is not on the previous solution. The objective was to build a solution that met customer needs and did so securely and conveniently, not to create an exact replica of an existing implementation.
Mercury's first project was for a dashboard called Fleet Business Insights. This would allow a fleet to analyze their multiple figures on their business with Michelin, including total spending, spending trends over time, number of invoices, proportion of service vs. product expenses, and spending categories (such as retread tires vs. new tires).
Several key architecture questions needed to be answered before moving forward, which I will itemize below. Some are quickly answered, some required additional analysis:
- Who is the target audience for these dashboards?
The entire reason for these dashboards is for external users. The dashboards are created as a tool for the dealers and fleets, so that will be the target audience.
- Who needs access to these dashboards?
Along with the above mentioned groups, the analytics team developing the dashboards will obviously need access. Michelin's internal sales team, working with these external user groups, will need access as well.
- Where will the dashboards be accessed by each group?
We could build a separate environment, but this would have a high development and support cost. Salesforce is Michelin's standard solution already used for external partners, so this made the most sense. Michelin's internal sales team already accesses Salesforce.com via integration with Azure Active Directory. There they could authenticate and access dashboards with their Michelin credentials. Dealers and fleets do not have Michelin credentials and use the Michelin MyPortal on top of Salesforce.com I mentioned earlier for external consumption. This would also work.
The technical and business needs were both clear, as was the framework. But now for the harder questions:
- How would external users authenticate?
- How would users' rights be appropriately managed?
- How would Salesforce/MyPortal interface with PowerBI?
- How would we verify this was secure?
All these questions would be answered in the design phase.
Designing the Solution
First step was to meet with two groups: Michelin's internal Salesforce development team and Microsoft. That way we could find out what was both possible and recommended.
After these meetings, we determined the PowerBI embedded module was an option but not necessary. The dashboards could be embedded by putting an iframe inside a Lightning component (proprietary component within Salesforce), and would not require an additional subscription or paid add-on.
High level functional architecture of access is shown below, server names and network information excluded for security reasons.
Authentication and Rights Management
Authentication was a little bit trickier for the external groups. Could we put guest accounts in Azure Active Directory? Would we consider using OAuth? SAML?
After doing some research and looking back at the previous solution, I realized the problem wasn't nearly as complex as it first appeared. These external users were not accessing the Michelin internal network, and would never have to. They were only accessing through Michelin's Salesforce MyPortal, to which they already had logins. These MyPortal accounts were completely independent of Michelin accounts, and they could stay that way.
Authentication could be completely segmented into two separate systems and mechanisms, which I will detail below. This closely fits a pattern that Microsoft recommends for embedding reports, which I will provide a reference link for later in my explanation.
1. Internal PowerBI. Michelin's PowerBI nodes pull data from our back-end corporate data lake to run queries for internal consumption already. PowerBI could continue to use this same connection mechanism with this external facing solution; the resulting PowerBI dashboard would just be embedded in the portal. But this dashboard had to be run as a user. Internally, users would login with their Michelin credentials, which wasn't an option for those not in Michelin systems. So we decided to utilize a service principal, effectively functioning as a robot account. This account executes queries for this dashboard and would have full rights to execute the necessary queries within PowerBI, along with read access to all back-end data needed for said queries. This account would also have access to the PowerBI REST APIs from the back-end web server, which could tokens and URLs back to the end user. More on this below.
As a former network security administrator in a previous job, my instant reaction was, "Access a dashboard in an Internet facing portal? With a robot account loading the dashboard? We are creating a security risk!" With further analysis, and design of the second part of rights management, this wasn't the case.
Here is a link for additional details Microsoft provides about embedding in an app with a separate authentication method securely. Something they refer to as embedding for customers: https://learn.microsoft.com/en-us/power-bi/developer/embedded/embed-tokens?tabs=embed-for-customers. Microsoft provides more detail, so please check out the link if you want to learn more, but I will paraphrase and include one of their diagrams here.
When a user accesses the URL for the web app, the web app makes a call using the service principal credentials to Azure Active Directory. Once this is successful, a token is passed back from the AD to the web app, confirming it can load the embedded report. Please note that this initial Azure AD token pass is exclusively back-end, server to server communication. It does not exist in the user's web session.
The end user still needs access. So the web app then issues a call to PowerBI's REST APIs, which send an embed token back through the web app and to the web app user. This token is effectively a session token, and also specifies what PowerBI content can be embedded, sending a URL to PowerBI with it. Lastly, the web app user sends a request to the provided PowerBI URL, along with the embed token, loading the embedded dashboard on the webpage.
This method depends upon the assumption that the web app user is permitted to access the web app page in the first place, which brings me to the second part of authentication.
2. Michelin MyPortal authentication. The only place these dashboards will be accessible external to Michelin is this portal, making it an appropriate place to manage rights. All external fleets and dealers that required access would have accounts in Michelin's MyPortal. Users would be separated into groups. There would be a top level external group for fleets and dealers, and a separate group for those working for Michelin, primarily for testing the user experience. Additional granularity of access could then be applied via filtering at two separate levels. First at the dashboard level. There is only one dashboard right now for this particular user need, but will be important for potential to expand this technical solution to other products. Second, filters could be applied as needed through the embedded URL in the iframe, tied to the customer ID assigned to the user's MyPortal credentials. This would restrict access to only the user's company's data.
Here is a flow diagram of the entire process - from when a user logs in to MyPortal to successfully loading the PowerBI embedded dashboard. Actions or data loads are in blue rectangles, validation checks are in pink diamonds.
Lastly, I've described how a few scenarios would work with respect to user rights. These dashboards are set up to deny access by default. Examples:
- User A has no account on MyPortal. User A login fails, cannot access portal.
- User B has an account on MyPortal, but isn't in the external fleet group nor the Michelin internal access group. Login succeeds, but dashboard is not visible in dropdowns nor is URL accessible.
- User C has an account on MyPortal, and is in the external fleet group. User is not a member of any fleets. Login succeeds, user can see dashboards, filters don't allow visibility over any data.
- User D has an account on MyPortal, and is in the external fleet group. User is a member of fleet 1. Login succeeds, user can see dashboards. Filters applied to only allow visibility over fleet 1 data.
Verifying the Security
Since this integration of these two components was completely new to the company, it was important to check for security vulnerabilities in our implementation before moving to production.
Putting on my security hat for a minute, if user permissions are managed entirely in MyPortal, parameters would have to pass through the URL called by the iframe with PowerBI embedded. This URL was not accessible by the end user, but for someone with nefarious intentions, could this be manipulated to elevate rights? I needed to test and verify all the checks and tokens worked as intended. I went to one of our security architects for advice, and he recommended using Fiddler to dig deeper.
Fiddler is a freeware tool, useful as a debugging proxy, for Windows that allows capture of web traffic to and from the local machine. This would allow me to not only see the embedded URLs that were hidden in the user interface, it would allow me to see how the parameters were passed. Fiddler also enabled me to view the headers and payload information, and would even allow me to manipulate those and resend them with information I specify. Sample screenshot of a capture accessing Google.com below.
My first goal was just to learn how the web traffic worked. From there I could investigate my two security scenarios - whether I could access the dashboard without logging in to MyPortal, and whether I could manipulate my rights to see data I shouldn't be able to once logged in as a different user.
The team created two test users, let's call them User A and User B. To identify the data each should see, a company ID accompanies each user. Let's call User A's company ID 123 and User B's company ID 456.
I logged into MyPortal as User B and accessed the dropdown menu for our embedded PowerBI report. Screenshot from that URL connection forward posted below.
Once accessing the URL of the report, my laptop then sends a set of 4 requests to call the embedded Salesforce lightning components. Digging into the payload, I discovered in the response to the 4th message, the company ID 456 was passed twice, once in a field, the other in a full URL; the exact URL that my client would then connect to app.powerbi.com with, along with the PowerBI embed token. This connection then loads the appropriate filtered data from app.powerbi.com, and the Salesforce components continue to load on the page. Now for the actual testing:
Test #1
My first test was simple, log out of MyPortal on my computer and attempt to resend the MyPortal request. The first packet went through, but automatically forwarded to the portal login page. The others all gave me an error about an inaccessible component or an invalid session. This was exactly the behavior I was looking for; without a valid session, even with all the connection information and payload, a user could not spoof a session without logging in.
Test #2
Test two was also simple. The Lightning component calls an embedded URL from app.powerbi.com. I wanted to try and connect directly to that URL. Since this URL included the company ID, I could try and manipulate it. Resending the request resulted in the PowerBI logo loading on the screen, and the dashboard never loading. Looking at the payload, the URL would only load if there was a valid embed token. But even if that check succeeded, the dashboard must still be loaded. This was the previously mentioned server to server communication from Microsoft, which occurred on the back end. The Azure AD token and account were not present. This part failed, which is why the report would not load. This was also desired behavior.
Test #3
Test 3 was a little bit more complex. I wanted to take a valid, active login session as User B and attempt to see data intended for User A. Fiddler featured a nifty feature called "AutoResponder" that helped greatly with this. Since I initially found two places company ID 456 was passed in the 4th request via a Salesforce URL, I downloaded the response payload to my PC and manipulated it to instead include company ID 123. The AutoResponder feature allowed me to trigger responding with this file any time that 4th request URL was accessed.
After attempting to load the MyPortal URL with the dashboard, I received similar behavior to the previous test. Where the dashboard usually rendered, a spinning wheel showed. Eventually, I received the following message via response in Fiddler. This was also the behavior I was hoping for.
With these results, I was comfortable with proceeding with our official security validation process with both the technical and business contacts.
Deployment and Shout Outs
I was happy we could design a solution and mechanism that allowed convenient access to these dashboards for external users, but did so in a controlled and secure manner. This solution is now live in production, rolled out for several fleets. We will continue to evolve the solution, working with the customer and improving based on customer feedback. Screenshot with dummy data shown below:
Although the technical design was the focus of this article, the hard work on actually implementing and deploying this solution came from several people: the Mercury squad for their great and detailed PowerBI dashboards, the internal Salesforce development team in Michelin for their customized coding work to apply filters and rights management, and our PowerBI subject matter expert for his invaluable insight and execution on seamlessly connecting PowerBI with MyPortal.
They're the reason this is a success, so a big thank you to that whole group.
The Americas is also the first region to roll out an embedded PowerBI dashboard for external-facing customers within Michelin, so my hope is this creates a framework and jumping off point for other regions looking to do the same.