Row Level Security, Object Level Security, Data Masking: What Are the Business Use Cases?

In a previous entry I explained at length the various ways to secure data in business intelligence products. In this follow-up I want to elaborate on which method(s) to use for what purposes. This entry points to Power BI specifics in some instances, but should be relevant to many other BI and data storage products.

RLS vs. OLS vs. DDM: Three Shades of Need to Know

Row Level Security (RLS) is meant to restrict the depth of what any specific user or group may see in a report, i.e. which rows of any tables they have access to they can see at all. RLS’s main purpose is to reflect organizational boundaries by geography, business unit, product line, or more rarely time. In other words, RLS makes the most sense to show you the dimensional subset of facts that are relevant to you, without oversharing either adjacent – say your colleague’s detailed sales by account – or broader – say your entire department or company – data.

RLS can usually be implemented without revisiting the structure of your reports. At worst you may have empty charts if no row is returned in specific filtering conditions. On the other hand RLS can have interesting data modeling ramifications and requires planning to be reasonably maintainable.

Object Level Security (OLS) intends to hide the breadth of what can be seen within a data model. The purpose is to prevent users from having cues that certain business logic even exists, with the sole appearance of certain column of table names possibly revealing trade secrets and business practices that are only shared on a “need to know basis.” Let’s say for instance that only some channel partners gain access to “back margins” under certain conditions. People who are not privy to these deal structures do not need to know that there’s a corresponding column in your data model.

Unlike RLS, OLS is a bit trickier to implement in reports, as blocking specific columns or tables from “existing” from the perspective of specific users, will outright break visuals that feature these entities. At least that’s how Power BI currently works. You may need to author different reports, or at least report variations using separate pages or bookmarks, to deliver a user-friendly experience. Field parameters provide the most elegant way to avoid charts from breaking without having to create bookmarked report variations by audience.

(Dynamic) Data Masking on the third hand doesn’t outright hide rows, tables, or columns. Instead, what it does is hide true values by obfuscating or randomizing specific data points, from Personal Identifiable Information (PII) to margin rates and other data that, again, are distributed on a need-to-know basis. The benefit is that the entire data model and structure is still accessible, which is essential for data engineers and BI developers to be able to work. This practice is very often used when hiring outside consultants, though it is common to just do it by providing fake data spreadsheets rather than leveraging DDM capabilities available in some relational databases such as MS SQL Server.

I like to think of Data Masking as a tool primarily targeting BI authors while RLS and OLS are meant for BI consumers. RLS and OLS do not even apply to developers in most tools, but thanks to DDM, giving developer full data model access doesn’t have to mean giving them full data access.

Sensitivity labels are quite different from all the above. They don’t hide data outright nor are the applied at a granular level. Instead, they’re used to label entire assets and files – datasets, reports, dashboards, and dataflows – to prevent them from being shared outside of the confines of the Power BI service – think Excel or PowerPoint exports – beyond their intended audience.

Common Implementation Challenges

Things can get pretty complicated whenever requirements ask for the combination or RLS, OLS, and/or data masking, not just in function of the target audience, but also depending on the grain of the returned data. It’s pretty common to want to hide or obfuscate data at the smallest grain while still allowing the lowest resolution data points to be displayed. That sounds straightforward, except when it isn’t, such as in cases when there are not enough data points to prevent people from easily guessing missing values by subtracting the granular values they do see from the total.

Another challenge arises from deciding where to apply RLS/OLS/Data Masking. My general advice follows the recommended ETL pattern of doing transformations as upstream as possible and as downstream as necessary. In other words, if you can enforce these security mechanisms in source data systems, you’re better off doing so as various downstream tools will then all benefit. That means leveraging data warehouse capabilities rather than BI frontends if possible. Just because you can do it in Power BI doesn’t mean you should.

Third, combining these techniques adds its own layer of complexity, with the combination of RLS and OLS in particular requiring handling users and rules in specific ways.

Fourth, these rules need to evolve continuously as people get hired, fired, demoted, promoted, or moved to different jobs with different attributions. You’d better leverage the enterprise system of record where these things are recorded – typically an IT directory such as Active Directory and/or a HR app, otherwise you’ll have a maintenance nightmare on your hands and likely only partial implementation, not in technical terms, but in organizational terms.

“How come this guy who moved from sales to marketing 6 months ago can still see sales quotas?

Irate boss finding out your security rules are not keeping up with the org chart

In conclusion, all this complexity introduces management and technical overhead and should be considered against the real impact of information leaks. It’s easy to underestimate security challenges, but it’s also easy to over-value internal data. There are tradeoffs here in terms of development speed, data pipeline and security maintenance, and report performance.

Thanks to Aswanth Venkatesan at Lumel for suggesting this topic. As always, constructive feedback is welcome in the comments below.

Don’t get them confused, they’re different animals!

Leave a Reply

Your email address will not be published. Required fields are marked *