CoP TIP: Prevent column changes from breaking your Power BI model

by Welcome to Marquee Insights

This is a post from our Managed Community of Practice service that supports clients in building a data culture. If you need help for your organization, contact us at hello@marqueeinsights.com to discuss.

Who Broke My Model?

Don’t you hate it? You have a Power BI model pulling data from many sources, like SharePoint and a few Excel spreadsheets. You come in in the morning and see a Teams message that your Power BI model isn’t updating! Oh no, what now?

You look in the Power BI service and see that there’s an error on the refresh. You sigh as once again; someone renamed a column, or they deleted a column from the data source.

Business is constantly changing so this shouldn’t be such an issue, but it is. Power BI refers to all columns by name, so the default behavior will lead to issues. Power BI is getting rolled out everywhere in the org, so this problem is growing quickly.

Models Should Adapt to Changing Data

In a perfect world, Power BI handles schema changes without blowing up. The column that is renamed or removed is simply returned as null values, giving you time to figure out where it is used and make changes accordingly.

Thankfully, that world is here today. Let’s review how to make that happen.

Here’s How to Make Power Query M Adapt Automatically

The key is in how you specify what columns to include in your Power Query. As discussed in your Marquee Insights Power BI class, this action should be done immediately after the Navigation step as this action generally folds back to the data source. The order in which this action is made is of even greater importance, using this technique.

It’s important to select your columns first, after navigation, for this to work.

Normally, when you select your columns, you specify the following Power Query M statement, usually through the UI action:

= Table.SelectColumns(Table_Name,{"Field1", "Field2",
"Field3"})

If Field3 is no longer returned by the data source, the query errors out. Why is that? Well the Table.SelectColumns command has three parameters. The third parameter is what action to take when a field is missing.

= Table.SelectColumns(Table_Name,{"Field1", "Field2",
"Field3"},ActionToTakeWhenFieldIsMissing)

There are four potential values for this third parameter.

  • MissingField.Error
  • MissingField.Ignore
  • MissingField.UseNull
  • MissingField.Type

MissingField.Error is the default so when a column is missing, an error is thrown, and the column is not returned as part of the dataset. The MissingField.Ignore option doesn’t throw the error but also doesn’t return the missing column. Unfortunately, these two options will break any reports that use that missing column.

Thankfully, there’s MissingField.UseNull. This option is the key to making your queries resilient to changes.  If a specified column is not found in the underlying data, it returns the column as a column of null values. The benefit is that your reports will continue to work, and it allows you to track down where the column is being used.

I’ve yet to find any documentation regarding MissingField.Type so that will be covered in a later blog post.

An Example

I’ll use the Projects oData feed from Project Online to illustrate the behavior. Project Online is an example of a product that allows Administrators to define and change custom metadata easily. If you are not reading this on the MarqueeInsights.com site, please go there for the original version. Consequently, model refresh failures are a common issue with Power BI and Project Online.

I’ve created a Project Custom Field called “A Test of Field Removal” in Project Online.

Set up of the Project level Custom Field in Project Online.

I’ve created a new query against the Project oData feed and selected five columns. I’ve specified the default MissingField.Error value. The values load normally as shown below.

Select columns using the default MissingField.Error value.

Now, let’s delete the column from Project Online and refresh Power BI. The result is this error.

Deleting the source column yields an error.

If we change the MissingField parameter to MissingField.UseNull, we get the following result instead. Note, the missing column is returned but now as a column of nulls.

Using MissingField.UseNull, the column is returned with nulls, allowing use to continue.

That’s all there is to it. Simply ensure you are using the correct parameter and you’ll see less data model issues due to renamed and/or removed columns.

Leave a Reply

Your email address will not be published. Required fields are marked *


The reCAPTCHA verification period has expired. Please reload the page.