Analyzing Office 365 Photo EXIF data using Power BI

In an earlier post, I wrote about how to mine the OCR data from Office 365 document folders using Power BI. During the research for that post I discovered that the photo EXIF data is also available for analysis from Office 365. I’ll show you how to get to it in this post.

What is EXIF data anyway?

Think of EXIF as special metadata for photos and with today’s cameras, records the settings of the camera as well as the location it was taken, if the camera has GPS. You can read more about the format here: https://photographylife.com/what-is-exif-data  Many smart phones today automatically include the EXIF data when you take photos with your phone. This data is automatically extracted by Office 365 from photos when uploaded to a document library.

Why do I care about this?

Imagine you work for a company that does inspections. Your inspectors use their phone to take photos of issues already. Wouldn’t it be great to show the photos by location and related location data together on a report page? This technique allows you to easily mine that EXIF data.

How do I get to it?

First, upload your images to a SharePoint document folder in Office 365 and the OCR process will initiate  automatically. I’ve had it take up to 15 minutes to process so you may need to be patient. You can do this via SharePoint Mobile app if you are uploading mobile photos.

Second, from Power BI desktop, connect to the document folder using the SharePoint Online List connector. By doing so, you’ll get access to the correct column that contains the EXIF data. Once in the dataset, you can use Power Query M to parse the data and start analyzing.

Demo

In this video, I’ll show you how to access the EXIF data and what you can do with the data.

Here’s the Power Query M Code

let
Source = SharePoint.Tables(“https://YourInstance/YourSite”, [ApiVersion = 15]),
#”8d19b1eb-42b8-4843-b721-fc1e8ef47688″ = Source{[Id=”88888888-8888-8888-8888-888888888888″]}[Items],
#”Renamed Columns” = Table.RenameColumns(#”8d19b1eb-42b8-4843-b721-fc1e8ef47688″,{{“ID”, “ID.1”}}),
#”Filtered Rows” = Table.SelectRows(#”Renamed Columns”, each [MediaServiceAutoTags] <> null and [Created] >= #datetime(2019, 4, 19, 0, 0, 0)),
#”Expanded FirstUniqueAncestorSecurableObject” = Table.ExpandRecordColumn(#”Filtered Rows”, “FirstUniqueAncestorSecurableObject”, {“Url”}, {“FirstUniqueAncestorSecurableObject.Url”}),
#”Removed Other Columns” = Table.SelectColumns(#”Expanded FirstUniqueAncestorSecurableObject”,{“FirstUniqueAncestorSecurableObject.Url”, “File”, “Properties”}),
#”Expanded File” = Table.ExpandRecordColumn(#”Removed Other Columns”, “File”, {“Name”, “ServerRelativeUrl”, “TimeCreated”, “TimeLastModified”}, {“File.Name”, “File.ServerRelativeUrl”, “File.TimeCreated”, “File.TimeLastModified”}),
#”Merged Columns” = Table.CombineColumns(#”Expanded File”,{“FirstUniqueAncestorSecurableObject.Url”, “File.ServerRelativeUrl”},Combiner.CombineTextByDelimiter(“”, QuoteStyle.None),”Picture URL”),
#”Expanded Properties” = Table.ExpandRecordColumn(#”Merged Columns”, “Properties”, {“vti_mediaservicemetadata”}, {“Properties.vti_mediaservicemetadata”}),
#”Parsed JSON” = Table.TransformColumns(#”Expanded Properties”,{{“Properties.vti_mediaservicemetadata”, Json.Document}}),
#”Expanded Properties.vti_mediaservicemetadata” = Table.ExpandRecordColumn(#”Parsed JSON”, “Properties.vti_mediaservicemetadata”, {“address”, “location”}, {“Properties.vti_mediaservicemetadata.address”, “Properties.vti_mediaservicemetadata.location”}),
#”Expanded Properties.vti_mediaservicemetadata.location” = Table.ExpandRecordColumn(#”Expanded Properties.vti_mediaservicemetadata”, “Properties.vti_mediaservicemetadata.location”, {“altitude”, “latitude”, “longitude”}, {“Properties.vti_mediaservicemetadata.location.altitude”, “Properties.vti_mediaservicemetadata.location.latitude”, “Properties.vti_mediaservicemetadata.location.longitude”}),
#”Expanded Properties.vti_mediaservicemetadata.address” = Table.ExpandRecordColumn(#”Expanded Properties.vti_mediaservicemetadata.location”, “Properties.vti_mediaservicemetadata.address”, {“City”, “State”, “Country”, “EnglishAddress”}, {“Properties.vti_mediaservicemetadata.address.City”, “Properties.vti_mediaservicemetadata.address.State”, “Properties.vti_mediaservicemetadata.address.Country”, “Properties.vti_mediaservicemetadata.address.EnglishAddress”}),
#”Changed Type” = Table.TransformColumnTypes(#”Expanded Properties.vti_mediaservicemetadata.address”,{{“File.Name”, type text}, {“Picture URL”, type text}, {“File.TimeCreated”, type datetime}, {“File.TimeLastModified”, type datetime}, {“Properties.vti_mediaservicemetadata.address.City”, type text}, {“Properties.vti_mediaservicemetadata.address.State”, type text}, {“Properties.vti_mediaservicemetadata.address.Country”, type text}, {“Properties.vti_mediaservicemetadata.address.EnglishAddress”, type text}, {“Properties.vti_mediaservicemetadata.location.altitude”, type number}, {“Properties.vti_mediaservicemetadata.location.latitude”, type number}, {“Properties.vti_mediaservicemetadata.location.longitude”, type number}})
in
#”Changed Type”

How to query Project Online multiple value fields in Power BI

Power BI

This post addresses a need to arises when querying Project Online data via OData. It can also serve as a solution template for other data sources.

Synopsis

A number of projects in our portfolio have impacts in different countries. These countries are designated via a Country Project level multi-value (MV) custom field. When multiple values are selected from the dropdown list, it creates a comma-delimited list of values. For example, you could see Brazil, Canada, United States returned as a field value from OData.

Challenge

One of the challenges is that this list can be any number of values and will grow and contract over time. Therefore, the technique should automatically adjust to the underlying maximum number of specified values.

Approach

For each MV field, a new data set representing all values in the MV field will be added to the data model. This new MV field data set will be a child to the original Project master data set. The relationship between the Project set and the MV set will be 1:N.

An inline function will also be used so that it can be executed for each record in the data set. This function will perform the split of values into separate columns.

Once split, the values will be unpivoted, creating a record for each ProjectId, MV Field value pair.

Potential Issues

  • Correct data set relationships. Initially, Power BI created the MV data set with a 1:1 relationship instead of a 1:N. Once corrected, all errors went away.
  • You will need to watch refresh performance as you are basically querying the same data multiple times within the model.

Procedure

Register here for this free session to see the step by step video and to download the PBIX file containing the data model with sample code. http://academy.tumbleroad.com/courses/how-to-parse-multi-value-columns-in-power-bi

Using Project Online, Power BI Alerts and Microsoft Flow for Custom Alerts

This is so cool! I just received an alert about a data condition in Project Online from Microsoft Flow. The best thing is that this condition isn’t anything Microsoft provides out of the box. With the functionality now in Flow and Power BI, we can now construct our own using Power BI and Flow.

With recent updates, Microsoft Power BI added the ability to set alerts on the card visual. You can read more about the details here. One of the settings allows an email to be sent with a distinct subject. That got me thinking since I knew from playing with Microsoft Flow that you can connect it to Office 365 Outlook and drive flows from specific emails. Flow also provides a Push Notification function so that I can push a notification to the Flow app on my iPhone.

It seems like all of the pieces I need are there to create my own custom notification.

First, I have a Project Online instance where I’ve got projects and people assigned to those projects. I’m using Project to manage capacity so knowing when the total number of overallocations is growing helps me react accordingly.

Project Web App

Second, using our Marquee Project Dashboards product, I have a count of Overallocated resources already available. I’m using Power BI Pro, so my data model refreshes automatically multiple times a day.

Marquee Dashboard

Third, I’ve created my own dashboard and I pinned the count of Overallocated Resources to it. Once the card is on your dashboard, you can click the ellipsis menu … and access the alert function (bell icon).

When the Alert panel opens, I turned it on, changed the Alert title and set my threshold. You can also determine if you need once an hour or once a day for alerting. The key is to check the box that says to send an email. This provides a way for us to use Flow to act upon this alert.

Now, we go to http://flow.microsoft.com. Once you login (because you already have an account, right?), go to My Flows and select Create a New Flow. You are going to create a two-step flow where it connects to your email looking for a specific subject and sender and the second step is where it sends the notification.

  • Type Outlook into the Search window to see the Outlook events.
  • Select Office 365 Outlook – When a new email arrives
  • Ensure the connection is to the right account that will receive the alert email
  • Click Show Advanced Options
  • I filled in mine to look like this. Note the From and Subject Filter values.

  • Now click New Step
  • Click Add an Action
  • Type Push into the search window
  • Select Push Notification – Send a push notification
  • I filled out mine to look like the following. You could have the URL set to bring you back to Power BI or Project. I chose for simplicity to bring me to Outlook.

I would have loved to have had an SMS message generated. This would have required a Twilio account that I don’t have at the moment. So since I was playing, I took the free route and loaded the Flow app on my iPhone. The Push notification would then show up there on my phone.

Once set up,

  • I updated a Project Online project plan to create a new resource overallocation and published it.
  • Power BI automatically refreshed the dataset, the number of overallocated resources increased.
  • This value change triggered a Power BI alert and sent an alert email to my Outlook inbox.
  • Flow picked up the alert and fired off the Flow.
  • The push notification was sent to my phone and bam, there it is on my Apple Watch.

What do you think? What have you created with Flow and  Power BI? Tell me in the comments below! I hope you found this useful.

Free Marquee™ for Google Sheets Template demo

Microsoft Power BI isn’t just for getting data from Microsoft products. The PBIX demo file that you can get once you register below, allows you to query the data from your Google Sheet into Power BI and then share resulting reports and dashboards via PowerBI.com with co-workers or the world if you desire. If you have the Power BI mobile app, you now have Google Sheets data on the go.

Demo File Only

This PBIX  is provided as a demo only, with no support or warranty offered as a result. Testing was only sufficient for a demo and not for production use. You may encounter errors in your environment with the use of this model in it’s current state. You are welcome to expand the solution. If you do, please add to the comments below so that we can all share from your experience.

Note: the PBIX file only connects to the first tab in your Google Sheet.

Google Sheets API Oddities

This was an interesting project as Google Sheets doesn’t have the same concept of table as Excel does. Therefore, there’s two conditions you may encounter for which we don’t yet have a good solution.

First, you shouldn’t have a blank column heading. This will cause the model to error out on the last data transformations as Power BI expects column headings to be present.

Second, the Google Sheets API doesn’t appear to return cells that are null that are in the last column of your sheet. Since the cells are returned as a list and we fold the list every X rows, this throws off the row count and fold points. As a workaround, we recommend having the last column of data have values in all cells.

Send me the data model!

 

Setup

You need three pieces of data in order to use this PBIX file.

  • The number of columns in the Sheet
  • Your Spreadsheet ID
  • Your Browser API Key

Steps to get your SpreadsheetID

  • Navigate to your Sheets page.
  • The key is in the URL, see the bolded red text below.
    • https://docs.google.com/spreadsheets/d/1gLzc8AxdlUl1MPY4t2ATKjc1UfBNj7iUaHRexwLYSKQ/edit#gid=0.

Steps to get your Browser API Key

  • Log into your Google account
  • Click this link to start the wizard.
  • Click Continue button to create a new project.
  • Unfortunately, you have to create the following, even though we won’t use it.
  • Click Go to credentials button.
  • Select Web Browser (Javascript) in the Where will you be calling the API from?
  • Select User Data under What data will you be accessing?
  • Click What credentials do I need? button
  • Click Create client ID button
  • Enter a Product name
  • Click Continue button.
  • Click Done button.
Now to create the credential we need for this to work.
  • Click the Create credentials button.
  • Select API key.
  • Select Browser key.
  • Give it a name and click the Create button.
  • Copy the API key and paste it into the BrowserAPIKey parameter.

Setting Up Your PBIX File for Use

Once you receive your PBIT file, do the following.
  • You must have Power BI Desktop installed prior to performing this procedure.
  • In File Explorer, double-click on the Google Spreadsheet Template – Final.pbit file.
  • Power BI Desktop will open and you will be presented with this dialog.
  • Fill in the values and click the OK button.
  • The model will refresh and it should load your Google data.

Setting Up Scheduled Refresh on PowerBI.com

Once you have saved the model, verified the data and built your reports, you can publish this model to PowerBI.com. Once there, you can set it up to automatically refresh the data so that any reports and dashboards are up to date.

Procedure for Scheduled Refresh

  • In Power BI Desktop, click File, Save to save the model
  • Click Publish
  • If you aren’t signed into Power BI, you’ll be prompted to do so.
  • You may be prompted for the location to publish. My Workspace is the default
  • Once done, go to PowerBI.com.
  • Follow the procedure in the video below.
  • Navigate to the Datasets in the left navigation to start the process.
  • Note, the API key you entered earlier in the model is your login. This is why it is set to anonymous in PowerBI.com.
Send me the data model!

Microsoft Project Online Resource Management Overview

[tagline_box backgroundcolor=”” shadow=”no” shadowopacity=”0.7″ border=”0px” bordercolor=”#fe8f00″ highlightposition=”left” content_alignment=”left” link=”http://bit.ly/TalkResMgmt” linktarget=”_blank” modal=”” button_size=”medium” button_shape=”square” button_type=”flat” buttoncolor=”green” button=”CLICK TO REGISTER” title=”Want to know more? Register for our Resource Management webinar.” description=”” animation_type=”0″ animation_direction=”down” animation_speed=”0.1″ class=”” id=””][/tagline_box]

This two minute video provides a high level look at the resource management capabilities within Project Online and Project Server 2013. All content was created, using out of box functionality. The demo also shows the use of Team Resources as containers for assigned, but unscheduled work.

 

Controlling Chaos: Calculating Your Project Contingency Budget

I was sitting in my graduate level statistics class when it hit me, Expected Value calculations could be used to solve my project contingency budget problem!

My Quandary

Earlier that same day, I was sitting in a project review with my senior management. The project team had identified a number of risks with the project. I included a contingency budget task in the schedule based on that qualitative risk assessment. The management challenged me on this inclusion and said the team was artificially inflating the estimates.

The project itself involved a lot of moving pieces, with external vendors, timed deliveries of equipment and geographically dispersed personnel. The project ran nine months, ending in October. We had factored in things like vendor delays, employee sick time, etc. I thought the risk analysis was a reasonable precaution. However, management said “Prove it!”

I Need Days, Not Rankings

Every risk assessment article I had seen at the time involved the use of a qualitative risk ranking (High, Medium, and Low). This qualitative assessment didn’t meet my needs to prove that the amount of contingency was correct.

What I needed was a way to quantify the risk in days, so that I could create a task of X days to track the contingency. It was also the first time this organization had seen this type of risk analysis so the analysis needed to be effective but not overly complex.

Let’s Get Statistical

Back in my 3½ hour statistics class, we also reviewed the Pareto principle in that 80% of your outcome impacts are likely the result of 20% of your events. We also discussed Expected Value calculations and Normal Distribution and how all of these techniques could be used together.

Normal Distribution, which you may know as the “Bell Curve”, occurs in many settings and charts the probability distribution for a given scenario. All points on the curve has an associated Z-Score, which can be used to create an Expected Value result for a specific occurrence.

The Idea

My epiphany was that by identifying a small number of project risks and calculating the Expected Value of each risk in days, the sum of the Expected Value outcomes could be used as the duration for my contingency task. It should be big enough of a sample to cover most of the potential project variance.

Risks are a way of trying to quantify variance in your schedule. Each task finish date can be thought of as a little bell curve and the sum of those individual task finish variances decides where your final project finish date occurs.

The idea seemed to have merit. I did some reading to validate my idea and found that the disaster recovery planners do a similar calculation for assessing risk. They also add an expiration date for a given event and a number of potential occurrences for a given time frame. Expiration dates are needed, for example, if you have a risk that a vendor will not deliver some equipment on time. Once delivered, the risk expires as it is no longer needed.

Implementation Challenges

Imagine my dilemma. How the heck am I going to explain this concept to my team without it sounding like a lot of work?

Another consideration is that most people don’t think of probability in terms of a number. They use language like:

  • Very Unlikely
  • Unlikely
  • Possible
  • Likely
  • Very Likely.

You may be familiar with the word Sigma, as in Six Sigma. Sigma is a measure of variance around a mean. 97+% of all outcomes will occur between -2 Sigma and 2 Sigma. Anything beyond +/- 2 Sigma is exceptionally unlikely. Each Sigma point had a corresponding Z-Score that is the probability of an event happening at that point on the curve.

To make this user friendly, I mapped the language terms above to the Z-Scores at the corresponding -2, -1, 0, 1, and 2 Sigma points.

My core calculation is [Expected impact in days if risk occurs] * [Likelihood it will occur, Z-Score] * [the number of possible occurrences in time period] assuming that the expiration date had not passed. I needed to capture this in a spreadsheet at the time and I needed it to be easily understandable to the user.

The Tool

The resulting spreadsheet captured the Risk, the impact as measured in days, expiration date and a dropdown for likelihood that the risk will occur. We formulated a risk list of ten items and found our calculations added two days of exposure to our original estimate of contingency. Ten items seemed like a sufficient sample without creating a lot of additional work to formulate the list.

For example, one of our vendors was in Miami and had a key deliverable in late August. I grew up on the Gulf Coast and knew this was peak hurricane season. If a hurricane hit the area, they would have a week of downtime.

Originally, we were thinking this was an unlikely event. One of the team members pointed out that the National Weather Service was predicting a higher than normal number of hurricanes for the season. The team then upgraded the Risk rating to Possible. The risk was then documented as shown in the table below. We did this for each of the risks and the sum of the values was the duration of the contingency task.

Risks Screenshot

The Result

The new analysis was introduced in the next management meeting. They were dubious but they allowed us to use it in our project. As risks occurred, we kept track of them and used days from the contingency budget. We encountered a number of issues along the way, some anticipated and a number that were not.

We ended the project only one day over our contingency budget date. Considering we had 28 days of contingency, the management reaction to a 1 day slip was much more muted than communicating a 29 day slip. We also knew why we consumed 28 days of contingency, which gave management confidence that the situation was being actively managed.

Summary

I’ve used this basic technique successfully on other projects where we were able to increase project on-time rates from 35% to 70% on-time. This technique also gets your team in the right mindset as the analysis is reviewed every status meeting and gets them thinking about how to address risks proactively.

This post is part of the Chaos and the Cubicle Hero series. Other posts can be found here, here and here,

 

[tagline_box backgroundcolor=”” shadow=”yes” shadowopacity=”0.7″ border=”0px” bordercolor=”#fe8f00″ highlightposition=”top” content_alignment=”left” link=”https://clarity.fm/trebgatte” linktarget=”_blank” modal=”” button_size=”xlarge” button_shape=”square” button_type=”flat” buttoncolor=”orange” button=”CLICK TO SCHEDULE VIA CLARITY.FM” title=”Have An Immediate Project Online or Business Intelligence Need?” description=”We are now coaching and advisory services for Project Online, Project Server and Business Intelligence via Clarity.fm. Clarity.fm allows you to request specific times to meet so that we can discuss your immediate need. ” animation_type=”0″ animation_direction=”down” animation_speed=”0.1″ class=”” id=””][/tagline_box][separator style_type=”double” top_margin=”20″ bottom_margin=”40″ sep_color=”#fe8f00″ icon=”fa-shopping-cart” width=”” class=”” id=””][three_fifth last=”yes” spacing=”yes” background_color=”#e0e0e0″ background_image=”” background_repeat=”no-repeat” background_position=”left top” border_size=”0px” border_color=”” border_style=”solid” padding=”20px” class=”” id=””]

[/three_fifth]

Project Tasks are Your Lowest Priority

Project tasks are the lowest priority work you have on any given day. Wait, what?

It’s true! Strategically, we know project work is the most important future investment for the company. When you break down what you do every day, you’ll see that you are fitting in project work around the other work you have to do. It’s frustrating. You know you could be doing more. It’s frustrating because someone thought you had the time to get this work done.

If you don’t believe the premise, imagine the following scenario. You are staying late at the office to get some project work completed. Your manager’s manager sees that you are in the office, comes over, and asks you to do a task for tomorrow morning. If your answer is “I’m sorry, but I can’t because I really need to get this project work completed.”, their response will determine the relative priority of project work in your environment. For some, rejecting the task would be a career-limiting move.

Perhaps then, we are asking the wrong question when it comes to resource capacity management. Instead of asking whether this resource has free capacity to do the work, shouldn’t we be asking if the resource has enough consolidated free time to work on project work? If they do not, what can we do to remedy this situation?

In my “Done in 40” webinar, we discussed recent research by time-tracking software companies that identified how the top 10% of productive employees work in an agile fashion. These employees typically work 52 minutes and take a 17 minute break away from the work.  This is coherent with ultradian body rhythms studies from the 90’s and 00’s that showed your focus naturally waxes and wanes on a 1.5-2 hour schedule. These work sprints can make you very productive and help reduce mistakes and rework.

I’ve personally tried the sprint approach and I can say, it works well for me. I use a timer app on my Pebble watch to monitor my sprints. Fifty minutes is roughly the time where the mind starts wandering to “Did Joe ever respond to my email?” or “Is there coffee?”. Three sprints enable the top three daily tasks to get done easily.

The catch is you need to have 69 uninterrupted minutes to complete a personal sprint. This leads us back to the question of does a resource have consolidated availability? Yes, they have 3 hours available that day but if it’s in 15 minute increments, that’s not usable.

When a client with project throughput issues engages my services, I find it’s usually not a project management issue. Many times, the lack of consolidated availability is preventing the project work from happening. If you are interrupted every 10 minutes, as are most office workers in the United States, it’s very hard to get work done. If you are having issues getting projects through the pipe, perhaps it’s time to look beyond your projects and to your operational work processes.

We spend the majority of our energy providing oversight and processes to projects, which are a minority of the work instead of doing the same for operational work. McKinsey released a white paper recently that showed most of the operational spend goes to keeping the company running. New projects are a small portion of the overall effort. Yet, we don’t monitor operational work holistically the way we do projects. Perhaps, its time we start.

Project management processes are very helpful and needed. We’ve worked out how to anticipate and reduce risk and how to deliver the reward. We need to apply these approaches to how we manage all work. It’s the operational work that provides the overall context within which we do our project work. If improperly managed, it also constricts our ability to get our project work done. Operational work management improvements could yield the biggest benefit by enabling the consolidation of availability, yielding more usable time for project work.

If you are interested in finding out more about the specific techniques and how to use Microsoft Project to support this need, sign up here and get the recording link to the full “Done In 40” webinar.