Abstract and encapsulate with Power BI and SQL Server Table-Valued Functions – Use Case 2: Change results based on a database point in time and a user defined parameter

This is the second article  in which I cover different use cases for using SQL Server Table-Valued Functions with Power BI.

In the previous article I showed you how a table valued function can be used to hide lower levels of a hierarchy based on user id. This is handy if you need to prevent certain users from breaking down aggregate measures while interacting with a report or dashboard in Power BI.

In this article I will show you how you can use table valued functions to filter the results based on a date range stored in your database and allow the report author to control how many years of data to bring back with an optional input parameter.

Prerequisites

Use case 2: Change results based on time

This use case comes in handy if you want your results to be filtered based on a date and time in your database and not on the date time of your Power BI users. This situation may occur in a globally distributed system in which Power BI users and the database they are querying are located in different time zones or in situations where the data lags behind the current date/time of your users. I will cover a situation where you want the data set to only show data up to a point in time stored in your database.

Using the Wide World Importers sample database suppose you wanted to limit the orders that a person can report on to the last year in which there were orders in the database. Run the following query.

USE WideWorldImportersDW
GO

SELECT
MIN([Order Date Key]) AS [Earliest Order Date]
, MAX([Order Date Key]) AS [Latetest Order Date]
FROM
WideWorldImportersDW.Fact.[Order]
GO

As you can see from the results, we have orders from 2013 up to 2016. If we were to use the Power BI relative date slicer and set it to only show data from the past 1 year we would not see any results because the Power BI relative date slicer is based on today’s date April 6, 2019 and not on a date in the database. One way to overcome this problem is to use SQL Server Table-valued functions and encapsulate the logic to only show orders from June 1 2015 to May 31 2016. To do this will create a new function using the code below.

Create Function

Run the following code in the Wide World Importers database to create a new function. We are including an optional parameter so that the report author can change how many years they want to go back when they connect to the data.

--DROP FUNCTION dbo.ufn_Orders_PastYear
--GO


CREATE FUNCTION dbo.ufn_Orders_PastYear(@NumberOfYears INT = NULL)
RETURNSTABLE
AS
RETURN
(
SELECT
*
FROM
WideWorldImportersDW.Fact.[Order] ord
WHERE
ord.[Order Date Key]
BETWEEN
(SELECT DATEADD(year, -ISNULL(@NumberOfYears, 1), MAX([Order Date Key])) FROM WideWorldImportersDW.Fact.[Order])
AND
(SELECT MAX([Order Date Key]) FROM WideWorldImportersDW.Fact.[Order])
);
GO

Notice that I negate the number of years by adding a negative sign in front of the ISNULL function in the select statement, this is to simplify the report authoring experience with using this function. Next, we will make a DirectQuery connection to the function using Power BI.

Connect with Power BI

Similar to before connect to the SQL Server database using a DirectQuery connection and select the function ufn_Orders_PastYear from the list of database objects.

As you can see in the image above the function parameter @NumberOfYears appears in Power BI as an optional parameter. If you leave it blank and click apply it will pull back 1 year’s worth of data based on what is available in the database. You can enter in your own number to control how many years back you query the Orders fact table. Incorporating parameters is a very powerful way to give the report author control of the results.

Once the data has been loaded in let’s visualize it using a simple bar chart.

Your results should look like the following image below.

As you can see in the chart we only have data from 2015 to 2016. To make the chart a bit easier to read lets add a proper date hierarchy. We will need to build it because we are using a DirectQuery to access the data so the autogenerated date hierarchies are not available, those are only created when you import data into Power BI and set the data type to be a date.

Create Year Column

Create a new calculated column and use the following DAX code to pull out the year value from the Order Date Key field.

OrderYear = Year([Order Date Key])

Create Month Columns

Next, we will create two month columns one will be used to sort and the other will be used to display the month name on the report.

Use the following DAX code to create a new month number calculated column.

OrderMonth = Month([Order Date Key])

Now create a new calculated column to store the month name using the following DAX code.

Order Month Name = 
SWITCH (
    [OrderMonth],
    1, "January",
    2, "February",
    3, "March",
    4, "April",
    5, "May",
    6, "June",
    7, "July",
    8, "August",
    9, "September",
    10, "October",
    11, "November",
    12, "December"
)

Now we need to set the sort by column property of the Order Month Name to use the value of the OrderMonth column. To do this navigate to the model viewer and click on the Order Month Name field and then set the Sort by column to OrderMonth.

Now we will create a new Hierarchy based on OrderYear and Order Month Name.

Click on the chart and the Axis value with the new hierarchy we just created. Drill down a level to see the years and months.

As you can see this makes the chart much easier to read. Now lets insert some new data into the table and refresh the report.

Add some data

Run the following SQL to create new Date and Order records.

INSERT INTO [Dimension].[Date]
([Date]
,[Day Number]
,[Day]
,[Month]
,[Short Month]
,[Calendar Month Number]
,[Calendar Month Label]
,[Calendar Year]
,[Calendar Year Label]
,[Fiscal Month Number]
,[Fiscal Month Label]
,[Fiscal Year]
,[Fiscal Year Label]
,[ISO Week Number])
VALUES
(‘4/1/2019’,
,1
,1
,’April’
,’Apr’
,4
,’CY2019-Apr’
,2019
,’CY2019′
,6
,’FY2019-APR’
,2019
,’FY2019′
,14)
GO
INSERT INTO [Fact].[Order]
([City Key]
,[Customer Key]
,[Stock Item Key]
,[Order Date Key]
,[Picked Date Key]
,[Salesperson Key]
,[Picker Key]
,[WWI Order ID]
,[WWI Backorder ID]
,[Description]
,[Package]
,[Quantity]
,[Unit Price]
,[Tax Rate]
,[Total Excluding Tax]
,[Tax Amount]
,[Total Including Tax]
,[Lineage Key])
VALUES
(45901
,0
,175
,’4/1/2019′
,’4/1/2019′
,76
,67
,9073
,NULL
,’April Fools ain’t no joke’
,’Each’
,3
,13.00
,15.00
,39.00
,5.85
,44.85
,9)
GO

Refresh the report and notice how it updates so that it only has one column.

This is because there is only 1 record from March 2018 until April 2019.

Combining SQL Server database functions with Power BI is a powerful way to abstract and encapsulate logic in the database thus simplifying the report authors job and ensuring the right data is presented to report consumers.

Hopefully you have found this to be another practical post.

Until next time.

Anthony

Abstract and encapsulate with Power BI and SQL Server Table-Valued Functions – Use Case 1: Change results based on user

If you’ve ever required a dynamic data source in Power BI that can change based on who the user is, when they are querying the data source or if certain data elements have changed you can leverage the ability for Power BI to connect to a table value function in SQL Server.

Table-valued functions allow you to abstract complex business logic from the report author and encapsulate it into a database object. This simplifies report building and enables you to do things like hide hierarchy levels, filter data based on a certain point in time stored in the database or check for certain data conditions and alter the query results as appropriate.

Prerequisites

  • SQL Server 2016 or later. You can download the SQL Server 2017 developer edition HERE
  • Wide World Importer sample database. A copy can be found HERE
  • Power BI Desktop. You can download the latest version from HERE

In this series of articles I will step through several use cases for direct queries from SQL Server Table-valued Functions in Power BI.

Use case 1: Change results based on user

For this first use case we will cover how you can embed some simple logic in your table-valued function to hide lower levels of a hierarchy. This is useful if you want to prevent certain individuals from breaking down aggregated values but still allow them to use data at a summary level.

Create user accounts

For the purposes of simplicity, we will create some users in the database using SQL Server authentication. Connect to your SQL Server database and execute the following SQL code.

USE [master]

GO
--Create Bob
CREATE LOGIN [Bob] WITH PASSWORD='Bob', DEFAULT_DATABASE=[WideWorldImportersDW], DEFAULT_LANGUAGE=[us_english], CHECK_EXPIRATION=OFF, CHECK_POLICY=OFF
GO

--Create Mary
CREATE LOGIN [Mary] WITH PASSWORD='Mary', DEFAULT_DATABASE=[WideWorldImportersDW], DEFAULT_LANGUAGE=[us_english], CHECK_EXPIRATION=OFF, CHECK_POLICY=OFF
GO

USE [WideWorldImportersDW]
GO

--Grant Bob access to WideWorldImportersDW
CREATE USER [Bob] FOR LOGIN [Bob] WITH DEFAULT_SCHEMA=[dbo]
GO

--Grant Mary access to WideWorldImportersDW
CREATE USER [Mary] FOR LOGIN [Mary] WITH DEFAULT_SCHEMA=[dbo]
GO

--Grant Bob access to read access to WideWorldImportersDW
ALTER ROLE db_datareader ADD MEMBER [Bob]
GO

--Grant Mary access to read access to WideWorldImportersDW
ALTER ROLE db_datareader ADD MEMBER [Mary]
GO

 

Next, we will create a function in SQL Server with the following code.

Create function

Use the code below to create a new Table-Valued function in SQL Server. The function is what we will directly connect to in Power BI.

CREATE FUNCTION dbo.ufn_Customer()  
RETURNS TABLE 
AS
RETURN   
(  
    SELECT  
       [Customer Key]
      ,[WWI Customer ID]
      ,[Customer]
      ,[Bill To Customer]
      ,[Category]
      ,CASE SYSTEM_USER
        WHEN 'Bob' THEN
            NULL
        WHEN 'Mary' THEN
            [Buying Group]
        ELSE
            [Buying Group]
        END AS [Buying Group]
      ,[Primary Contact]
      ,[Postal Code]
      ,[Valid From]
      ,[Valid To]
      ,[Lineage Key]
  FROM [WideWorldImportersDW].[Dimension].[Customer]
);  

GO

As you can see in the code above, I have created a function called dbo.ufn_Customer which returns the data from the Customer dimension table. In the code I have added a simple case statement that returns different data for the Buy Group based on who executing the function.

Next we will bring this function into Power BI and see the results.

Connect with Power BI

Open Power BI and get data from SQL Server. Enter in the server name and select DirectQuery.

Click on OK. Log in using a Database account. First, we will try using Bob.

Click on connect and select the function ufn_Customer from the list of available objects.

If you wanted to force the report author to use the function rather than the actual customer table you can use database security to only expose the function and not the table. I typically use custom database roles and schemas because it is easier to manage and allows me to enable “data discovery with guard rails”.

Load the data in and create a simple hierarchy using the fields Category and Buying Group.

Drop a Matrix onto the canvas of the report and use the Category Hierarchy you just created for the rows and the Customer Key for the values.

You should have a report that looks similar to the image below. Notice how the Buying Group is null because for Bob the function is not returning the Buyin Group value but the NULL value instead.

Now lets switch to Mary and see how the lower level values of the hierarchy appear in the report. Click on Home > Edit Queries > Data Source Settings. Select the data source that you are using for this report and click on Edit Permissions…

In the Edit Permission pop up menu click on Edit.. then in Database enter in Mary and Mary for the ID and PWD.

Click on Save and OK and Close. Refresh the report.

Notice how the lower levels of the hierarchy now appear in the matrix visual because for Mary the actual Buying Group value is being return as specified in the function.

NOTE: The reason why you need to refresh is because of caching. When using this technique to obfuscate lower levels of a hierarchy make sure to build your visuals so that the default view of the report is at an aggregated level and minimize the amount of caching which will force Power BI to re-query from the source and update the results appropriately.

If you want to add an enterprise semantic layer such as an Analysis Services Tabular model and still have the same dynamic results you will need to build your SSAS model using DirectQuery mode because the results of the model need to change and cannot be processed and stored in memory in advance of the user querying it.

In the next article I will cover how you can use a database function to curate the results based on a date time in your database.

Next we will look at using a database function to curate the result set based on time.

Hopefully you have found this to be another practical post.

Until next time.

Anthony

Instant insights, automation and action – Part 6 Integrate Power BI, Power Apps, Azure Machine Learning and Dynamics 365 using MS Flow

This is the last article in a 6-part series in which I will explain how you can integrate Power BI, Power Apps, Azure Machine and Dynamics 365 using MS Flow.

For reference here are the descriptions and links to the previous articles.

Instant insights, automation and action – Part 1 Create Power App

Instant insights, automation and action – Part 2 Create Azure Machine Learning Experiment

Instant insights, automation and action – Part 3 Create the Power BI Report

Instant insights, automation and action – Part 4 Register Power BI in Azure Active Directory

Instant insights, automation and action – Part 5 Integrate with MS Flow

In this article I will explain how you can kick off a MS Flow by adding an action to your Power App and then how you can integrate the Power App into a Power BI Dashboard. Data alerts can by tied to tiles in the Power BI Dashboard that can kick off additional flows which will insert records into Dynamics. The complete system is depicted in the diagram below.


Modify the Power APP

In Part 1 of this series we created a simple app that allowed a user to enter new sales data. We now need to go back to this app and modify it. Navigate to Power Apps and edit the app


Once the app is open click on the submit button to select it and then from the Action menu at the top select Flows.


This will open up a new pane in which you can select the flow that we created in Part 5 of this series. Once you have selected the flow enter the following code into the formula expression bar.

PowerApptoAzureMLtoPowerBIbkp.Run(NAME.Text, CHANNEL.Text, REGION.Text, FRESH.Text, MILK.Text, GROCERY.Text, FROZEN.Text, DETERGENT.Text, DELICASSEN.Text,CATEGORY.Text)


This will execute the flow and pass the data values from each of the text input boxes into the flow. You can test the flow by clicking on the play button in the top right-hand corner of the screen.

Save the report and publish it so that the new version with the flow attached to the submit button is available to integrate into Power BI.

Modify the Power BI Report

Next, we will need to modify the Power BI report to drop in a PowerApps visual. Open the Power BI report that we created in Part 3 and add a new custom visual from the marketplace. We need to add the Power App custom visual to the report.


Once the new visual has been successfully added we will add it to a new page in the report. In the Power BI report create a new page and call it Data Entry. We are doing this to keep the report clean and simple. We will integrate various visuals including the Power App in a Power BI Dashboard once we have finished putting the necessary polish in the report.

Drop the new visual onto the canvas of the new page in the report and add any field from the list of fields in the dataset, I used customer name. You should see a screen like the image below.


We are not creating or editing an app since we already built it in Part 1. Click ok and then select Choose app. Select the app we created for entering new whole customer sales data.


Click Add. You may see another warning about creating or editing the app, just ignore this by clicking ok.


New report page should now look like the image below.


Rename Page 1 and call it Wholesale Customer Report. You can spruce up the first page to make it look more appealing. I modified my report to make it look like this.


Once you are happy with the design of the report you need to publish it to Power BI. You can replace the existing report that we created in Part 3. Once the report has been published navigate to the cloud service and go the report that you just published.

Build the Dashboard

It’s now time to build a dashboard. With the report open pin the following visuals to a new dashboard.


To pin a visual to a dashboard click on the visual and select the pin from the menu bar.


A menu like the one below will pop up. Give the new dashboard a name such as Wholesale customer dashboard.


Select pin to create and add the visual to the new dashboard. Repeat this for all of the card visuals in the report except instead of selecting New Dashboard select Existing dashboard and if not already selected pick the Wholesale customer dashboard that we just created.

Next, we will need to pin the Power App visual. Go to the Data Entry page and pin the Power App just like we did for the card visuals. If you are having trouble selecting the pin option you may need to edit the report to pin the visual.

Your dashboard should now look something like this.


Let’s rearrange the tiles and add some new visuals by using Q&A.

First add a new visual by typing the following questions in the Q&A bar at the top of the screen.

Fresh by customer sort by fresh

Pin the visual to the existing Wholesale customer dashboard.


Then place this at the bottom of the dashboard.

Repeat these steps using the following questions:

Milk by customer sort by milk

Grocery by customer sort by grocery

Frozen by customer sort by frozen

Detergent paper by customer sort by detergent paper

Delicassen by customer sort by delicassen

Your dashboard should now look similar to the image below.


Try adding a new customer by using the Power App embedded in the Power BI Dashboard. After you have entered data into each of the input boxes in the Power App hit the submit button and in about 5 seconds or less you should see the customer count go up and your new customer on the dashboard in real-time. Also try entering in a new customer but do not fill out the Category field blank. Notice how even though the field is blank it is still populated by the time it shows up in Power BI, that is because the Azure Machine Learning model is supplying this data.

Integrate with Dynamics 365

The last step is to add a data alert to one of the tiles which will create a record in Dynamics 365. Navigate to the dashboard if not already there and click the … in the top right hand corner of the Fresh tile.


Then select Manage alerts.


This will open a new menu on the right-hand side of the screen. From this screen click + Add alert rule. Create an alert that will fire once the Fresh goes above a certain value. In my case I used 60,000.


For the purposes of this tutorial an alert based on an absolute value is adequate however a better choice would be to create an alert on a relative value such as % change since you do not want to have to go in and modify the alert to increase its threshold every time you surpass it. Click Save and close.

Go back to Manage alerts for this tile (Fresh) and this time select Use Microsoft Flow to trigger additional actions.



This will launch MS Flow. Use the default template to create a new flow triggered from a Power BI alert.


Use the template and select the Alert for Fresh from the Alert id drop down menu. Next select add new step and search for Dynamics 365. Then select Create a new record Dynamics 365.

Your flow should now look like this.


Enter the details for the Dynamics 365 tenant and select the Entity that you want a record created in. For my purposes I created a new task to follow-up with the customer by using the tasks entity. Save the flow and test it out by entering in new sales data using the Power App embedded in the Power BI report. If you have wired up the flow correctly a new record should be created in Dynamics 365 once you have triggered the data alert in your Power BI dashboard.

We have now reached the end of this series hopefully you have realized that by combining Power BI, Power Apps, Flow, Azure Machine Learning and Dynamics 365 you can open up new possibilities which lead to insights, automation and action at the speed of business.

Until next time.

Anthony


Instant insights, automation and action – Part 4 Register Power BI in Azure Active Directory

This is the fourth post in a series of articles in which I explain how to integrate Power BI, Power Apps, Flow, Azure Machine Learning and Dynamics 365 to rapidly build a functioning system which allows users to analyze, insert, automate and action data.

In the previous article I covered building the Power BI Report.

In this article I will cover how to enable data to be pushed into Power BI use Flow. This is a fast no code solution.


This is a one-time setup that is required in order to use the Power BI connector in MS Flow. If you do not do this step you will see an error screen in MS Flow like the screen clip below.


Prerequisites

In order to complete this tutorial, you will need permission to register applications in your Azure Active Directory tenant.


For more information on the Azure AD Tenant you can click the following link.

https://docs.microsoft.com/en-us/power-bi/developer/create-an-azure-active-directory-tenant

Power BI Development Center

Log onto the Power BI Development Center and enable API features and get the key to register the app in Azure.

Go to the following URL and sign in.

https://dev.powerbi.com/apps


Enter in a meaningful name for your app, I called mine AnthonysPowerBIApp but you can call yours whatever you would like. Choose Native for the Application Type and select Read all datasets and Read and write all datasets for the API Access


Click on Register. A screen like the one below should pop up. Be sure to copy down the Application ID as this is needed to register the application in Azure.


Azure Portal

Next log onto the azure portal using the following URL https://portal.azure.com/#home

Once in the portal admin page navigate to the Azure Active Directory menu blade


Next click on App registrations and select the app that we created using the Power BI Development Center.


You can change settings in the app if you whish to tailor it be clicking on Properties.

Now that the Power BI App has been registered in Azure Active Directory you can use it in various Microsoft cloud services such as Flow.


As you can see in the image above, I no longer get a permission error and I am able to select the workspace, dataset and table.


In the next post we will build out the flow so that data is passed from the Power App to an Azure Machine Learning experiment for scoring and then into the Power BI API Enabled Dataset for real-time analytics.

Hopefully you have found this to be another practical post.

Until next time

Anthony

References

Here is the official documentation from Microsoft on how to register Power BI to push data into it using REST API calls.

https://docs.microsoft.com/en-us/power-bi/developer/overview-of-power-bi-rest-api


Instant insights, automation and action – Part 3 Create the Power BI Report

This is the third post in a series of articles in which I explain how to integrate Power BI, Power Apps, Flow, Azure Machine Learning and Dynamics 365 to rapidly build a functioning system which allows users to analyze, insert, automate and action data.

In the previous article I covered building the Power App. In this article I will cover the Power BI report.

We will build out this system in the following order; Power App, Azure Machine Learning, Power BI and then last MS Flow to connect the components. Before you can begin this tutorial there are some required prerequisites.

Prerequisites

  • Power Apps
  • MS Flow
  • Power BI Pro or Premium
  • Access to Azure Active Directory to register Power BI App
  • Dynamics 365

Create the API Enabled Dataset

Log onto Power BI and create a new app workspace called Customer Segmentation. This step is not required however if you are like me you create a lot of different content so it’s a good habitat to get into so that you can better manage your work.


In case you are wondering the screen clip above is using the new App Workspace experience. Next, we will create a new streaming data set.

On the splash page for the app click Skip at the bottom right corner of the page.


Now select +Create > Streaming dataset.


Select API and click next.


Next create the WholeSaleCustomer dataset.

It will have the following field names and data types

Field Name Data Type
Customer Name Text
Channel Number
Region Number
Fresh Number
Milk Number
Grocery Number
Frozen Number
Detergents_Paper Number
Delicassen Number
Category Number


Click the Create button to generate the dataset.

Next, we will leverage the generated PowerShell script to create some test records in our newly formed dataset. Click on PowerShell and copy the code into Notepad.


We will create three test records by running the PowerShell code below. Modify the code you coped into Notepad so that it looks simlar to the code below. Before you can run this you will need to replace <Your Key> with the key displayed in your Power BI service.

$endpoint = "https://api.powerbi.com/beta/8c17d9d4-2652-4573-8a9c-d5dde0750715/datasets/13b74183-5eb2-480b-ba11-c0af0ecbdd26/rows?
key=<Your Key>

$payload = @{
"Customer Name" ="Test1"
"Channel" =1
"Region" =1
"Fresh" =98.6
"Milk" =98.6
"Grocery" =98.6
"Frozen" =98.6
"Detergents_Paper" =98.6
"Delicassen" =98.6
"Category" =0
}


Invoke-RestMethod -Method Post -Uri "$endpoint" -Body (ConvertTo-Json @($payload))
$payload = @{
"Customer Name" ="Test2"
"Channel" =2
"Region" =2
"Fresh" =98.6
"Milk" =98.6
"Grocery" =98.6
"Frozen" =98.6
"Detergents_Paper" =98.6
"Delicassen" =98.6
"Category" =1
}


Invoke-RestMethod -Method Post -Uri "$endpoint" -Body (ConvertTo-Json @($payload))
$payload = @{
"Customer Name" ="Test3"
"Channel" =3
"Region" =3
"Fresh" =98.6
"Milk" =98.6
"Grocery" =98.6
"Frozen" =98.6
"Detergents_Paper" =98.6
"Delicassen" =98.6
"Category" =2
}


Invoke-RestMethod -Method Post -Uri "$endpoint" -Body (ConvertTo-Json @($payload))

To do this launch PowerShell in Administrator mode and copy and paste the code into the PowerShell desktop app.


The data set now has three records in it and you can start to use it in Power BI. To do this go to the dataset and click the three dots beside the name of the dataset. This will open a new report with a blank canvas. Add a table and drop all of the fields from the data set into the visual.


As you may notice from the screen shot above the fields Fresh, Milk, Grocery, Frozen, Detergents_Paper and Delicassen are not formatted as currency but should be. Unfortunately, API enabled data sets only have three data types Text, Number and Date and no formatting options so we cannot specify that these fields are currency fields.

Thankfully we can leverage the Report level measures for live connections to Analysis Services tabular models & Power BI service datasets feature that was released in May 2017 to add new measures with the proper currency data type defined.
Continue reading “Instant insights, automation and action – Part 3 Create the Power BI Report”

Instant insights, automation and action – Part 2 Create Azure Machine Learning Experiment

This is the second post in a series of articles in which I explain how to integrate Power BI, Power Apps, Flow, Azure Machine Learning and Dynamics 365 to rapidly build a functioning system which allows users to analyze, insert, automate and action data.

In the previous article I covered building the Power App. In this article I will cover the Azure Machine Learning Studio Experiment.

We will build out this system in the following order; Power App, Azure Machine Learning, Power BI and then last MS Flow to connect the components. Before you can begin this tutorial there are some required prerequisites.

Prerequisites

  • Power Apps
  • MS Flow
  • Power BI Pro or Premium
  • Access to Azure Active Directory to register Power BI App
  • Dynamics 365

Build the Azure Machine Learning Experiment

The Azure Machine Learning Studio platform is a powerful cloud service from Microsoft that allows data scientists to rapidly build and deploy machine learning experiments. For the purpose of brevity, we will leverage an existing template from the Azure AI Gallery. The Azure AI Gallery is a great resource for creating and learning about Machine Learning experiments in the Microsoft platform.

Weehyong Tok from Microsoft created an experiment that segments customers based on the dataset Wholesale customers Data Set from UCI Machine Learning Repository which is perfect for our purposes.

You can find the experiment here https://gallery.azure.ai/Experiment/Customer-Segmentation-of-Wholesale-Customers-3 .

Open the experiment in the Azure Machine Learning Studio by clicking on Open in Studio. Be sure to log in using the same account that you used to build the Power App.

This will launch the Azure Machine Learning Studio platform and create an experiment for you based on Weehyong Tok template. You may notice that the experiment has to be updated, click ok.

This is because the Assign to Clusters module has been deprecated and replaced by a new module called Assign Data to Clusters. Thankfully the upgrade takes care of the necessary changes and we can use the experiment as is with out having to modify it.

Click the Run button at the bottom of the page.

Once the experiment has finished running click on the output of the Assign to Cluster module and select Visualize from the drop down menu.

As you can see in the image the data is grouped into clusters.

This experiment uses the K-Means clustering algorithm to assign the data points to groups. As you can see in the image below it currently uses 2 centroids which essentially means that each row will be assigned to 1 of 2 groups based on the distance of the data points in the row to the centroid.

Modify the experiment to determine the optimum number of centroids

Now you may wonder if this is the optimal number of clusters or not. Thankfully we can use an elbow chart to help determine the optimal number of centroids. To do this we will add a Python module to drop some code into our experiment.

Search for the Execute Python Script and drop it onto the canvas of the experiment. Connect the first output (the one on the left) of the Split Data module to the first input of the Execute Python Script module. Your experiment should look as follows.

Now you will need to add the following code to the Execute Python Script module. Replace the generated with the code below.

Python Code

# The script MUST contain a function named azureml_main
# which is the entry point for this module.

# imports up here can be used to 
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from math import sin, cos, sqrt, atan2
from sklearn.cluster import KMeans
from sklearn import metrics
from scipy.spatial.distance import cdist

# The entry point function can contain up to two input arguments:
#   Param<dataframe1>: a pandas.DataFrame
#   Param<dataframe2>: a pandas.DataFrame
def azureml_main(dataframe1 = None, dataframe2 = None):

# Execution logic goes here
#print('Input pandas.DataFrame #1:\r\n\r\n{0}'.format(dataframe1)) #We don't need this, we just want the visual.

colors = ['b','g','r']
markers = ['o','v','s']

distortions = []
centroids = range(1, 10)
for i in centroids:
kmeanModel = KMeans(n_clusters=i).fit(dataframe1)
kmeanModel.fit(dataframe1)
distortions.append(sum(np.min(cdist(dataframe1, kmeanModel.cluster_centers_, 'euclidean'),axis=1))/dataframe1.shape[0])

plt.plot(centroids, distortions, 'bx-')
plt.xlabel('Number of centroids')
plt.ylabel('Distortions')
plt.title('Elbow chart showing the optimal number of centroids')
plt.show()

plt.savefig("elbow.png") #To see the chart in Azure Machine Learning Studio we need to save the image as a png.

# If a zip file is connected to the third input port is connected,
# it is unzipped under ".\Script Bundle". This directory is added
# to sys.path. Therefore, if your zip file contains a Python file
# mymodule.py you can import it using:
# import mymodule

# Return value must be of a sequence of pandas.DataFrame
return dataframe1,

Run the experiment and click on the second output, Python device (Dataset), of the Python Script module and select visualize. You should see something like the image below.

The optimal number of centroids is at the “elbow” of the chart above which looks to be about 5. Based on this insight we will update the algorithm and change the number of centroids to 5. We will also increase the number of iterations to 500 since we have more centroids.

Run the experiment and click on the output of the Assign to Cluster module and select Visualize from the drop down menu. The output should look like the image below.

Next, we will convert this experiment into a Predictive Web Service. At the bottom of the screen select Predictive Web Service > Predictive Web Service [Recommended]

Once the predictive experiment has been setup, we are going to modify it slightly so that it only returns the Assignment field. To do this we need to drop in the Select Columns in Dataset module and place it between the Assign to Clusters module and the Web service output.

Launch the column selector and enter in the Assignments column as the only value to get passed through to the web service output.

Run the experiment and Deploy Web Service.

This concludes the second part of this series. Next, we will build the API enabled dataset in Power BI which will store the data that we will use in the Power BI Reports and Dashboards. Since the dataset is API enabled we can push data into it using Flow.

Hopefully you have found this to be another practical post.

Until next time

Anthony

References

@Python Programming has a good site for understanding the Python code to plot an elbow chart.

https://pythonprogramminglanguage.com/kmeans-elbow-method/ 

 

Featured

Transforming data into value one blog post at a time

 

 

Thanks for joining me!

My name is Anthony Bulk and I am passionate about making data useful. Whether it’s through data storytelling, business intelligence, artificial intelligence or data storage I will cover it all.

Come join me on this journey down data alley!

Here are the topics that have covered so far…

Last updated March 15, 2019.

Instant insight, automation and action using Power Apps, Power BI, Flow and Azure Machine Learning

Extend your information reach without over stretching by virtualizing data using SQL Server 2019 and MongoDB


“Without data you’re just another person with an opinion.”

W. Edwards Deming

 “If I had only one hour to save the world, I would spend fifty-five minutes defining the problem, and only five minutes finding the solution.”

Albert Einstein