{"id":22580,"date":"2021-01-22T11:18:56","date_gmt":"2021-01-22T11:18:56","guid":{"rendered":"https:\/\/www.experfy.com\/blog\/visualize-your-nested-iot-data-3d-spark-power-bi\/"},"modified":"2023-09-05T13:31:49","modified_gmt":"2023-09-05T13:31:49","slug":"visualize-your-nested-iot-data-3d-spark-power-bi","status":"publish","type":"post","link":"https:\/\/www.experfy.com\/blog\/iot\/visualize-your-nested-iot-data-3d-spark-power-bi\/","title":{"rendered":"How To Visualize Your Nested IoT Data In 3D Using Spark And Power BI"},"content":{"rendered":"\t\t<div data-elementor-type=\"wp-post\" data-elementor-id=\"22580\" class=\"elementor elementor-22580\" data-elementor-post-type=\"post\">\n\t\t\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-ea9356d elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"ea9356d\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-bd75f88\" data-id=\"bd75f88\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-3eab068 elementor-widget elementor-widget-text-editor\" data-id=\"3eab068\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p class=\"has-medium-font-size\">Using Azure Databricks, Python, Spark and Power BI python script visuals<\/p>\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-af7969b elementor-widget elementor-widget-heading\" data-id=\"af7969b\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h2 class=\"elementor-heading-title elementor-size-default\">A. Introduction<\/h2>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-054fd74 elementor-widget elementor-widget-text-editor\" data-id=\"054fd74\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"42e7\">Every now and then, you run into new unique problems to solve. This time it was a client getting nested IoT data. Storing and visualizing IoT data is usually a standard task, but getting nested IoT data as a \u201cmatrix\u201d per message with corresponding vectors is not as straightforward as usual.<\/p>\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-3d15a6b elementor-widget elementor-widget-heading\" data-id=\"3d15a6b\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h4 class=\"elementor-heading-title elementor-size-default\">In this post, we will learn:<\/h4>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-727eed3 elementor-widget elementor-widget-text-editor\" data-id=\"727eed3\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<ul><li>how to transform nested IoT data in a scalable way<\/li><li>how to create custom python visualizations in Power BI<\/li><li>how data can be consumed and visualized by non-technical end-users through a web browser<\/li><\/ul>\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-cc199d4 elementor-widget elementor-widget-text-editor\" data-id=\"cc199d4\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p>We will use Python, Spark and Power BI to solve our problem in some simple steps.<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-436e2e0 elementor-widget elementor-widget-heading\" data-id=\"436e2e0\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h2 class=\"elementor-heading-title elementor-size-default\">B. Data Structure<\/h2>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-0c684fb elementor-widget elementor-widget-text-editor\" data-id=\"0c684fb\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"c246\">First, let\u2019s take a look at what the data looks like when it comes from the IoT devices:<\/p>\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-8f4d1ea elementor-widget elementor-widget-image\" data-id=\"8f4d1ea\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"image.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t\t\t\t<figure class=\"wp-caption\">\n\t\t\t\t\t\t\t\t\t\t<img fetchpriority=\"high\" decoding=\"async\" width=\"694\" height=\"274\" src=\"https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/1aUQzJU8g2rWgT95u9YsO3g.png\" class=\"attachment-large size-large wp-image-18509\" alt=\"Spark And Power BI - An array with arrays forming the matrix in a Spark dataframe.\" srcset=\"https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/1aUQzJU8g2rWgT95u9YsO3g.png 694w, https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/1aUQzJU8g2rWgT95u9YsO3g-300x118.png 300w, https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/1aUQzJU8g2rWgT95u9YsO3g-610x241.png 610w\" sizes=\"(max-width: 694px) 100vw, 694px\" \/>\t\t\t\t\t\t\t\t\t\t\t<figcaption class=\"widget-image-caption wp-caption-text\">Example IoT data structure in the spark DataFrame. Image by Author.<\/figcaption>\n\t\t\t\t\t\t\t\t\t\t<\/figure>\n\t\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-26ed98f elementor-widget elementor-widget-text-editor\" data-id=\"26ed98f\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"9473\">In the picture above we can see a subset of the IoT data that we will focus on to solve the specific problem with a \u201cmatrix\u201d type of data per message\/row. The IoT data contains many other columns of standard types, but we will only look at how to handle the complex types in this post.<\/p>\n<p id=\"dbf9\">Each position and value in the \u201cmatrix\u201d are connected to a specific x_value and y_value. See the picture below.<\/p>\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-eec26cc elementor-widget elementor-widget-image\" data-id=\"eec26cc\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"image.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t\t\t\t<figure class=\"wp-caption\">\n\t\t\t\t\t\t\t\t\t\t<img decoding=\"async\" width=\"470\" height=\"350\" src=\"https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/1f9L_H7GFb7gDYJXvMc2FSw.png\" class=\"attachment-large size-large wp-image-18510\" alt=\"Spark And Power BI- A matrix with x- and y-vales belonging to each value in the matrix.\" srcset=\"https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/1f9L_H7GFb7gDYJXvMc2FSw.png 470w, https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/1f9L_H7GFb7gDYJXvMc2FSw-300x223.png 300w\" sizes=\"(max-width: 470px) 100vw, 470px\" \/>\t\t\t\t\t\t\t\t\t\t\t<figcaption class=\"widget-image-caption wp-caption-text\">For every x- and y-value there exists one z-value. Image by author.<\/figcaption>\n\t\t\t\t\t\t\t\t\t\t<\/figure>\n\t\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-2cacc3a elementor-widget elementor-widget-text-editor\" data-id=\"2cacc3a\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<pre class=\"wp-block-preformatted\">z = f(x,y)<br>1 = f(0,40)<br>3 = f(40,40)<br>2 = f(100,75)<\/pre>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-23b5e37 elementor-widget elementor-widget-text-editor\" data-id=\"23b5e37\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"b625\">This is how we get the raw data from the IoT devices. In reality, the matrix is much larger, but for simplicity we use a matrix of size 5&#215;5 here. The approach we will use will also work for larger matrices.<\/p>\n\n<p id=\"d889\">For every combination of x and y we have one z value. Our task is to transform the data so it can be used and visualized in Power BI. BI-tools can\u00b4t consume the raw data in the compact form we receive from the IoT devices. <a href=\"https:\/\/www.experfy.com\/blog\/bigdata-cloud\/how-to-pick-the-best-business-intelligence-tool\/\" target=\"_blank\" rel=\"noreferrer noopener\">BI-tools<\/a> usually want tabular data. So we want to create a regular table looking like this:<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-e45ed8b elementor-widget elementor-widget-text-editor\" data-id=\"e45ed8b\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<pre class=\"wp-block-preformatted\">[id ][X-value] [y-value] [z-value] [any meta-data-column(s)]<br>1000    0        10         0            meta-data<br>1000   20        30         2            meta-data<br>... <\/pre>\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-f611e1a elementor-widget elementor-widget-heading\" data-id=\"f611e1a\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h2 class=\"elementor-heading-title elementor-size-default\">C. Transforming the data<\/h2>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-4e5c1ab elementor-widget elementor-widget-text-editor\" data-id=\"4e5c1ab\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"c314\">We want to transform the raw source data, where every row consists of:<\/p>\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-1f10e04 elementor-widget elementor-widget-text-editor\" data-id=\"1f10e04\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<ul><li>a matrix of z-values<\/li><li>an array of x-values<\/li><li>an array of y-values<\/li><\/ul>\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-efd6280 elementor-widget elementor-widget-text-editor\" data-id=\"efd6280\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"9b2d\">into a regular table of unique rows making up all possible combinations of the data.<\/p>\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-4509224 elementor-widget elementor-widget-heading\" data-id=\"4509224\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h4 class=\"elementor-heading-title elementor-size-default\">C0. A note about Spark<\/h4>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-bf9e1f5 elementor-widget elementor-widget-text-editor\" data-id=\"bf9e1f5\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"e7a7\">In this example, we will work with a limited amount of data. In a production environment, we often have several Terabytes of data, so we want to use a technology that scales with the incoming IoT data. Therefore we will look at how to solve this problem using Spark.<\/p>\n\n<p id=\"36d7\">We will use&nbsp;<a href=\"https:\/\/azure.microsoft.com\/en-us\/services\/databricks\/\" target=\"_blank\" rel=\"noreferrer noopener\">Azure Databricks<\/a>&nbsp;and Spark to transform the data. We will have our data in a Spark DataFrame. We can see above that the matrix is stored as arrays in an array in the Spark DataFrame.<\/p>\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-74c4c10 elementor-widget elementor-widget-heading\" data-id=\"74c4c10\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h3 class=\"elementor-heading-title elementor-size-default\">C1. Generate test data<\/h3>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-928a51c elementor-widget elementor-widget-text-editor\" data-id=\"928a51c\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"f032\">Let\u2019s generate some test data to work with.<\/p>\n<p id=\"1b26\">This gives us a spark DataFrame looking like this:<\/p>\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-298c124 elementor-widget elementor-widget-image\" data-id=\"298c124\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"image.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t\t\t\t<figure class=\"wp-caption\">\n\t\t\t\t\t\t\t\t\t\t<img decoding=\"async\" width=\"702\" height=\"286\" src=\"https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/1htgsvRG8w9hRL7sPdwDPaQ.png\" class=\"attachment-large size-large wp-image-18511\" alt=\"A Spark dataframe containing the matrix and x- and y-vales.\" srcset=\"https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/1htgsvRG8w9hRL7sPdwDPaQ.png 702w, https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/1htgsvRG8w9hRL7sPdwDPaQ-300x122.png 300w, https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/1htgsvRG8w9hRL7sPdwDPaQ-610x249.png 610w\" sizes=\"(max-width: 702px) 100vw, 702px\" \/>\t\t\t\t\t\t\t\t\t\t\t<figcaption class=\"widget-image-caption wp-caption-text\">Spark DataFrame with IoT data. Image by author.<\/figcaption>\n\t\t\t\t\t\t\t\t\t\t<\/figure>\n\t\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-0550596 elementor-widget elementor-widget-text-editor\" data-id=\"0550596\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"fe29\">We get the IoT data in a compact form as described above Z = f(x,y), but Power BI does not understand that form so we need to create a regular table of the data. Spark has built-in functions to help us. The function we are after in this case is:<\/p>\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-9135599 elementor-widget elementor-widget-text-editor\" data-id=\"9135599\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<pre class=\"wp-block-preformatted\">posexplode(col)<\/pre>\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-ca7a892 elementor-widget elementor-widget-text-editor\" data-id=\"ca7a892\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"116c\">In short\u00a0<em>posexplode<\/em>\u00a0will create a new column with the exploded data and additional rows for the column we chose to explode. Also, we will get an extra column with the position\/index of where the explode comes from in the original data structure. See docs\u00a0here.<\/p>\n\n<p id=\"b1b7\">We need to get every single value of the \u201cmatrix\u201d on its own row with the index of the x- and y-coordinate so that we can match the correct z-value with the correct x- and y-value.<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-aea1a6d elementor-widget elementor-widget-image\" data-id=\"aea1a6d\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"image.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t\t\t\t<figure class=\"wp-caption\">\n\t\t\t\t\t\t\t\t\t\t<img loading=\"lazy\" decoding=\"async\" width=\"903\" height=\"283\" src=\"https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/1vWwTZmhB_ojcUwpwAKKrpg.png\" class=\"attachment-large size-large wp-image-18512\" alt=\"A display of a Spark dataframe after the first posexplode.\" srcset=\"https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/1vWwTZmhB_ojcUwpwAKKrpg.png 903w, https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/1vWwTZmhB_ojcUwpwAKKrpg-300x94.png 300w, https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/1vWwTZmhB_ojcUwpwAKKrpg-768x241.png 768w, https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/1vWwTZmhB_ojcUwpwAKKrpg-610x191.png 610w, https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/1vWwTZmhB_ojcUwpwAKKrpg-750x235.png 750w\" sizes=\"(max-width: 903px) 100vw, 903px\" \/>\t\t\t\t\t\t\t\t\t\t\t<figcaption class=\"widget-image-caption wp-caption-text\"> We explode each array to a new column called \u201dvalues\u201d. We also store the index. Image by author.<\/figcaption>\n\t\t\t\t\t\t\t\t\t\t<\/figure>\n\t\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-1a06fb8 elementor-widget elementor-widget-text-editor\" data-id=\"1a06fb8\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"0c2c\">As we can see in the display of the first posexplode above we have exploded the first array of the column \u201cmatrixdata\u201d in a new column called \u201cvalues\u201d. We also store from what row of the \u201cmatrix\u201d we have exploded the data in the column called \u201cy_index\u201d. I have kept the \u201cmatrixdata\u201d column to make it easier to follow, even though we technically don\u2019t need the column anymore.<\/p>\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-06c47ee elementor-widget elementor-widget-heading\" data-id=\"06c47ee\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h3 class=\"elementor-heading-title elementor-size-default\">Next, we need to posexplode the resulting values:<\/h3>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-77c06b4 elementor-widget elementor-widget-image\" data-id=\"77c06b4\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"image.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t\t\t\t<figure class=\"wp-caption\">\n\t\t\t\t\t\t\t\t\t\t<img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"272\" src=\"https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/1hOMEQ3nvybFruARwUoWimw-1024x272.png\" class=\"attachment-large size-large wp-image-18513\" alt=\"A display of a Spark dataframe after the second posexplode.\" srcset=\"https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/1hOMEQ3nvybFruARwUoWimw-1024x272.png 1024w, https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/1hOMEQ3nvybFruARwUoWimw-300x80.png 300w, https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/1hOMEQ3nvybFruARwUoWimw-768x204.png 768w, https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/1hOMEQ3nvybFruARwUoWimw-610x162.png 610w, https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/1hOMEQ3nvybFruARwUoWimw-750x199.png 750w, https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/1hOMEQ3nvybFruARwUoWimw-1140x303.png 1140w, https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/1hOMEQ3nvybFruARwUoWimw.png 1149w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/>\t\t\t\t\t\t\t\t\t\t\t<figcaption class=\"widget-image-caption wp-caption-text\">posexplode of the exploded values column. Keeping the x_index. Image by author.<\/figcaption>\n\t\t\t\t\t\t\t\t\t\t<\/figure>\n\t\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-89febb6 elementor-widget elementor-widget-text-editor\" data-id=\"89febb6\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"95f2\">So, in the end, we get z_value with x_index and y_index. The last part is to match the x_index with the same index in the x_values column and the same for the y_index with the y_values column.<\/p>\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-fa9aa88 elementor-widget elementor-widget-image\" data-id=\"fa9aa88\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"image.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t\t\t\t<figure class=\"wp-caption\">\n\t\t\t\t\t\t\t\t\t\t<img loading=\"lazy\" decoding=\"async\" width=\"564\" height=\"246\" src=\"https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/107clme-EqyCaWHVk0FUhxw.png\" class=\"attachment-large size-large wp-image-18514\" alt=\"Spark And Power BI- A table containing ID, x-, y- and z-values.\" srcset=\"https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/107clme-EqyCaWHVk0FUhxw.png 564w, https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/107clme-EqyCaWHVk0FUhxw-300x131.png 300w\" sizes=\"(max-width: 564px) 100vw, 564px\" \/>\t\t\t\t\t\t\t\t\t\t\t<figcaption class=\"widget-image-caption wp-caption-text\">Final table to be consumed in power BI. Image by author.<\/figcaption>\n\t\t\t\t\t\t\t\t\t\t<\/figure>\n\t\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-934e2f6 elementor-widget elementor-widget-text-editor\" data-id=\"934e2f6\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"5577\">Now we have a regular table with our data that can be consumed by Power BI.<\/p>\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-19258eb elementor-widget elementor-widget-heading\" data-id=\"19258eb\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h2 class=\"elementor-heading-title elementor-size-default\">D. Exporting the data<\/h2>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-68a1c9a elementor-widget elementor-widget-heading\" data-id=\"68a1c9a\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h3 class=\"elementor-heading-title elementor-size-default\">D0. Export to Data Lake<\/h3>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-c671afd elementor-widget elementor-widget-text-editor\" data-id=\"c671afd\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"1951\">Let\u2019s export the data to our Storage Account as a CSV file. We will use Pandas to export the Spark DataFrame to a named CSV. For larger amounts of data, we would use Spark directly. The variable&nbsp;<em>path_export<\/em>&nbsp;holds the mounted path in our Data Lake. Later we will load the CSV file from Power BI.<\/p>\n<p id=\"1490\">We need to preface the mounted path with \/dbfs\/ when we work with&nbsp;<a href=\"https:\/\/docs.databricks.com\/data\/databricks-file-system.html#local-file-apis\" target=\"_blank\" rel=\"noreferrer noopener\">single node libraries<\/a>&nbsp;and local file API.<\/p>\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-a8f390c elementor-widget elementor-widget-heading\" data-id=\"a8f390c\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h3 class=\"elementor-heading-title elementor-size-default\">D1. Alternative local export<\/h3>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-98d53dd elementor-widget elementor-widget-text-editor\" data-id=\"98d53dd\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"6d04\">If we don\u2019t have a data lake account we can, for test purposes, export the CSV to our local computer and import the CSV into Power BI from the local file. Azure Databricks comes with a DBFS filesystem that is included in every workspace. We can export our CSV to DBFS and download the CSV file from the DBFS file system using Databricks CLI. See&nbsp;<a href=\"https:\/\/docs.microsoft.com\/en-us\/azure\/databricks\/dev-tools\/cli\/\" target=\"_blank\" rel=\"noreferrer noopener\">here<\/a>&nbsp;how to install the Databricks CLI. Then we can download the file to our local computer with a CLI command like:<\/p>\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-3704654 elementor-widget elementor-widget-text-editor\" data-id=\"3704654\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<pre class=\"wp-block-preformatted\"># Usage: databricks fs cp [OPTIONS] SRC DSTdatabricks fs cp dbfs:\/path_we_exported_to\/heatmap_from_spark.csv destinationpath<\/pre>\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-c48213e elementor-widget elementor-widget-text-editor\" data-id=\"c48213e\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"cef2\">After that, we can&nbsp;<a href=\"https:\/\/docs.microsoft.com\/en-us\/power-bi\/connect-data\/desktop-connect-csv\" target=\"_blank\" rel=\"noreferrer noopener\">load the CSV file<\/a>&nbsp;into the Power BI desktop.<\/p>\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-3f09d82 elementor-widget elementor-widget-heading\" data-id=\"3f09d82\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h3 class=\"elementor-heading-title elementor-size-default\">D2. A note about Pandas<\/h3>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-2c56cf7 elementor-widget elementor-widget-text-editor\" data-id=\"2c56cf7\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"18b3\">We do not have any \u201cbig data\u201d matrices and arrays so we allow us to use Pandas to export our Spark DataFrame as a named CSV. We could have exported the data as CSV directly from Spark in a scalable way, but we would then get CSV files named by spark and that would require some extra work before loading the data in Power BI. So we choose to use Pandas here as it fulfills our needs in this case. Also if we would have a huge amount of data we would not export the data as CSV. In that case, we could consume the data directly via direct queries from Power BI to Databricks. (Databricks and Microsoft have optimized the connection between Power BI and Databricks so the performance is much better nowadays than it used to be some time ago. Also, passthrough authentication via&nbsp;<a href=\"https:\/\/docs.microsoft.com\/en-us\/azure\/databricks\/integrations\/bi\/power-bi#access-azure-databricks-using-the-power-bi-service\" target=\"_blank\" rel=\"noreferrer noopener\">SSO<\/a>&nbsp;is supported, but that is out of scope for this post.)<\/p>\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-ff0be91 elementor-widget elementor-widget-heading\" data-id=\"ff0be91\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h2 class=\"elementor-heading-title elementor-size-default\">E. Visualization in Power BI<\/h2>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-025975f elementor-widget elementor-widget-text-editor\" data-id=\"025975f\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"7efc\">We now have our data as a table and want to visualize the data in Power BI. We can&nbsp;<a href=\"https:\/\/docs.microsoft.com\/en-us\/power-query\/connectors\/datalakestorage\" target=\"_blank\" rel=\"noreferrer noopener\">load data from a Data Lake Gen 2 account<\/a>&nbsp;directly from Power BI. We can then use the built-in visualizations or download additional add-ons to visualize the data. In this example, we will use a python script visual from within Power BI. This allows us to create a custom visual in the way we want using Python.<\/p>\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-5f4f8bc elementor-widget elementor-widget-heading\" data-id=\"5f4f8bc\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h3 class=\"elementor-heading-title elementor-size-default\">E0. Making python visuals work in Power BI<\/h3>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-37e0ae4 elementor-widget elementor-widget-text-editor\" data-id=\"37e0ae4\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"3284\">The idea of using python from within Power BI is simple. We get a Pandas DataFrame into our script and then we can use selected python libs to visualize our data. It\u2019s also possible to publish our report with the python visuals to the Power BI service online.<\/p>\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-70d5f25 elementor-widget elementor-widget-heading\" data-id=\"70d5f25\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h4 class=\"elementor-heading-title elementor-size-default\">Set up Power BI desktop to work with python visuals:<\/h4>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-d657e27 elementor-widget elementor-widget-text-editor\" data-id=\"d657e27\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<ul><li>1. This step is only needed if you are using a non-Windows computer like a MAC or Linux. Install Windows on a Virtual Machine or make sure you have access to a Windows installation as Power BI desktop requires Windows.<\/li><li>2. Install Power BI desktop if needed. Can be&nbsp;<a href=\"https:\/\/powerbi.microsoft.com\/en-us\/downloads\/\" target=\"_blank\" rel=\"noreferrer noopener\">installed<\/a>&nbsp;from the Microsoft Store.<\/li><li>3. Make sure you have the correct python environment matching Power BI python requirements. The python version and supported versions of packages are listed here:&nbsp;<a href=\"https:\/\/docs.microsoft.com\/en-us\/power-bi\/connect-data\/service-python-packages-support\" target=\"_blank\" rel=\"noreferrer noopener\">https:\/\/docs.microsoft.com\/en-us\/power-bi\/connect-data\/service-python-packages-support<\/a><br>When setting up Python on the VM (which I did as I use a MAC) I noticed a&nbsp;<a href=\"https:\/\/developercommunity.visualstudio.com\/content\/problem\/1207405\/fmod-after-an-update-to-windows-2004-is-causing-a.html\" target=\"_blank\" rel=\"noreferrer noopener\">problem<\/a>&nbsp;with NumPy and Windows 10 that forced me to use the 32-bit version of python, as a workaround, to make it work.<\/li><li>4. Use the suggested versions in the link above and create the python environment using&nbsp;<a href=\"https:\/\/www.anaconda.com\/\" target=\"_blank\" rel=\"noreferrer noopener\">anaconda<\/a>.<\/li><\/ul>\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-9630c8d elementor-widget elementor-widget-text-editor\" data-id=\"9630c8d\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<pre class=\"wp-block-preformatted\">conda create \u2014name powerbi377 python=3.7.7<br>activate powerbi377<br>pip install pandas==1.0.1<br>pip install matplotlib==3.2.1<\/pre>\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-c6f9394 elementor-widget elementor-widget-text-editor\" data-id=\"c6f9394\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<ul><li>5. Point Power BI to our python environment.<br>Start Power BI.<br>Select File -&gt; Option and settings-&gt; Options -&gt; Python Scripting<br>Set the path to the anaconda environment we created.<\/li><\/ul>\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-1f1af51 elementor-widget elementor-widget-image\" data-id=\"1f1af51\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"image.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t\t\t\t<figure class=\"wp-caption\">\n\t\t\t\t\t\t\t\t\t\t<img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"307\" src=\"https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/1N2VyqNifs-6bd71wcpccYg-1024x307.png\" class=\"attachment-large size-large wp-image-18515\" alt=\"Python script settings in Power BI. Spark And Power BI\" srcset=\"https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/1N2VyqNifs-6bd71wcpccYg-1024x307.png 1024w, https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/1N2VyqNifs-6bd71wcpccYg-300x90.png 300w, https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/1N2VyqNifs-6bd71wcpccYg-768x230.png 768w, https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/1N2VyqNifs-6bd71wcpccYg-610x183.png 610w, https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/1N2VyqNifs-6bd71wcpccYg-750x225.png 750w, https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/1N2VyqNifs-6bd71wcpccYg-1140x341.png 1140w, https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/1N2VyqNifs-6bd71wcpccYg.png 1202w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/>\t\t\t\t\t\t\t\t\t\t\t<figcaption class=\"widget-image-caption wp-caption-text\">Set Power BI python home directory. Image by author.<\/figcaption>\n\t\t\t\t\t\t\t\t\t\t<\/figure>\n\t\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-c7387db elementor-widget elementor-widget-text-editor\" data-id=\"c7387db\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<pre class=\"wp-block-preformatted\">Detected home directories: Other \n\nSet Python Home Directory: C:\\Users\\YOURUSERNAME\\Anaconda3\\envs\\powerbi377<\/pre>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-795dc29 elementor-widget elementor-widget-text-editor\" data-id=\"795dc29\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"3787\">Now we are ready to do some python coding in Power BI by simply dropping the python visual:<\/p>\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-a3f5383 elementor-widget elementor-widget-image\" data-id=\"a3f5383\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"image.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<img loading=\"lazy\" decoding=\"async\" width=\"350\" height=\"372\" src=\"https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/1wgLkrucIJAYW09i0aO7W3w.png\" class=\"attachment-large size-large wp-image-18516\" alt=\"Spark And Power BI visualizations.\" srcset=\"https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/1wgLkrucIJAYW09i0aO7W3w.png 350w, https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/1wgLkrucIJAYW09i0aO7W3w-282x300.png 282w\" sizes=\"(max-width: 350px) 100vw, 350px\" \/>\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-ffc0bbc elementor-widget elementor-widget-image\" data-id=\"ffc0bbc\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"image.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t\t\t\t<figure class=\"wp-caption\">\n\t\t\t\t\t\t\t\t\t\t<img loading=\"lazy\" decoding=\"async\" width=\"54\" height=\"54\" src=\"https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/11yBhgEt2u3L5lLFffakgog.png\" class=\"attachment-large size-large wp-image-18517\" alt=\"Python script icon.\" \/>\t\t\t\t\t\t\t\t\t\t\t<figcaption class=\"widget-image-caption wp-caption-text\">Built in python visual in Power BI. Image by author.<\/figcaption>\n\t\t\t\t\t\t\t\t\t\t<\/figure>\n\t\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-491e35a elementor-widget elementor-widget-text-editor\" data-id=\"491e35a\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"c0e5\">to the canvas. And you will get a script editor with the data as a Pandas DataFrame called dataset. You decide what fields of the table to use. In our case, we dropped x_value, y_value and z_value onto our canvas Py-script.<\/p>\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-db115b4 elementor-widget elementor-widget-image\" data-id=\"db115b4\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"image.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t\t\t\t<figure class=\"wp-caption\">\n\t\t\t\t\t\t\t\t\t\t<img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"361\" src=\"https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/129i190foI42x_-BSSWX9bQ-1024x361.png\" class=\"attachment-large size-large wp-image-18518\" alt=\"Python script template in Power BI.\" srcset=\"https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/129i190foI42x_-BSSWX9bQ-1024x361.png 1024w, https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/129i190foI42x_-BSSWX9bQ-300x106.png 300w, https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/129i190foI42x_-BSSWX9bQ-768x271.png 768w, https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/129i190foI42x_-BSSWX9bQ-610x215.png 610w, https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/129i190foI42x_-BSSWX9bQ-750x264.png 750w, https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/129i190foI42x_-BSSWX9bQ-1140x402.png 1140w, https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/129i190foI42x_-BSSWX9bQ.png 1164w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/>\t\t\t\t\t\t\t\t\t\t\t<figcaption class=\"widget-image-caption wp-caption-text\">We get a Pandas DataFrame with the columns we select. Image by author.<\/figcaption>\n\t\t\t\t\t\t\t\t\t\t<\/figure>\n\t\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-4590221 elementor-widget elementor-widget-text-editor\" data-id=\"4590221\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"3479\">Now we can visualize our data using python and e.g.&nbsp;<a href=\"https:\/\/matplotlib.org\/\" target=\"_blank\" rel=\"noreferrer noopener\">Matplotlib<\/a>. Below is an example of plotting our data as a bar3d visualization, but you could choose the visualization you\u2019d like.<\/p>\n<p id=\"d98e\">We are using in the Power BI desktop client, but we want our end-users to be able to consume the data online in a browser. To achieve that we can&nbsp;<a href=\"https:\/\/docs.microsoft.com\/en-us\/power-bi\/create-reports\/desktop-upload-desktop-files\" target=\"_blank\" rel=\"noreferrer noopener\">publish<\/a>&nbsp;our report to the Power BI service and let our end-users consume the report in the browser:<\/p>\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-2b283d0 elementor-widget elementor-widget-image\" data-id=\"2b283d0\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"image.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t\t\t\t<figure class=\"wp-caption\">\n\t\t\t\t\t\t\t\t\t\t<img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"913\" src=\"https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/1Bu-u2ZVR76dadQrkhDjAhQ-1024x913.png\" class=\"attachment-large size-large wp-image-18519\" alt=\"A web browser with Python 3d visualization of our data.\" srcset=\"https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/1Bu-u2ZVR76dadQrkhDjAhQ-1024x913.png 1024w, https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/1Bu-u2ZVR76dadQrkhDjAhQ-300x267.png 300w, https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/1Bu-u2ZVR76dadQrkhDjAhQ-768x684.png 768w, https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/1Bu-u2ZVR76dadQrkhDjAhQ-610x544.png 610w, https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/1Bu-u2ZVR76dadQrkhDjAhQ-750x668.png 750w, https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/1Bu-u2ZVR76dadQrkhDjAhQ.png 1093w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/>\t\t\t\t\t\t\t\t\t\t\t<figcaption class=\"widget-image-caption wp-caption-text\">Published in-browser python script visualization using the Power BI service. Image by author.<\/figcaption>\n\t\t\t\t\t\t\t\t\t\t<\/figure>\n\t\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-ece5cb1 elementor-widget elementor-widget-heading\" data-id=\"ece5cb1\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h2 class=\"elementor-heading-title elementor-size-default\">E.Conclusion<\/h2>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-6db71a3 elementor-widget elementor-widget-text-editor\" data-id=\"6db71a3\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"caad\">When we have nested structures of IoT data we can use built-in Spark scalable transformation to get the data in the shape we need.<\/p>\n\n<p id=\"482f\">We can consume the transformed data in Power BI by exporting the data to a data lake and loading it in Power BI or by using direct query.<\/p>\n\n<p id=\"b666\">If we want to visualize our data in Power BI in a way that is not supported by the built-in visualizations or by 3rd party extensions we can use python scripting visuals to visualize our data.<\/p>\n\n<p id=\"d80d\">We need to carefully set up our python environment to match the versions supported in the Power BI service.<\/p>\n\n<p id=\"1a9c\">We can publish our custom python visuals to the Power BI service and share them with others so they can consume the reports online using a web browser.<\/p>\n\n<p id=\"2a5f\">A notebook (.ipynb) with the source code above can be found&nbsp;<a href=\"https:\/\/github.com\/blendax\/Databricksnotebooks\/blob\/master\/notebooks\/Blog%20IoT%20Matrix%20Matrices%20Posexplode%20explode.ipynb\" target=\"_blank\" rel=\"noreferrer noopener\" class=\"broken_link\">here<\/a>.<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<\/div>\n\t\t","protected":false},"excerpt":{"rendered":"<p>Storing and visualizing IoT data is usually a standard task, but getting nested IoT data as a \u201cmatrix\u201d per message with corresponding vectors is not as straightforward as usual.<\/p>\n","protected":false},"author":1031,"featured_media":18520,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"content-type":"","footnotes":""},"categories":[195],"tags":[1277,836,114,921],"ppma_author":[3748],"class_list":["post-22580","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-iot","tag-iot-data","tag-power-bi","tag-python","tag-spark"],"authors":[{"term_id":3748,"user_id":1031,"is_guest":0,"slug":"mikael-hermansson","display_name":"Mikael Hermansson","avatar_url":"https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/Mikael-Hermansson-150x150.jpeg","user_url":"https:\/\/news.microsoft.com\/","last_name":"Hermansson","first_name":"Mikael","job_title":"","description":"Mikael Hermansson is Senior Cloud Solution Architect, Data &amp; AI at Microsoft."}],"_links":{"self":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts\/22580","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/users\/1031"}],"replies":[{"embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/comments?post=22580"}],"version-history":[{"count":4,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts\/22580\/revisions"}],"predecessor-version":[{"id":32401,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts\/22580\/revisions\/32401"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/media\/18520"}],"wp:attachment":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/media?parent=22580"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/categories?post=22580"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/tags?post=22580"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/ppma_author?post=22580"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}