The Apress Certification Study Companion Series offers guidance and hands-on practice to support technical and business professionals who are studying for an exam in the pursuit of an industry certification. Professionals worldwide seek to achieve certifications in order to advance in a career role, reinforce knowledge in a specific discipline, or to apply for or change jobs. This series focuses on the most widely taken certification exams in a given field. It is designed to be user friendly, tracking to topics as they appear in a given exam. Authors for this series are experts and instructors who not only possess a deep understanding of the content, but also have experience teaching the key concepts that support readers in the practical application of the skills learned in their day-to-day roles.
More information about this series at https://link.springer.com/bookseries/17100
The Apress logo.
This Apress imprint is published by the registered company APress Media, LLC, part of Springer Nature.
The registered company address is: 1 New York Plaza, New York, NY 10004, U.S.A.
This book is dedicated to my husband who has always supported me in my endeavors.
I cannot honestly say that I have always wanted to write a book. What I can say is that I have relied on books written by others throughout my learning journey with Power BI. Call me “old-school,” but having the physical document available to write notes on, refer to, and reread helps me master a concept, technique, or idea. When Apress approached me with the idea of writing a book about the PL-300, I was excited and nervous. Excited because it would give me a chance to “pay it forward”—to help others who want to pass the PL-300 and, more importantly, become skilled Power BI practitioners. Nervous because I worried that I wouldn’t be able to write cogently and succinctly about all the topics covered by the PL-300.
In the process of writing, I had to research some topics further, because I realized that I didn’t fully understand them. I had to think deeply about each topic to come up with simple, clear language to explain complex elements. In other words, I had to really ensure that I understood each topic before I could write about it. This process has made me a better teacher and a better practitioner. Because teaching is my priority, I really hope that this book will reach folks who may feel intimidated by Power BI. I hope it will empower you, dear reader, to tackle the PL-300 exam. Even more importantly, I hope the book will encourage you to pick up other books written by other Power BI professionals. Apress publishes many of these titles, and they are all excellent.
Finally, let me share with you my personal mantra: #showup. Do not be afraid to take the exam; so what if you fail? Failure is integral to learning. Do not be afraid to speak at a conference; there are always folks who know less than you do and folks who know more than you do. Do not be afraid to become active in the Power BI community. We are a very welcoming, inclusive group of people who want everyone to succeed, and that includes you!
I would like to thank Ginger Grant, my technical editor, for her kindness when correcting my errors. It takes tact and empathy to tell someone they are wrong, and she has both in spades.
A photo of Jessica Jolly.
A photo of Ginger Grant.
Welcome, reader. I hope that this book will be informative and helpful as you prepare for the PL-300 exam. I am a big believer in certification and want to encourage as many people as possible to get certified.
The very premise of this book begs a question: Why certify? Lots of very talented professionals in the Power BI community are not, and don’t feel the need to be, certified. You certainly can go this route if your “street cred” is high, that is, other professionals recognize your expertise. Building up your reputation over time is one way to accomplish professional recognition. But what if you are in a hurry? Or you don’t have an extensive network to leverage? This is where a certification is valuable.
Any Microsoft certification test you take is rigorous and comprehensive—it is supposed to be! If you pass, that is a very good indicator to others (hiring managers, colleagues, your wider network) that you possess the minimum amount of expertise deemed necessary to be a competent Power BI practitioner. The PL-300 exam is what I call a “broad-spectrum” exam; it will test you on a wide range of concepts and techniques. To prepare for the exam, you need to make sure you are conversant with the full Power BI ecosystem. That’s what makes the PL-300 (and the DA-100 before it) such a good indicator of your competencies and why certification is a great way to advertise them.
I take exams as part of my job; I am a Microsoft Certified Trainer (MCT) and am required to maintain active certifications to renew my MCT yearly. I was never terribly test-phobic and am certainly not now—I can’t afford to be! But I recognize that lots of people are terrified of taking a test. In my observation, there are two types of exam takers: (1) those who want to be completely prepared and won’t sit for an exam unless they are close to 100% confident in their knowledge and (2) those who take the exam as soon as they feel even slightly ready. (Full disclosure: I am firmly in the second camp.) If you are in the “completely prepared” camp, you are your own best judge of when to take the exam. But I can say don’t wait for the perfect moment. No such moment exists. I can’t honestly say that this book (or any study material) will completely prepare you for every question on the exam. There will be questions even a seasoned professional may not have encountered. Nor do I expect that you will want to take exams the way I do: on a wing and a prayer. I use exams as a gauge of how far I am from competency. Occasionally, my approach works, and I unexpectedly pass an exam. But more frequently, I fail. I have a high tolerance for failure, which is not the same as saying it doesn’t bother me. It does. But I work at taking it in stride. I respond to passing the same way I react to failing; it’s a milestone along the way. In this way, I normalize the taking of exams, without overvaluing either outcome. In this book though, I am trying to prepare you for a middle way: taking the exam once you are reasonably confident in your knowledge.
Determining if you are reasonably confident is subjective, but there are objective measures you can use. The first, and most important, method is to follow the list provided by Microsoft that outlines each of the subject areas (also called domains). This book is organized using those domains. Don’t skip any chapters because you can be sure that some material from each domain will appear on the exam. Practice the techniques described as much as possible. Read or listen to other supplementary material, particularly on topics that you don’t understand. (I will provide well-reputed sources for additional content in Chapter 15.) Attend as many user group meetings as you can. Post-COVID, most of them have a virtual option, so attending is much easier than it used to be.
To assess my understanding of a topic, I describe the concept using layman’s terms—no technical terms at all. If I can’t “translate” the idea into “plain English,” I know I am not quite there. I also perform “teach-back” to someone else. Almost anyone in the Power BI community is a likely audience for a “teach-back.” If the audience is more proficient than you are, they can give you feedback on aspects where you were unclear or, worse yet, incorrect. If your audience is not as knowledgeable as you are, they can tell you which parts you explained clearly and areas where your explanation was “muddied.”
Before you take the exam, there are some steps you need to take.
The first step to taking the exam is signing up. You do this on the Microsoft website:
https://docs.microsoft.com/en-us/learn/certifications/exams/pl-300
The retail price for the exam is 165 USD. (You might have discounts based on your workplace or other memberships.)
Once you choose “Schedule exam,” you will need to create a certification profile if you haven’t taken an exam before. If you have taken an exam before, you will be taken to your profile page.
Passport
Driver’s license
Non–US military ID (including spouse and dependents)
Identification card (national or local)
Registration card (green card, permanent residence, visa)
Unacceptable forms of identification include renewal forms with expired IDs and government-issued name change documents with government IDs. If your identification is not considered valid, you will not be permitted to complete your exam, and you are unlikely to receive a refund.
If you are creating a profile for the first time, make sure that the name that you provide matches the name on your identification you will use to validate your identity. If it doesn’t, you may be challenged during your sign-in process.
Use an email address you will continue to have access to even if you change employers. Use a Microsoft email address for best results: Hotmail, Outlook, or your own domain.
After creating or verifying your profile, you will then be taken to the “Exam Discounts” page. If you have received a discount code, this is where you will be able to enter it.
Now proceed to the scheduling page.
Are there testing centers near you? A lot of them closed during the COVID pandemic and may not have reopened.
Do you have a quiet, uncluttered space in which to take the exam? If you are taking the exam at home, you cannot have people coming into the room where you are taking the exam. You also need to clear out the space around your computer so there are no books or other potential sources of information near you. The exam proctor will ask you to take a picture of all four walls of your room to make sure there is nothing that could provide any extra information. This means removing any pictures from the walls, covering all screens but the one you are working on, and closing the door to the room.
Do you do your best work later in the day or at night? If so, a virtual test may be the right option for you.
Do you have lots of distractions you cannot control at home? If so, a testing center may be a better option.
Does your computer have a microphone and camera attached? Both are required for an online exam.
Can your computer run the required software? Pearson VUE provides a test that you should run before your first online exam, to ensure you won’t have any issues. If you are taking the test from a work-provided computer, make sure that you are not on the virtual private network or behind your company’s firewall.
When my local library was open, I booked a conference room for two hours and took the test there. It was great because I didn’t have to clear anything out. I put a handwritten sign “Do Not Disturb—Testing” on the door and then locked the door so no one could accidentally come in.
I took a test in my bathroom because it was the least cluttered of my rooms. I don’t recommend this—it was very uncomfortable. The proctor didn’t like it either as my device camera was at a strange angle because I had my laptop on my lap.
I have taken a test in a bedroom, with a door that I thought I had closed. My dogs came in and jumped up on the bed. I thought it would invalidate my test, but fortunately it didn’t. (These dogs are poodles, so they are smart, but their data analysis skills are not renowned.)
I had a Dell laptop in which the camera was on the bottom of the screen. This created an awkward angle, which the proctor was not used to. She constantly asked me to adjust my camera, something that was not easy based on its position. I would recommend using a free-standing camera if you can, as it will be easy to adjust on request.
Proctors do not like it if you cover your mouth with your hand. Be aware of your habitual gestures because the proctor may object.
Proctors do not like it if you read the question to yourself, even silently. If that is a habit you have while reading, be aware of it.
Do not expect to have drinks or food. I have had proctors challenge my soda bottle.
These are just tidbits from my experience. Please read the Pearson VUE website carefully, particularly if this is your first exam.
If you are scheduling an online exam, you will see a calendar with dates on which an appointment is available. (Grayed-out dates do not have available time slots.) You can choose the clock type you want to use (12-hour or 24-hour). Be sure to set the time zone where you will be when you are taking the exam.
The time you choose is the starting time for the exam. You can begin your check-in process no earlier than 30 minutes prior to the starting time. I recommend checking in as soon as you can—you can always start the exam a little early, but if you run into problems when checking in, you may need that entire 30 minutes.
Once you have successfully scheduled your exam, you will receive a confirmation email at the email address listed on your profile. (This may not be the email address you check regularly.) I forward the confirmation to my business email, where I keep my calendar. I then set up an appointment for the exam at the starting time of the exam and make a note that I can sign in up to 30 minutes early. I attach the confirmation to the appointment item. That way, I have all the relevant information I need right in the appointment entry.
You should block out two hours for the exam. The time allotted for the exam varies by exam, but two hours should be plenty for the PL-300.
Up to 24 hours in advance, you can reschedule or cancel your exam. I have rescheduled exams several times. Each time the process worked seamlessly, until I forgot to reschedule an exam and therefore “lost” the exam. If you can’t reschedule more than 24 hours in advance, you will need to contact Pearson VUE to discuss your individual situation.
I haven’t taken an onsite exam since the beginning of COVID, so I can’t speak to the rescheduling or cancellation policies of an onsite testing center. If you choose to take the test at an onsite center, be sure to read the guidance documents carefully.
Now you are prepared to take the exam. You have scheduled, you have studied, and you are ready! What can you expect? I am not allowed to give you information about the content of the exam, but I can describe the format and structure of the exam.
Case Studies: You will have at least one case study, which will describe a scenario, with the current setup, the solution requirements, and the problems/issues/constraints fully described. You will then be given a series of specific questions, and you will need to determine if the proposed solution will meet the requirements. Case study sections are self-contained. You can mark a question for review, but you will only be able to review it while you are still in the case study.
My advice for Case Studies: Take your time and read all the descriptive material meticulously. Then read each question carefully and go back to the descriptive material and look for wording or specific points that could affect your answer. If you are uncertain about your answer, mark it for review and come back to it after answering the other questions in the case study. Sometimes answering other questions will jog your memory, and you will be able to confidently answer the question you weren’t sure about.
Drag and Stack: These questions will require you to build an answer using various options that you can drag over from a list. For example, you will need to provide three steps, in sequence, and you are provided four options from which to choose.
My advice for Drag and Stack: Eliminate the clearly incorrect answer. There will usually, but not always, be one. Then, of the remaining options, identify the one that is clearly either first or last in the sequence. You have thus minimized the options you have to choose from and thereby reduced your risk of being incorrect. Read the options carefully (have I already said this?!). There are times when the way an option is written will make it clear whether it is correct or not.
Fill in the Blank: These questions provide you with options to use to fill in the blank (or multiple blanks) in an answer. Unlike Drag and Stack, you can use an option more than once or not at all.
My advice for Fill in the Blank: Look for the parts of the answer about which you are confident and fill in that blank. Then look at the other options and eliminate anything that is clearly wrong. After that, it is something of a guess. I do mark these questions for review because after finishing all the questions, I feel more confident in some of my answers.
Problem/Solution: These questions pose a particular problem and then provide a proposed solution. You are then asked if the proposed solution will solve the stated problem.
My advice for Problem/Solution: Read the problem statement carefully. (I know I am repeating myself, but I cannot overemphasize this enough. Sometimes there is a clue in the wording of the problem.) As you review the proposed solution, remember that there can be multiple solutions to a problem. Just because you marked a previous answer as a solution does not mean the next question is not a solution. Don’t try to outwit the exam algorithm—there is not a predetermined set of solutions that always obtains. Read each proposed solution at face value.
Don’t be afraid to fail. Failing a test does not mean you are a failure. Rather, it is a gauge that tells you that more study is needed. As I said earlier, I have failed plenty of exams!
Get a good night’s sleep. You are taking an exam, not facing a judge and jury.
Do not eat or drink anything unusual the night before your exam. You can step away for an unscheduled break, but this is a recent development, so I can’t give you any specifics.
Breathe. Again, it seems obvious, but what I really mean is pay attention to your breath. Don’t hold your breath or take shallow breaths. These breath patterns can increase your anxiety.
Wear comfortable clothing and make sure the room temperature is comfortable for sitting in one place for over an hour. Again, you won’t be able to get up and put on a sweater or take off a garment.
Plan to keep your face in the camera’s view and your hands resting on the table/desk or lap when you aren’t using your mouse/touchpad. Proctors do not like it when you cover your mouth in any way.
Manage your exam clock. Don’t spend too much time on one question. If I can’t answer the question completely or confidently within two minutes, I answer to my best ability and then mark it for review. I rarely leave an answer completely blank. You would be surprised how much your nerves have settled toward the end of the exam. You can look at questions you weren’t sure of and answer them confidently when you are reviewing them.
Read carefully. Yes, I am saying it again. I have passed exams for which I really didn’t know the material very well. But because I read each question and answer carefully and I guessed very judiciously, I passed the exam. I don’t recommend this as a surefire way to pass an exam, but it certainly helps.
Handle failure and passing with the same amount of reaction—don’t overly celebrate a pass or overly accentuate a failure. Neither is anything other than a gauge of your knowledge. There’s always more to learn.
You will find out immediately whether you passed. Each Microsoft certification has a “badge” with an icon. I search for the badge icon, capture it (or get it in a .png format), and put it into a folder I keep for certification icons. I then add it to my email signature, my LinkedIn profile, and my website.
If an employer needs proof that you have passed the exam, you can share your transcript with them. You will have an MCID (once you’ve filled out your profile), and you will have a transcript code. You can provide your MCID and transcript code to any employer interested in your certifications.
Plan to take the exam again. Do not be discouraged. At the end of each exam, you will get a profile of your answers by domain. Take note of these areas and tackle the ones that were scored the lowest.
Failed exams do not reflect on your transcript. No one needs to know if you failed. And even if they do, so what? There’s a reason I am so public about failing exams—failure is just a necessary part of success!
As I mentioned in Chapter 1, I consider the PL-300 exam a “broad-spectrum” exam. That means that the domains covered are comprehensive: from the Power Query Editor (PQE) to the Power BI Desktop and then the Power BI Service. Before we dive into each of these domains in turn (in subsequent chapters), I would like to provide an overview of the Power BI ecosystem. I start all my training classes with this overview, and invariably there are elements that surprise students.
Power BI has a deep heritage in database technology. It should therefore come as no surprise that there is a lot of database terminology. Throughout this book, I will take pains to identify and explain terms that you will encounter while studying for this exam.
There are three separate tool sets that are used in Power BI: the Power Query Editor, the Power BI Desktop, and the Power BI Service. You will need to be familiar with each of these tools and what their individual roles are in the product.
The Power BI Desktop (hereafter called “Desktop”) is the client where you will develop your data model and create reports. It is a free tool, which can be downloaded from the Microsoft Store. (See below for more information on downloading the Desktop.) You can use the Power BI Desktop without logging into the Power BI Service, but if you want to publish a report, you need to sign in.
If you have Office 365 through your employer, your Power BI license will probably be connected to those credentials. If you are learning Power BI proactively and do not have employer-provided credentials (yet), you can sign up for a free Power BI license. However, to do so you must use a “business” domain. For example, you cannot sign up for a Power BI license using a Gmail email account.
The Power BI Desktop is a resource glutton; Microsoft recommends having a machine with at least 8 GB of RAM. If you can, get a “beefier” machine, with at least 16 GB of RAM, preferably 32 GB. With less than 16 GB of RAM, you may struggle with the responsiveness of the Desktop. Additionally, the Power BI Desktop does not have a Mac version. If you are planning to use an Apple computer, you will need to install a Windows virtual desktop, such as Parallels.
The Desktop is a development environment. Many of us come to Power BI through Excel, and we don’t consider ourselves “developers.” But you need to adopt a developer mindset when using the Desktop to develop your data model and your reports. There will be some features and functionality that won’t be activated until the report is published. You also do not want to share your reports (in a PBIX file) directly with another user. Yes, they can open your report in the Desktop, but there will be things that won’t work properly or as designed until you publish the report into the Power BI Service.
The Power BI team at Microsoft delivers updates to the Power BI Desktop ten times a year. The easiest way to ensure that you have the latest Power BI Desktop is to download the application from the Microsoft Store. (Scroll down to the bottom of the web page to find the download button.) If your organization does not allow you to download from the MS Store, you can download the Desktop application directly from here. If you cannot download from the MS Store, you will need to manually update the Power BI Desktop.
When I teach, I often hear students use the terms report and dashboard interchangeably. Reports and dashboards are two different products, and as I will be using these terms throughout this book, I want to define them here.
A report is a single-page or multi-page document with one or more visuals on each page. Reports are usually created in the Power BI Desktop but can be created in the Power BI Service. A report is always underpinned by a data model. A report typically covers a particular subject but doesn’t have to.
A dashboard is a single-page document with one or more visuals taken from one or more reports. Dashboards can only be created in the Power BI Service. Visuals that are “pinned” to a dashboard do not interact with each other as they do on a report. (There is one exception to this rule, which we will cover in a later chapter.) Dashboards can have streaming content, such as IoT data, whereas reports cannot.
Report or dashboard? The favorite answer in the Power BI community is “It depends” because it often does. If the report consumers need to actively filter the visuals, a report is probably the best choice. On a dashboard, when a visual tile is selected, the reader is directed back to the original report. If you are creating something to provide information at a glance, a dashboard may be a better option than a report. I think of a dashboard as something that can be displayed on a big monitor hanging in the lobby, cafeteria, or factory floor. People look at the monitor to see updated information, but don’t expect to interact with that information.
The Power Query Editor (PQE) tool is what you use to extract, transform, and load your data.
Extract, Transform, Load (ETL) describes the process in which you will connect to a data source, acquire the data you want (Extract), perform the changes you need to make (Transform), and then apply the changes you have specified to the selected data and import it into the Power BI Desktop (Load).
The Power Query Editor is a tool that is available through the Power BI Desktop. It is also available through Excel 2016 (or later) and from the Power BI Service. Any techniques and skills you use in the Power Query Editor within the Power BI Desktop you can also use in the Power Query Editor in Excel or in the Service. Just a quick note, new features are released to the Power Query Editor in this order: PQE Service, PQE Desktop, and, finally, PQE Excel. There may be features that you use in the PQE Service or in the PQE Desktop that you won’t see in PQE Excel.
Navigate to www.app.powerbi.com. To sign in, use your Office 365 credentials or the sign-in credentials you used when you registered for a free license and the trial.
From within Office 365, click the tile for Power BI. You may have to go to All apps to surface the tile if you have never used it.
From within the Desktop, after you have published your report, follow the hyperlink to the Service.
The Service has three main areas: My workspace, Workspaces, and Apps. For this exam, it is essential to have familiarity with Workspaces. If you are using a free Power BI account, you will not have access to Workspaces, only to My workspace. Practicing only with My workspace will not provide access to the full capabilities of the Service, which is why you should sign up for the 60-day trial Power BI Pro license as you study for the exam.
Admin: Has full control of the workspace
Member: Can create, edit, and delete content. Can add other members. Can decide what is included in the workspace’s app (if there is one)
Contributor: Can create, edit, and delete content
Viewer: Can view content
You will not see a specific workspace in the Service unless you have been added to it at one of the preceding levels of membership.
My workspace is a sandbox area where you can publish reports and experiment with the features of the Service. You should not publish content to My workspace that you wish to share with others.
An app is a vehicle for the distribution of reports and dashboards within the Power BI Service. It is not the same thing as the Power BI mobile app. I think of the app as a magazine that has curated content and is distributed at intervals (weekly, monthly, quarterly). An app is associated with a specific workspace on a one-to-one basis (one app, one Workspace). Someone with the right privileges in the workspace must create and update the app. Apps can contain reports, dashboards, and Excel content. I will cover Apps in more detail in a later chapter.
You can use the suite of tools with a free license. However, you will be limited in your capabilities using a free license: primarily you will not have access to Workspaces. To use Power BI within a workplace infrastructure, you will need a Pro license, which can be a trial license. If you want to work with the full Power BI ecosystem and you don’t already have a paid license, you will need to sign up for a Power BI Pro trial license, which is valid for 60 days.
You can’t sign up for a trial license with a consumer email address such as Gmail or Yahoo or AOL. You will need a “work”-related email to sign up.
There are two languages that are used with the Power BI ecosystem: M and DAX (Data Analysis Expressions). You do not need to be an expert in either of these languages for the exam. But you will need to be familiar with basic DAX and M functions and syntax.
We will cover the basics of the M language in Chapter 4 and DAX in Chapters 6 and 7.
If you are working in Power BI already, you may not need a lot of practice prior to sitting for the exam. If, however, there are techniques or features that I refer to in this book with which you are unfamiliar, your best bet is to practice. The biggest challenge is acquiring a dataset that is easy to use. The good news is that Microsoft has several sample datasets available that you can use.
The easiest way to practice is to use Excel and/or CSV files. Almost every business user has familiarity with these two tools. Northwind is a set of data that Microsoft developed for use as sample data. This data is fake, which makes the data safe for you to use without worrying about confidentiality. If you are using the Excel (or CSV) version, you can review the data so that you are familiar with it prior to using it in Power BI. You can also calculate results using Excel formulas, which can help you check your DAX code. The disadvantage of this data is that it is a small set of data with a limited number of fields in each worksheet. But as a starter dataset, I think it is excellent. You can find it on GitHub:
https://github.com/graphql-compose/graphql-compose-examples/tree/master/examples/northwind/data/csv
If you want a more robust (and realistic) set of data with which to practice, Microsoft has developed several databases. AdventureWorks DW is the most commonly used, but there are others, such as Tailwind Toys.
AdventureWorks is available as a BAK file, which you can use to restore a database to your local device. This process was unfamiliar to me as I had no database background (and still don’t); I assume it will be unfamiliar to many of you as well. I am going to outline the steps you will need to take to use the database so you understand the overall process. I will then provide a link to the Microsoft documentation where they give step-by-step instructions.
Step 1: Download the BAK file for AdventureWorks DW. Find the BAK file for AdventureWorks here.
Step 2: Install SQL Server Management Studio (SSMS). This is a free application (there are paid versions, but you don’t need one) available here.
Step 3: Install Microsoft SQL Server on your local device. This is where the AdventureWorks database will be hosted.
Alternate Step 3: Set up an Azure AD account and use the Azure SQL Server option. Note that you will need an Internet connection to access a database hosted in Azure. Read all the fine print if you choose this option so that you aren’t surprised by any charges.
Step 4: Use SSMS to restore the database from the BAK file. There are detailed instructions on how to do this at the following URL:
Step 5: Use the credentials that you set up for the database to connect from the Power BI Desktop. Take care to note if you are going to use database credentials or Windows credentials to connect to your database.
When planning this book, I had to decide on both content and perspective. I could have made this a “how-to” book, but there are many excellent “how-to” books—many of them published by Apress! My approach is to explain a feature (what is) and why it is important to understand it. If I believe that knowing a feature or concept will make you a well-rounded Power BI practitioner, I have included it, even if it does not appear on the PL-300 exam. Don’t misunderstand me: there are many features that I do not cover at all because they are more complex or advanced. In Chapter 15, I will provide a list of books, blogs, and YouTube channels that provide excellent “how-to” content. The feature set in Power BI is constantly changing, and it is better to use frequently updated sources to learn how to use a feature. In this book, I want to provide a road map for you so that you know what to study and why.
SQL Server (all the flavors)
Excel/CSV
Folders (SharePoint and File Explorer)
Power Platform
Azure
Power BI dataflows and datasets
A screenshot of the power B I toolbar. The home tab is selected and options under the data section are. Get data, excel workbook, data hub, S Q L server, enter data, data verse, and recent sources.
The Home tab has several options for extracting data
A screenshot of the navigator dialog box. The left panel has a dropdown with the name of an excel workbook. 3 worksheets under it are selected. The right screen has the data from the product worksheet. The load button at the bottom is highlighted with the transform data button next to it.
Selecting worksheets in the Navigator dialogue box
As shown in Figure 3-2, it is tempting to choose the Load button (because it looks like you should). But unless your data is clean and ready to work with, choose Transform Data instead because it will open the Power Query Editor window.
In the Navigator window, you will see three yellow dots that scroll across the top of the window. Do not be misled—they are not indicative that you must complete another step.
When you connect to a data source, you may be asked to enter the appropriate credentials. Each type of connector will require a different type of credential, and some, such as CSV and Excel sources, do not require credentials. For example, if you are connecting to a SQL database, you may have to use a Windows credential or a database credential. As a practitioner, you do need to know what type of credentials you should use. In preparation for the exam, it is important for you to understand how and when you will need to change or update credentials. We will cover that later in this chapter.
A screenshot of the power query editor window with the data source settings dialog box. Under the data sources in the current file, there are 2 file locations. The change source button on the bottom is indicated and the close button is highlighted.
Accessing the Data source settings in the Power Query Editor
It is quite common to need to change the file path for a particular file. For example, if a colleague shares a PBIX file with you, any files that were used in the PBIX file will have the pathway pointing to where your colleague stored the files. You will have an error until you change the pathway to point to the location on your device. The easiest way to do this is in the Data source settings dialogue box. But you can also correct the pathway directly from the error screen itself.
A screenshot of an error message in the power query editor reads. Data source error. Could not find a part of the path. The file path is given. The go-to error button is on the right.
A broken file path
A screenshot of an error message in the power query editor reads. Data source error. Could not find a part of the path. The file path is given. The edit settings button is on the right. The query settings panel on the right has properties and applied steps sections.
Edit Settings within an error
A screenshot of the excel workbook dialog box. The radio button basic is selected and the file path textbox has the file location with a browse button next to it. The ok button below is highlighted.
Browse to change the file path
This is one of the most common errors you can incur in the Power Query Editor. In Chapter 4, we will cover errors in more depth.
A screenshot of the data source setting dialog box. Under data sources in current file, the demo source is selected. The edit permissions button below is selected. The close button at the bottom is highlighted.
Editing permissions on a file source
A screenshot of the edit permissions dialog box for the demo source. The credentials type is windows. The encryption box is unchecked. The privacy level is none. No native database queries are approved. The ok button below is highlighted.
The Edit Permissions dialogue box
A screenshot of the S Q L server dialog box. The 3 options on the left are windows, database, and Microsoft account. Under windows, the demo server has the use my current credentials option selected. Use alternate credentials option has a username and password textboxes. The save button below is highlighted.
Changing your login credentials for a database
The type of credentials you will need depends on how the database administrator has set up your account.
If you need to remove and/or replace existing permissions that you used to connect to a data source, follow the same process as editing permissions. (See Figure 3-7.)
None: There are no privacy restrictions on this data source. It can be combined with other data without any restrictions.
Public: This data can be combined with other public or organizational data sources, and the data is visible to anyone. However, only files, Internet data sources, and workbooks can be marked as public.
Organizational: If you mark a data source as organizational, that data source can be combined with other data sources classified as “organizational.”
Private: A data source classified as private will isolate that data completely. It cannot be combined with any other data source, even other data sources classified as private.
A screenshot of the edit permissions dialog box. The privacy level dropdown is selected with 4 options under it. None, public, organizational, and private. The ok button at the bottom is highlighted.
The Privacy Level
The challenge with setting privacy levels is that you can set them too rigorously. If you do so, you can incur a Formula.Firewall error in your query. Another concern with privacy levels is that they can slow down a refresh once your report is published in the Service. There is an option to disable the privacy levels in the File ➤ Options and settings ➤ Options ➤ Global ➤ Privacy menu. If you do choose to disable the privacy levels, you must be sure that you are only working with data within your organization’s control.
In addition to entering credentials, you will also need to determine a storage mode (if applicable) for the data source. For sources like SQL Server, you will be presented with two options: Import or Direct Query. Knowing the differences (and limitations) of these modes is included in the PL-300 exam.
The default storage mode is Import. In this mode, Power BI extracts a copy of the selected tables (or views) and stores them in the Analysis Services database that is spun up when you create a PBIX file. Import does not affect the source of the data—you are not “writing back” to the source in any way. Import is a copy of the data at the time of extraction. Contrary to your initial assumption, you do not need to work with a constantly refreshed copy of the data to build your report. If the data that you are working with is representative of the latest data, that is sufficient.
What does it mean for data to be representative? It does not mean that the values are the same or the most recent. It means that the structure of the data remains consistent as data is added, updated, or deleted.
Using the Import mode is a much more efficient way to process your queries: your report will be more responsive. You will also have full control over the data model—you can add measures, create the tables you need, and structure relationships optimally.
When I teach a class on Power BI and Direct Query comes up, invariably a student says, “You should always use Direct Query so you can have the latest data.” Not to put too fine a point on it, this is not necessary. It is true that if you use Direct Query you will always be using the latest data. But you will pay a performance price for that timeliness. And you don’t usually need to work with the most recent data. If the data you are working with is representative (see earlier) of all the data, working with older data is not an issue.
Direct Query creates a connection between the Power BI Desktop and the data source. Your report’s responsiveness will be dependent on the latency of this connection, something that is probably not in your control but can negatively affect your report performance. You will not be able to perform any transformations or modeling of the data—you must use the data as is. If the data has been fully prepped for all your reporting requirements, this restriction may not be an issue. But you need to be sure that it meets your needs. Using Direct Query can also have a negative impact on the server that you are connecting to; be sure to talk to the database and/or server owner to make sure that your report doesn’t create an inordinate burden.
How often will the report consumers act on the data? If the data is only actioned once a day, or even twice a day, you don’t need Direct Query.
Do you have a lot of historical data in your dataset? If you have data that is several years old, typically you do not need the data from five minutes ago, because you are making decisions across a wider time frame.
Does the data change significantly in real time? How often is the data refreshed at the source?
Are parts of the source data refreshed at different rates? If yes, you don’t want to use Direct Query because parts of your report could conflict.
Don’t choose Direct Query unless you have some overriding business justification for it; the disadvantages often outweigh any advantages.
In the same dialogue box where you see the Import and Direct Query options, you will also see an Advanced drop-down. This is where you could write your own SQL to extract exactly what you need from the source. Writing your own SQL statement immediately breaks “query folding” (explained in Chapter 4), which is a bad thing. If you choose to write a custom SQL query, do so only if you are confident that you can write a better query than Power Query can generate.
When you connect to an Analysis Services source, it is called a Live Connection. These three sources are SQL Server Analysis Services Tabular, SQL Server Analysis Services Multidimensional, and Power BI Service dataset.
In December 2021, Microsoft announced the ability to combine data that has been imported and data that is accessed via Direct Query. These are called hybrid tables and are accomplished by partitioning the data. The primary benefit of a hybrid table is you get the speedy performance of imported data and the latest data via Direct Query. Partitions are not covered in the PL-300 exam and are beyond the scope of this book, but I want you to know what is meant by hybrid tables.
A screenshot of the power B I window. Under the home tab, the get data icon is selected and 10 common data sources are listed in a dropdown. The more button at the end of the list is highlighted.
Reviewing all the connectors available
And of course, there are always good old Excel and CSV files. But there is a whole new world of data sources that you have (potentially) available to you. (Whether you have them available to you or not, you need to know about them for the exam.)
Dataverse is the new name for what used to be called the “Common Data Service.” Approximately five years ago (don’t quote me on the exact date), Microsoft realized that within their Dynamics CRM tool, they were sitting on a gold mine: a ready-made database infrastructure that could be used by non-database-literate folks (like me) to build a database. There are premade tables in Dataverse for commonly used data structures, such as accounts, contacts, and so on. It’s a database in a box. And even better, Dataverse is part of the Open Data Initiative, something that Microsoft, SAP, and Adobe have collaborated to create. You can also use Dataverse to build your own tables, if you can’t find one that works for you in the premade ones.
Dataverse is the data infrastructure that underpins the entire Power Platform, so it should not be a surprise that Power BI can connect to a Dataverse model. Dataverse tables are housed in something called an environment, and you need to have the URL to connect to an instance of it. That URL will usually contain some reference to “CRM” and/or “Dynamics.” (Remember, Dataverse emerged from Dynamics CRM.)
A dataflow is a set of data on which someone has performed transformations and saved that work.
Imagine you have a colleague who is quite skilled with the Power Query Editor. This person could perform transformations on a variety of data and then share the completed work as a dataflow. Another use case: Often multiple report creators have the need to connect to the same underlying data source, but for some reason, they should not connect directly to the data source. A dataflow can be created that provides the content (tables, views) that the report creators need.
A screenshot of the navigator dialog box. The power B I dataflows drop-down is expanded and under the training workspace, there are 4 folders. The north wind data folder is selected and 8 dataflows are listed within.
Connecting to a dataflow in a workspace
If someone in your organization has already put a lot of work into creating a dataset, anyone who has permissions to that dataset can use it to create a new report. We will talk further in a later chapter about datasets, but the benefit of using a shared dataset is the time savings. Someone else can do all the work of transforming and modeling the data, and you can use it all in a report. Just a side note: Using a shared dataset creates a Live Connection (see earlier).
A local dataset is one sitting on your computer. There isn’t anything unusual about connecting to a local dataset except when you try to set up the refresh schedule for the local dataset in the Power BI Service. The Service will not be able to “reach” this dataset without a gateway, which will be covered in a later chapter.
A screenshot of the power query editor window. A table has 4 columns content, name, extension, and date accessed. The content column has the folder name. The name column has the names of files inside the corresponding folder. The extension has value x l s x.
The contents of a folder as a query
A screenshot of the power query editor window. A table has 7 columns content, name, extension, date accessed, date modified, date created, and attributes. The combine and transform button below is highlighted. The transform data button is next to it.
Options for combining files
You usually combine files that have the same structure: number of columns, names of columns. If you combine queries with different structures, there may be a lot of “messy” transformations to perform.
SharePoint folders work the same way, but there is one trick: you need the top-level URL to the SharePoint. Nothing else will work.
Provides a value to a function within the Power Query Editor
Allows a user to enter a value in the Power BI Desktop
Allows a user to enter a value in the Power BI Service
For the PL-300 exam, you will be expected to know how to change the value of a parameter, but I want you to understand what a parameter is, where you create them, and some of the ways you can use them.
This is one of the few features where I do show how to create/use it because parameters can be a little confusing when you are first getting started.
All parameters need to be defined in the Power Query Editor; they are part of data preparation. To define a parameter, follow this list of steps:
Step 1: Select Manage Parameters.
Step 2: Select New Parameter.
A screenshot of the manage parameters dialog box. The panel on the left has a list of parameters with the new button on top. The screen on the right has name, description, and current value textboxes with type and suggested values dropdown boxes. The required check box is checked.
The New Parameter dialogue box
Step 4: Provide a name and a description for your parameter. Use a name that is unique.
Step 5: Specify the data type for the value that will be supplied to the parameter (e.g., if you are going to enter dates, specify the Date data type).
A screenshot of the manage parameters dialog box. Sample parameter is selected. Type is any. Suggested values has the query option selected. Query has the country zip option selected. Current value reads select a country zip combination.
Populating the Manage Parameters dialogue box
Step 7: Provide a default value for the parameter. This will be the value that displays when a user first opens the parameter. You can enter a helpful message such as “Enter a date here” or choose a default value.
If you have ever used the Data Validation feature in Excel, this process will be very familiar. The only feature this dialogue box doesn’t offer is the ability to add a message.
One use case that is very common is to enable users to change file paths easily. Creating this parameter is straightforward.
A screenshot of the manage parameters dialog box. The file path parameter is selected. The type is text. Suggested values has the list of values option selected. Default and current values textboxes each have different paths of file location.
Setting up a list of file paths as a parameter
A screenshot of the power query editor window. In the query panel on the left, the file path option is selected. The screen on the right has a current value textbox with the path of the file location. The manage parameters button is below.
Changing the parameter value in the Power Query Editor
Once you have created a parameter, you can then use it in the Power BI Desktop. In our file path parameter example, you can make the parameter available as a data source. The first step is loading the parameter to the data model, something that does not happen by default. You must Enable load for the parameter, as shown in Figure 3-19.
A screenshot of the power query editor window. In the query panel on the left, the file path option is selected. A right-click drop-down list has the enable load option selected and indicated. The screen on the right has a current value textbox.
Enabling the load for the parameter
A screenshot of the power B I window. The toolbar has the home tab selected. The transform data icon is chosen and 3 options are listed. Transform data, data source settings, and edit parameters. The edit parameter option is indicated.
Accessing the parameters created in the Power Query Editor
A screenshot of the edit parameters dialog box. The file path drop-down list has 2 file locations options. The ok button below is highlighted.
Choosing the parameter
One of the most common uses of parameters is incremental refresh. The first time you publish your report in the Service (covered in a later chapter), you will set up a refresh schedule. Often, you want to bring in all the data when you first publish and, in subsequent refreshes, only bring in the new or changed data. Incremental refreshes allow you to specify the start date and end date of the data you want to target during the subsequent refreshes. Setting up a RangeStart and RangeEnd parameter is a prerequisite for setting up an incremental refresh.
This chapter covers exam topics around cleaning data, transforming data, and loading that data into Power BI where it can be analyzed. We cover the Power Query Editor, the M language that underlies transformations, and various topics relating to data quality.
After initiating the Get data procedure within the Desktop, the Power Query Editor (PQE) will open in a new, separate window. This is intentional—the two tools are intended to work side by side. While you are active in the PQE window, you can minimize the Desktop window.
Let’s set the scene. You have a project that requires you to gather data from a variety of sources. You didn’t (necessarily) originate this data, and therefore it probably needs some shaping before you can use it. It may have columns you don’t need, or it might be missing columns you do need. It may not have necessary calculations, or the data might not be in the right format for you to perform calculations on it. It might have misspellings, inconsistent abbreviations, blank rows, and null values. Any or all these things are invariably true for most data sources. Can you go back to the sources and fix them? Maybe, but most likely not. If the data comes from a database, it has been created for and is used by a lot of different people. The database owner is probably not going to change the data to meet your specific requirements. The Power Query Editor to the rescue!
Get data on the Home tab
Transform data on the Home tab (as shown in Figure 4-1)
A screenshot of the pathway to access the Power Query Editor on the Power B I desktop. The following pathway, home, transform data, transform data, will open the Editor in a separate window.
Accessing the Power Query Editor using Transform data
Deleting, merging, combining, and splitting columns
Deleting top, bottom, and alternate rows
Changing the data type of a column
Combining queries together (merging, appending)
Duplicating or referencing a query
Renaming and documenting columns, queries, and steps
Replacing values, blanks, and errors
Adding new columns
Adding new calculations
This is not a comprehensive list of all the types of changes (called transformations) you can perform, but to pass the PL-300 exam, you should be conversant with all the preceding transformations.
You use the ribbon tools to specify the transformations you want to perform for each query. Your mouse clicks are captured and recorded in individual steps, called Applied Steps. (If you are familiar with macros in Excel, this process will sound familiar.) Each step is “translated” into M and is visible using the Formula Bar (see Figure 4-6 for an example).
To become a very proficient “transformer” (sorry—couldn’t help myself!), you need to be very familiar with the extensive available commands. You do not need to know how to write M from scratch, but some basic M skills are helpful on the exam (and in real life!).
Turn on the Formula Bar under the View tab. Leave it on because it is the fastest way to absorb M while going about your transformations. Each action you perform using the ribbon is recorded in M and is visible in the Formula Bar, as you can see in Figure 4-2.
A screenshot of the file menu bar with an open view tab. The check box of the formula bar option under the view tab is marked.
The Formula Bar visible
A screenshot of the comment text window with a single column. A note reads, an example of a column to be split on the first comma.
Column to be split
A screenshot of the menu selection in the Comment Text window to split the column. The option to split the column by delimiter is selected from a menu list.
Splitting the column by delimiter
A screenshot of the comment text window split into 2 columns. The resulting 2 columns are obtained after splitting the column on the first comma.
One column split into two
A screenshot of the formula bar with an M statement. The parts of the statement are labeled. The comment text at the bottom has 2 columns. The delimiter option under the applied steps is highlighted.
The anatomy of a sample M statement
M is a separate language from DAX. It is used in the PQE, not in the Desktop.
M is case sensitive. Pay attention to upper- and lowercase!
M is a functional language. You use a function to perform an action.
An M function has arguments (just like in Excel) that are separated by commas.
In M, when you see curly braces “{}”, that indicates a list.
To see the entire script you have written in the query you are working on, open the Advanced Editor on the Home tab.
A screenshot of the applied steps panel. The shaded X sign on the left of the split column by delimiter option is used to delete the step and the gear icon on the right is used to redo it.
Undoing or redoing a step
In addition to data not meeting your requirements, it often does not have user-friendly names. If your source is a database, you will often see column names prefixed with the term “DIM” or “FACT”—more on these prefixes in Chapter 5. You should rename columns and queries with names that make sense to your report readers. Two columns in the same query cannot have the same name, but you can have columns of the same name in different queries. In fact, you should make sure that your key columns have the same names in different queries. That will make it easier for you to relate them in the data model and easier for your report reader to know that they represent the same value.
You may have seen names where there are two words joined by an underscore (Name_Name2). Should you have spaces in the names of your columns and queries? Spaces are more familiar to your users, but if you have a space in your table name, you will always have to enclose it with single quotes (e.g., ‘Name Name2’).
Get in the habit of always putting your table name in single quotes—then, when a table name requires single quotes, you won’t be wondering what is wrong.
If there is a naming convention in your organization, follow it. If there isn’t a naming convention, establish one. Column and query names will be consistent.
Rename all your columns in the query as one Applied Step. That way, when you look for the step where you renamed a column(s), it will be in one step. It doesn’t matter if you do it at the beginning or end of your process, but because you may be creating and deleting columns, you may want to save it until close to the end.
A screenshot of the Power Query Editor with filter functionality option. The option removes data from the data model. The screen has a list of countries with marked check boxes except for 2 countries.
Filtering out data
A screenshot of the applied steps panel with the step renaming option. The screen opens a menu with rename option for the filtered rows in the applied steps section.
Renaming the resulting step
A screenshot of the applied steps panel. The last option from the list is highlighted. It reads removed the United Kingdom and the United States.
Much easier to find when in a panic
A screenshot of the All Properties window for query description. The query settings on the right highlight the all properties option. The screen has entry fields for query properties on the left.
Documenting a query
A screenshot of the applied steps panel with the step description option. The properties option from the menu opens a dialogue box titled step properties. It has entry fields for name and description.
Documenting a step
Bonus: If you add a description to a step, a little circle with an “i” in it appears next to it in your Applied Steps list.
I always document steps when I filter data, remove columns, or make other changes that may not be obvious to me or a colleague after some time has passed.
There’s an old expression a stitch in time saves nine. It really applies to documentation—do it while you are during the work. You (or your colleagues) will be grateful later!
Any documentation you add will be visible once you have applied your changes (see “Loading Your Queries (or Not)” below).
A screenshot of the file menu bar with an open view tab. The checkboxes for column quality, distribution, and profile are marked.
Turning on column profiling
Each of these features will introduce new elements to your screen real estate, so you may not want these features on all the time. Looking at them when you first bring in the data is a good practice; then you can turn them off after you have assessed your data.
Are there empty or blank values in a column? Decide what your strategy will be for filling these blanks. Do you want to put in a “0” or an “N/A” or a “blank”? Using Replace Values is a simple way to make these changes, but make sure you know if the empty value is truly blank or is a null. Blanks and nulls are treated differently in the Power Query Editor.
A screenshot of a customer I D column. A note mentions that the I D column has a 1 over 1 ratio between the distinct and unique values and none of the values are repeated.
Example of a high-cardinality column
A screenshot of a country column. A note mentions that the column has 21 different countries with only 3 countries listed only once. The column has a very low cardinality.
Example of a low-cardinality column
What is the distribution of values? Do you see different types of data in the same column (i.e., text and numeric)? Part of your work will be to make the values in a column a consistent type (either a text or number).
A screenshot of the data columns with a profile shortcut bar. Simplified data profiling is available when hovering over the bar under the column header. The bar magnifies the customer I D column data.
Simplified data profiling is always available
It makes intuitive sense that every collection of data has something that uniquely identifies each row (or data point). Otherwise, how would you tell rows apart? Or identify duplicates? A column that contains a unique identifier is called a “key” in database terminology. A key can be native to the data (e.g., an employee ID number in a table with employee information). It can also be something that the database designer has added. Regardless of where the key comes from, it is a unique identifier for the data, and there must be one present for the data model to be constructed.
A screenshot of the customer I D column. A note mentions that the I D column has a 1 over 1 ratio between the distinct and unique values and none of the values are repeated.
A good example of a key column
If you have two columns with a 1/1 ratio, scrutinize each of them to see which you should keep. A key column has (by definition) high cardinality, which means it will be very “expensive” to “store.” I hate to keep putting you off, but we will get to cardinality. (It gets its own section!)
When a key column is present in another table, it is called a “foreign key.” When it is in its own “home” table, it is called a “primary key.” You may not see these terms directly on the exam, but the concepts they represent are critical. You can connect two tables together by matching the primary and foreign key columns. If we were working in a database, we would call this process “joining.” In Power BI, the term “join” is not used; rather, we create “relationships.” You need to have relationships between tables for the data model to work. (See Chapter 5.)
As part of your transformation process, you must ensure that each query has a column that serves as the “key.” Simplify this column as much as you can. For example, if the ID has two components (1234-56789) and only one of the components can serve as an ID, split the column and reduce the key down to the minimum number of characters to uniquely identify each row.
For Future Reference: Native and Surrogate Keys
A native key is something that the business (or the data owner) uses to uniquely identify a data point, for example, employee ID for employees.
A surrogate key is a unique ID that is added to the data by a database owner. The advantage of a surrogate key is that it is independent of the business, so does not change, even if the business changes its methodology for identification.
For example, the HR department assigns an employee the ID number XYZ123. The database administrator assigns the same employee the ID number 123456. Both IDs are unique, but the ID 123456 is the surrogate key and never changes even if the HR department changes their ID taxonomy to 123XYZ.
Currency should be stored as a Fixed Decimal type. This data type stores a number at four digits of precision after the decimal point.
A Fixed Decimal type is not the equivalent of a Decimal type.
A Decimal type stores up to 15 digits, including numbers before and after the decimal place. For example, if I have a number that is 13 digits before the decimal place, the Decimal data type will only store two digits after the decimal place, resulting in a loss of precision.
Dates should be stored as a Date data type. If you need to store a value for Time, store it in a separate column. (Don’t worry about formatting a date column—that happens in the Power BI Desktop.)
A screenshot of the using locale option in the order date dropdown menu. The using locale option enables the conversion of dates or currency to different formatting conventions.
Accessing Using Locale in the Data Type menu
A screenshot of the window titled change type with locale. The date option is selected from the data type dropdown menu. A note mentions that the data type in the using locale menu is chosen as normal.
Set the data type as normal
If you can store a value as a Whole Number, do so. A whole number is going to be the “cheapest” way to store values.
If you can avoid storing a value as a Text value, do so. String (text) values are the most “expensive” way to store values.
If you have two columns in two different queries that will be used to link the two tables together (once we get to the Power BI Desktop), make sure that the two columns have the same data type. (As mentioned earlier, it is nice if they have the same name as well, but that isn’t strictly required.)
Food for Thought
A colleague told me he once saved 30% in his overall data model size by changing a column from the text data type to the whole number data type. Indeed, data types matter!
Be deliberate and intentional in your data type choices. Checking the data types should be the last thing you do before you finish your query transformations. Watch out for extra Changed Type steps. You are usually safe removing these as they appear in your list of steps. Most of the time they are unnecessary. (There are always exceptions. If the Changed Type is necessary to perform a step below it, do not delete it. You will break the subsequent steps.)
This is the feature you don’t know you need until you break it. When you connect to a data source that is powered by something (meaning a server), you can “fold” the work of doing the transformations back to that engine. For example, if you connect to a data source on SQL Server, there’s an excellent chance that the server supports folding. If so, the PQE is going to fold back the transformations to the server to perform them and thus get them done faster. (Servers typically have more power than our laptops.) But there are some transformations that will break query folding, and changing a data type is one of them (usually). That’s why I said that you should delete any gratuitous Changed Type steps that the PQE creates. That is also why I said that changing data types should be the last step you perform on your query. All the previous steps will be folded back, speeding up the process.
There are sources that have no underlying “engines”: Excel, CSV, and PDF come to mind. When a query originates from one of these types of sources, there is no query folding. One school of thought is “don’t worry about breaking query folding in these queries.” But I disagree.
Practice the same good habits regardless of whether the query type supports folding. Then you won’t accidently break folding when it is available because of sloppy practices.
A screenshot of the applied steps panel. The navigation option in the panel is highlighted and the view native query option is selected from a popup menu. The selected option has a bright background.
Query folding is occurring in this step
A screenshot of the native query window. It has 15 lines of code.
Example of a native query
One final note: Query folding is probably not going to show up on the PL-300, but while I want to help you pass the exam, I also want you to become a skilled Power BI practitioner. Knowing how to use query folding to speed up your transformations is one skill that you should have.
Your Chance to Practice
Look up Alex Powers’ 30 days of query folding on YouTube. It is a great opportunity to understand what steps do (and don’t) break folding.
A screenshot captures the path for data replacement. The screen has a vendor name column in the backdrop. A text reads, right-click on the column with null values and choose the replace values option.
Replacing values in a column
A screenshot of the replace values dialogue box. It has 2 entry fields for value to find and replace with. A note mentions that for null values, null is entered into the value to find box.
For null values, be sure to enter “null”
For blank values, leave the Value To Find box empty.
A screenshot of data columns with error cells. Clicking an error cell in the vendor number column opens a message at the bottom that reads, data format error with an invalid cell value.
Displaying the reason for the error
A screenshot of a data column with a data quality peek bar under the header. By right-clicking the bar, a popup menu opens which is specific to errors.
Several options for handling errors
A screenshot of the profiling behavior change facility. The profiling can be changed from the first 1000 rows to the entire data set by clicking the options in the bottom left corner of the window.
Changing the profiling behavior
A screenshot of the product code column data. Once the profiling behavior is changed, the data quality peek reflects all the rows and the error percentage.
After changing the profiling behavior
This example does not have errors, but if it did, they would be reflected in the Error category in the Data Quality peek.
A screenshot of a popup menu that is specific to errors. The menu opens by right-clicking the data quality peek bar. The keep errors option is selected from the menu.
Choose Keep Errors
A screenshot of the data column rows with errors. The keep errors option shows just the errors in the query. The applied steps panel on the right highlights the kept errors options.
Now only the errors are displayed
Now you can diagnose the errors and take the appropriate measures to fix them.
If only the data sources to which we connect had all the data we needed! But all too frequently the source(s) you are using lacks something important. There are two types of new data you can create in the Power Query Editor: a new column or a new table.
Column from Examples
Custom Column
Conditional Column
Index Column
Duplicate Column
A screenshot captures the column addition process. The United States of America is typed in the sixth column on the right and Power B I determines a pattern and suggest it in the remaining rows.
Adding a column from examples
A screenshot of the custom column window. It has entry fields for a new column name and a custom column formula. A note mentions that the formula in the custom column has to be written in M.
Adding a custom column
I have provided these two examples to illustrate that there will often be several ways to perform the same task. Which method you choose will vary based on your understanding of each technique and the requirements of your task.
To create a new query, you have several options. You can create a blank query—essentially a blank sheet of paper. You can also duplicate an existing query, something we will cover in detail later in this chapter.
I tell my students that the Power Query Editor is very forgiving. It will allow you to easily fix or change a step that you created earlier in the process; very few decisions are irreversible. With that in mind, be ruthless as you look at your queries. Don’t need a column? Remove it. Don’t need an entire query but just part of it? Combine it with another query, either as a merge or append. One of your main objectives is to streamline and simplify the data before you bring it into the Power BI Desktop. Do not bring a query or a column over “just in case.” You can always come back for it.
A screenshot captures the path to append queries together into a new query. The path includes home, append queries, and append queries as new.
Appending queries as a new query
A screenshot of the append window. A box on the left lists the available tables, an add button with right arrowheads is at the center, and another box has tables to append.
Appending multiple queries into a new one
A screenshot of a window that lists the queries. The new query titled Append 1 is added to the bottom of the query list.
The new query
The mechanics of merging two queries are like those you use when appending, with one key difference. When appending, it is a good idea to ensure that the structures of both queries (number, name, data types of columns) are the same. With a merge, it is important to understand exactly how you want to merge the two queries. There are seven different join patterns, which I will explain in the following.
A screenshot of 3 query tables titled query A, query B, and merged query. The rows in queries A and B with matching values will be merged and the rest from B will appear null in the merged.
Left outer join: all rows from the left, matching from the right
A screenshot of 3 query tables titled query A, query B, and merged query. The rows from query B are matched to rows in query A and the remaining rows from A will have a null value in the merged.
Right outer join: All rows from the right, matching from the left
A screenshot of 3 query tables titled query A, query B, and merged query. All rows from query A and query b are merged.
Full outer join: all rows from both queries
A screenshot of 3 query tables titled query A, query B, and merged query. Only the rows that are matched in both queries are merged.
Inner join: only rows that are present in both queries
A screenshot of 3 query tables titled query A, query B, and merged query. Only the rows from query A that are not matched in query B are merged.
Left anti join: only rows in Query A that are not matched
A screenshot of 3 query tables titled query A, query B, and merged query. Only rows in query B that are not matched in query A are merged.
Right anti join: only rows in Query B that are not matched
A screenshot of 3 query tables titled query A, query B, and merged query. The rows from both query A and query B that are not matched are merged.
Full anti join: only rows in both queries that are not matched
A screenshot captures the process of creating a duplicate query. The screen has a list of queries and clicking right on a query opens a popup menu. The duplicate option is then selected.
Duplicating a query
A Matter of Style
Lots of choices you make in the Power Query Editor are questions of style. When appending files, I prefer to append (or merge) them all into one new query, leaving the originals as is. But this isn’t a “must-do.” Experiment with different methods so that you can choose the right one in each circumstance.
A screenshot captures the process of creating a referenced query copy. The screen has a list of queries and clicking right on a query opens a popup menu. The reference option is then selected.
Referencing a query
A screenshot captures the path to delete a query. The supplier query is selected from the list. Clicking right on the query opens a popup menu. The delete option is then selected.
Deleting a query
You can’t delete a query that is being used in another query (e.g., if you have appended or merged it into another query).
If your data has millions of rows, you do not need (or want) to work with all the rows to decide on your transformations. Or you may not want certain rows to appear in the final report. Either way, you can filter your data in the PQE to limit what you are working with. Filtering excludes rows that do not meet the filter conditions. These rows will not appear in the finished report. If you want to work with only a subset of the data during the report development process, remember to remove the steps in which you limited the rows before publishing the report.
A screenshot of the filter panel in the manufacturer column. The load more option on the bottom right is clicked to view more values in case the displayed list of manufacturers is incomplete.
Using the Filters panel to see all the unique values in a column
A screenshot of the product name column with a filter panel. The filter panel displays 1000 unique values.
Only 1000 unique values can be displayed
A screenshot of the text filter and the product name column. A note reads, if the data has more than 1000 unique values, then a manual filter like a text filter is used to build a filter.
Creating a manual filter
A screenshot of a window with multiple data columns. The screen has row numbers well past 1000 and the scroll bar on the right is used to scan through the data.
Scrolling through data in the Preview window
Scrolling through thousands of rows of data is tedious at best and impracticable at worst, which is why working with representative data is so important. If the data you have is very inconsistent, you may need to go back to the source to clean it prior to being able to use the Power Query Editor.
Once you create a step to filter out rows, I would strongly recommend renaming and documenting that step (see earlier).
Columns are renamed, added, or deleted.
Files are moved, added, or deleted.
When you are working with a database, the data is probably much more stable. But if your data source is Excel, you have a higher likelihood of having columns changed. You need to build your transformation script such that if a change happens, your script can handle it without breaking. This is called future-proofing. The first step is to make sure your Formula Bar is visible and read the M for every step you create.
A line of code reads, = Table dot remove columns, left parenthesis, hashtag, double quotes, split column by delimiter double quotes comma left curly bracket, double quotes, vendor number, double quotes, right curly bracket, right parenthesis.
This step will remove a column called “Vendor Number”
A line of code reads, = Table dot select columns, left parenthesis, hashtag, double quotes, removed columns, double quotes comma left curly bracket, double quotes, vendor name, double quotes, comma, double quotes comment text dot 1, double quotes, comma, double quotes comment text dot 2, double quotes, right curly bracket, right parenthesis.
This step will only keep the explicitly named columns
In Figure 4-49, one column has been excluded explicitly by using Remove Columns. What happens if the next time the data is refreshed another column is added? Will the new column appear in the report? In this example, the answer is yes because it has not been specifically excluded.
In Figure 4-50, three columns have been specifically included using Select Columns. What happens if the next time the data is refreshed another column has been added to the data source? Will the new column appear in the report? The answer is, no, it will not because it has not been specifically included.
There are many techniques to write M so that it is flexible and can handle changes to the source data, almost all of which are beyond the scope of this book. But there are several good books that have been published in which you learn how to future-proof your queries, which I will list at the end of the chapter. For the PL-300, learn to recognize an M expression that will break your query or perform unexpectedly. For real life, invest some time in learning how to future-proof your queries—your future self, and colleagues, will thank you.
A screenshot of the query list. Seven queries out of 8 have the warning icon that indicates the presence of errors.
A lot of alarming errors!
Hopefully you won’t often see this many errors, but the resolution process is always the same. Start with the step that has the error and then select each step preceding the one with the error, until you find a step that isn’t broken. Then identify what happened between the last working step and the first broken step.
The good news is that most errors are one of two types: a broken file path or a prior step in the wrong sequence. I covered broken file paths in the previous chapter, so here I will talk about a few common types of steps that break other steps. (I can’t cover every possible error type.)
Usually, you set your data types as your last step in the query because the Changed Type step will break query folding (see earlier). However, there are times when you need to perform an operation on a column or columns. If the existing data types are not compatible with that operation (e.g., you try to multiply two text columns), you may produce an error. The solution in this case is to insert a Changed Type step before the operational step.
A screenshot of an error message for a renamed column. The renamed column option in the query settings was used to rename a column. The file could not be found in the subsequent added custom step.
I renamed a column too early!
No matter the type of error, to troubleshoot I start at the broken step and then look at each step above it in turn. In this case, I would find that the Renamed Columns step caused the problem, and I can fix the error by deleting that step. Sometimes you must do more radical surgery and rearrange or redo your steps.
A well-written M script can be a work of art. That may seem an extravagant statement, until you spend a considerable amount of time building one! A lot of thought goes into choosing the correct steps, in the right order, preferably without breaking query folding (see earlier), and planning for future changes to the data. It is nice to know you can reuse your work.
A screenshot of the advanced editor window. The screen displays 14 M statements. A message at the bottom confirms there are no syntax errors. The done and cancel buttons are on the bottom right.
The M for a query, in the Advanced Editor
While using the Advanced Editor is beyond the scope of the exam, I want you to know what it can do for you. If you need to modify several steps and you feel confident in your M coding skills, you can edit the M directly in the Advanced Editor. Additionally, M steps that you see in the Advanced Editor can be copied as plain text and pasted into another query.
Make sure that you have a comma after each step except the last one (the one just above the in statement).
Make sure that your M script starts with a let statement and ends with an in statement (unless you are using functions).
The in statement usually references the last step name.
The Source step (the first one) must point to the right location for your data source.
When copying and reusing M in another query, check anything that is “hard-coded” (in double quotes) to ensure that an item with that name is present in the destination query.
When copying and reusing M in another query, check all the step references to ensure that they are present in the new query.
An even easier way to reuse the M script is to copy the query on the Queries panel. Select the query on the Queries panel (on the left side of the PQE) and then Ctrl+C. Open the PQE window for another PBIX file and Ctrl+V. The entire query is now available to the second PBIX file.
A screenshot of the advanced editor window. The screen displays 14 M statements. The highlighted statements from 7 to 14 can be copied and pasted into the advanced editor window for another query.
Sample of an M script to be copied into another query
Step names are not the same in the two queries.
Column names are not the same in the two queries.
The source reference is different (this is usually in the top step).
Omitting the comma at the end of a step.
Including the comma in the last step.
Take your time and pay attention to the details. Don’t try to edit in a hurry; that is the fastest way to break your M script.
Once you have completed your work, it is time to load your changes into the data model. It is important to understand exactly what is happening when you do this. The first thing that will happen is that the script, as it is written, is applied to the full set of the data that you have included. Sometimes, though, you don’t want to load a particular query. Perhaps it is a “helper” query. A good example of a “helper” query is one that you appended into another query (as in the example earlier in this chapter). In this scenario, you don’t need the original query as you have already appended (or merged) it into another query.
A screenshot of the query loading options. A query is right-clicked to open a popup menu. The enable load option is unchecked to disable the loading of the query which is now indicated in italics.
Disable loading to the Desktop
Document your queries and steps.
Keep like steps together.
Name your steps.
Data types matter—be intentional in selecting them.
Keep the Formula Bar open and read the M for each step.
Future-proof your queries to the best of your current ability.
Reuse your M script as/when appropriate.
Change data types as close to the last step as you can.
Collect, Combine, and Transform Data Using Power Query in Excel and Power BI by Gil Raviv
Master Your Data with Excel and Power BI by Ken Puls and Miguel Escobar
In Chapter 15, I will provide additional resources, including YouTube and other Internet resources.
What is a data model and why is it necessary? This is (perhaps) the most difficult concept to understand for a business user because Excel works in a flat world. Any summarization you want to perform requires that all the data is on the same worksheet. So, at most, you must use some LOOKUP functions to “retrieve” the data you need to include. While having everything on one worksheet is easy to comprehend, it is unwieldy and inefficient. Enter the data model.
A data model allows the logical combination of different groups of data (arranged in tables) without physically combining them. The model is the structure within which you will define these connections (relationships) between the tables.
Of all the skills we cover together in this book, data modeling may be the most critical to your success as a Power BI practitioner. Certainly, there are questions on the exam that test your knowledge of data modeling, and that is our main purpose here. However, when you use Power BI, by far most of your challenges will arise because of your data model structure. A good data model is vital for a performant Power BI report; a bad data model will hinder the speed and responsiveness of the report despite your best efforts. In this chapter I explain the basic principles of data modeling that you should know, both for the exam and in real life.
As we saw in Chapter 4, a query is the form your extracted data takes in the Power Query Editor. Once you finish all your transformations and apply your changes, those queries become tables inside the Power BI Desktop. (For example, if you extract data from an Excel worksheet, it first becomes a query in the Power Query Editor and then a table once it is loaded into the data model. The same is true for any extracted data.) Once all your data has been loaded, you will have at least one table and usually many more. Now we need to define the connections between all these disparate tables. (Remember, they may come from a wide variety of different sources and may never have been combined before. That’s part of the magic of Power BI!) Before we explore relationships, I need to define some important terminology.
Fact Tables | Dimension Tables |
---|---|
A fact table contains the data you are reporting on. In retail, it could be sales transactions. In medicine, it could be appointments and/or outcomes. In manufacturing, it could be data from each production line. Fact tables record things that have happened (usually), and so you will almost always see a date (and maybe time) stamp as part of a fact table record. A fact table ideally only contains values that are in numeric format (quantity; date; time; price; unique IDs for customers, patients, and products; etc.). For example, if the fact table contains sales transactions, each record will have an ID for the customer and for the product. Ideally, this ID will be an integer (whole number). This ID is a foreign key that allows the fact table to be related to the dim table for the customer (or product). The fact table is usually the longest (the greatest number of rows) of the tables in the model. There can be two fact tables in a model, but that is a more advanced modeling problem, so I assume only one fact table in this book. Fact tables can have many rows for each element such as product or customer or store. | Dimension (dim) tables describe the elements of your fact table in greater detail. A dim table usually has a lot of columns (very wide), many of which are “strings” or alphanumeric. In the dim table, the ID number for the product (or customer or patient) is a primary key. (See Chapter 4, “Keys and IDs,” for more information.) In the dim table, the record for that ID number is very detailed: the customer’s name, address, birth date, demographic info, etc. Dimension tables have just one row for each element being described (customer, product, store, etc.). Models usually have multiple dimension tables. Ideally, dim tables connect directly to the fact table, but they can be related to other dimension tables. |
A quick note about other types of tables: There can be other types of tables in a data model such as a factless fact table or a bridge table. This book does not tackle those concepts, nor will the PL-300. |
Dimension table fields are descriptive; they should be used as labels on charts (e.g., rows, columns, and axes).
Fact table fields are the values that are analyzed in a visual (e.g., summarized, averaged, counted).
There are multiple types of relationships that Power BI supports. Let’s walk through each of them. But first a word for those of you with a database background: relationships in Power BI are similar to but not the same as a join. The purpose of a relationship in the Power BI context is to transmit filters. (Much more on filters later.)
A screenshot has three tables for the product, store, and sales. Each has different options. The product key option under the product and sales is selected.
A one-to-many relationship between a dimension and fact table
A screenshot has two icons for one and an asterisk connected by a line.
The one-to-many icons
If you can’t figure out which table is your fact table, look for the table that has lots of rows with repeating customer IDs or product IDs or store IDs (the ID will depend on what data you are reporting on). Each fact will usually have a date (and maybe a time) associated with it.
A screenshot has two tables for product and sales connected by an arrow. Each has different options. The product key option is selected in both.
Filter transmission from the Product to the Sales table
A screenshot has two tables for dimension and fact. Each has a different number of customers from A to D.
A dimension table with multiple rows for each customer
How can you select the right rows in the dimension table? What is the difference between each row listing Customer A in the dimension table? The multiple rows in the fact table are to be expected; after all, you hope that there are multiple sales for each customer. We can create a many-to-many relationship between these two tables and then ask the engine to process through the records. Here’s the problem, and it is a big one: You run the risk of duplicate counts or of records not being included. Either way, your report will be inaccurate. If your tables contain millions of records, you are not likely to spot the error. Your report will also be slow. If you must create a many-to-many relationship, you can mitigate its potential ill effects by setting a filter direction (see below). But your best course of action is to redesign your model before resorting to a many-to-many relationship. Return to the Power Query Editor and simplify the dimension table. How you do this will vary based on your data, but keep the goal in mind: you want the dimension tables to contain one unique record for each element described (customer, store, product, etc.). Usually, you must simplify the dimension table (de-duplicate, add a field, etc.), which will result in the dimension table having a unique record for each element. Apply your changes to the model and create a one-to-many relationship.
A screenshot has three tables: dimension table A, B, and dimension table merged. For both A and B, there are 4 customers with attribute. For the merged table, each customer has 2 attributes.
Two dimension tables merged
Return to the Power Query Editor and combine the two tables (it will probably be a merge operation). Apply your changes; you will only have one table and can proceed to create the standard one-to-many relationship.
Unlike the many-to-many relationship, the one-to-one relationship doesn’t come with any significant disadvantages. Notwithstanding, if Power BI suggests a one-to-one relationship, you should try to merge the two tables to streamline your data model.
What happens if a table does not have a relationship with any other tables in the model? If you try to use a field from the disconnected table in a visual with a field from another table in the model, you will usually see repeating values. There are situations in which a disconnected table can be useful, but they are not in the scope of this book and are not included in the exam.
If you see repeating values in a visual, the first place to look is in your model. There is a missing relationship or a problem with the existing relationship between the two tables. Do not waste time trying to “fix” the visual—that is almost never the problem.
There can only be one active relationship between two tables, but there can be as many inactive relationships as necessary. We will explore one of the most common use cases for inactive relationships below.
As soon as you get started with Power BI, you will start reading and hearing about star schemas. This term refers to the “shape” of your data model, but it is much more metaphorical than physical. In a star schema, the fact table (usually just one) sits at the “virtual” center of your tables. The dimension tables are arranged in a circle around the fact table. Each dimension table is connected to the fact table through a relationship.
The star schema is not required. Rather, you should consider it the ideal that you are striving toward as you create your model. There is a common variant called the “snowflake” schema.
For Future Reference: Normalized Tables
A normalized schema is one in which tables have been deconstructed into their constituent parts. For example, if you have a Product table that has Subcategory and Category information in it, in a normalized schema, this table is broken down into three separate tables (or maybe even further). You will see this type of schema referred to as Third Normal Form, and it is quite common in data modeling generally.
Power BI aims for a happy medium. You don’t want to break down your tables to their furthest extreme, but you also don’t want all your data in one table (as in Excel).
Snowflake schemas are not the worst design for a data model, and sometimes they are necessary. Remember, the star schema is your ideal data model structure.
Now that we have covered relationships, we must address the direction of the relationship. Yes, relationships have a direction. In Power BI a relationship is not a join; rather, it is a method to convey a filter from one table to another.
A screenshot has three tables for the product, store, and sales. Each has a different options. The productkey is selected in product table. The product and store are connected to sales via 1 and an asterisk.
Two dimension tables connected by single-direction one-to-many relationships
This behavior enables you to make a selection in the dim table where there is only one row for each customer (or product or store), thereby avoiding ambiguity in your selection. This selection is then transmitted from the dim table to the fact table via the direction of the relationship. As long as the direction is one way, you don’t have to worry about incorrect results arising from a bidirectional relationship.
A screenshot has tables for the date dim and employees connected by a bi-directional arrow. The text box reads a bi-directional relationship will have arrows pointing in both directions.
Bidirectional relationship
In a 1-1 relationship: These relationships are automatically created as bidirectional, and largely in this instance, the bidirectional filter won’t do much damage to your report. The risk of ambiguity is not present, because both tables have unique rows for each element.
In a many-many relationship: This is the scenario in which you do not want a bidirectional relationship, but a many-to-many relationship is always created as bidirectional. Usually, you are going to use one of the tables to filter the other table, so in this situation, mimic a single-direction relationship by setting the filter direction artificially. This selection will mitigate the potential for ambiguity because the selection is made in one table and is filtered to the other one.
With all that said, avoid bidirectional relationships as much as possible!
In the previous chapter, I mentioned cardinality, and I promised that I would go into great depth explaining this very important concept. First, a formal definition: A column’s cardinality is defined by how many unique (non-repeating) values it has. A column that is a “key” column has, by its very nature, high cardinality. In contrast, a column that has the same values repeated has lower cardinality. One of your primary jobs as a data modeler is to reduce or eliminate high cardinality everywhere you can, because a column with high cardinality is “expensive” to store.
For Future Reference: The VertiPaq Engine
The engine that powers Power BI is called “Vertipaq.” Vertipaq stores and accesses data in columns (vertically), not horizontally. This method makes Vertipaq very fast at retrieving data and performing computations. It uses several compression algorithms, all of which are less effective the higher the number of unique values in a column. Therefore, a column with high cardinality is “expensive” to store: the engine cannot perform as much compression.
If you are not using the time component, change the contents to date only.
When you do have a column with high cardinality that you cannot simplify (e.g., a key column), try to store that column as an integer, preferably a whole number. Integers are easier to compress, so even if the column is “expensive,” you can mitigate the impact by selecting the most efficient data type.
The format of the data: You are not changing the data type (which is how the data is stored); rather, you are changing how it is displayed in your report. The only place you can change the format of the data type is in the Desktop; you cannot change the data format in the Power Query Editor.
A screenshot has drop-down menus for summarization and data categories. A pop-up window for fields has the option for the city highlighted.
Categorizing data using the Column tools menu in the Report view
Another use case for categorization is if you have a column with a URL that points to a storage location for an image. When you categorize that column as Image URL, and then use that column in a visual, it will display the image, not the text string for the URL.
Once you have successfully categorized a geographical field, a globe symbol will appear next to the field.
This is a Microsoft tool, which means that there are almost always at least two ways to perform a task. You can categorize data in the Model view as well, as shown in Figure 5-9.
A screenshot has a drop-down menu for the data category. The right side has a table of columns with the option city selected.
Data categorization in the Model view
The name of the column: Ideally, this is something you did in the Power Query Editor, but if you forgot or need to change the name of a column, you can do so in the Desktop. The change will propagate back to the Power Query Editor, if the data was loaded from the Power Query Editor.
The fact table may not have all the dates between the earliest and latest dates represented. What if the company in question does not accept orders, ship, or deliver on the weekends? (Improbable, I know, but work with me here.) The fact table will not contain any rows for dates that fall on a Saturday or Sunday. What if the company observes holidays that can fall on a weekday (Christmas, New Year’s Day, Fourth of July)? None of these dates will be represented in the fact table. These gaps in your data mean that you will not be able to make accurate calculations of time elapsed between an order date and a ship date (performed by a simple subtraction of dates) or any other calculations involving dates.
What if your company is based in a country whose weekend falls on Friday and Saturday?
Any reports that display information using your company’s specific calendar will not be correct, unless your fiscal calendar strictly follows the Gregorian calendar. For example, if your company’s fiscal year ends on June 30th, just using dates that are included in the fact table will not accurately portray your financial performance.
The built-in time intelligence functions in DAX will not work as designed. These functions will be covered in the next chapter, but here’s a sneak preview: you will want to use the time intelligence functions!
Use one that has been created by someone else and stored in a data warehouse. If you are lucky enough to have one of these, rejoice! All you do is to import that table the same way you would any other table from a data source. The good news is that it will be accurate, standardized, and complete.
Tip If your company does not already have a corporate date table, make it your mission to get one created and stored in an area where anyone using Power BI can access it.
Create one in the Power Query Editor. This method has the advantage of taking up less “space” in your model (more on that in the next chapter). You can readily find the M code for date tables that have been created by experts in M such as Ken Puls and others. Or you can watch one of the several excellent YouTube videos on the subject and create one yourself.
You can use DAX to create a date table in the Power BI Desktop. As with the M code date tables, there are excellent examples readily available on the Internet. Most would agree that the best examples are those provided by SQLBI (I have no affiliation with them, just a big fan). In the next chapter, wherein we explore DAX more fully, I will return to using DAX for your date dimension table.
Establish a numerical sort column for the months of the year. Otherwise, when you use the month column in a report, it will be sorted alphabetically (e.g., April, August, February, etc.).
Best Practice In your sort column, include the year number and the month number (e.g., 202101, 202102, 210203). That way, the months and years will be sorted appropriately in your report.
Mark your date table. You will need to have one column that has a row for every date between the earliest and the latest date present in your model. This column must be formatted as a date. Once you have successfully marked the date table, you see a little icon next to your date column that looks something like an ID card as shown in Figures 5-10 and 5-11.
A screenshot has the tab for table tools selected. It has an option for mark as date table. The right side has a table for fields with the datedim highlighted.
Marking a date table
A screenshot has a table for fields. It has different options, and the date dim is selected. The date option has an icon alongside.
The date column indicates that the date table has been marked
For Future Reference: Shadow Date Tables
A screenshot for options has tabs for global and current files highlighted. A text box details the steps to disable the time intelligence option.
Disabling “shadow” date tables forever
Connect the date dimension table to the fact table via a 1-many relationship. (Remember, the date dim table is a dimension table just like any other in your model. There is one row for each date between the earliest and the latest date in your model.)
Now that I have covered the why and the how of creating a date dim table, we can return to the first statement of this section: almost every data model will have multiple fields that are dates. To use each date field in a report visual accurately, the field must be connected to the date dimension table. But remember: there can only be one active relationship between two tables, which leads us to inactive relationships.
'FactTable'[OrderDate] to 'DateDim'[Date] (active)
'FactTable'[ShipDate] to 'DateDim'[Date] (inactive)
'FactTable'[DeliveryDate] to 'DateDim'[Date] (inactive)
A screenshot has 2 tables for orders and datedim with different options. Both are connected via 1 and an asterisk.
Active and inactive (dotted lines) relationships
Usually, the field that you use most in your reports is the one that has the active relationship. In this example, the 'Orders'[OrderDate] field is the most used in reports, which is why it is the active relationship. If 'Orders'[ShipDate] is more commonly used in reports, then the active relationship could be between 'Orders'[ShipDate] and 'DateDim'[Date].
FactTable[OrderDate] to OrderDateDim[Date] (active)
FactTable[ShipDate] to ShipDateDim[Date] (active)
FactTable[DeliveryDate] to DeliverDateDim[Date] (active)
Every table that you add increases the size of your model, so think carefully before you choose role-playing instead of using active/inactive relationships. If your model is small overall, then the penalty for the extra tables may not be important for performance. But if your model is already complex, with lots of tables, opting for active/inactive relationships is probably a better choice.
Creating the correct relationships is critical to the success of your report. Without the proper relationships, your visuals will be, at best, inaccurate. At worst, they will be wrong. Relationships play an important role in the performance of your report. A bidirectional relationship can slow your report significantly, among its other drawbacks.
There is a rule of thumb in the Power BI community that you will spend 80% of your time cleaning, transforming, and modeling your data. If you find yourself wondering when you can get to building a report, be patient! Once the model is complete, you will be amazed at how fast the report comes together. In the meantime, let’s move to perhaps (?) the most challenging part of modeling: writing DAX.
In the previous chapter, we covered the basic principles of creating a data model. A star schema, dim and fact tables, and relationships are the building blocks of a data model. Once you have those in place, you can proceed to refine your model. The one assumption that we made was that all the data we needed was either in the source(s), or we created it using the Power Query Editor. But sometimes that isn’t sufficient or possible; we need to explore how to add data using DAX in the Power BI Desktop.
Data Analysis Expressions (DAX) is the language that is used to perform four types of content in the Power BI Desktop:
Calculated tables
Calculated columns
Measures
Security roles
In this chapter we will cover calculated columns and tables and security roles in some depth. We are saving measures, and a deeper examination of DAX, for Chapter 7.
It is best to add data that you need “as far upstream as possible,” meaning at the source. But quite often you do not have access to the source data or permission to change it. The next best option is to add the data in the Power Query Editor. Let’s assume that adding the data in the source or in the Power Query Editor is not possible. In that situation, you will need to use DAX to add the content in the Power BI Desktop.
The Formula Bar isn’t visible unless you are creating something (table, measure, column) using DAX or you select something that was created using DAX. If you want the Formula Bar not to display, check the X. Don’t worry—you won’t delete anything that has already been created.
A screenshot has the tab for table tools selected. It has a sub-option for the new table on the right and a formula bar below it.
Adding a table by copying an existing one
Creating a calculated table can be a significant drag on the performance of your report because it is not compressed as effectively by the VertiPaq engine.
Calculated tables (and columns) are initiated when the report opens. Their values do not get recalculated unless the model is reinitiated.
A screenshot of the fields. It has a calculator icon over the options for an average customer order, copy of customers, date dim, large lines, and proposed discount.
Calculated tables are marked with a different icon
Avoid creating calculated tables if you can. Your best option is to create the needed content (tables or columns) at the source. Because modifying the source is often impossible, the next best option is to have tables and columns in the Power Query Editor.
When you want to use the resulting values in the column in a slicer or filter.
When the calculation must be performed row by row to be accurate. In Figure 6-3, you can see an example, using an Excel table for ease of presentation.
A screenshot has a table with four columns for item name, unit price, quantity sold, and total sales. A text box below reads, if we multiply unit price and quantity sold, the answer is incorrect.
Example of a row-by-row calculation
A screenshot has five columns for the order I D, the product I D, unit price, quantity, and discount. A text box reads, these are the original columns with no column for subtotal or a net total.
The original columns in the query; none provide a total
A screenshot has eight columns for order and product I D, unit price, quantity, discount, subtotal, discount value, and line total. A text box reads, cross-check the results of each D A X expression in the columns.
Compare the DAX expressions to the results in the columns
They are written the same way you would write them in Excel.
There are no functions (e.g., SUM, AVERAGE, COUNT) in the expressions.
Each column has a unique name.
Column names (in square brackets) are fully qualified by their table name.
Common mathematical operators are used (*, –, +).
Table names have single quotes around them.
DAX “thinks” primarily in columns. It can also “think” in rows, and when it does, it is iterating over a table or a column.
(Yes, I am speaking as if DAX is a living breathing organism, which of course it is not. But bear with my literary flights of fancy. Maybe once you have worked with DAX, you too will think it has a mind of its own!)
There are several ways to refine your model. They aren’t all necessary, but they are useful touches that make your model easier for your report consumers to understand.
Dates: Year, Quarter, Month, Date
Products: Category, SubCategory, Product
Organization: President, Vice President, Director, Manager, Supervisor
You make it clear to the user that the fields relate to each other in a hierarchical structure.
You make it easy for the report creator (who might be you!) to drag all the related fields on to a visual at once.
You create a clear “path” for drilling down in a visual.
Hierarchies can be created either in the Report view or in the Model view. In the following examples, I am using the Model view.
A screenshot has a drop-down menu for products, and options for category, product name, and subcategory are highlighted.
Several fields that can be combined into a hierarchy
A screenshot has a pop-up window with create hierarchy selected. The text box has details to initiate a hierarchy by selecting create hierarchy.
Initiating a hierarchy
A screenshot for properties has the option for hierarchy selected. The right side has options under fields. The sub-option for category hierarchy is highlighted.
Adding to a hierarchy on the Properties panel
If you don’t see the Hierarchy option on the Properties pane, make sure that you have selected the hierarchy.
A screenshot for hierarchy has options to select a column to add a level, category, subcategory, and product name.
The only place you can rearrange a hierarchy
To minimize report consumer confusion
To prevent the use of the wrong field in the visual
A screenshot for product and sales has the option for the product key highlighted.
The Product dim table and the Sales fact table
In this example, it would be best practice to hide the ProductKey field in the Sales table. The report consumers would have to use a field from the Product dimension table as a label in a visual.
It is also best practice to hide “helper” fields—fields that are present to provide a needed function but are not useful in a visual. An index field or a field that provides a sort order is a good example of a “helper” field.
A screenshot has options for view hidden, unhide all, collapse all, and expand all. A text box reads, right-clicking in a blank area of the fields pane in the report view will show the sub-menu.
Showing hidden fields in the Report view
A screenshot for the sales has multiple options. A few of the options have a sigma symbol alongside.
The sigma symbol indicates a summarization behavior
The default summarization that Power BI applies is a SUM if the field’s data type is a number. During the refinement of your model, you should examine each field that has a sigma symbol next to it and decide if that is an appropriate operation to apply. You can perform this operation in any of the three views within the Power BI Desktop.
A screenshot has the column tools tab with options, name, data type, format, data category, sort, group, relationships, and calculations. The summarization option is highlighted.
Set the summarization behavior
A screenshot has options to sort by column, data category, and summarize by. A text box reads changes the summarization behavior on the properties panel of the model view.
Changing the summarization behavior in the Model view
A screenshot has a tab for column tools selected. The option for summarization is highlighted. A text box reads, don't summarize will remove any summarization behavior.
Changing the default summarization behavior on a field
A screenshot has a tab for column tools selected. It has a box for names and a drop-down menu for data categories. The text box reads, from the data category drop-down, choose the appropriate category.
Categorizing a field
You can categorize a place (e.g., an airport or a stadium).
You cannot categorize a region (or any geographical unit not used in standard maps).
Make sure your zip/postal codes are in a text data type; if they are not, they cannot be categorized as a postal code.
Maps will work without categorization, but performing this step can remove ambiguity. For example, if you have a column with an abbreviation CA, is that California or Canada? Is AR Arkansas or Argentina? It is usually clear from the column heading (Country or State); nonetheless, categorization is a form of insurance. Finally, categorization will help the Q&A feature properly interpret a question about an address or place (see in the following for more discussion about Q&A).
Another interesting example of categorization is for URLs. If your data has a URL that points to an image or a web page, you want Power BI to display the image (or web page), not the text of the URL. Categorizing the data as either an Image URL or a Web URL will enable this capability.
A screenshot has a drop-down menu for the store. The options for country and state have a globe symbol.
Two fields categorized as geographic
Imagine a day when we can simply speak a question and Power BI will recognize and answer that question. This is what the Q&A function supports: the ability to use natural language to ask a question about the data. While Power BI will deliver results without any special preparation, you can enhance the user experience by completing a few steps in advance.
A screenshot has a box to ask a question and five options under try one of these to get started. The right side has icons to build visuals, and q and a visual icon are selected.
Q&A visualization with sample questions
The Q&A engine reviews the data model and generates questions based on the relationships between tables. (See Chapter 5 for more details on relationships.) The questions use the names of the columns in each table, so naming the columns with a user-friendly name (see Chapter 5 again) contributes to the success of Q&A.
Make sure to set the appropriate data types. For example, when working with date columns, set them as Date data type. The Q&A will be able to map the relevant date column to a “when” question.
Normalize your tables, to some extent. If your tables have compound information in one column (such as a full address), break the entry into separate columns (such as street address, city, state, zip).
Categorize your data (see earlier) so that Power BI recognizes that a specific column references a geographic location. The Q&A will be able to answer a “where” question.
Relationships are critical for Q&A success. For example, to answer a “when” and “where” question, the date table and the table with location information must have a relationship.
Set up synonyms for key terms.
“Seed” with some suggested questions.
Review questions that have been asked and correct as needed.
“Teach” the model by supplying some sample questions.
A screenshot has a tab for modeling highlighted. It has an icon for Q and A setup. The text box reads, set up Q and A within the modeling Q and A menu in the power B I desktop.
Accessing the Q&A setup menu
A screenshot for getting started has four blocks for field synonyms, review questions, teach Q and A, and suggest questions.
The Q&A setup menu
A screenshot has name, description, and synonyms under properties and company name under the field. The text box details the settings for the properties and fields in the model view.
Entering synonyms in the Model view
Like setting up Q&A, setting up security is a two-step process. The first step happens in the Desktop, and the second happens in the Service. This chapter covers the steps you complete in the Desktop. A subsequent chapter will address the steps that are completed in the Service.
There are multiple layers to the security options available in the Power BI Desktop.
When connecting to a data source, you must enter your credentials. Those credentials determine what you can access (as in a database). These credentials control what you can access, not what the report viewer can see.
Once the data has been extracted, loaded into the data model, and combined, the model creator must set up the security controls that will determine what data the viewers can see in the report. This is called row-level security because the security controls the rows that a viewer can see.
A screenshot has the option to manage the roles selected. It has options to create roles and table filter D A X expression.
The Manage roles dialogue box
A screenshot to manage roles has options under roles, tables, and table filter D A X expression. The text box reads, create a role with the appropriate name and select tables and fields to restrict access.
Setting up a role restricted to customers in Germany
In this example, the restriction is simple. Roles can be set up using multiple conditions and much more complex DAX. One very common restriction is using the UserPrincipalName function, which allows you to set up restrictions based on a person’s identity. This type of security is called dynamic row-level security.
A screenshot has 3 columns for I D, full name, and U P N. The text box reads, each user is assigned an I D number which is the actual I D for that user in an employee table.
An example of a simplified user table
A screenshot has two columns for brand and user I D and 11 rows. The text box reads that user I D is mapped to a particular brand.
The brands and the associated user IDs
A screenshot has four connected blocks for the product, user to brand, user U P N, and sales. The text box details the relationship between the four blocks.
The key to dynamic security is the relationships in the model
A screenshot to manage roles. It has the user selected under roles, user U P N under tables, and user I d is set as the user principal name under table filter D A X expression.
Setting up a role that explicitly ties visibility to the UPN
A screenshot has two columns for the country and all sales. A text box reads, without any row-level security, the sales are visible for all customers in every country.
Without row-level security, all data is visible
A screenshot has two columns for the country and all sales. The text above reads now viewing as Germany account manager. The right side has options under view as roles, and the Germany account manager is selected.
Using the View as feature
Once the roles are established and tested, the second step is to assign users to roles in the Service, which will be covered later.
The PL-300 does not cover this topic, but as always, I want you to be fully prepared for the terminology and concepts you will read about when studying for the exam. Unlike row-level security, object-level security controls access to an entire object or entity. For example, you can restrict access to a table or a field such that it is not visible or useable by a report viewer. Object-level security must be implemented with an external tool, such as Tabular Editor.
DAX (Data Analysis Expressions) is the language Power BI uses to create content: tables, columns, security roles, and measures. We covered the first three uses of DAX in the last chapter. In this chapter, we will focus on measures. Measures are calculations that your data doesn’t already have. What kind of calculations? They can be anything from a simple addition to a way to conditionally format a visual or to create a dynamic title for a visual.
A screenshot of an excel sheet that lists the options under the measure tools, and the arrow points to the D A X expression in the formula bar, new measure tab, and measure option.
Entering a measure
This expression adds up the entire column called LineTotal in the table called 'Orders Details' and creates a measure called All_Sales. A measure is stored inside the data model (which is why the topic is introduced in this chapter). Because a measure is not materialized, it does not require resources from the engine until it is used in a visual.
You won’t see a measure in the Data view because it is not materialized.
In the preceding example, the 'Orders Details'[LineTotal] column is added up and then divided by the number of rows in the column—in other words, it is averaged.
And just to remind you, let’s reverse the question. When must you use a calculated column and not a measure? If you need to use the values that are calculated in a slicer or filter, you must use a calculated column. If the calculations must be performed row by row to be accurate, you must use a calculated column or an iterator function (covered later in the chapter).
DAX is not case sensitive (unlike M in the Power Query Editor).
Every column name is enclosed in square brackets.
Every column name is “fully qualified” with its table name.
Table names do not require single quotes around them unless the name has a space in it. (Nonetheless, I make it a practice to always enclose my table names with single quotes because that way I won’t forget to put them in when necessary.)
Measures are enclosed in square brackets, without a table name.
After a function, the arguments required by the function are enclosed in parentheses and separated by commas (exactly as formulas are entered in Excel).
You can nest functions within other functions.
You can reuse measures within other measures.
Functions have mandatory arguments and optional arguments. In the IntelliSense menus, optional arguments are enclosed in square brackets.
When you are first learning to write DAX, it can be confusing. One helpful tool is IntelliSense built into the Power BI Desktop. As soon as you start typing in the Formula Bar, IntelliSense initiates.
A screenshot of an excel sheet that lists the formula bar with the expression, which is pointed by the arrow.
IntelliSense suggestions for functions
A screenshot of an excel sheet that lists the expression in a context modified by filters, and the arrow indicates the highlighted IntelliSense, mandatory argument, and optional argument.
IntelliSense explanation of the selected function
A screenshot of an excel sheet that lists the expression with a column name, and the arrow points to the list of available tables in a single quote.
Enter a single quote to see a list of tables (and columns)
A screenshot of an excel sheet that lists the expression, and the arrow points to the list of measures available in the model in a left square bracket.
Enter a left square bracket to display a list of measures in the model
If you are new to DAX, using IntelliSense can guide you toward writing DAX that is syntactically correct. This does not mean that the DAX you write will work. You should test every measure in a simple visual to see the results of the DAX you have written.
Many people test their DAX by using simple data and performing the same calculation in Excel. This is a great way to understand what DAX is doing.
Continuing with the “Cookie Monster” analogy, what if we want to split the cookie up, rather than just letting the “Cookie Monster” gobble the whole cookie in one bite? Enter the filter context.
A screenshot of a table with the company name, country, and all sales, and the arrow indicates the highlighted Germany with the all sales option and the filter icon.
Creating a filter context with a slicer
A screenshot of a table with the company name, country, and all sales, and the arrow indicates the customer country field with the filter context.
Creating a filter context with the Filters pane
A drill-through from another page (or report)
A selection on another visual on the same page
A synced slicer
Any combination of slicers, filters, drill-throughs, and/or hard-coded filters
You should always be aware of the current filter context because it can affect the results of any measure you are using in a visual.
In Figures 7-6 and 7-7, the slicer or Filters pane selection is affecting everything on the page. (I am only showing one visual, to keep the figure simple.) What happens if you want to include the total value of [All_Sales] on the same visual where it is also filtered? We need a new tool to be able to control the filter context’s impact on a measure.
A screenshot of an excel sheet, which has a formula bar of expression, and the arrow indicates the highlighted IntelliSense, mandatory argument, and optional argument.
Optional filter context modifiers
A filter context modifier that adds, modifies, or removes filters. Examples of filter modifiers include REMOVEFILTERS, ALL, and ALLEXCEPT (and there are many more).
A model modifier that alters the functionality of the model. Examples of model modifiers include USERELATIONSHIP and CROSSFILTER.
CALCULATE does not wipe out the existing filter context. It superimposes the filter conditions you have included in the CALCULATE measure onto the existing filter context.
A screenshot of a page, which has a table with the country, all sales, and all countries, and the arrow indicates the filter pane with the selected country Germany.
Removing filters coming from the Filters pane
A screenshot of a page, which has a table with the country, all sales, and all countries, and the arrow indicates the filter pane with the category name of the country Germany.
All_Sales_AllCountries (and All_Sales) reflects the “Beverages” filter
The other optional argument that CALCULATE will accept is a model modifier argument. Examples of these include USERELATIONSHIP and CROSSFILTER functions.
'Customers'[Country] = "Germany" hard-codes the country to Germany during the evaluation of the measure.
'Categories'[CategoryName] = "Beverages" hard-codes the category to Beverages during the evaluation of the measure, as shown in Figure 7-11.
A screenshot of a page, which has a table with the country, all sales, and all countries, and the arrow indicates the sales of beverages in Germany of $ 54.45 K.
All_Sales_GermanyandBeverages returns the same value as All_Sales filtered
A screenshot of a page, which has a table with the country, all sales, and all countries, and the arrow indicates the highlighted customer's country, categories, and beverages.
All_Sales_GermanyandBeverages does not respond to filter selections
The Measure | All_Sales(Measure 1) | All_Sales_AllCountries(Measure 2) | All_SalesGermanyandBeverages(Measure 3) |
---|---|---|---|
The DAX | =SUM('Orders Details'[LineTotal]) | =CALCULATE( [All_Sales], REMOVEFILTERS('Customers'[Country] ) | =CALCUATE([All_Sales], 'Customers'[Country]= "Germany", 'Categories'[CategoryName]= "Beverages" ) |
The explanation | This measure does not use CALCULATE. Therefore, it is not performing any additional manipulation of the filter context. Its results depend on the external filter context. In this example the external filter context is Country=France and Category=Condiments. | This measure uses CALCULATE to remove any filters that are coming from the Country column in the ‘Customers' table in the external filter context. The external context stipulates that Country = France, and this is still true. But the presence of the REMOVEFILTERS is imposed in addition to the external filter context. In this example, because the measure removes any filters coming from 'Customer'[Country], the Country=France is “overwritten” for this measure. By contrast, the external filter context also has a filter Category=Condiments. This external filter is included when calculating the measure, because nothing in the CALCULATE statement changes this external filter. | This measure uses CALCULATE to set the Country=Germany and the Categories=Beverages. Anything in the external filter context that conflicts with these two filter settings will be overwritten. The fact that the external filter context sets Country=France and Category=Condiment is “overwritten” for this measure. Any other filters present in the filter context are unaffected. |
KEEPFILTERS
ALL
ALLEXCEPT
ALLSELECTED
Every filter modifier argument for CALCULATE does one of two things: removes existing filters or adds a filter. Two different modifier arguments can be combined as well.
Another example of a model-modifying function is CROSSFILTER, which changes the direction of a relationship.
A calculated column called [LineTotal] and using it in a visual
A measure called LineTotalAsMeasure
A measure called All_Sales
A screenshot of a2 page, which has a table with the company name, all sales, line total, and line total measure, and is indicated by the arrows.
Three different methods to calculate the same results
A screenshot of a page, which has a table with the company name, all sales, line total, and line total measure, and the arrow indicates the line table in the filter pane.
Using a calculated column in a filter on the Filters pane
A screenshot of a visualization window that lists the options under build visual, and the arrow indicates the highlighted count and the show value as options.
Changing the summarization of an implicit measure
A screenshot of a page, which has a table with the country, line total, and all sales, and the arrow indicates the implicit measure and the explicit measure.
The results of an implicit and explicit measure side by side
A screenshot of a page that lists the options under values, and the arrow indicates the explicit measure that changes the summarization behavior.
The summarization of an explicit measure cannot be modified
Once all the necessary explicit measures have been created, the “base” field can be hidden. As I stated in Chapter 6, the hidden column is not hidden from the report developer in the Power BI Desktop. Rather, hiding a base column (or any field) streamlines the model once the report is published.
It would not be unusual to hide all the fields in the fact table. Doing so ensures that report creators would have to use the dimension table fields to create a visual, which is almost always preferable.
A screenshot of a page that lists the options under the format tab with the quick measure, which has a variety of calculation options and is indicated by the arrow.
There are built-in calculations in the Quick measure dialogue box
A screenshot of a quick measure page with the calculation, base value, category, and fields, and the arrow indicates the list of fields and the product name.
Creating a quick measure
The Quick measure wizard writes the DAX for you, which you can see once the measure is created.
Select the table where you want the measure to appear, then right-click, and choose Quick measure. This will ensure that the measure appears in the selected table. You can always re-home a measure (and a measure doesn’t care where it is), but it is nice to be able to find a measure quickly.
Once the Quick measure wizard has created the measure, it will be named something awkward such as Price Minus Cost. You will almost always want to rename these measures.
It does take a long time to learn how to write good DAX, and the Quick measure wizard is a useful way to get started. However, DAX masters say that the DAX created by the Quick measure wizard can be “clunky.” This should not deter you from taking advantage of the wizard, but be aware that the DAX it generates can be improved.
Logical
Aggregation
Statistical
Text
Time intelligence
Mathematical/trigonometric
Date/time
Filter
Table manipulation
Financial
TOTALYTD
SAMEPERIODLASTYEAR
PARALLELPERIOD
Time intelligence functions require a marked date table to work (covered in Chapter 5) because they operate on the premise that there is one row for every day between the earliest date and the latest date.
A screenshot of a page, which has a table with the year, all sales, and Y T D sales, and the arrow indicates the date dim and the total Y T D fields.
Using YTD in a visual
A screenshot of a page, which has a table with the year, all sales, and Y T D sales, and the arrow indicates the date dim that refers to the latest year in the data.
Using a time intelligence function without a date field can be ambiguous
A screenshot of a page, which has a table with the year, all sales, Y T D sales, and sales last year, and the arrow indicates the parallel period to modify the context.
Using ParallelPeriod to modify a filter context
Not everything that we want to calculate should result in a grand total. I call these types of calculations “point-in-time” calculations. Officially, these types of calculations are called semi-additive. The classic example is a bank balance. When you check your bank balance, it is a “snapshot” of your balance at that point in time. Another example is inventory balances. If you are calculating inventory, you don’t want to see a total of the inventory over the month; you want to see the inventory at that point in time.
OPENINGBALANCEYEAR
CLOSINGBALANCEYEAR
ENDOFMONTH
STARTOFMONTH
Once again, I am not going to list all the semi-additive measures. A great place to research these functions is www.sqlbi.com. They have posted in-depth articles on how these functions work.
A screenshot of a page, which has a table with the category name and all sales, along with their corresponding values.
A non-totaling measure
A screenshot of a visualization window that lists the options under visual, and the arrow indicates the turn-off option of the subtotals for every field used on the visual.
Turning off subtotals
HASONEVALUE
HASONEFILTER
ISINSCOPE
A screenshot of a page, which has a table with the category name, all sales, and non-total all sales, and the arrow indicates the beverages and the total values.
Using HASONEVALUE to turn off the totaling for one column
DAX Expression | Explanation |
---|---|
Address_All_Sales = IF( ISINSCOPE( 'Suppliers'[Address]), [All Sales] ) | This measure tests whether the 'Suppliers'[Address] field is being used to group the data. |
City_All_Sales = IF( ISINSCOPE( 'Suppliers'[City]), [All Sales] ) | This measure tests whether the 'Suppliers'[City] field is being used to group the data. |
Country_All_Sales = IF( ISINSCOPE( 'Suppliers'[Country]), [All Sales] ) | This measure tests whether the 'Suppliers'[Country] field is being used to group the data. |
A screenshot of a page, which has a table with the country, address of all sales, city of all sales, and country of all sales, and the arrow indicates Australia, Melbourne, and the address.
Anything lower in the hierarchy is in scope
In the Address_All_Sales column, the measure tests for the presence of the 'Suppliers'[Address] in the current filter context. Because the test returns TRUE, the All_Sales measure is calculated. The next column, City_All_Sales, tests for the presence of 'Suppliers'[City]. The test returns true for both the row with the city name and for the row with the address. Anything that is lower in the hierarchy is considered in scope.
In the preceding example, I used the formal hierarchy that I created. If I were to use the base fields, the result would be the same.
In the exam prep items for PL-300, there is a specific entry for the DAX statistical functions. I am going to be candid here and state that I really don’t know why this subset of DAX functions was mentioned specifically. Statistical functions work the same way as other DAX functions. My guess is that while most of us can intuitively understand a logic or mathematical function, we may not be as familiar with basic statistics. I recommend that you familiarize yourself with some basic statistical principles, such as median, average, and standard deviation.
This is a common saying in Power BI circles. I think it originated with either Alberto Ferrari or Marco Russo, the DAX powerhouses who run www.SQLBI.com. If you want to get better at DAX, focus on understanding filter and row contexts. These are the two hardest concepts to master, but once you do, your DAX will improve dramatically.
When you first start writing DAX, you are bound to write some “ugly baby” DAX—DAX only an author can love! And that is okay—at the beginning. There’s an expression “Perfect is the enemy of done,” which means don’t let the quest for perfect DAX get in the way of writing DAX. Good DAX takes practice and some tools, one of which is built into the Power BI Desktop.
A PowerPoint presentation's menu bar where the performance analyzer icon can be opened by selecting the view tab.
On the View tab, click Performance analyzer
A tab for performance analysis with an arrow that points to a start recording option.
The Performance analyzer will record your selections
A table with data on the company name, average order value, total order value, and largest line value. Adjacent to it is the performance analyzer tab with DAX query data.
Note the four entries under the name Table
A great way to start improving your DAX is to copy the queries from the visuals you create. You may not understand everything in the query, but you can start to get a “feel” for the syntax and logic of DAX.
Cumulatively, the number of separate DAX queries on a page can slow page responsiveness. If this starts to happen, open the Performance analyzer and interact with each visual in turn. Identify the slowest elements of the visual. If the DAX query component is slow, and if your visual uses measures or calculated columns, you can optimize those elements by rewriting the DAX. You cannot modify the DAX on which the visual is based.
A DAX page of the customer profile for Franchi S dot p dot A with data for the number of products from each category. There is a performance analyzer with a report. adjacent to the DAX page.
Refreshing a page to measure the overall responsiveness
Source | Remediation |
---|---|
Unnecessary (unused) columns | Remove those columns in the Power Query Editor. |
Columns with high cardinality | Remove these columns in the Power Query Editor or reduce their cardinality by simplifying them. For example, if the column has both date and time, separate these into two columns, date and time. |
Many-to-many relationships | Build bridge tables between two tables that currently have a many-to-many relationship. |
Bidirectional relationships | Use DAX to reverse a relationship (e.g., build a measure using CROSSFILTER). |
Using nonoptimized DAX | Many DAX functions can be replaced by one that is faster. Test various versions of a measure. |
Calculated columns | Replace with a measure or add the column in the source or using the Power Query Editor. |
Aggregate large tables | Create aggregated tables in the Power Query Editor or in the source. |
Filter and slicer selections | Preset filter and slicer to the most commonly selected items. Close the Filters pane. |
Excessive tooltips | Tooltips generate a query. Avoid using too many fields in a tooltip. If a tooltip is unnecessary, disable tooltips for that visual on the Format visual panel. |
Reduce interactions | If a visual does not need to interact with other visuals, you can disable interactions on the Edit interactions panel. |
As you can see, report performance is dependent on many factors, many of which should be addressed before creating the first visual.
A graphical summary of various pieces of important information, typically used to give an overview of a business.
Reports | Dashboards |
---|---|
Can be created in the Power BI Desktop or the Power BI Service | Can only be created in the Power BI Service |
Multiple pages | One virtual page |
Must be underpinned by a data model | Uses visuals from published reports |
Can be shared via apps | Can be shared via apps |
Cannot contain streaming content | Can contain streaming content |
In this chapter, we will explore reports in depth. In Chapter 13, we will address dashboards.
A screenshot has four sections for visualizations. Section 1 has page information and canvas settings. Section 2 has a canvas background and wallpaper. Section 3 has a filter pane, and section 4 has filter cards.
The page- and report-level settings
Page information: If this is going to be a tooltip page, you need to enable it here. You can also re-name a page here.
Canvas settings: This controls the overall size of your canvas. Notice that the default option is an aspect ratio rather than the familiar paper sizes (letter, legal, etc.).
Canvas background: This setting allows you to place colors or a picture within the canvas (as indicated by the dotted lines).
Wallpaper: This setting allows you to place colors or a picture across the entire report area (working and nonworking). By adjusting the transparency, the background and wallpaper colors can be interactive, as shown in Figure 9-2.
A screenshot has a few lines of text on the left. It reads build visuals with your data. The right side for visualizations has filters for color, image, image fit, and transparency under wallpaper.
Combining the background and wallpaper colors
Filter pane: You can set up a background color of your Filters panel.
Filter cards: You can change the look and feel of the individual filter cards.
…doesn’t mean you should. It is easy to fall down a design rabbit hole in the Power BI Desktop. It’s far better to standardize on the colors, fonts, page sizes, and images that you are going to use in your report. A great starting point is the PowerPoint template that your company probably has. You can capture your PowerPoint template slides as images and use them on your canvas or wallpaper. You can build a theme using JSON and embed it in a PBIT (template) file. Just like your company’s PowerPoint template, this PBIT file can be used as a starting point for every PBIX file, thus ensuring visual consistency and accurately “branded” reports.
A screenshot has a tab for view selected. It has different themes for this report and power B I. There are options for browse, gallery, customize, save, and how to create a theme at the bottom.
Available standardized themes
Just as in PowerPoint, the chosen theme determines the color palette used in the visuals. Be careful, though, to not overwhelm your report readers with different color schemes in reports and dashboards.
A screenshot of a window has the option to customize the theme. The tab for names and colors is selected. The right side has options for name, theme, sentiment, and divergent colors.
Customizing a theme
Make sure your color scheme has sufficient contrast particularly when used in highlighting on a visual.
Font sizes, for all text on a visual, should be at least 15 points.
There are fonts that are preferred for accessibility. Make sure that your report uses these fonts.
All visuals should have alternative text provided, describing the visual in detail. Avoid the use of “see” (and related words) in the description.
Every report should be tested with a screen reader, to ensure that all the accessibility settings are present.
Because ensuring a report is fully accessible is time consuming, ideally your company’s Power BI theme adheres to all the design principles of accessibility.
At the heart of every report are the visualizations. The visualization is the materialization of all the hard work of transforming and modeling the data. Because 80% of the work to create a report lies in transformation and modeling, choosing, populating, and formatting the visuals should take far less time. Despite this reality, when first getting started with Power BI, the visualization component looms large.
A screenshot for visualizations has multiple icons to build visuals. An option is labeled table.
The Visualizations menu
A screenshot has a table for matrix visual with three columns for category names, Argentina and Austria. The right side has icons, and options for rows, columns, values, and drill-through.
Automatic placement of fields into field wells
Every visualization has unique field wells in which fields can be placed. My preferred method for building my visuals is to drag a field to a specific field well, but of course, this is personal preference.
A screenshot for visualizations has icons to build visuals and an option to get more visuals. A menu option has a text box that reads, and custom visuals are accessed through the ellipses menu.
Accessing custom visuals
Many companies limit access to custom visuals, considering them security risks. There are custom visuals that have been certified by Microsoft, and many organizations consider that sufficient risk mitigation. Some custom visuals do require an additional purchase.
Among the standard and custom visual menus, there will probably be a visual type that will meet the requirements for your data. If, however, there are specific requirements that cannot be addressed, there are tools that allow you to create a custom visual, such as Deneb and Charticulator. (This is not an endorsement of either tool set.)
Comparison: Clustered column, bar, line
Change over time: Line and area
Parts of a whole: Stacked column and bar, pie, donut, treemap
Flow: Area and specialty custom visuals
Spatial: Globe and filled maps, ArcGIS maps
Distribution: Line, scatter, box, and whiskers
Correlation: Scatter, line
Single: Cards, KPI, gauge visuals
Before using a custom visual type (if your organization allows them), be sure to explore all the formatting features for the standard visual types that suit the data.
A screenshot for visualizations has two sub-menus for visual and general. A text box has details of the menu options.
The Format visual menu
When using a visualization for the first time, always make the time to explore all the features specific to the visual type.
A screenshot has three columns for the country, line total, and all sales. The top right side has three icons for the filter, focus mode, and ellipses.
The three controls all visual types share
Focus mode: Opens the visual in its own window
Filter: Displays all the filters currently active on the visual
Export data: Export the data used in this visual to Excel.
Show as a table: Instead of exporting the data to Excel, showing the data in a table format can be sufficient.
Remove: Usually selecting a visual and choosing Delete is sufficient to remove a visual from the canvas. However, if this doesn’t work, choosing Remove from the ellipsis menu works.
Spotlight: Fades back any other visuals on the report page.
Sort ascending, Sort descending, Sort by: This option is only available if there are multiple fields used in the visual.
A screenshot has the expansion of the ellipses option. It has export data, show as a table, remove, spotlight, sort descending, sort ascending, and sort by.
The ellipsis menu
Interactions are one way of filtering visuals. But there are several other more direct ways to filter visuals: by using slicers and/or the Filters panel. (Yes, you can use both of them together!)
A screenshot for visualizations has different icons to build visuals. An icon is highlighted.
Slicers are a visual type
Drag the field directly onto the slicer placeholder.
Check the box next to the field to use in the slicer.
Drag the field into the field well.
Any visual can be populated using one of these three ways, but if the visual has more than one field well, Power BI may put the field in an unexpected field well.
Slicers have different configuration options based on the data type of the field used. These options are underneath a caret in the upper-right corner, as shown in Figure 9-12.
In the most recent version of Power BI Desktop (as of this writing), these settings have moved to the Format Visuals menu.
A screenshot has a block with eight category names and icons for an eraser and a drop-down arrow to select the type of slicer.
You can change the type of slicer in the slicer header
List
Drop-down
List
Drop-down
Between
Less than or equal to
Greater than or equal to
Between
Before
After
List
Drop-down
Relative date
Relative time
A screenshot for visualizations has the tab for the visual selected. It has a drop-down menu for slicer settings, options, and selection.
Consider always turning Select all on
There are custom slicers available as well, but as always, investigate all the formatting settings for the standard slicer before resorting to a custom slicer.
A screenshot has eight options for category names and a popup window with four options of which copy option is selected.
Two different methods for copying a visual
Use Ctrl+V to paste the slicer in a new location.
Any visual element can be copied. However, as of this writing, when copying a button, you will not be able to right-click a button and choose Copy visual. Simply select the button and Ctrl+C.
A screenshot for sync visuals has a few lines of text and options for sync and don't sync. A text box reads, and syncing slicers will ensure that a selection on one is reflected on the other.
To sync or not to sync?
A screenshot has the tab for view selected. It has themes and an option for sync slicers. The text box reads, to modify synced slicer behavior, open the sync slicers panel on the view tab.
Opening the Sync slicers panel
A screenshot for sync slicers has two columns for pages 5 and 7 with a tick mark. The columns indicate whether the slicer is synced or visible on the page.
Synced slicers settings
Even though you can have “invisible” slicers on a page, use this feature cautiously. It could be very confusing to a report user to have a page filtered without any visible indicator.
Slicers are wonderful tools for filtering a report page (or several pages), but they have limitations. They are simple to set up, but that also means that they have a limited set of features. They also take real estate on the report page, something that is in limited supply. It’s a good thing we have another option: the Filters pane.
Filters on this visual, (only visible when a visual is selected)
Filters on this page
Filters on all pages
A screenshot for filters has three filter options, on this visual, on this page, and on all pages. The left side has a bar graph for all sales by country and a world map for unit quantity by country.
The Filters pane has three parts
A screenshot has three panels. The filters have a tab to add data fields here. The middle panel for visualizations has different icons. The right side for fields has an option for the company name.
Dragging a field onto the Filters pane
Basic filtering
Advanced filtering
Top N filtering (available only for Filters on this visual)
A screenshot of the filters on this page option has a drop-down menu for filter type. A text box details two options for basic and advanced filtering.
Basic filtering is very similar to slicer behavior
A screenshot of the filters on this page option has two drop-down menus for filter type and show items when the value. It has options of and or, and an apply filter tab.
Advanced filtering offers the opportunity to combine filter conditions
A screenshot of the filters on this visual has options for product name, filter type, show items, by value, and a tab for apply filter.
The Top N filtering is available for Filters on this visual
You, as the report developer, decide how much control the report user will have on the Filters pane. If the report user should not be able to change the filter pane settings, you can lock individual filter cards. Similarly, if the user should not even see the filter card, you can hide that card. Remember, by hovering over the Filters icon on a particular visual, the user can see the filters impacting an individual visual.
Every selection made on the Filters pane generates a query back to the engine. The overall performance of the page is dependent on a lot of variables, but the more of those variables you can control, the better your report’s response. If you know that most report readers are only interested in a specific set of data, you can preselect this data on the Filters pane and then close the pane. Once you publish the report, most of the users will be able to see the data they are interested in, without paying a performance hit. Users who are interested in different data can open the Filters pane and change the filters.
A screenshot has four columns for visualizations. 1 has page information and canvas settings. 2 has a canvas background and wallpaper. 3 has a filter pane. 4 has filter cards.
Numbers 5 and 6 have formatting features for the Filters pane
If you want the Filters pane to appear continuous with the report page, you can do so by using the same color scheme as your canvas and/or background.
A screenshot for visualizations has an option for drill-through. The text box reads, to use a field as a drill through, drag it to the drill through the area on the visualization pane.
The Drill through area
A screenshot has a left directional arrow and a text box that reads when the back button appears, the drill through is set up successfully.
The back button indicates success
A screenshot for all sales by category has four blocks for beverages, dairy products, poultry, and seafood. A pop-up window has an option of drill through.
The Drill through option in the right-click menu
A screenshot has a bar graph for sales of products in the meat or poultry category. A text box details the use of drill-through selection in the title.
The drill-through page showing the filtered result
A screenshot has four options, and 1996 is selected. It has blocks for different categories and a pop-up window with an option for drill through at the bottom.
The selected value on a slicer will also be “carried through” to the drill-through page
The good news is that the same functionality allows you to create a drill-through from one report to another!
A screenshot has two columns for category name and all sales. The right side has an option under drill through. The text box reads, table, and matrices visuals can be formatted by expanding fields in the visual.
Accessing Conditional formatting for tables or matrices
Background color
Font color
Icons
Web URL
A screenshot has two columns for category name and all sales. The right side has four options for background color, font color, icons, and web U R L.
The Conditional formatting options
Because most readers will probably have used conditional formatting in Excel, I will focus on the interesting nuances in Power BI.
A screenshot has categories under queries. The middle panel has columns for category name, description, and web address. The right side has options for properties and applied steps under query settings.
Adding a custom column for conditional formatting as a web URL
Select Conditional formatting ➤ Web URL.
Set the format to be based on the value in the Web Address field, as shown in Figure 9-32.
A screenshot for the web U R L category name has options for format style, apply to, what field should we base this on, and summarization.
Base the conditional formatting on the content of the Web Address field
This technique (basing conditional formatting on the value of a field) works for other formatting, such as background or font color. To turn the background of a cell a specific color, make sure you have a column with the hexadecimal color corresponding to the value you want to use in the format present in the table.
A screenshot has the tab for general selected. It has a box for text with the f x symbol. The text box reads any place the f x symbol appears, a measure can be used to create variable content.
The Fx symbol indicates an opportunity to provide variable content
A screenshot has the tab for general selected. It has a box for text with the f x symbol. The text box reads, clear the title field of the default text created, then select f x.
Clear the default title text
A screenshot for the title text title has two drop-down menus for format style and what field should we base this on. The text box reads, point the title text to the measure created to provide variable content.
Using a measure to create a variable title
A screenshot has a bar graph for all sales by country for bigfoot breweries. The right side has an option for bigfoot breweries under filters on this page.
The title reflects the selection for the supplier name made on the Filters pane
This is just one example. I am choosing to show the process so that the idea isn’t completely theoretical. The specifics aren’t as important as the overall concept.
Visuals are going to take up the bulk of the space on your page. But there are other elements that you can use on the page to enhance functionality.
Both JPEG and PNG images can be used to enhance the report page. Shapes can also be placed on the report page to create dimensional effects as well as create other composite shapes.
You can also place text boxes on the page to provide instruction or further details. Note that the text box contains static text. To create variable text, you need to use either the card visual type or buttons (see in the following).
Left and right arrows
Reset
Back
Information
Help
Q&A
Bookmark
Blank
Navigator
A screenshot has the tab for insert selected. It has a bar graph and options of the left arrow, right arrow, reset, back, information, help, Q and A, bookmark, blank, and navigator.
The Insert ➤ Buttons menu
A screenshot of the format has the tab for the button selected. It has options for shape, rotation, style, and action.
Action feature of buttons
Drill through
Web URL
What if you are not sure that your users will be savvy enough to right-click to drill through to another page? Placing a button on the page can make the process more transparent to the user. The first thing to do is to Insert ➤ Buttons ➤ Blank.
A screenshot has options for action, type, bookmark, destination, and web U R L. The text box reads, turn the action on, set the type of action to drill through and point the drill through to the correct page.
Setting up a Drill through action on a button
A screenshot has options, text, font, font color, horizontal alignment, and vertical alignment. The text box reads, add appropriate text to instruct the report user on how to use the button.
Adding instructional text to a button
Default
On hover
On select
Disabled
A screenshot has a bar graph for the quantity shipped for different products. The right side has the tab for button selected with options of style and text.
Setting the button text for the Default state
On hover: “Drill through for more details,” as shown in Figure 9-42.
A screenshot has a bar graph for the quantity shipped for different products. The right side has the tab for the button with options of state selected as on hover and a text box drill through for more details.
Setting the button text for the On hover state
Now, all the report user must do is click the button to drill through. Leveraging the states of a button allows you to design in layers of functionality and instruction on the same page.
A screenshot has the tab for buttons selected with different options. The navigator option is expanded to page navigator and bookmark navigator.
Including a page navigator
A screenshot has different page numbers and a bar graph for sales of products in the beverage category. The right side has the visual tab selected under format.
Setting the page navigator button to not show hidden or tooltip pages
If the report must conform to a specific size of paper, usually because it is printed (e.g., an invoice)
If the report must include all the rows of a long table (e.g., an inventory report)
If all the elements must be aligned perfectly on a printed page (e.g., a report printed on pre-printed stationary or forms)
In these instances, a paginated report is required. Downloading the Power BI Report Builder is free, and you can use that to build a paginated report.
There is a paginated report visual that enables you to embed a paginated report into a standard Power BI report.
Once the report is created, you can publish it into the Service if you have Admin, Member, or Contributor rights to a workspace in a Premium capacity. If those terms mean nothing to you, fear not. We will be covering the Service in upcoming chapters.
In Chapter 9, we covered the steps and features for creating a report. In this chapter, we will delve a bit deeper into some of the features we have already covered and explore some new ones. Because in some cases we are talking about enhancements to features we covered previously, you may need to page back and forth between these two chapters.
A screenshot depicts an excel sheet that lists the options under table tools with the date, year, month, and year, and the month tab is highlighted, along with its corresponding data.
Using the Sort by feature to sort month names
A screenshot depicts the up control, drill down icon, and expander icons that list the options under the category and net price, along with their corresponding features.
The drill down and expand down controls
A screenshot depicts a window that lists the options under the rows, columns, values, and drills through, with multiple fields in a hierarchical relationship.
Fields in a hierarchical arrangement
Notice that I don’t say “in a hierarchy,” which is a specific structure you can set up in a data model, which was covered in Chapter 6. You can use a hierarchy, which will enable the drill down and expand all down controls.
A screenshot depicts a window that lists the options under category and net price with its corresponding features, along with the trident, up control, drill down, and filter icons.
Category and Subcategory fields are visible after clicking the “trident” once
A screenshot depicts a window that lists the options under category and net price with its corresponding features. Its category, subcategory, and product name are visible on the right.
Category, Subcategory, and Product Name are all visible
A screenshot depicts a window that lists the options under subcategory and net price. The double arrows icon views the next level of camcorders with its category, subcategory, and product name.
The double arrows show the next level of the hierarchy but not the level above it
A screenshot depicts a window that lists the options under subcategory and net price. The up arrow icon uses to travel back up the hierarchy.
The drill up arrow
A bar graph compares all sales by country, and its data points highlight the sales by category name. Some of them are beverages, dairy products, confections, meat, seafood, and condiments.
Selecting the USA data point highlights the sales by category in just the USA
A bar graph compares all sales by country, and by choosing beverages on the bottom visual, the upper visual views the sales of the beverage category by country.
Interactions work from any visual to any other visual
Highlight (the default)
Filter
None
There are some visual types (such as maps) that do not highlight—they can only be filtered or have no interaction. The report developer can control the types of interactions visuals have with each other, either in the Power BI Desktop or in the Service. In this chapter, we will cover controlling interactions using the Desktop.
A screenshot depicts a window that highlights the edit interaction icons with the new set of icons: filter, highlight, and none.
The Edit interactions icons
A screenshot depicts a bar graph of all sales by country, and the arrow indicates the edit interactions icon, along with all sales by category, and unit quantity by country.
Editing interactions from the selected visual to the others on the page
The interaction between two types of visuals does not have to be reciprocal. In the preceding example, All Sales by Country can highlight All Sales by Category, and All Sales by Category can filter All Sales by Country.
When selecting colors for your theme, be sure to consider how the colors will look when highlighted. A light color will be difficult to see once it is grayed back and therefore difficult for a screen reader to “read.”
Interactions can have an impact on the performance of your report. If your users are complaining about the page performance, one option is to disable some or all of the interactions.
In theory, there are no limits on the number of pages in a report or, at least, none that I have been able to find. With that said, there is a limit to the number of pages a report user is going to be willing to tab through. But what if you have a lot of information to share? Or what if you want to make your report users’ lives a little easier by providing them with “shortcuts” to see details that you don’t want to put on the “front page”? In either scenario, and in many more, bookmarks are the answer.
A screenshot lists the options under the View tab, and the arrow mark highlights the Bookmarks pane.
Open the Bookmarks pane
A screenshot depicts a bar graph that compares the quantity shipped of the product category for the country, and two arrows highlight the filter option and the add tab in the bookmarks.
Setting the view to be captured and adding a bookmark
A screenshot depicts a bookmarks tab that lists the options under bookmark 1 with update, rename, delete, group, data, display, current page, all visuals, and selected visuals.
Renaming a bookmark
A screenshot depicts a two bar graphs of the quantity shipped of the product and all sales by category name, along the bookmark tab with the update option.
Remember to update the bookmark if you make a change to the page
Once you have created your bookmark, you can then revert to the default (or standard) view. If you make a change to the page that you want reflected on the bookmark, remember to select Update after making the change.
I find it easier to create a Default View bookmark first and then build bookmarks from that. You can delete the Default View bookmark once you have all of the bookmarks created.
A screenshot depicts a format tab that lists the options under visual and the bookmark navigator button allows the de-selection of a bookmark with the default view.
Selecting bookmarks to exclude in the bookmark navigator button
And there is a bonus: both the page and bookmark navigator buttons have all of the rich formatting options available for buttons.
A screenshot depicts a bookmarks tab that lists the options under the view option to enable a presentation-like experience.
Use the View option for presentations
Data
Display
Current page
All visuals
Selected visuals
Understanding the nuances of each of these settings is not on the PL-300 exam. However, the first time you create a bookmark, you will probably be curious about what each of these settings does. I will summarize the Data and Display options, and I encourage you to experiment with the others for yourself.
Data: Preserves the filtering and/or slicing selections you made when you first created (or when you update) the bookmark. In the preceding example, Data was checked, and my slicer selections were preserved in the bookmark. This is default behavior and without a doubt a setting you want active most of the time.
A screenshot depicts a default view of the page, and the select all option views the quantity shipped bar graph for different countries with product categories.
The Default View, with Spotlight turned on for one visual
A screenshot depicts a bookmarks tab that lists the options under visual with the selected data and current page, with its corresponding bar graph.
With Display unchecked, the bookmark inherits the selections made on the previous view of the data
Understanding all the bookmark settings, and how they work, requires a lot of trial and error (in my opinion). I suggest setting up a test page with at least two slicers (or a slicer and a filter) and two visuals. Then save that as a Default View bookmark. Make individual selections on the slicers and save the bookmarks with a name indicating what you did. Then toggle back and forth between the bookmarks to see the impact of changing the settings.
A screenshot depicts a window that lists the options under the view tab with the selection pane that is usually used in conjunction with the bookmark pane.
Opening the Selection pane
A screenshot depicts a selection tab that lists the options under layer order with the bar graph, and the arrow indicates the default view button, which toggles the eye icon.
Making an element invisible (or visible) on a page
A screenshot depicts a selection tab that lists the options under layer order, and the arrow indicates the default view button with the selected Canadian customer's bookmark.
Making an element visible using the Selection pane
Remember to toggle Update when making an element visible or invisible on a particular bookmark.
As in PowerPoint, you can arrange and layer the elements on your report page. On the Selection pane, use the Layer order to place shapes, buttons, and other visual elements on top of each other. Use the Tab order to determine the selection sequence when tabbing on a page.
A screenshot depicts a window that lists the options under the format option, and the arrow highlights the selected group features and the selected visual elements under the selection pane.
Select at least two visual elements to see Format ➤ Group
A screenshot depicts a selection tab that lists the options under the layer order, and the arrow highlights group 1, and a message that reads double-click on the name to rename the group.
Groups appear with their constituent elements on the Selection pane
A screenshot depicts a selection tab that lists the options under layer order, and the arrow indicates the category visual groups.
Setting a group as not visible
You might be familiar with the Selection pane in PowerPoint. (If not, it is worth checking out!) As in PowerPoint, you can rename elements on your Selection pane (as I did with the preceding group) to make them easier to select. However, there is one “gotcha.” If you rename a visualization on the Selection pane, that name will appear in the visualization’s title as shown in Figure 10-26. This is not usually a desired outcome. But you can rename any page element successfully, as long as you are not using the Title feature for the visual.
A screenshot depicts a selection tab that lists the options under the layer order with bar graphs, and the arrow indicates the highlighted visual option and the bar graphs.
Renaming visual elements on the Selection pane
Expecting the report user to know how to open the Bookmarks pane to view and use bookmarks might be a big “ask.” Many report developers combine buttons and bookmarks to make it easy for report users to navigate to a bookmark. We have already seen how to set up a button and how to set up a bookmark. All that remains is for us to combine them together.
A screenshot depicts a format tab that lists the options under the button, and the arrow indicates the bookmark button option that automatically sets the action to bookmark.
A Bookmark button
A screenshot depicts return to default view tab, and the arrow indicates the control plus click here to follow link button and the bookmark button.
Add helpful text to your button
In an earlier chapter, I said that when you are working in the Power BI Desktop, you are in developer mode. Some features do not work without using the Ctrl key. Once the report is published into the Service, Ctrl is no longer required.
Perhaps you have reached saturation in the number of features you can use in a Power BI report. Take a break and stretch. Maybe take a walk outside—because we have one very rich area still to cover: tooltips.
A screenshot depicts a bar graph of the quantity shipped, along with its product category, and the arrow indicates the tooltip pops up over a specific data point on the bar graph.
A typical tooltip
A screenshot depicts the X-axis that lists the options under add data fields, and the arrow indicates the highlighted number of the product I Ds and the tooltips.
Adding a measure to the Tooltips field well for a visual
Be very judicious when adding additional measures to a visual in the Tooltips field because every measure will require additional calculation time, which can slow down the report’s performance. This is true even if the report user never hovers over a data point to see the tooltip! If your report page is slow, one reason can be too many measures in the Tooltips field for one or more visualizations.
Don’t forget that if the field or measure has a “clunky” name, you can rename it just for the visual in question as shown in Figure 10-31.
A screenshot depicts the window that lists the options under the number of the product I Ds with the highlighted rename for this visual option.
If you rename a field or measure, be consistent across visualizations
A screenshot depicts a window that has a bar graph of the quantity shipped, along with its product category, and the arrow indicates the fields in a readable tooltip.
A more readable tooltip
A screenshot depicts a visualizations tab that lists the options under general, and the arrow indicates the color of the background of the tooltip.
Formatting tools for the background and text of a tooltip
Always keep accessibility in mind before changing all your tooltips.
A screenshot depicts a tooltip page that lists the options with the selected hide page options.
A new tooltip page, named “Tooltip Page” for clarity
A screenshot depicts a visualizations tab that lists the options under the format page, and the arrow indicates the page information to update the tooltip and the canvas size.
Setting the page as a tooltip page
A screenshot depicts a canvas settings tab, and the arrow indicates the size of the tooltip to change the setting of canvas to custom and adjust the pixels.
Adjusting the tooltip display size
A screenshot depicts a window with the text largest order amount of $ 15.81 K and is indicated by an arrow.
The tooltip page, formatted
A screenshot depicts a visualizations tab with the bar graphs, and the arrow indicates the highlighted format visual and the tooltip page.
Setting up the visualization to display the tooltip page
A screenshot depicts a window with the bar graph along with the text largest order amount of $ 2.57 K which is indicated by the arrow.
Displaying the entire tooltip page
Not only are tooltip pages a cool concept, but they can also help improve the performance of your report page. Unlike a garden variety tooltip, a measure used in a tooltip page is not evaluated unless the report user hovers over the data point.
For a tooltip to work, the fields (or measures) used in the visualization and tooltip must be related in the data model.
A screenshot depicts a window that lists the options under the view tab with the highlighted mobile layout, along with several graph designs.
Accessing the Mobile layout
A screenshot depicts a page visuals window with the bar graph of the quantity shipped, along with its country and category, which are indicated by the arrow.
Drag, drop, and arrange selected visuals on the mobile canvas
A screenshot depicts a visualizations tab that lists the options under visual with the bar graph of the quantity shipped along with the country and category, which are indicated by the arrow.
Adjust the formatting of visualizations for optimized viewing
Power BI’s purpose is to allow you to visualize your data and thereby facilitate analysis of that data. In this chapter, I will explore several features and visualization types that are available to make analysis both easy and comprehensive. Let’s start with the Analytics feature.
Power BI has a lot of visualizations that we readily recognize: column, bar, the much maligned pie, and, of course, line charts. Some of these visualization types have an additional feature available: Analytics.
A screenshot of a line visualization column with analytics features. The analytics icon on the right is selected to open a search bar.
The Analytics pane for a line chart
A screenshot of a trend line visualization in the analytics menu. The line chart plots total sales versus year. It depicts the sales trend of gadgets which mostly peaks in 2018 and declines by 2020.
Adding a Trend line to a line visualization
In the “Identifying Outliers” section, there is another example of using an Average line to a visualization.
If the type of data you are working with supports it, the Analytics pane will display an option for Forecasting. For the PL-300 exam, be sure to understand the components that go into creating a forecast in case you are asked to create a forecast line. There are many excellent blog posts that walk through the steps, using data that supports forecasting.
A screenshot of a trend line chart for total sales by year. Right-clicking the peak sales data point in 2018 for computers opens a menu. The options, analyze and explain the increase, are selected.
Right-click a data point and choose Analyze ➤ Explain….
A screenshot of a visualization for sales data model. A bar chart analyzes the total sales for 2017 and 2018 by country. A plus icon on the top right adds useful visualizations to the report page.
One of the visualizations produced by the Analyze feature
A screenshot of a visualization for total sales by year and month. A shaded area in the scatterplot indicates a clustering pattern from around April 2018 to September 2017 between $0.0 trillion and $0.2 trillion.
The gray band indicates a pattern
A screenshot of an analytics menu with average sales line visualization. The scatterplot of total sales by year and month has an average line that makes the outliers more visible.
An average line makes the clustering clear
A screenshot of a line chart visualization. The line chart depicts the sales trend of the gadgets from 2017 to 2019. The year column is used on a continuous axis.
Using a Year field on a continuous axis
A screenshot of a format visual menu in the visualizations panel. The X-axis under the visual tab is set to categorical type.
Setting the X-axis to Categorical
A screenshot of a line chart sales visualization with a categorical X-axis. The years on the X-axis are not in chronological order as the menu has sorted total sales in a descending pattern.
Using a Year field on a categorical axis
Most data will naturally fall definitively in one of the two types. But if you are working with data that can be either continuous or categorical, experiment with the different types of axes to see if one suits your analysis better than the other.
Sometimes your data has a lot of values, and it would be useful to be able to organize them into larger groupings. Power BI has several useful tools to help you accomplish this.
A screenshot of a bar chart visualization of total sales by year and month. The group data option from the menu is selected to group a selection of data points by pressing down the shift key.
Selecting the data points for sales over 200 billion
A screenshot of a fields panel. Once a group of selected data points is created, it will appear in the same table as the base field. The year-month group from the date dropdown menu is highlighted.
A group appears in the same table as the base field
A screenshot of a menu option to edit groups in the fields panel. A popup menu appears after clicking the ellipses beside the group name. The edit groups option is selected.
Editing a group from the Fields pane
A screenshot of a group menu. It has dialogue boxes for name, field, group type, ungrouped values, and groups and members. The last section provides the options to add, remove, and rename a group.
Editing groups
A screenshot of a bar chart of total sales by month and year. The bars of 2 different shades represent grouped data. The groups are reflected in the legend and the data colors reflect the group.
Naming a group improves the clarity of a visualization
A screenshot of a group in a slicer visualization field. The field on the left selects the year-month group and the corresponding groups are visible on the right.
Groups can be used in a slicer
A screenshot of the menu option to add a group to the filters panel. A popup menu in the visualization fields panel highlights the add to filters option which further opens the visual level filters.
Adding a group to the Filters pane
A screenshot of a filter pane. The year-month group filter is selected and all the group options are checked. A note mentions that a group behaves the same way a field does on the filter pane.
A group on the Filters pane
A screenshot of the menu option to create bins on a calculated column. The ellipses menu is chosen or the column is right-clicked to open a popup menu. The new group option is selected.
Choose New group to create bins on a calculated column
A screenshot of a group menu. It has blanks for name, field, group type, bin type, minimum and maximum values, and bin size. Bin type and bin size indicate the natural bin split.
Creating bins based on the size of each bin
A screenshot of the group menu. It has blanks for name, field, group type, bin type, minimum and maximum values, bin count, and bin size. The size of the bin here is determined by the bin number.
Setting a predetermined number of bins
A screenshot of a sales dropdown menu. The total sales C C bins option is highlighted and a note reads, the bins will appear in the same table as the calculated columns.
Bins appear in the same table as the calculated column
A screenshot of a menu to create a number of orders in each bin. A note pointing to the Y-axis field reads, the Count distinct implicit measure is used on the order number.
Using a Count Distinct implicit measure with the bins
A screenshot of a bar chart visualization of the number of orders in each bin. The visualization divides the total sales column into bins to analyze other values. The highest bar is at ($O, 5685).
The resulting visualization
If I were using the visualization shown in Figure 11-23 in a report, I would spend more time perfecting the formatting on the X-axis!
The last way of grouping your data is to let Power BI do it for you with the clustering feature.
There are some very sophisticated statistical techniques for clustering. If you want to be able to use those techniques, you should probably use R or Python and then create a visual.
A screenshot of a menu to access the clustering feature. The screen displays the total sales and name columns. Automatically find clusters is selected from the ellipses menu on the top right.
Access the clustering feature from the ellipsis menu on the visual
A screenshot of a data table that uses the clustering feature. The 3 columns list the total sales, name, and cluster. Power B I found 4 clusters. The online store is in a cluster by itself.
Power BI found four clusters in the data
Now the resulting field can be used in a slicer, on the Filters pane, or on another visualization.
There are several types of visualizations that use artificial intelligence to detect patterns that may not be immediately apparent. I will cover two of these here, but I encourage you to experiment with all the visualization types before sitting the PL-300 exam.
A screenshot of the key influencers visualization. The visualization is set to analyze the impact on total sales by category.
Setting up the key influencer visualization
A screenshot of the key influencers visualization in the music, movies, and audio category. The analysis of the total sales is based on the impact. Total sales increase almost 6 X.
The Music, Movies, and Audio Books drive up sales
A screenshot of the top segments component of the key influencer visualization. It generates total sales in segment 1 for music, movies, and audio category and analyzes it with the overall.
Profiling a segment of the data
A screenshot of a format visual panel in the visualizations menu. It presents the key influencers visualization under the visual tab of the format option.
The Format visual pane has functional features as well
While the key influencer visualization is interesting, I confess that it isn’t my favorite AI visualization. That title is reserved for the decomposition tree. Despite its unattractive name, the decomposition tree is elegant and simple to use.
A screenshot of a decomposition tree visualization menu. The total sales option is selected from the analyze dropdown. The sub-category and category options are selected from the explain-by menu.
The decomposition tree setup
A screenshot of explain by fields. The high-value option is selected and a note reads, the analyze by field is expanded using the + icon. The explain-by field has ascending and descending options.
Sorting the Explain by fields
A screenshot of explain-by fields. The total sales bar has categories and sub-categories. The appropriate bar is selected to view the breakdown of each category.
Viewing the Explain by fields
A screenshot of explain-by fields with decomposition tree settings. The setting has default selected from the density dropdown menu and filter or highlight as the default action.
The formatting options for the decomposition tree
You have created and published your report. Bravo! In Chapter 14, you will read about how to publish and update a report once it is published in the Service. In that chapter, I specifically say that updating the report does not include updating the data. “But wait,” you say! Isn’t the whole point to be able to see the latest data in the report, at any time? Yes, you are correct—having up-to-date data in the report is one of the main benefits of using Power BI. It is time to talk about refreshing the data.
Free license: No automated refresh, just manual refreshes
Pro license: Up to eight scheduled refreshes per day
Premium per user: Up to 48 refreshes a day
Workspace in Premium capacity: Up to 48 refreshes a day
Once you have published a report (and its accompanying dataset) in the Service, you will need to schedule automated refreshes, license permitting. A refresh is set up on a dataset, not the report itself.
Datasets do not have to be in the same workspace as the report.
A screenshot of the Contoso P B I X window. A note pointing to an icon on the top right mentions that there are 2 refresh icons, namely scheduled refresh and manual refresh next to the dataset entry.
Accessing the Scheduled refresh for a dataset
A screenshot of the settings panel under the datasets tab. The settings panel for Contoso P B I X has the scheduled refresh menu at the bottom.
The Scheduled refresh in the Settings menu
Turn on Scheduled refresh. It is not on by default (shown in Figure 12-3).
Set the refresh frequency.
Set the time zone.
Add the times when the refresh should occur.
Note: Microsoft does not commit to meeting these times to the minute. For example, if you set the refresh time as 2:00 p.m., the refresh may happen at 1:58 p.m. or 2:01 p.m.
Determine who should receive notification if the refresh fails. It is very important to make sure that someone is notified because if a refresh fails three times, the refresh is turned off.
A screenshot of the scheduled refresh menu. The refresh button is turned on and the refresh frequency, time zone, and time fields are displayed. The apply and discard buttons are at the bottom.
The Schedule refresh settings
If, when you open the Scheduled refresh menu, all the items are grayed out, that means that the Power BI Service is not able to access the source data to perform the refresh.
On-premise data gateway (personal mode)
Virtual network data gateway
A screenshot of the settings menu on the Contoso P B I X window. A note mentions that the settings menu can be accessed by using the ellipses on the same row as the dataset.
Accessing the Settings menu for the dataset
A screenshot of the settings menu under the datasets tab. The settings menu for Contoso P B I X displays the gateway connection option at the bottom.
The Gateway connection sub-menu
A screenshot of the gateway connection window. 5 columns display the gateway information. A row highlights a personal gateway that is running on a personal computer.
An example of a personal gateway running
A screenshot of the menu options to add gateways. The manage gateways option from the settings menu is selected.
Accessing the Manage gateways menu
On-premise data gateways
Virtual network data gateways
A screenshot of the data preview window. It has data gateways and data sources tabs. The latter opens 4 columns titled name, data source type, users, status, and gateway cluster names.
The list of gateway connections used in the workspace
Sometimes, you may want to access Excel and CSV content without having to install a gateway (personal or other). When you create a workspace, you can associate a OneDrive instance with it and then store Excel and CSV content. In this scenario, the Power BI Service can access the data sources and refresh once an hour.
A screenshot of the menu options to add people to roles in the Power B I desktop. The security option in the settings menu is selected.
Accessing the Settings ➤ Security menu for the dataset
A screenshot of the row-level security section. The computer category manager role is mentioned on the left and a search bar on the right is used to find people or groups who belong to this role.
Adding individuals (or security groups) to a role
Anyone who is an Admin, Member, or Contributor in a workspace automatically has both build and share permissions. This means that they can create a new report using an existing published dataset (even if they didn’t publish it). Admins, Members, and Contributors can also share a report (and its underlying dataset) with others who do not belong to the workspace.
Sharing with colleagues who are not members of the workspace (or subscribed to the associated app) can lead to “security sprawl.” It is a good idea to have very clear governance about when sharing with non-workspace members should be allowed.
A screenshot of menu options to initiate, share and build permissions. The manage permissions option from the settings menu is selected.
Accessing the Manage permissions sub-menu
A screenshot of a direct access tab in the Contoso P B I X window. The tab opens 3 columns that list the people and groups with access, their email addresses, and permissions.
Workspace members have share and build permissions by default
A screenshot of menu options to grant access to the dataset. The add user tab is clicked to open a grant people access window. The person's name is entered and the permission type is selected.
Granting share or build permissions to an individual
A screenshot of a Contoso P B I X window with options to create a sharing link. The related contents are listed on the left and the add link tab on the right opens the links.
To create a link, you must select the report
A screenshot of the permission settings on the Contoso P B I X. The settings provide options for selective people to access the links. The link can be set to either allow share or build permissions.
Defining the permissions granted by the link
A screenshot of the links tab. The links tab opens 3 columns that list the links, people who have access, and permissions.
Displaying the links
A screenshot of the ellipsis menu under the links tab. The link column lists the links with an ellipsis menu that has options to copy links, delete, manage access, and see shared views.
Adjusting the links
When students introduce themselves in my classes, I ask them to tell me what they want to learn. Invariably, I will hear that they are using dashboards and they want to learn how to create one. When I dig a little deeper, it often turns out that they are using reports, not dashboards. Now that we are close to the end of this book, I hope that it is clear that dashboards and reports are two different things. So far, we have focused on reports. This chapter is dedicated to dashboards.
Is a virtual one-page document
Can only be created in the Service
Can contain visuals from multiple reports published in the workspace
Can contain streaming content (reports cannot)
The purpose of a dashboard is different from that of a report. It is intended as a “snapshot” of the data being represented by the visualizations that are included on the dashboard. I say “snapshot” quite deliberately because, by default, the visual tiles do not interact with each other. This is in sharp contrast to reports, where visualizations interact with each other by default. (Remember Edit interactions?) There is some anecdotal evidence that organizations don’t use dashboards as much as reports, perhaps because users expect dashboards to act exactly as reports do. I have a suggestion for the ideal purpose for a dashboard—feel free to use it as a jumping-off point for your imagination.
In a previous working life, I was a facilities manager. After the 9/11 events, I was asked to put up large TV monitors in common areas, for example, the lobby, the break rooms, and the cafeteria. (You may have noticed this in your own office.) The perfect application for a dashboard is to be displayed on these monitors. They are usually mounted high on a wall, so no one expects to interact with the dashboards. The data is being displayed for information, not investigation. Setting users’ expectations appropriately can reduce their frustration with the mechanics of dashboards.
A grouped bar graph plots quantity versus the year from 1996 to 1998 of the quantity shipped of the product category assigned adjacently.
Pin your visualization using the pushpin icon
A screenshot of a window with a pin to the dashboard box to enable it, whether you need it in the existing dashboard or a new dashboard. A bar graph on the left displays the quantity shipped of various products.
Choose an existing or create a new dashboard
Keep scrolling to a minimum on a dashboard by limiting the number of visualizations you pin.
A screenshot of a popup window with a display pinned to a dashboard and two buttons to create a mobile layout and go to the dashboard.
This pop-up menu only stays up for a short time
A screenshot of a power B I dashboard with the option of a training workspace. There is a box with text that reads, clicking on the workspace name returns you to the main menu of the workspace.
Follow the “breadcrumbs” at the top of the screen
A screenshot of the training workspace to create a sample dashboard under the navigation menu. There is a box with text that reads, use the left hand navigation menu to get to dashboards, reports, datasets, and workbooks.
Using the left-hand navigation
A screenshot of the dashboard displays a logo of adventure works, a bar graph of revenue by category, units by manufacturer, and quantity by ship country.
Example of a dashboard
A screenshot of a menu bar with an edit option to select the dashboard theme. The editing options for the dashboard are limited.
Editing options on a dashboard
A screenshot of a menu with a text add a tile, under the media heading, the options are for web content, image, text box, and video tiles. Under real time data, the custom streaming data tile is below.
Adding a tile with live data (or an image or a text box)
A screenshot of a dialog box to select the dashboard themes as required. There is a box with text that reads, to improve on the limited formatting options, a J S O N theme can be uploaded to customize the dashboard.
Changing your dashboard theme
A bar graph of revenue by category. There is a drop down menu on the right with various options. An arrow points to the option to copy visuals as images.
Limited editing options on a tile
Most of the options are straightforward, but there are a couple worth exploring.
A spreadsheet with details on categories, subcategories, product names, and sales information. A text box on the right reads the result of export to dot c s v.
Export to .csv result
A bar graph of exit focus mode, in which filters are applied as needed. There is also a census of total sales by subcategory and total sales by category.
Visualizations generated by the AI engine using View insights
This option may seem odd—isn’t this visualization already pinned? Yes, but what if you want to pin it to another dashboard? Nothing easier.
As I mentioned at the beginning of the chapter, some users are disappointed when viewing dashboards, because the tiles do not interact with each other. Upon consideration, this is only to be expected, as visuals can come from multiple reports and different datasets. If you click a tile, you will return to the source report. But there is one way to achieve interactions between tiles on a dashboard: pin an entire report page.
A graph on revenue by the category in urban 872 M dollars, rural 198 M dollars, mix 28 M dollars, and youth 27 M dollars. There is also a drop down menu with a pin to the dashboard option on the right.
Pinning an entire page to a dashboard
A screenshot of a window with order details on a live pinned page under the report title with the number of pages.
A live page has the report name in the upper left-hand corner
Think carefully before pinning a live page to a dashboard. Can the users benefit from simply using the source report? Dashboards and reports are not the same thing and shouldn’t be used interchangeably. Can the report be simplified and shared with users via the app?
A screenshot of a menu bar with a drop down menu has a settings option with several features. A logo for adventure works is in the background.
Accessing the File ➤ Settings menu on the dashboard
A screenshot of a dialog box with a settings menu displays options for featured, Q and A, comments, and dashboard tile flow.
Part of the Settings menu for a dashboard
We will be talking about Q&A in the following. The Dashboard tile flow automatically rearranges the tiles if a tile is added or deleted.
Setting up your dashboard to perform well on a mobile device is easy. Simply choose Edit ➤ Mobile layout. You can unpin tiles from the dashboard to make the mobile view easy to navigate.
In an earlier chapter, we covered setting up the natural language feature, Q&A. There are several settings that must be enabled in the Service and in the Power BI Desktop.
A screenshot of the Power B I window with a modeling option on the taskbar in which the Q and A setup are visible.
Setting up Q&A in the Desktop
A screenshot of the training workspace with the name of the file and with type f the file. A drop down menu with a settings option is to be selected to access the Q and A feature.
Accessing the Settings menu for a dataset
A screenshot of the dialog box to turn on Q and A, which is used to create an index of the data option in the settings menu.
Making sure Q&A is turned on for the dataset
Now that Q&A is enabled, you can take advantage of it on a dashboard and on a report.
A screenshot of a window with a header text that reads, ask a question about your data bar.
The Q&A feature at the top of a dashboard
A screenshot of the exit Q and A window with a text that reads, type a question bar and a few questionnaire suggestions.
The Q&A window
A screenshot of the exit Q and A, with details of the order I D, customer I.D., employee I.D., required date, shipped date, ship via, freight, ship name, ship address, ship city, and ship region.
Q&A generates a visual from a question
A screenshot of the exit Q and A window with details on the order I D, customer I D, an employee I D, required date, ship via, and freight.
You can pin a Q&A visual if you like it
A screenshot of a refresh option to reload the data on the page. The data is automatically refreshed based on the refresh schedule.
Refreshing visualizations
Your report is done, and you want to share it with colleagues and, possibly, a wider audience within your organization. It can be tempting to email your PBIX (a Power BI Desktop file), but you will quickly find this method of sharing less than optimal, to say the least. PBIX files can be very large, and many email servers will block them. Additionally, there are features of your report that will not work properly unless the report is published to the Power BI Service. So what do you do? You publish your report to a specific workspace in the Power BI Service. Not sure what that means? Read on.
Save your report with a unique name.
Log into the Power BI Service.
Choose the workspace where you want the report to be published.
Publish.
A screenshot of the window. It lists the sign-in details to enter your credentials, along with the continue and cancel buttons.
Sign into the Power BI Service using your credentials
A screenshot of the Jessica Jolly Microsoft account page. The arrow indicates the sign-in with a different account option.
You can sign in with different credentials if needed
Now that you are signed in, you are ready to publish your report to a workspace in the Service.
My workspace
Workspaces
A screenshot of the publish to power B I page. It has a Select a destination option, and the arrow indicates the publish tab and the training workspace option.
Selecting the appropriate workspace
A screenshot of the publish to power B I page. The arrow indicates the open North Wind query in power B I, which is used as a link to go straight to the report in the service.
You can go directly to your report in the Service
A screenshot of the Office page. It lists various apps, and the arrow indicates the app launcher icon and the all-apps menu.
Find Power BI in the list of your apps in Office 365
Go to https://app.powerbi.com and sign in using your credentials.
A screenshot of the Power B I home page with the home, create, and browse options. The arrow indicates the power B I service with the app launcher icon.
From the Power BI Service, you can go to other places in Office 365
A screenshot of the Power B I home page. It has home, create, browse, a data hub, metrics, apps, learn, and workspaces. The arrow indicates the left-hand navigation icon.
The navigation menu for the Service
Home: Frequently accessed reports and dashboards, recent files, favorited files. Everybody’s Home screen will vary based on your organizational settings and your use of the Power BI Service.
Create: Initiate the creation of a report in the Service. Note that you will have to use a previously published dataset.
Browse: Browse the contents of the Service.
Data hub: Lists all the datasets published and available to you in the Service.
Apps: Where all the apps to which you have access are listed. You may have to hit Get apps to find an app initially.
Learn: One-stop shop for finding additional learning resources.
Workspaces: A list of all the workspaces to which you have access.
My workspace: Access to your My workspace, a sandbox environment ideal for experimentation. Not designed for collaboration on production reports and dashboards.
A screenshot of my workspace page with a search dialog box. The arrow indicates the training workspace option.
Choosing the workspace to work in
Reports: Reports can be published from the Power BI Desktop or created in the Service. If you want to create a report in the Service, you must use a dataset that has been published into the workspace or that has been shared directly with you.
Datasets: A dataset accompanies every report that is published from the Power BI Desktop. It is a combination of the data model and the data.
Dashboards: A dashboard is a virtual single page that can comprise visualizations from any report published in the workspace.
Dataflows: A dataflow is the output of the Power Query Editor in the Service. (Yes, the Power Query Editor is available in the Service.)
Excel content: Excel workbooks can also be stored in a workspace.
A screenshot of the training workspace page. It lists the options under all, and the arrow indicates the dashboard options and the appended files.
The colors of tiles indicate the type of content
A screenshot of the page. It lists the options under the filters menu with the type, owner, and other, which are indicated by the arrow.
The Filters menu
Admin: This is the highest level of privilege. Someone with Admin rights can delete the workspace. It is a good idea to have at least two people with Admin rights in the workspace.
Member: Members can add, edit, and delete content. They can also add other people to the workspace. They can publish, edit, and unpublish the app associated with the workspace.
Contributor: Contributors can add, edit, and delete content. Contributors can also be given optional rights to update the workspace’s app.
Viewer: Viewers can only view content. They have no other rights.
A screenshot of the access page that lists the options under the training workspace. It includes the email address dialog box and the selected member, indicated by the arrow.
Workspace membership levels
Because three of the four access levels allow the editing of any content in the workspace, establishing governance practices is very important. Anyone with Admin, Member, or Contributor access needs to follow agreed-upon practices to ensure that unintentional changes do not occur.
A screenshot of my workspace page. It has a search dialog box with the highlighted training workspace and an arrow that indicates the create workspace button.
Creating a workspace
Name: Ideally, your organization has a naming convention to avoid duplicative names. If not, name your workspace something that clearly identifies it.
Description: A good description will remind participants of the purpose and audience for the workspace. An overall governance statement in the description would not be a bad idea either.
Contacts: The default contacts are the workspace Admins, but it may be more appropriate to put the business owners in as contacts, particularly if IT is the only one with Admin privileges in the workspace.
Workspace OneDrive: A dedicated OneDrive space can be designated for the workspace. The advantage of this is that files that support datasets can be put in OneDrive. For example, if you have Excel files that are part of a dataset, if they are stored in OneDrive associated with the workspace, the Service will be able to perform a refresh once an hour.
Template app: A template app is one that is designed and developed for viewers external to the organization.
Contributors update app: This is an optional setting.
Before creating a workspace, it is best to have a clear idea of both the purpose of the workspace and the intended audience (if any) of the reports and dashboards. This is because, as of this writing, there is only one app allowed per workspace. Recently, Microsoft has added the ability to define multiple audiences for a single app, which does help. Your workspace still can only have one app, but different content within the same app can be viewed by different audiences. This is a relatively new feature (as of this writing), so you may not see it implemented in your workspace yet.
Larger datasets (above 10 GB).
Paginated reports.
More frequent refreshes (48/day).
Viewers do not require a Pro license.
If a workspace resides in Premium capacity, it will have a diamond icon next to it.
A screenshot of the training workspace. It lists the options under new with the report, paginates report, scorecard, dashboard, dataset, dataflow, streaming dataset, and upload a file.
Creating a report in the Service
A report that is created in the Service must use a preexisting data model. No additional content (e.g., columns, measures) can be added to the report in the Service.
A screenshot of the training workspace. It lists the options under the new menu, and the arrow indicates the upload of a file option, which is used to add content to the workspace.
Adding additional content to the workspace
Once a report is published, imported, or created within the Service, the question of updating will eventually arise. Not the data, of course; that is taken care of when you establish the refresh schedule. But what happens when a visual needs to be added, deleted, or changed? You can edit the report in the Service, but this option should be exercised with significant caution.
If a report is updated in the Power BI Desktop and republished to the Service, it will overwrite the version of the Power BI report saved in the Service. This is by design—you want to make sure that the latest version of the report is what is available in the Service. This means that if a change is made in the Service, it will be overwritten when a new version of the report is published, unless the change is replicated in the Power BI Desktop.
With a restricted group of people (you can also share with a security group)
To the entire organization
These are either/or conditions—you can’t share the same app to a specific group of people and with the organization.
I think of an app as a magazine or newsletter. Someone decides what content is going to be included, and then the app is updated on a periodic basis (daily, weekly, monthly, quarterly). The great thing about using the app as a sharing vehicle is that app viewers do not see the working copy of the content. If a report is published in the app (e.g., in January), viewers will only see the January version of that report until the app is updated. To designate content that should be included in the app, toggle the Include in app to Yes, as shown in Figure 14-15.
Microsoft has released a newer set of features for apps, which allow for the designation of different audiences for content in the app. The screenshots below pertain to V1 Apps, not to V2 Apps.
A screenshot of a page. It lists the in-app options with the yes button to share the content in the workspace's app and is indicated by an arrow.
Toggle the Include in app to Yes
Only reports, dashboards, and Excel content can be shared in an app.
Viewing content in an app requires a Pro license unless the workspace is stored in Premium capacity (see earlier).
App name (which can be different from the workspace name)
Description (mandatory)
Support site: A place to put a URL to get more information
Contact information: One contact (or group) for the app or individual contacts for each report or dashboard (this must be set individually)
A screenshot of the training workspace. It lists the options under setup with the app name, description, support site, app logo, app theme color, and contact information.
The Setup screen for an app
Sections: You can use sections to organize your content. In the following example, the sections are Reports and Dashboards.
Links: You can add links to the app.
A screenshot of the training workspace. It lists the options under navigation with the Northwind query along with the report details and is indicated by an arrow.
Setting up Navigation in the app
A screenshot of the navigation page. It lists the options under the Northwind query with the report details, and the arrow indicates the dashboards and the hide from navigation options.
Assigning content to a specific section
Access: This is where the audience for the app is identified. Note that it is either to a restricted group of people or the entire organization.
Build permissions: Allowing build permissions gives anyone who has access to the app the ability to discover the underlying dataset and build a new report off the dataset. Think carefully before allowing build permissions in the app.
Share permissions: Allowing share permissions gives anyone who has access to the app the ability to share the report and the underlying dataset, with accompanying build permissions. Think carefully before allowing share permissions in the app.
Pushing the app: You can “push” the app to all authorized viewers, which saves them the step of going to Get apps and “pulling” the app.
A screenshot of the training workspace page. It lists the options under permissions with access. The arrow indicates the specific group and the app with build and share permissions.
Access controls for the app
If, after creating the app, you need to change the settings, click Update app. The Settings menu will open, and you can alter settings as needed.
If an app is no longer needed you can delete it, if you are an Admin or Member of the workspace.
Once a report or dashboard is published in an app, viewers can subscribe to receive notifications when the data is updated.
We have worked through a lot of settings already, but there are still more to examine.
A screenshot of the Microsoft page. It lists the toolbar menus, and the arrow indicates the data sensitivity labels.
The Sensitivity tile is available when the feature is enabled
A screenshot of the Microsoft page. It lists the options under name and type, and the arrow indicates the data sensitivity label.
The Sensitivity label is reflected once the report is published
It is important to know that as a Power BI user, you do not have control over whether data sensitivity labels are available or not.
Once an organization becomes data-centric, the origin and quality of data is very important. The Power BI Service has several tools that help report and dashboard users assess the reliability of the data used.
Reports
Datasets
Dataflows
Apps
When you promote something, it is akin to signing your name to your work. I would suggest establishing criteria for content worthy of promotion.
The source of the data
The completeness of the ETL process
The accuracy of any measures or calculated columns
What your organization considers worthy of certification will differ from another organization’s criteria. Once the criteria have been established, the policy needs to be written, and the accompanying process needs to be designed and implemented. Only after these steps are complete should the certification feature be implemented within Power BI. As a user, you don’t have any control on whether the certification option is available—that is up to the Power BI Admin. Once it is, only selected individuals will be able to certify content.
Congratulations! If you have made it this far in the book, my hope is that you are ready to sit the PL-300 exam. But remember what I said in the first chapter? I want you to be ready to work with Power BI. How can you continue your growth?
I am in no way affiliated with any of these resources.
This could never be a comprehensive list of resources. Any omissions are either unintentional or because I am unaware of them.
Let’s start with YouTube, which is where the lion’s share of the resources is.
This is the channel where I started my learning journey. Adam Saxton and Patrick LeBlanc (who both work for Microsoft) have been creating videos since 2017, so the library of videos is extensive. One of the things I love about their channel is that their content runs the gamut from beginner to complex topics. No matter the stage you are in, you will find helpful, relevant material on this channel.
Guy in a Cube (GIAC) provides a bonus offering: they hold a one-hour live session most Saturdays at 9:30 a.m. Eastern, which is open to anyone. Attendees are from all over the world, and, if you attend regularly, you will get to know some Power BI heavy hitters by name. Adam and Patrick take questions from participants and answer them. If you choose to join as a member, you can pre-submit questions.
I love this channel because Reid alternates between in-depth interviews with Power BI luminaries and practical videos. Reid has a design background, and many of his videos focus on enhancing both the appearance and functionality of your report. Do you have questions about buttons, bookmarks, and visualizations? Reid’s channel is a must-watch. You can also support Reid’s channel by becoming a member.
Ruth’s channel is wonderful because she produces videos that cover the entire Power BI spectrum. You will find videos on DAX, the Power Query Editor, and the Service. Her style is very approachable and easy to follow. She also has fun contests and events, such as the “25 Days of DAX,” which she ran in December 2021. You can’t go wrong following Ruth’s channel.
Reza Rad is Power BI royalty in the Power BI community. In addition to the extensive content he produces, he also runs conferences that are affordable and well worth your time and money.
No list of YouTube channels (or blogs) would be complete without talking about SQLBI, run by Marco Russo and Alberto Ferrrari (colloquially known as “the Italians” in the Power BI community). There are so many good resources I don’t quite know where to begin.
They regularly post videos about DAX functions and concepts. Some of these videos are appropriate for beginners, and some are much more advanced. Every video is accompanied by a blog post on SQLBI.com.
They have an Unplugged series. In these videos they tackle a topic or challenge, but they do not edit the resulting video. I love these videos because you can follow their reasoning as they work on solving the problem. Many times, watching them solve a problem has enlightened me more than seeing the result.
Finally, they just launched a Whiteboard series, which they are basing on the demonstrations they perform in class. I have attended their live DAX class and can attest to the value of seeing a concept explained using diagramming on a whiteboard.
Chris Wagner’s YouTube channel is my go-to when I want to tackle some more technical concepts. He does a great job of explaining difficult concepts clearly. He has a wealth of knowledge and experience in implementing, maintaining, and governing Power BI within a large enterprise. Not to mention he has a great sense of fun!
For a while, Alex Powers and Alex Dupler ran a Saturday morning live stream right after Guy in a Cube. I loved this channel because they would go deep into some technical concepts. While there aren’t a lot of videos on this channel, every one of them is worth watching.
Mike Carlo, Tommy Puglia, and Seth Bauer host a twice-weekly podcast called “Explicit Measures.” I love this podcast because they cover real-world topics in a very approachable way. I can’t recommend them enough!
There are so many good blogs out there it is very hard to confine myself to just a few.
As I mentioned earlier, Marco Russo and Alberto Ferrari post articles that accompany the videos they post on YouTube. I am someone who learns by reading, so having an article to read really helps me solidify a concept. I often will watch the YouTube video, then read the article, and then rewatch the video.
I would mislead you if I said that their posts are for beginners, but it is never too early to expose yourself to DAX concepts. You will be amazed by how much you retain. And, over time, it all begins to “click.”
Chris’ blog focuses on the Power Query Editor. I love it because he often delves into complex topics that I don’t even know how to find through a search. Often his topics are above my head, but that gives me the motivation to dig deeper and learn more.
Are the Power Query Editor and the M language “black boxes” to you? I just took a class with Ben (see “Paid Training” in the following) because I had lots of questions about both. The class was very enlightening. But you can benefit from Ben’s knowledge at no cost by reading his blog. He has a Power Query Editor Primer series that is a great place to start. He also regularly blogs about much more complex topics. Even if some of them are too complex for me, they inspire me to dig deeper.
Paul Turley wrote one of the best blog series I have ever read: Doing Power BI the Right Way. If you don’t explore any of the other blogs I mention, be sure to read this series. I recently met up with Paul, and, at the time of this writing, he is actively thinking about new topics for his blog page—I can’t wait for his new content to drop.
Sometimes it seems to me that no one uses books for learning anymore, but I love them. (Hopefully this book will be the exception to this trend!) There are several books that I rely on heavily.
This book is a “must” for anyone who wants to be good at DAX. I don’t know a Power BI rockstar who doesn’t own this book.
This is the book I used to “ramp up” my knowledge of the Power Query Editor. It covers the basics and then moves to crucial concepts, such as future-proofing queries. You may not be ready to put everything in this book into practice, but being aware of the concepts he explores is critical.
This is a great companion text to Gil Raviv’s book. It focuses on mechanics and techniques and is an invaluable reference.
If you are just starting to learn DAX, Matt Allington’s book is a perfect place to start. He doesn’t assume any prior knowledge and includes lots of exercises.
Eventually, you will want to know more about data modeling generally and star schemas specifically. This is a great book to increase your understanding of what a star schema is and how to build a good one.
Once you are beyond the basics, there are several paid training courses that I can highly recommend.
This is a three-day in person class that is excellent. Be warned—it isn’t for beginners.
This is a 12-hour class split over three days. Take this class when you want to understand why and how the Power Query Editor does what it does.
Alberto Ferrari and Marco Russo have produced numerous video courses on a variety of subjects. The amount of content they pack into each course is incredible. You have several years to make your way through the content (and you will need that time!), and you can re-up your access when the initial period expires.
Stuttgart
Romania
London
Azerbaijan/Baku
But this is by no means an inclusive list. You should explore and find your favorites.
At the time I am writing this, conferences are back. There are a lot of different options, but they vary from year to year. Twitter is a great place to hear about them. The biggest one, “Ignite” hosted by Microsoft, has not yet returned in person but is available online. Most conferences have a virtual option now.
I love going to conferences because I get to meet Power BI folks I only talked to virtually. I also love sessions that are a stretch for me. I always make sure that I attend at least two to three of these sessions.
Don’t be afraid to introduce yourself to the speaker. Most of us love to meet new people, and we welcome your questions. We want to see others succeed—we wouldn’t be spending our time and money speaking if we didn’t!
By far and away, Twitter is the platform preferred by Power BI folks. Do you have a question? Do you have an insight or an opinion? Twitter is the place to post. Some good hashtags to follow are #PowerBI, #PowerQueryEditor, and #PowerBIHelp.
There is an active Power BI subreddit where you can post questions and get help. There are some heavy hitters in that group.
It is a good idea to expand your network with Power BI professionals on LinkedIn. You can start by connecting with me at www.linkedin.com/in/jessicabjolly/. The Power BI community is welcoming and generous. As you grow your career, you will find plenty of people who want to help you succeed.
Once you have gained a modicum of skills in Power BI, you will want to explore some of the external tools that will make your job easier. I have listed a few of these tools in the following, but you can find a complete list at www.sqlbi.com/tools/. Once you have downloaded the program, the next time you open the Power BI Desktop, a new tab will appear called External Tools.
DAX Studio enables you to write and test your DAX. You can connect it directly to your model and then evaluate different permutations of particular DAX expressions. In several places in this book, I mentioned that there are always ways to improve your DAX performance. DAX Studio helps you do this because you can measure how long a DAX expression takes to evaluate and analyze what components are taking the longest. Once you have fine-tuned the expression, you can then copy it back into the Power BI Desktop. DAX Studio is free, which is incredible given the value it adds to the development experience. And there is a bonus: Marco and Alberto have posted a free video sequence explaining how to use DAX Studio on their You Tube channel and on their website.
Bravo is a new tool (March of 2022) that SQLBI has introduced. It can analyze your model, format your DAX, and create a date table for you. I have not had a chance to play with it, but any tool created by SQLBI is well worth exploring.
Building a data model is a nontrivial exercise. While you can do this in the Power BI Desktop, most developers are now using Tabular Editor, a tool developed by Daniel Otykier. There are two versions of Tabular Editor: the community version, which is free, and the paid version, which is approximately 350 USD/year. Please be sure to check the website (https://tabulareditor.com/) for exact pricing.
Both versions allow you to perform modeling steps more conveniently than in the Power BI Desktop. Additionally, Tabular Editor supports some features that the Power BI Desktop does not. With a free community version available, you can afford to check it out!
The SQLBI website maintains a comprehensive list of all the external tools available for Power BI developers (www.sqlbi.com/tools/). Be sure to check out the list as there are many other cool tools to investigate.
I have had a lot of fun writing this book. It has cemented some of my knowledge of topics that were a little “fuzzy,” and it has caused me to think deeply about what someone needs to know to get started in Power BI. I wanted to give you an in-depth guide to the topics you will need to study for the PL-300, but I also wanted to get you started on the road to becoming a Power BI “ninja.” I hope that I have accomplished both goals. Most importantly, I hope that this book helps you learn to love Power BI as much as I do!