3 Things: Data Analytics Highlights from May 2024
7th June 2024 . By Michael A
This month's '3 Things' blog post includes topics ranging from the introduction of 'Task Flows' in Microsoft Fabric to the welcome addition of Liquid Clustering in Delta Lake.
Read on as we highlight three things for each of the four technology areas that you should be aware of from last month.
Power BI
-
Copilot in Microsoft Fabric is now generally available in the Power BI experience. You can use it to summarise a report, create a new report, explain what DAX measures are actually doing, and much more. If you were waiting for the general availability milestone before rolling this capability out within your business, the wait is over. Remember though, you'll need an F64 Fabric capacity or higher to use Copilot features. Learn more.
-
The Power Automate Visual is generally available. With it, you can trigger workflows directly from your Power BI reports, providing a seamless and intuitive user experience. This type of integration can make your path from insight to action even shorter and significant data points can be used to trigger a sequence of automated events with workflow approval stages. Learn more.
-
The DAX query view feature has reached the general availability milestone. It was added to Power BI late last year as a preview. Before its arrival, Power BI professionals had to use third party tools like DAX Studio and Tabular Editor 3 - both of which are excellent tools - to be productive when authoring DAX queries. Now that a native DAX query authoring experience is built into Power BI, using those tools is optional and no longer a blocker in scenarios where your organisation has a strict policy on the use of third party tools or extentions. Learn more.
Microsoft Fabric
-
Microsoft Fabric was designed to alleviate the friction that commonly delays or prevents analytics and AI projects from getting off the ground. However, without some form of source control and automated deployment management, it is challenging to track, control, and carefully release changes. Fabric's continuous integration and continuous deployment (CI/CD) features simplify the management of these aspects. It's still in preview but recently added git integration and deployment pipeline support for more Fabric items, introduced a deployment pipelines API, and rolled out some usability improvements. Learn more.
-
We know that, due to strict security requirements, many organisations held off on adopting Microsoft Fabric. The primary concern was often that the network security controls in Fabric were insufficient for highly sensitive and confidential data. A few months ago, Fabric introduced a set of preview features to address these concerns: Private Links, Trusted Workspace Access, and Managed Private Endpoints. All three of these features are now generally available so you can reassess the feasibility of using Fabric in your organisation. Learn more.
-
Task Flows is a new feature introduced as a public preview that enables you to create living, breathing data solution designs that map Fabric items to solution components in a visual form. You can use this feature help with designining and implementing your Fabric data solutions as well as to communicate the implementation details of an existing solution to others in your team. It's a simple, user-friendly tool that can go a long way. Learn more.
Azure Analytics and AI
-
Delta Lake Liquid Clustering is generally available in the Databricks Intelligent Cloud Platform. Unlike Z-Ordering and manual partitioning, Liquid Clustering evolves with your data to ensure that it's always stored and queried optimally. One of the biggest benefits is you'll no longer need to finely tune the data layout of each table, freeing your data engineering resources up for other higher value tasks. Learn more.
-
Integrating one or more Azure ML and AI innovations into your Azure workloads can be challenging. That's why Microsoft released a framework to provide guidance on how you can transition generative AI experiments and prototypes to solutions that are suitable for production use. The framework is called the 'Azure OpenAI chat baseline architecture in an Azure landing zone' and it covers several aspects including a design for shared resources, governance strategy, and operational monitoring and intelligence. Learn more.
-
In March earlier this year, support for Regular Expressions (Regex) was added to Azure SQL databases. This is an incredibly powerful capability that enables you to flexibly search, transform, and validate text beyond what was previously possible with T-SQL. In a recent episode of Data Exposed, the Azure Data team explores this new feature with demos of usage scenarios. Learn more.
Open-Source Analytics
-
Delta 3.2 was released and it includes significant additions and improvements. One of these is the introduction of Liquid Clustering which auto-optimises the data layout of your tables as they grow and evolve. There's now support for 'Type Widening' which means you can alter byte or short data type columns to have the next largest data type. Support for broader data type widening options is coming in Spark 4.0. Another noteworthy improvement is preview support for the Apache Hudi open table format in UniForm tables which means we are getting ever-closer to table format agnostic open lakehouses. Learn more.
-
DuckDB added support for Hugging Face which means over 150,000 datasets that they host can be queried directly from DuckDB SQL. There's an article with lots of examples to show you how it works including querying multiple files at a time. Imagine the possibilities. Learn more.
-
What's the difference between Delta Lake and a Data Lake? The Delta Lake team have written a great article that compares the two in the context key areas including ACID transactions, performance, file listing, metadata, data skipping, and more. They may be similarly named but they are fundamentally different. Learn more.
Did You Find This Useful?
Get notified when we post something new by following us on X and LinkedIn.