3 Things: Analytics and AI Highlights from October 2024
8th November 2024 . By Michael A
This month's '3 Things' blog post includes topics ranging from the ability to edit Direct Lake models in Power BI Desktop to the availability of the Azure Multimodal AI and LLM Solution Accelerator.
Read on as we highlight three things for each of the four technology areas that you should be aware of from last month.
Power BI
-
If you were waiting for dynamic format strings for measures to become generally available before using it in your production reports, your wait is over. It was a preview feature for over a year which means it's had rigorous testing in the wild. Dynamic format strings enable you to conditionally control the formatting of your measures using DAX. For example, you might want a measure that returns monetary values prefixed with a different currency symbol depending on the country a user selects in your reports. Learn more.
-
It's now possible to edit Direct Lake models in Power BI Desktop. This feature also enables enterprise development as you can export the Direct Lake model to a Power BI Project (PBIP) file. That means you can get your Direct Lake models into Git for source control and manage the development life cycle of these models more effectively in your Fabric solutions. Learn more.
-
Four INFO.VIEW DAX functions have been added which enable you to create self-documenting semantic models. Collectively, they provide information about a semantic model's tables, columns, relationships, and measures. Learn more.
Microsoft Fabric
-
The Native Execution Engine is a feature of the Spark Compute that powers the Data Engineering and Data Science experiences in Fabric. It enables your Spark notebooks and jobs to run significantly faster without you needing to make changes to your code. Best of all, Microsoft just announced that this feature comes at no additional cost. That makes the decision to enable it a no-brainer as it will simultaneously speed up your Spark workloads and save you money. Learn more.
-
It often comes as a surprise that the data stored in OneLake isn't accessible without at least one active Fabric capacity. This is by design because the transaction costs for reading from and writing to OneLake are billed through Fabric Capacity Units (CUs). Once you understand that, it makes a lot more sense. But, did you know you can access data stored in OneLake from a different capacity to the one it's stored in even when that capacity is paused? The trick is to access the data through OneLake Shortcuts. The Fabric team published an article that explains how this works. Learn more.
-
It's almost impossible to read about lakehouse architecture without coming across the concept of Medallion Architectures. The Microsoft Fabric team recently published an article that presents some considerations and suggestions for how you can optimise your bronze (i.e. raw), silver (i.e. cleansed and enriched), and gold (i.e. curated) layers in Fabric when implementing a medallion architecture. Learn more.
Azure Analytics and AI
-
For many years companies have been combining Databricks with Power BI to meet their data, analytics, and AI needs. It makes perfect sense to have a native Databricks to Power BI integration, and the good news is this just reached the general availability milestone. This means it's hardened and ready for use in your production workloads. With just a few clicks, you can create a Power BI semantic model from tables in your Databricks Unity Catalog and use the rich Power BI web browser experience to quickly build your reports. Learn more.
-
Databricks introduced a batch large language model (LLM) inference capability for Mosaic AI model serving. This feature aims to solve the problem of applying LLMs to large documents, which is often challenging due to the overhead of one-at-a-time inference. The ability to do LLM inference in bulk provides cost efficiencies in LLM inference scenarios such as data extraction, data transformation, and bulk content generation. Learn more.
-
The Azure AI Services team introduced the 'Azure Multimodal AI & LLM Solution Accelerator' which serves as an excellent point of reference for all your AI and LLM use cases. This includes content summarisation, data extraction, classification, and enrichment. It also covers scenarios including call centre analysis, document processing, insurance claim processing, and customer email processing. Read the detailed introduction on the Azure AI Services blog and follow the GitHub repo to explore the examples. Learn more.
Open-Source Analytics and AI
-
DuckDB has analytics-optimised support for multi-version concurrency. What does that actually mean? Well, it ensures that if more than one process is updating a table, they do not see each other's changes until the respective transaction are successfully committed. In a recent blog post, the DuckDB team do a great job of explaining how this works in DuckDB and what the benefits are for your workloads. Learn more.
-
Unity Catalog started off as a metastore (or metadata catalogue) that was exclusive to Databricks. Recently, Unity Catalog was open-sourced, making it the 'industries first open-source catalogue for data and AI governance'. An article on the Unity Catalog blog explores how Apache Spark integrates with it through Open APIs. You'll learn how this integration enables you to perform create, read, update, and delete (CRUD) operations on you tables registered in Unity Catalog through its open-source REST APIs. Learn more.
-
Daft, the unified engine for data engineering, analytics, and AI, recently gained a SQL interface called 'daft-sql'. Similar to Apache Spark, this means you can choose between querying and transforming your data with the DataFrame or SQL API, and any combination of the two. What's great to see is that functions in the DataFrame API have conveniently named equivalents in the SQL interface. Learn more.
Did You Find This Useful?
Get notified when we post something new by following us on X and LinkedIn.