Revolutionize your Development with Microsoft Fabric

The Tech Platform
Jun 12, 2023
10 min read

Updated: Jun 13, 2023

Data is everywhere. It is generated by the devices we use, the applications we build, and the interactions we have. Data can help us improve communication, collaboration, and productivity in our organizations. Data can also help us create and use AI experiences that can transform how we work and live.

Revolutionize your Development with Microsoft Fabric

But to make the most of data, we need a platform that can help us collect, process, analyze, and visualize data in an easy and efficient way. We need a platform that can handle different types of data, from structured to unstructured, from batch to real-time. We need a platform that can integrate various tools and services, from data movement to data science, from real-time analytics to business intelligence. We need a platform that can scale with our needs, from small projects to enterprise-wide solutions.

That platform is Microsoft Fabric.

Table of Content:

What is Microsoft Fabric?
Why Microsoft Fabric is Created?
How Fabric is built on SaaS Foundation?
Microsoft Fabric Components
Pricing Options Available in Microsoft Fabric
Conclusion

What is Microsoft Fabric?

Microsoft Fabric is a complete analytics platform that combines all the necessary tools for data analysis in one place. It's a software-as-a-service solution that makes it easy to manage and analyze data, from storing and processing it to performing advanced analytics and creating insightful reports.

With Fabric, you don't have to use different services from different companies. Instead, you get a single, integrated platform that simplifies your analytics tasks and provides a seamless user experience. It's designed to make data analysis easier and more accessible for everyone.

Why Microsoft Fabric is Created?

Microsoft Fabric is created to handle the challenges and opportunities of the era of AI. AI is transforming how we work and live, enabling us to create and use AI experiences that are powered by generative AI and language model services, such as Azure OpenAI Service.

But to power these AI experiences, we need a lot of clean data from a well-organized and well-connected data system. However, most organizations’ analytics systems are very complicated and have disconnected services. This is because there are many different data and AI tools and services from different companies.

Customers or organizations have put together a lot of different tools and services from different companies by themselves and pay a lot of money and time to make them work together.

Microsoft Fabric fixes this problem by bringing together new and existing tools from Power BI, Azure Synapse, and Azure Data Explorer into one product. These tools and services are then presented in different ways so that customers or organizations can customize them as per their tasks.

With Fabric, customers can use one product with one experience and design that has everything they need for a complete data and AI solution. And because the product is online, everything is already connected and working well, and users can sign up quickly and get real benefits right away.

How Fabric is built on a SaaS foundation?

Microsoft Fabric brings together different components from Power BI, Azure Synapse, and Azure Data Explorer into a unified environment. It includes Data Engineering, Data Factory, Data Science, Data Warehouse, Real-Time Analytics, and Power BI.

The integration offers several benefits:

Extensive analytics capabilities: Fabric provides a wide range of integrated analytics tools.
Familiar user experiences: Users can leverage shared experiences that are easy to learn and navigate.
Asset accessibility and reusability: Developers can easily access and reuse assets across different components.
Unified data lake: Fabric allows data to be retained in its original location while utilizing preferred analytics tools.
Centralized administration and governance: Administrators can manage and govern the entire Fabric environment from a centralized platform.

With Microsoft Fabric, all the data and services are seamlessly integrated, allowing creators to focus on their work without worrying about infrastructure management. It simplifies the process and enhances productivity by providing a unified and governed environment.

Microsoft Fabric Components

Below we have a detailed description of each component of Microsoft Fabric:

1. Data Factory

Data Factory, within Microsoft Fabric, offers a modern data integration experience that allows you to ingest, prepare, and transform data from a wide range of sources, including databases, data warehouses, lakehouses, real-time data, and more.

One of the key features of Data Factory in Microsoft Fabric is the Fast Copy capability, which enables high-speed data movement for both dataflows and data pipelines. With Fast Copy, you can quickly move data between your preferred data stores, facilitating fast and efficient data transfer. This feature is particularly beneficial for bringing data into your lakehouse and data warehouse in Microsoft Fabric, enabling seamless analytics.

Data Factory implements two primary features: dataflows and pipelines.

Dataflows allow you to utilize over 300 transformations available in the dataflows designer, making data transformation easier and more flexible than ever before. The dataflows designer includes intelligent, AI-based data transformations, providing advanced capabilities for data manipulation.
Data pipelines provide rich data orchestration capabilities out-of-the-box, allowing you to compose flexible data workflows that meet the specific needs of your enterprise. Data pipelines enable you to efficiently orchestrate and manage the flow of data throughout your data integration processes.

2. Data Science

In Microsoft Fabric, Data Science will allow users to perform complete end-to-end data science workflows. These workflows are designed to enrich data and provide valuable business insights. With Microsoft Fabric, you have access to a wide range of activities that cover the entire data science process.

In Microsoft Fabric, Data Science offers below features:

Data Wrangler: Data Wrangler is a user-friendly tool in the form of a notebook that allows you to explore and analyze data. It provides a grid-like display of data with dynamic summary statistics and easy-to-use data cleansing operations. With just a few clicks, you can perform common data-cleaning tasks and generate reusable code scripts that can be saved in the notebook.
Experiment: A machine learning experiment is the main way to organize and control all related machine learning runs.
Model: A machine learning model is a file that learns from data to recognize patterns. It is trained using an algorithm to understand and make predictions based on the provided data set.
Run: In MLflow, a run represents a single execution of model code. It is tracked and organized within experiments.

Consider the below diagram which will illustrate how the data science process works:

1. Problem formulation/ideation: Data Science users, business users, and analysts all work on the same platform. This seamless integration enables easy data sharing and collaboration across different roles. Analysts can share Power BI reports and datasets with data science practitioners, making problem formulation and hand-offs much smoother.

2. Data discovery and pre-processing: Users can Interact with data in OneLake using the Lakehouse item. This will help you to easily read from OneLake into Pandas dataframes for exploration.

3. Experiment and model: With tools like PySpark/Python, SparklyR/R, and notebooks, users can handle machine learning model training. Microsoft Fabric supports a wide range of ML algorithms and libraries, allowing users to leverage popular libraries like Scikit Learn.

4. Enrich and operationalize: Microsoft Fabric's Notebook can perform batch scoring using open-source libraries or the scalable Spark Predict function. ML models can be registered in the Microsoft Fabric model registry, enabling easy operationalization. Users can enrich their data and make predictions using these models.

5. Insight and reporting: Predictions can be written to OneLake and seamlessly integrated into Power BI reports using the Power BI Direct Lake mode. Scheduled notebooks ensure up-to-date predictions without manual data loading or refreshing.

How to Access Data Science Experiences?

To access these Data Science experiences, Microsoft Fabric provides a dedicated Data Science Home page. From this page, users can discover and access various resources that are relevant to their data science tasks. For example, you can create machine learning experiments, develop models, and work with notebooks. Additionally, you have the option to import existing notebooks directly from the Data Science Home page.

3. Real-Time Analytics

Real-Time Analytics in Microsoft Fabric allows organizations to effectively scale and streamline their analytics solutions while making data accessible to both citizen data scientists and advanced data engineers. Real-time analytics have become crucial in various enterprise scenarios like cybersecurity, asset tracking, predictive maintenance, and more.

Real-Time Analytics simplifies data integration and reduces complexity. It provides quick access to data insights by automatically streaming and indexing data from any source or format. It also allows on-demand query generation and visualizations, making it easier for users to analyze data. With Real-Time Analytics, you can focus on your analytics solutions and seamlessly scale up as your data and query requirements increase.

Microsoft Fabric offers three features in Real-time analytics:

Eventstream: It allows you to capture, transform, and send real-time events to different destinations without needing to write code. This feature simplifies the process of handling real-time events and makes it easy to route them where they need to go.
KQL Database: It provides a storage and management system for your data. When you load data into a KQL database, you can access it in OneLake (a data lake in Microsoft Fabric) and use it in other Fabric experiences. It offers a convenient way to store and organize your data.
KQL Queryset: This feature enables you to run queries on your data stored in the KQL database. You can view and customize the results of your queries. The KQL query set allows you to save your queries for future use, export and share them with others, and even generate Power BI reports based on your query results. It helps you analyze and gain insights from your data efficiently.

4. Data Engineering

Microsoft Fabric's Data Engineering offers a robust Spark platform that simplifies large-scale data transformation tasks and democratizes data through the lakehouse. With Microsoft Fabric Spark's integration with Data Factory, data engineers can schedule and orchestrate notebooks and Spark jobs seamlessly.

In simple terms, Microsoft Fabric provides an excellent environment for data engineers to work with Spark. They can easily perform complex data transformations on a large scale and make data accessible to a wider audience through the lakehouse. The integration with Data Factory enables convenient scheduling and orchestration of notebooks and Spark jobs, ensuring efficient data processing and analysis.

Microsoft Fabric offers a range of data engineering features that facilitate seamless data accessibility, organization, and quality. From the data engineering homepage, you can:

Create a lakehouse and manage your data: Microsoft Fabric provides a lakehouse environment where you can efficiently store and manage your data. This enables easy access, storage, and organization of your data assets.
Design data pipelines: With Microsoft Fabric, you can design and configure pipelines to efficiently copy data into your lakehouse. These pipelines streamline the process of ingesting data from various sources, ensuring its availability for analysis and processing.
Apache Spark Jon definition: Microsoft Fabric integrates with Apache Spark and allows you to define and submit batch or streaming jobs to a Spark cluster. This enables you to perform scalable data processing and analytics on your data within the lakehouse environment.
Notebook: Microsoft Fabric supports the use of notebooks, such as Jupyter Notebooks, where you can write code for data ingestion, preparation, and transformation. Notebooks provide an interactive and collaborative environment for executing code and performing data engineering tasks.

5. Data Warehouse

Data Warehouse offers excellent performance and scalability for SQL-based operations. It achieves this by separating the computing power from the data storage, allowing each component to scale independently. This ensures that the system can handle large amounts of data and process queries efficiently.

Furthermore, the Data Warehouse experience uses the Delta Lake format, which is an open and widely adopted data storage format. Storing data in this format provides advantages like data versioning, transactional capabilities, and improved data integrity.

Depending on different workloads, Microsoft Fabric provides two types of warehouses:

SQL Endpoint of the Lakehouse
Synapse Data Warehouse

Type 1: SQL Endpoint of the Lakehouse

A SQL Endpoint is created automatically from a Lakehouse. It provides a way to access the data stored in the Lakehouse using SQL commands. The SQL Endpoint is read-only, meaning you can't modify the data directly through it. Any changes to the data need to be made using the "Lake" view of the Lakehouse with Spark.

With the SQL Endpoint, users can use a subset of SQL commands to define and query data objects within the Lakehouse. However, they cannot manipulate the data directly.

Some of the actions you can perform with the SQL Endpoint include:

Querying tables that reference data stored in the Delta Lake folders within the Lakehouse.
Creating views, inline table-valued functions (TVFs), and procedures to encapsulate semantics and business logic using T-SQL.
Managing permissions on these objects to control access and security.

In a Microsoft Fabric workspace, a SQL Endpoint is labeled as "SQL Endpoint" under the Type column. Each Lakehouse has an automatically generated SQL Endpoint that can be accessed using familiar SQL tools such as SQL Server Management Studio, Azure Data Studio, or the Microsoft Fabric SQL Query Editor. These tools provide a familiar interface for interacting with the data through the SQL Endpoint.

Type 2: Synapse Data Warehouse

In a Microsoft Fabric workspace, a Synapse Data Warehouse, also known as a Warehouse, is labeled as 'Warehouse' under the Type column. A Warehouse provides support for transactions, Data Definition Language (DDL), and Data Manipulation Language (DML) queries.

A Warehouse offers more capabilities compared to a SQL Endpoint. While a SQL Endpoint only allows read-only queries and the creation of views and table-valued functions (TVFs), a Warehouse enables full transactional support for both DDL and DML operations. It is created by the customer and can be populated using various data ingestion methods such as COPY INTO, Pipelines, Dataflows, or cross-database ingestion options like CREATE TABLE AS SELECT (CTAS), INSERT..SELECT, or SELECT INTO.

Difference: A SQL Endpoint vs Warehouse

Consider the below diagram which illustrates the practical difference between SQL Endpoint and Warehouse.

Microsoft Fabric Data Warehouse: SQL Endpoint vs Warehouse

Now, the below table will provide you the theoretical difference between SQL Endpoint and Synapse Data Warehouse:

Factors	SQL Endpoint of the Lakehouse	Synapse Data Warehouse
Creation	Automatically generated from a Lakehouse	Manually created by the customer
Data Source	Delta tables created through Spark in Lakehouse	Customer-controlled tables, loading, and transforming data
Discoverability	Automatically discoverable as tables	Tables and data created by the customer
Read/Write Access	Read-only access	Full transactional T-SQL capabilities for reading and writing.
Relational Layer	Builds a relational layer on top of Lakehouse data	Offers a 'traditional' data warehousing experience
Connection and Query Language	SQL connection string and T-SQL	T-SQL commands and Microsoft Fabric portal
Usage	Designed for BI needs and data serving	Suitable for enterprise data warehousing and querying

6. Power BI

Power BI is a Business Intelligence platform that enables business owners to easily and efficiently access data in Microsoft Fabric. With Power BI, users can retrieve and analyze data from Fabric in a quick and intuitive manner. This empowers business owners to make informed decisions based on accurate and up-to-date data, ultimately driving better business outcomes.

Pricing Options Available in Microsoft Fabric

Microsoft Fabric offers two pricing options:

Azure SKUs
Microsoft 365 SKUs.

1. Azure SKUs: Azure SKUs are the recommended option for Microsoft Fabric. They have flexible billing based on your usage, allowing you to scale your capacity as needed.

You pay per second and can pause or resume your usage as required. The pricing for Azure SKUs starts at $0.18 per compute unit (CU) per hour.

For example, an F2 SKU would cost around $262.8 per month. You can purchase Azure SKUs from the Azure portal.

2. Microsoft 365 SKUs: Microsoft 365 SKUs are Power BI SKUs that also support Microsoft Fabric. They are billed on a monthly or yearly basis, with a monthly commitment.

The pricing for Microsoft 365 SKUs depends on the number of users and the type of license you choose. To purchase a Microsoft 365 SKU, you can visit the Microsoft 365 admin center.

Conclusion

In conclusion, Microsoft Fabric is a comprehensive and integrated platform that brings together various components from Power BI, Azure Synapse, Azure Data Explorer, and more. It provides a unified environment for data engineering, data science, data warehousing, real-time analytics, and business intelligence.

Fabric offers seamless integration and shared experiences, making it easier for users to access and leverage data across different roles and tools. It enables efficient collaboration and data sharing between data engineers, data scientists, business analysts, and other stakeholders.