Load Data to Warehouse Table from ADLS Gen2 Using Dataflow Gen2 | Microsoft Fabric Tutorial

Load Data to Warehouse Table from ADLS Gen2 Using Dataflow Gen2 | Microsoft Fabric Tutorial

Load Data to Warehouse Table from ADLS Gen2 Using Dataflow Gen2

In this Microsoft Fabric tutorial, you'll learn how to create a Dataflow Gen2 to load data from Azure Data Lake Storage Gen2 (ADLS Gen2) directly into a Warehouse table. Dataflows offer a graphical interface for no-code or low-code ETL and are ideal for self-service data preparation and reuse across the Fabric ecosystem.

✅ Setting up Dataflow Gen2 in Microsoft Fabric

To begin, go to your Microsoft Fabric workspace and click “New > Dataflow Gen2”. Choose Blank Dataflow or start from a template. You’ll enter the Power Query web-based editor, which allows you to pull, transform, and publish data.

Dataflow Gen2 supports various sources and destinations and is optimized for working with Fabric Lakehouses, Warehouses, and cloud file sources.

✅ Connecting to ADLS Gen2 as a Source

To load data from ADLS Gen2:

  • Click “+ Add New Source” and select Azure Data Lake Storage Gen2.
  • Provide the account URL and navigate to the desired folder or file.
  • Select the format (CSV, Parquet, etc.), and preview the dataset.
Once the data is previewed, click “Transform Data” to begin preparing it for loading.

✅ Transforming and Mapping Data for Warehouse Tables

Inside the Power Query editor:

  • Rename columns to match your Warehouse schema.
  • Change data types (e.g., convert string to datetime, float, boolean, etc.).
  • Apply filters, remove nulls, or enrich with calculated columns.
After transformation, go to “Destination” and choose a Fabric Warehouse table. If the table does not exist, you can optionally create it from your transformed schema.

✅ Publishing and Running the Dataflow

Once the transformations and mappings are complete:

  1. Click “Publish” to save the dataflow to your workspace.
  2. Run the dataflow manually or schedule it to refresh at a specific interval.
  3. Monitor load progress and view row counts and error logs from the Dataflow Monitoring tab.
The loaded data is now available for querying via T-SQL or Power BI directly from your Fabric Warehouse.

✅ Tips for Efficient and Scalable Ingestion

  • Use incremental refresh: Configure filters for date/time columns to process only new or updated rows.
  • Partition large files: Break input files into manageable chunks to optimize performance.
  • Reuse dataflows: Promote common logic to shared dataflows across multiple teams or projects.
  • Profile data: Use the “Column profile” tool in Power Query to identify anomalies before loading.
  • Test transformations: Always validate small samples before running a full ingest on large datasets.

🎬 Watch the Full Tutorial

Blog post written with the help of ChatGPT.