DataFlow vs Copy Activity | How to Join Two Files & Create Output Using Data Flow in Azure Synapse
📘 Overview
Azure Synapse Analytics provides two powerful ways to move and transform data: Copy Activity and Data Flow Activity. While Copy Activity is ideal for simple file movement, Data Flows allow for complex transformations such as joining, filtering, and aggregating data before writing it to the destination.
🔄 Copy Activity vs Data Flow: Key Differences
Feature | Copy Activity | Data Flow Activity |
---|---|---|
Use Case | Data movement (copy files/tables) | Data transformation (joins, filters, derived columns) |
UI-based Logic | No | Yes |
Performance Tuning | Limited | More control (partitioning, caching) |
Join Support | No | Yes |
🛠️ Use Case: Join Two Files & Write Output
We demonstrate how to use Data Flow Activity to join two CSV files stored in ADLS Gen2 and output the joined result to a new file.
✅ Step 1: Create Linked Services
- Linked Service to your Azure Data Lake Gen2
✅ Step 2: Create Source Datasets
- Dataset 1:
customers.csv
- Dataset 2:
orders.csv
✅ Step 3: Design the Data Flow
- In Synapse Studio, go to the Data Flows tab
- Add two source transformations for customers and orders
- Add a Join transformation: Join on
customer_id
- Use a Select transformation to choose columns
- Add Sink transformation to write output to ADLS Gen2 in CSV format
✅ Step 4: Create and Trigger Pipeline
1. Go to Integrate tab → New pipeline
2. Drag in a Data Flow activity and select the one you created
3. Debug and publish
4. Trigger manually or schedule
📂 Output Example
The resulting file contains joined customer and order information, and is saved as a new CSV file in your configured output path in ADLS.
📌 Best Practices
- Use Data Flow for any transformation-heavy pipelines
- Parameterize paths and filters for reusability
- Use debugging to test logic before publishing
🎯 When to Use Each
- Use Copy Activity for fast, simple data transfers
- Use Data Flow Activity when joining, filtering, or reshaping data
📺 Watch the Full Video Tutorial
📚 Credit: Content created with the help of ChatGPT and Gemini.