Managed vs External Tables in Spark SQL Lake Database | Azure Synapse Analytics

Managed vs External Tables in Spark SQL Lake Database | Azure Synapse Analytics Explained

📘 Overview

In Azure Synapse Analytics, you can create both Managed and External Tables within a Lake Database using Spark SQL. Understanding the difference is essential for managing data, storage, and performance effectively.

📂 What Are Managed Tables?

Managed tables are those where both the metadata and the data are controlled by Synapse Spark. When you drop the table, the data and metadata are both deleted.

✅ Create a Managed Table

%%spark
CREATE TABLE lakehouse.managed_employee (
    id INT,
    name STRING,
    salary FLOAT
)

✔️ No LOCATION clause is specified — Spark decides where to store the data internally.

📁 What Are External Tables?

External tables store only the metadata in the Lake Database while the data resides externally in user-defined paths like ADLS Gen2.

✅ Create an External Table

%%spark
CREATE TABLE lakehouse.external_sales (
    id INT,
    product STRING,
    amount DOUBLE
)
USING DELTA
LOCATION 'abfss://data@yourstorageaccount.dfs.core.windows.net/sales/external/'

✔️ Data lives outside Spark's internal catalog. Deleting the table will NOT remove the data.

📊 Key Differences

Feature	Managed Table	External Table
Data Location	Managed by Synapse	Custom ADLS path
Drop Table Deletes Data	Yes	No
Usage	Quick prototyping	Production pipelines

📌 Best Practices

Use external tables for production where you need to retain data
Use managed tables for temporary analysis or rapid testing
Prefer Delta format for ACID support and versioning

📂 Example: Read External Table

%%sql
SELECT * FROM lakehouse.external_sales;

📺 Watch the Full Video Explanation

📚 Credit: Content created with the help of ChatGPT and Gemini.

Welcome To TechBrothersIT

Label

Managed vs External Tables in Spark SQL Lake Database | Azure Synapse Analytics Tutorial