Indexing in Azure Synapse Dedicated SQL Pool Explained- Columnstore Heap Clustered & Rebuild Tips | Azure Synapse Analytics Tutorial

Indexing in Azure Synapse Dedicated SQL Pool Explained – Columnstore, Heap, Clustered & Rebuild Tips

Indexing in Azure Synapse Dedicated SQL Pool Explained – Columnstore, Heap, Clustered & Rebuild Tips

📘 Introduction

Indexes in Azure Synapse Dedicated SQL Pool play a crucial role in optimizing query performance and storage. Synapse supports multiple types of indexes depending on your workload type—batch analytics, lookups, or staging data.

This blog covers:

  • Clustered Columnstore Index
  • Heap Tables
  • Clustered Index
  • Nonclustered Index
  • Index Rebuild Tips

🔹 1. Clustered Columnstore Index (Default & Best for Analytics)

Best suited for large, analytical workloads. This index compresses data and improves performance. It's the default if no index is specified.

CREATE TABLE dbo.SalesFact_Columnstore
(
    SaleID INT,
    CustomerID INT,
    Amount FLOAT,
    SaleDate DATE
)
WITH (
    CLUSTERED COLUMNSTORE INDEX,
    DISTRIBUTION = HASH(CustomerID)
);

🔹 2. Heap Table (No Index)

No indexes are applied. Ideal for fast loading in staging or transient tables where performance tuning is not required yet.

CREATE TABLE dbo.SalesFact_Heap
(
    SaleID INT,
    CustomerID INT,
    Amount FLOAT,
    SaleDate DATE
)
WITH (
    HEAP,
    DISTRIBUTION = ROUND_ROBIN
);

🔹 3. Clustered Index (Rowstore)

Stores data in a sorted row-based format. Best for small dimension or lookup tables and selective filters.

CREATE TABLE dbo.Customer_Detail
(
    CustomerID INT NOT NULL,
    FullName NVARCHAR(100),
    Region NVARCHAR(50)
)
WITH (
    CLUSTERED INDEX (CustomerID),
    DISTRIBUTION = HASH(CustomerID)
);

🔹 4. Nonclustered Index

Creates a separate index on one or more columns for faster lookups. Use selectively as each index adds overhead.

CREATE INDEX idx_Region ON dbo.Customer_Detail(Region);

🔄 5. Rebuilding Indexes (Especially for Columnstore)

Rebuilding restores compression and performance. Ideal for maintenance schedules, especially after heavy loads or updates.

-- Rebuild all indexes on a table
ALTER INDEX ALL ON dbo.SalesFact_Columnstore REBUILD;

-- Rebuild a specific partition (if partitioned)
-- ALTER INDEX ALL ON dbo.SalesFact_Columnstore REBUILD PARTITION = 1;

-- Optional compression setting
-- ALTER INDEX ALL ON dbo.SalesFact_Columnstore 
--   REBUILD PARTITION = 1 WITH (DATA_COMPRESSION = COLUMNSTORE);

✅ Best Practices

  • Use CLUSTERED COLUMNSTORE for large fact tables and analytics.
  • Use HEAP for fast, staging loads or temporary tables.
  • Use CLUSTERED INDEX for small dimension or lookup tables.
  • Use NONCLUSTERED INDEX for tuning specific query filters.
  • Rebuild indexes regularly to maintain performance.

⚠️ Note: Index rebuilds are offline operations — schedule them during maintenance windows.

📺 Watch the Video Tutorial

📚 Credit: Content created with the help of ChatGPT and Gemini.