How to Write DataFrame to Azure SQL Table Using PySpark | PySpark JDBC write() Function Tutorial

How to Write DataFrame to Azure SQL Table Using PySpark

How to Write DataFrame to Azure SQL Table Using PySpark

In this tutorial, you'll learn how to save a PySpark DataFrame to an Azure SQL Database using the `write()` function and JDBC connector. We’ll cover all necessary steps including JDBC URL, credentials, and data formatting.

1️⃣ Step 1: Create Spark Session

from pyspark.sql import SparkSession
spark = SparkSession.builder.appName("WriteToAzureSQL").getOrCreate()

2️⃣ Step 2: Create Sample DataFrame

data = [
  (1, "Aamir Shahzad", "aamir@example.com", "USA"),
  (2, "Ali Raza", "ali@example.com", "Canada"),
  (3, "Bob", "bob@example.com", "UK"),
  (4, "Lisa", "lisa@example.com", "Germany")
]
columns = ["customer_id", "name", "email", "country"]

df = spark.createDataFrame(data, columns)
df.show()

3️⃣ Step 3: JDBC Configuration

jdbcHostname = "yourserver.database.windows.net"
jdbcPort = 1433
jdbcDatabase = "yourdatabase"
jdbcUsername = "sqladmin"
jdbcPassword = "YourPassword123!"

jdbcUrl = f"jdbc:sqlserver://{jdbcHostname}:{jdbcPort};database={jdbcDatabase};encrypt=true;trustServerCertificate=false;hostNameInCertificate=*.database.windows.net;loginTimeout=30"

connectionProperties = {
  "user": jdbcUsername,
  "password": jdbcPassword,
  "driver": "com.microsoft.sqlserver.jdbc.SQLServerDriver"
}

4️⃣ Step 4: Write DataFrame to Azure SQL Table

df.write \
  .mode("append") \
  .jdbc(url=jdbcUrl, table="customer", properties=connectionProperties)

print("✅ Data successfully written to Azure SQL Database 'customer' table.")

📺 Watch the Full Tutorial