How to Use createTempView in PySpark | Run SQL Queries on DataFrames | PySpark Tutorial

How to Use createTempView() in PySpark | Step-by-Step Guide

How to Use createTempView() in PySpark | Step-by-Step Guide

Author: Aamir Shahzad

Published: March 2025

📘 Introduction

The createTempView() function registers a PySpark DataFrame as a temporary SQL view. This allows you to execute SQL queries using Spark SQL, combining the power of SQL with the flexibility of DataFrames.

🧾 Sample Dataset

Name           Department     Salary
Aamir Shahzad   Engineering     5000
Ali             Sales           4000
Raza            Marketing       3500
Bob             Sales           4200
Lisa            Engineering     6000

🔧 Create DataFrame in PySpark

from pyspark.sql import SparkSession

# Create Spark session
spark = SparkSession.builder.appName("createTempViewExample").getOrCreate()

# Sample data
data = [
    ("Aamir Shahzad", "Engineering", 5000),
    ("Ali", "Sales", 4000),
    ("Raza", "Marketing", 3500),
    ("Bob", "Sales", 4200),
    ("Lisa", "Engineering", 6000)
]

columns = ["Name", "Department", "Salary"]
df = spark.createDataFrame(data, columns)
df.show()

✅ Expected Output

+-------------+-----------+------+
|         Name| Department|Salary|
+-------------+-----------+------+
|Aamir Shahzad|Engineering|  5000|
|          Ali|      Sales|  4000|
|         Raza|  Marketing|  3500|
|          Bob|      Sales|  4200|
|         Lisa|Engineering|  6000|
+-------------+-----------+------+

📌 Register DataFrame as a Temp View

# Register as temporary SQL view
df.createOrReplaceTempView("employee_view")

📊 Run SQL Queries on the Temp View

Example 1: Select All Records

result1 = spark.sql("SELECT * FROM employee_view")
result1.show()

✅ Output

+-------------+-----------+------+
|         Name| Department|Salary|
+-------------+-----------+------+
|Aamir Shahzad|Engineering|  5000|
|          Ali|      Sales|  4000|
|         Raza|  Marketing|  3500|
|          Bob|      Sales|  4200|
|         Lisa|Engineering|  6000|
+-------------+-----------+------+

Example 2: Filter by Department

result2 = spark.sql("""
  SELECT Name, Salary
  FROM employee_view
  WHERE Department = 'Sales'
""")
result2.show()

✅ Output

+-----+------+
| Name|Salary|
+-----+------+
|  Ali|  4000|
|  Bob|  4200|
+-----+------+

Example 3: Average Salary by Department

result3 = spark.sql("""
  SELECT Department, AVG(Salary) AS Avg_Salary
  FROM employee_view
  GROUP BY Department
""")
result3.show()

✅ Output

+-----------+----------+
| Department|Avg_Salary|
+-----------+----------+
|  Marketing|    3500.0|
|Engineering|    5500.0|
|      Sales|    4100.0|
+-----------+----------+

💡 Key Points

  • createTempView() is used to register a DataFrame as a temporary view.
  • You can use spark.sql() to run standard SQL queries on that view.
  • The view is session-scoped and will be removed when the Spark session ends.

🎥 Watch the Video Tutorial

Watch on YouTube

© 2025 Aamir Shahzad. All rights reserved.

Visit TechBrothersIT for more tutorials.