PySpark Tutorial : How to Use createGlobalTempView in PySpark | Share Views Across Sessions

How to Use createGlobalTempView() in PySpark | Step-by-Step Guide

How to Use createGlobalTempView() in PySpark | Step-by-Step Guide

Author: Aamir Shahzad

Published: March 2025

📘 Introduction

The createGlobalTempView() function in PySpark allows you to register a DataFrame as a global temporary view that can be accessed across multiple Spark sessions. Unlike createTempView(), it persists for the lifetime of the Spark application and is stored in the global_temp database.

🧾 Sample Dataset

Name            Department     Salary
Aamir Shahzad   Engineering     5000
Ali             Sales           4000
Raza            Marketing       3500
Bob             Sales           4200
Lisa            Engineering     6000

🔧 Create DataFrame in PySpark

from pyspark.sql import SparkSession

spark = SparkSession.builder.appName("createGlobalTempViewExample").getOrCreate()

data = [
    ("Aamir Shahzad", "Engineering", 5000),
    ("Ali", "Sales", 4000),
    ("Raza", "Marketing", 3500),
    ("Bob", "Sales", 4200),
    ("Lisa", "Engineering", 6000)
]

columns = ["Name", "Department", "Salary"]
df = spark.createDataFrame(data, columns)
df.show()

✅ Expected Output

+-------------+-----------+------+
|         Name| Department|Salary|
+-------------+-----------+------+
|Aamir Shahzad|Engineering|  5000|
|          Ali|      Sales|  4000|
|         Raza|  Marketing|  3500|
|          Bob|      Sales|  4200|
|         Lisa|Engineering|  6000|
+-------------+-----------+------+

📌 Create a Global Temporary View

df.createOrReplaceGlobalTempView("employee_global_view")

📊 Query the Global Temp View in Current Session

result1 = spark.sql("SELECT * FROM global_temp.employee_global_view")
result1.show()

✅ Expected Output

+-------------+-----------+------+
|         Name| Department|Salary|
+-------------+-----------+------+
|Aamir Shahzad|Engineering|  5000|
|          Ali|      Sales|  4000|
|         Raza|  Marketing|  3500|
|          Bob|      Sales|  4200|
|         Lisa|Engineering|  6000|
+-------------+-----------+------+

🔁 Access the Global View from a New Session

new_session = SparkSession.builder.appName("AnotherSession").getOrCreate()

result2 = new_session.sql("SELECT * FROM global_temp.employee_global_view")
result2.show()

✅ Expected Output

+-------------+-----------+------+
|         Name| Department|Salary|
+-------------+-----------+------+
|Aamir Shahzad|Engineering|  5000|
|          Ali|      Sales|  4000|
|         Raza|  Marketing|  3500|
|          Bob|      Sales|  4200|
|         Lisa|Engineering|  6000|
+-------------+-----------+------+

📈 Aggregate Example: Average Salary by Department

result3 = spark.sql("""
  SELECT Department, AVG(Salary) AS Avg_Salary
  FROM global_temp.employee_global_view
  GROUP BY Department
""")
result3.show()

✅ Expected Output

+-----------+----------+
| Department|Avg_Salary|
+-----------+----------+
|  Marketing|    3500.0|
|Engineering|    5500.0|
|      Sales|    4100.0|
+-----------+----------+

💡 Key Points

  • createGlobalTempView() enables view sharing across sessions.
  • The view must be accessed using the global_temp database.
  • Global temp views last until the Spark application terminates.

🎥 Watch the Video Tutorial

Watch on YouTube

© 2025 Aamir Shahzad. All rights reserved.

Visit TechBrothersIT for more tutorials.