PySpark Tutorial : How to Use createGlobalTempView in PySpark | Share Views Across Sessions

How to Use createGlobalTempView() in PySpark | Step-by-Step Guide

How to Use createGlobalTempView() in PySpark | Step-by-Step Guide

Author: Aamir Shahzad

Published: March 2025

๐Ÿ“˜ Introduction

The createGlobalTempView() function in PySpark allows you to register a DataFrame as a global temporary view that can be accessed across multiple Spark sessions. Unlike createTempView(), it persists for the lifetime of the Spark application and is stored in the global_temp database.

๐Ÿงพ Sample Dataset

Name            Department     Salary
Aamir Shahzad   Engineering     5000
Ali             Sales           4000
Raza            Marketing       3500
Bob             Sales           4200
Lisa            Engineering     6000

๐Ÿ”ง Create DataFrame in PySpark

from pyspark.sql import SparkSession

spark = SparkSession.builder.appName("createGlobalTempViewExample").getOrCreate()

data = [
    ("Aamir Shahzad", "Engineering", 5000),
    ("Ali", "Sales", 4000),
    ("Raza", "Marketing", 3500),
    ("Bob", "Sales", 4200),
    ("Lisa", "Engineering", 6000)
]

columns = ["Name", "Department", "Salary"]
df = spark.createDataFrame(data, columns)
df.show()

✅ Expected Output

+-------------+-----------+------+
|         Name| Department|Salary|
+-------------+-----------+------+
|Aamir Shahzad|Engineering|  5000|
|          Ali|      Sales|  4000|
|         Raza|  Marketing|  3500|
|          Bob|      Sales|  4200|
|         Lisa|Engineering|  6000|
+-------------+-----------+------+

๐Ÿ“Œ Create a Global Temporary View

df.createOrReplaceGlobalTempView("employee_global_view")

๐Ÿ“Š Query the Global Temp View in Current Session

result1 = spark.sql("SELECT * FROM global_temp.employee_global_view")
result1.show()

✅ Expected Output

+-------------+-----------+------+
|         Name| Department|Salary|
+-------------+-----------+------+
|Aamir Shahzad|Engineering|  5000|
|          Ali|      Sales|  4000|
|         Raza|  Marketing|  3500|
|          Bob|      Sales|  4200|
|         Lisa|Engineering|  6000|
+-------------+-----------+------+

๐Ÿ” Access the Global View from a New Session

new_session = SparkSession.builder.appName("AnotherSession").getOrCreate()

result2 = new_session.sql("SELECT * FROM global_temp.employee_global_view")
result2.show()

✅ Expected Output

+-------------+-----------+------+
|         Name| Department|Salary|
+-------------+-----------+------+
|Aamir Shahzad|Engineering|  5000|
|          Ali|      Sales|  4000|
|         Raza|  Marketing|  3500|
|          Bob|      Sales|  4200|
|         Lisa|Engineering|  6000|
+-------------+-----------+------+

๐Ÿ“ˆ Aggregate Example: Average Salary by Department

result3 = spark.sql("""
  SELECT Department, AVG(Salary) AS Avg_Salary
  FROM global_temp.employee_global_view
  GROUP BY Department
""")
result3.show()

✅ Expected Output

+-----------+----------+
| Department|Avg_Salary|
+-----------+----------+
|  Marketing|    3500.0|
|Engineering|    5500.0|
|      Sales|    4100.0|
+-----------+----------+

๐Ÿ’ก Key Points

  • createGlobalTempView() enables view sharing across sessions.
  • The view must be accessed using the global_temp database.
  • Global temp views last until the Spark application terminates.

๐ŸŽฅ Watch the Video Tutorial

Watch on YouTube

© 2025 Aamir Shahzad. All rights reserved.

Visit TechBrothersIT for more tutorials.