How to Use createGlobalTempView()
in PySpark | Step-by-Step Guide
Author: Aamir Shahzad
Published: March 2025
📘 Introduction
The createGlobalTempView()
function in PySpark allows you to register a DataFrame as a global temporary view that can be accessed across multiple Spark sessions. Unlike createTempView()
, it persists for the lifetime of the Spark application and is stored in the global_temp
database.
🧾 Sample Dataset
Name Department Salary
Aamir Shahzad Engineering 5000
Ali Sales 4000
Raza Marketing 3500
Bob Sales 4200
Lisa Engineering 6000
🔧 Create DataFrame in PySpark
from pyspark.sql import SparkSession
spark = SparkSession.builder.appName("createGlobalTempViewExample").getOrCreate()
data = [
("Aamir Shahzad", "Engineering", 5000),
("Ali", "Sales", 4000),
("Raza", "Marketing", 3500),
("Bob", "Sales", 4200),
("Lisa", "Engineering", 6000)
]
columns = ["Name", "Department", "Salary"]
df = spark.createDataFrame(data, columns)
df.show()
✅ Expected Output
+-------------+-----------+------+
| Name| Department|Salary|
+-------------+-----------+------+
|Aamir Shahzad|Engineering| 5000|
| Ali| Sales| 4000|
| Raza| Marketing| 3500|
| Bob| Sales| 4200|
| Lisa|Engineering| 6000|
+-------------+-----------+------+
📌 Create a Global Temporary View
df.createOrReplaceGlobalTempView("employee_global_view")
📊 Query the Global Temp View in Current Session
result1 = spark.sql("SELECT * FROM global_temp.employee_global_view")
result1.show()
✅ Expected Output
+-------------+-----------+------+
| Name| Department|Salary|
+-------------+-----------+------+
|Aamir Shahzad|Engineering| 5000|
| Ali| Sales| 4000|
| Raza| Marketing| 3500|
| Bob| Sales| 4200|
| Lisa|Engineering| 6000|
+-------------+-----------+------+
🔁 Access the Global View from a New Session
new_session = SparkSession.builder.appName("AnotherSession").getOrCreate()
result2 = new_session.sql("SELECT * FROM global_temp.employee_global_view")
result2.show()
✅ Expected Output
+-------------+-----------+------+
| Name| Department|Salary|
+-------------+-----------+------+
|Aamir Shahzad|Engineering| 5000|
| Ali| Sales| 4000|
| Raza| Marketing| 3500|
| Bob| Sales| 4200|
| Lisa|Engineering| 6000|
+-------------+-----------+------+
📈 Aggregate Example: Average Salary by Department
result3 = spark.sql("""
SELECT Department, AVG(Salary) AS Avg_Salary
FROM global_temp.employee_global_view
GROUP BY Department
""")
result3.show()
✅ Expected Output
+-----------+----------+
| Department|Avg_Salary|
+-----------+----------+
| Marketing| 3500.0|
|Engineering| 5500.0|
| Sales| 4100.0|
+-----------+----------+
💡 Key Points
createGlobalTempView()
enables view sharing across sessions.- The view must be accessed using the
global_temp
database. - Global temp views last until the Spark application terminates.