How to Use createGlobalTempView()
in PySpark | Step-by-Step Guide
Author: Aamir Shahzad
Published: March 2025
๐ Introduction
The createGlobalTempView()
function in PySpark allows you to register a DataFrame as a global temporary view that can be accessed across multiple Spark sessions. Unlike createTempView()
, it persists for the lifetime of the Spark application and is stored in the global_temp
database.
๐งพ Sample Dataset
Name Department Salary
Aamir Shahzad Engineering 5000
Ali Sales 4000
Raza Marketing 3500
Bob Sales 4200
Lisa Engineering 6000
๐ง Create DataFrame in PySpark
from pyspark.sql import SparkSession
spark = SparkSession.builder.appName("createGlobalTempViewExample").getOrCreate()
data = [
("Aamir Shahzad", "Engineering", 5000),
("Ali", "Sales", 4000),
("Raza", "Marketing", 3500),
("Bob", "Sales", 4200),
("Lisa", "Engineering", 6000)
]
columns = ["Name", "Department", "Salary"]
df = spark.createDataFrame(data, columns)
df.show()
✅ Expected Output
+-------------+-----------+------+
| Name| Department|Salary|
+-------------+-----------+------+
|Aamir Shahzad|Engineering| 5000|
| Ali| Sales| 4000|
| Raza| Marketing| 3500|
| Bob| Sales| 4200|
| Lisa|Engineering| 6000|
+-------------+-----------+------+
๐ Create a Global Temporary View
df.createOrReplaceGlobalTempView("employee_global_view")
๐ Query the Global Temp View in Current Session
result1 = spark.sql("SELECT * FROM global_temp.employee_global_view")
result1.show()
✅ Expected Output
+-------------+-----------+------+
| Name| Department|Salary|
+-------------+-----------+------+
|Aamir Shahzad|Engineering| 5000|
| Ali| Sales| 4000|
| Raza| Marketing| 3500|
| Bob| Sales| 4200|
| Lisa|Engineering| 6000|
+-------------+-----------+------+
๐ Access the Global View from a New Session
new_session = SparkSession.builder.appName("AnotherSession").getOrCreate()
result2 = new_session.sql("SELECT * FROM global_temp.employee_global_view")
result2.show()
✅ Expected Output
+-------------+-----------+------+
| Name| Department|Salary|
+-------------+-----------+------+
|Aamir Shahzad|Engineering| 5000|
| Ali| Sales| 4000|
| Raza| Marketing| 3500|
| Bob| Sales| 4200|
| Lisa|Engineering| 6000|
+-------------+-----------+------+
๐ Aggregate Example: Average Salary by Department
result3 = spark.sql("""
SELECT Department, AVG(Salary) AS Avg_Salary
FROM global_temp.employee_global_view
GROUP BY Department
""")
result3.show()
✅ Expected Output
+-----------+----------+
| Department|Avg_Salary|
+-----------+----------+
| Marketing| 3500.0|
|Engineering| 5500.0|
| Sales| 4100.0|
+-----------+----------+
๐ก Key Points
createGlobalTempView()
enables view sharing across sessions.- The view must be accessed using the
global_temp
database. - Global temp views last until the Spark application terminates.