PySpark Tutorial : How to Get a Column in PySpark by using DataFrame with Dot or DataFrame with Square Brackets

How to Get a Column in PySpark using Dot or Square Brackets

How to Get a Column in PySpark using Dot or Square Brackets

In PySpark, you can access DataFrame columns using either dot notation or square brackets. This tutorial shows both methods with examples and outputs.

1. Create Spark Session

from pyspark.sql import SparkSession

spark = SparkSession.builder \
    .appName("Get Columns in PySpark") \
    .getOrCreate()

2. Create Sample DataFrame

data = [
    (1, "Aamir Shahzad", 35),
    (2, "Ali Raza", 30),
    (3, "Bob", 25),
    (4, "Lisa", 28)
]

columns = ["id", "name", "age"]

df = spark.createDataFrame(data, columns)
df.show()
+---+--------------+---+ | id| name|age| +---+--------------+---+ | 1| Aamir Shahzad| 35| | 2| Ali Raza| 30| | 3| Bob| 25| | 4| Lisa| 28| +---+--------------+---+

3. Get a Column in PySpark

Method 1: Using Dot Notation

df.select(df.name).show()
+--------------+ | name| +--------------+ | Aamir Shahzad| | Ali Raza| | Bob| | Lisa| +--------------+

Method 2: Using Square Brackets

df.select(df["name"]).show()
+--------------+ | name| +--------------+ | Aamir Shahzad| | Ali Raza| | Bob| | Lisa| +--------------+

4. Filter with Column Reference

Using Dot Notation

df.filter(df.age > 28).show()
+---+--------------+---+ | id| name|age| +---+--------------+---+ | 1| Aamir Shahzad| 35| | 2| Ali Raza| 30| +---+--------------+---+

Using Square Brackets

df.filter(df["age"] == 30).show()
+---+---------+---+ | id| name|age| +---+---------+---+ | 2| Ali Raza| 30| +---+---------+---+

5. View All Column Names

df.columns
['id', 'name', 'age']

6. Summary

  • df.name is easier and more concise, great for simple column access.
  • df["name"] is more flexible and safer when column names include spaces or special characters.
  • df.columns returns a list of all column names.

📺 Watch the Full Tutorial Video

▶️ Watch on YouTube

Author: Aamir Shahzad

© 2024 PySpark Tutorials. All rights reserved.