PySpark Sorting Explained | ASC vs DESC | Handling NULLs with asc_nulls_first & desc_nulls_last | PySpark Tutorial

PySpark Sorting Explained: ASC vs DESC | Handling NULLs with asc_nulls_first & desc_nulls_last

PySpark Sorting Explained: ASC vs DESC | Handling NULLs with asc_nulls_first & desc_nulls_last

In this tutorial, you’ll learn how to sort DataFrames in PySpark using ascending and descending order, while controlling NULL value placement using asc_nulls_first() and desc_nulls_last(). Real examples are provided for clarity.

1️⃣ Create Spark Session

from pyspark.sql import SparkSession
from pyspark.sql.functions import col

spark = SparkSession.builder.appName("SortingExamples").getOrCreate()

2️⃣ Create Sample Data

data = [
    (1, "Aamir", 50000),
    (2, "Ali", None),
    (3, "Bob", 45000),
    (4, "Lisa", 60000),
    (5, "Zara", None)
]
columns = ["id", "name", "salary"]
df = spark.createDataFrame(data, columns)
df.show()

3️⃣ Sort by ASC (default)

df.orderBy(col("salary").asc()).show()

4️⃣ Sort by ASC with NULLs First

df.orderBy(col("salary").asc_nulls_first()).show()

5️⃣ Sort by ASC with NULLs Last

df.orderBy(col("salary").asc_nulls_last()).show()

6️⃣ Sort by DESC

df.orderBy(col("salary").desc()).show()

7️⃣ DESC with NULLs First

df.orderBy(col("salary").desc_nulls_first()).show()

8️⃣ DESC with NULLs Last

df.orderBy(col("salary").desc_nulls_last()).show()

📺 Watch the Full Tutorial