How to Display Data in PySpark Using show() Function | PySpark Tutorial for Beginners

How to Use show() Function in PySpark | Step-by-Step Guide

How to Use show() Function in PySpark | Step-by-Step Guide

The show() function in PySpark allows you to display DataFrame contents in a readable tabular format. It’s ideal for quickly checking your data or debugging your transformations.

What is show() in PySpark?

The show() function is a simple way to view rows from a DataFrame. By default, it displays up to 20 rows and limits long strings to 20 characters.

Common Use Cases

1. Show Default Rows (First 20 Rows)

df.show()

Displays the first 20 rows and truncates long strings.

2. Show a Specific Number of Rows

df.show(5)

Displays only the first 5 rows of the DataFrame.

3. Show Full Column Content (No Truncation)

df.show(truncate=False)

Displays full content in each column, without cutting off long strings.

4. Truncate Column Content After N Characters

df.show(truncate=10)

Limits column text to 10 characters, useful for large text fields.

5. Show Rows in Vertical Format

df.show(vertical=True)

Displays rows in a vertical layout, which is helpful for wide DataFrames or debugging.

Summary of Options

  • df.show(): Shows 20 rows with default truncation.
  • df.show(n): Shows the first n rows.
  • df.show(truncate=False): Shows full column content.
  • df.show(truncate=n): Truncates text after n characters.
  • df.show(vertical=True): Displays data vertically.

🎥 Watch the Video Tutorial

Prefer watching a step-by-step guide? Watch my video tutorial explaining show() in PySpark:

▶ Watch on YouTube

Author: Aamir Shahzad | PySpark Tutorial for Beginners