PySpark Tutorial: How to Use dtypes
in PySpark
Get DataFrame Column Names and Types | Step-by-Step Guide
Learn how to use dtypes
in PySpark to quickly inspect column names and their corresponding data types — a must-know feature for any data engineer or analyst.
📘 Introduction
Understanding the structure of your DataFrame is essential for building reliable data pipelines. The dtypes
attribute helps you list out each column with its corresponding data type in a quick and readable format.
🔧 PySpark Code Example
from pyspark.sql import SparkSession
# Create SparkSession
spark = SparkSession.builder.appName("PySpark dtypes Example").getOrCreate()
# Sample data
data = [
("Aamir Shahzad", 85, 90.5, True),
("Ali Raza", 78, 83.0, False),
("Bob", 92, 95.2, True),
("Lisa", 80, 87.8, False)
]
# Define columns
columns = ["Name", "Math_Score", "Science_Score", "Passed"]
# Create DataFrame
df = spark.createDataFrame(data, columns)
# Show the DataFrame
df.show()
# Get column names and data types
print("Column Names and Data Types (dtypes):")
print(df.dtypes)
# Formatted output
print("\nFormatted Output of Column Data Types:")
for col_name, data_type in df.dtypes:
print(f"Column: {col_name}, Type: {data_type}")
📊 Original DataFrame Output
+-------------+----------+-------------+-------+
| Name |Math_Score|Science_Score|Passed |
+-------------+----------+-------------+-------+
|Aamir Shahzad| 85| 90.5| true|
|Ali Raza | 78| 83.0| false|
|Bob | 92| 95.2| true|
|Lisa | 80| 87.8| false|
+-------------+----------+-------------+-------+
📥 dtypes Attribute Output
[('Name', 'string'),
('Math_Score', 'bigint'),
('Science_Score', 'double'),
('Passed', 'boolean')]
✅ Formatted Column Type Output
Column: Name, Type: string
Column: Math_Score, Type: bigint
Column: Science_Score, Type: double
Column: Passed, Type: boolean
💡 Why Use dtypes
?
- Quickly inspect DataFrame schema during exploration and debugging.
- Useful in dynamic transformations where data types matter (e.g., casting).
- A faster alternative to
printSchema()
when you just need names and types.