This post is how to create a Spark Session
Imports
from pyspark.sql import SparkSession
Create the Spark Session
spark = SparkSession.builder.appName('pyspark_app_name').getOrCreate()
You can add any configs you wish during creation. You would add this before the “.getOrCreate()”.
You can see a list here
- .config(“spark.sql.jsonGenerator.ignoreNullFields”, “false”)
- When reading JSON you will not ignore NULL fields
- .config(“spark.sql.parquet.int96RebaseModeInWrite”, “CORRECTED”)
- Fixes issues in timestamps in write operations
- .config(“spark.sql.parquet.int96RebaseModeInRead”, “CORRECTED”)
- Fixes issues in timestamps in read operations