Let’s explore Spark DataFrames
Dataframe analysis in PySpark!
Configs: Initial Configuration / Spark settings for your Jupyter notebook!
Basics: Read, write, generate a sample DF! In case you don’t have big data sets, build some quickly! Let’s get some standard reading, writing and partitioning examples down.
Analysis: Let’s do some common dataframe manipulations. I’ll show what I expect are some common column formatting issues, occurrences, and operations you’re likely to see. We’ll cover aggregations, grouping, and ordering.
Transformations: (coming soon!)
Joins: Join!