PySpark: Compare Two Schemas

To compare two dataframe schemas in PySpark, we can utilize the set operations in python.

def schema_diff(schema1, schema2):

    return {
        'fields_in_1_not_2': set(schema1) - set(schema2),
        'fields_in_2_not_1': set(schema2) - set(schema1)

