+
K
DataFrame.distinct()
Returns a new DataFrame containing the distinct rows in the originating DataFrame.
DataFrame
Copied!1 df = df.distinct()
1
df = df.distinct()
DataFrame.drop_duplicates(subset=None)
Returns a new DataFrame with duplicate rows removed, optionally only considering certain columns.
Copied!1 2 df = df.drop_duplicates() df = df.drop_duplicates(["firstname", "lastname"])
1 2
df = df.drop_duplicates() df = df.drop_duplicates(["firstname", "lastname"])
DataFrame.dropna(how='any', thresh=None, subset=None)
Alias: DataFrame.na.dropna(how='any', thresh=None, subset=None)
DataFrame.na.dropna(how='any', thresh=None, subset=None)
Returns a new DataFrame omitting rows with null values.DataFrame.dropna() and DataFrameNaFunctions.drop() are aliases of each other.
DataFrame.dropna()
DataFrameNaFunctions.drop()
Parameters:
'any'
'all'
None
DataFrame.limit(number)
DataFrame.sort(*cols, **kwargs)
Alias: DataFrame.orderBy(*cols, **kwargs)
DataFrame.orderBy(*cols, **kwargs)
Column.asc()
F.asc(col)
Column.desc()
F.desc(col)