pyspark.pandas.groupby.GroupBy.mean#
- GroupBy.mean(numeric_only=False)[source]#
Compute mean of groups, excluding missing values.
- Parameters
- numeric_onlybool, default False
Include only float, int, boolean columns.
New in version 3.4.0.
Changed in version 4.0.0.
- Returns
- pyspark.pandas.Series or pyspark.pandas.DataFrame
Examples
>>> df = ps.DataFrame({'A': [1, 1, 2, 1, 2], ... 'B': [np.nan, 2, 3, 4, 5], ... 'C': [1, 2, 1, 1, 2], ... 'D': [True, False, True, False, True]})
Groupby one column and return the mean of the remaining columns in each group.
>>> df.groupby('A').mean().sort_index() B C D A 1 3.0 1.333333 0.333333 2 4.0 1.500000 1.000000