Calculated Columns in Pandas

Create new columns in pandas

Pandas has got two very useful functions called groupby and transform. In this TIL, I will demonstrate how to create new columns from existing columns.

First of all, I create a new data frame here.

df = pd.DataFrame( {'city':['London','London','Berlin','Berlin'], 'rent': [1000, 1400, 800, 1000]} )

which looks like

cityrent
0London1000
1London1400
2Berlin800
3Berlin1000

I will create a new column called total, which will host the total rents of the corresponding cities.

df['total'] = df.groupby('city').transform('sum')
cityrenttotal
0London10002400
1London14002400
2Berlin8001800
3Berlin10001800

Next, I will create a new column called percent which will contain the percentage

df['percent'] = df['rent']/df['total']
cityrenttotalpercent
0London100024000.416667
1London140024000.583333
2Berlin80018000.444444
3Berlin100018000.555556

Planted: by ;

Dynamic Backlinks to til/programming/pandas/pandas-new-column-from-other:

L Ma (2018). 'Calculated Columns in Pandas', Datumorphism, 05 April. Available at: https://datumorphism.leima.is/til/programming/pandas/pandas-new-column-from-other/.