Calculated Columns in Pandas
Create new columns in pandas
Pandas has got two very useful functions called groupby
and transform
. In this TIL, I will demonstrate how to create new columns from existing columns.
First of all, I create a new data frame here.
df = pd.DataFrame( {'city':['London','London','Berlin','Berlin'], 'rent': [1000, 1400, 800, 1000]} )
which looks like
city | rent | |
---|---|---|
0 | London | 1000 |
1 | London | 1400 |
2 | Berlin | 800 |
3 | Berlin | 1000 |
I will create a new column called total, which will host the total rents of the corresponding cities.
df['total'] = df.groupby('city').transform('sum')
city | rent | total | |
---|---|---|---|
0 | London | 1000 | 2400 |
1 | London | 1400 | 2400 |
2 | Berlin | 800 | 1800 |
3 | Berlin | 1000 | 1800 |
Next, I will create a new column called percent which will contain the percentage
df['percent'] = df['rent']/df['total']
city | rent | total | percent | |
---|---|---|---|---|
0 | London | 1000 | 2400 | 0.416667 |
1 | London | 1400 | 2400 | 0.583333 |
2 | Berlin | 800 | 1800 | 0.444444 |
3 | Berlin | 1000 | 1800 | 0.555556 |
Planted:
by Lei Ma;
Dynamic Backlinks to
til/programming/pandas/pandas-new-column-from-other
:L Ma (2018). 'Calculated Columns in Pandas', Datumorphism, 05 April. Available at: https://datumorphism.leima.is/til/programming/pandas/pandas-new-column-from-other/.