The following data frames
a1, a2, a3, b1
1,1,2,4
3,4,2,1
3,6,2,9
...
I'll think about .
I am thinking of calculating c1=a1/b1, c2=a2/b1, and c3=a3/b1 to the right of the data frame.
In this case, there are three variables, so I can still write them by hand (for R, mutate (c1=a1/b1, c2=a2/b1, c3=a3/b1)
). I would like to know how both Python(pandas) and R(dplyr) can handle them when the variables increase.
Thank you for your cooperation.
If it's Pandas, it's like this.
df1=pd.DataFrame([1,1,2,4],
[3,4,2,1],
[3,6,2,9]],
columns=["a1", "a2", "a3", "b1"])
df2=df1.div(df1["b1", axis=0).drop("b1", axis=1)# Divide each column by b1
df2.columns=df2.columns.map(lambdas:s.replace("a", "c"))
df3=pd.concat([df1,df2],axis=1)#Connect to the right
If you want to do the same thing with R, then this is it.
df1<-data.frame(a1=c(1,3,3), a2=c(1,4,6), a3=c(2,2,2), b1=c(4,9)))
df2<-df1/df1$b1
df2$b1<-NULL
df2<-rename_all(df2, function(x)sub("a", "c", x))
df3<-cbind(df1,df2)
Anyway, I wrote it down.
How about this?
Python
import pandas as pd
df=pd.DataFrame({'a1':[1,3,3],'a2':[1,4,6],'a3':[2,2,2],'b1':[4,1,9]})
df.join(df.loc[:,:'a3'].div(df['b1'],axis=0).add_prefix('d_')))
# a1 a2 a3 b1 d_a1 d_a2 d_a3
#0 1 1 2 4 0.250000 0.250000 0.500000
#1 3 4 2 1 3.000000 4.000000 2.000000
#2 3 6 2 9 0.333333 0.666667 0.222222
R(dplyr) edition
library(dplyr)
df<-data.frame(a1=c(1,3,3), a2=c(1,4,6), a3=c(2,2,2), b1=c(4,1,9)))
df%>%mute(d=df[1:ncol(df)-1]/df$b1)
# a1 a2 a3 b1 d.a1 d.a2 d.a3
#1 1 1 2 4 0.2500000 0.2500000 0.5000000
#2 3 4 2 1 3.0000000 4.0000000 2.0000000
#3 3 6 2 9 0.3333333 0.6666667 0.2222222
© 2024 OneMinuteCode. All rights reserved.