I want to expand the column containing integers horizontally and convert it to 10

Asked 2 years ago, Updated 2 years ago, 65 views

I have the following data frame df.

id category
1   3
2   2
3   3
4   1
5   2
6   2

This is

 id category 1 category 2 category 3
1   0         0         1
2   0         1         0
3   0         0         1
4   1         0         0
5   0         1         0
6   0         1         0

I would like to make it look like this.
This is now

df<-tidyr::spread(df, category, category)#horizontal expansion
df.id<-df[,1]#Separate because if you do it as it is, the id column will also be 1.
df.other<-df[,-1]
df.other [!is.na(df.other)] <-1
df.other [is.na(df.other)] <-0
df<-cbind (df.id, df.other)

Is there a smarter way to do this?
Please let me know.

r

2022-09-29 21:45

5 Answers

I don't know if it's smart or not, but the following is the case for using apply.

>df [paste('category', sort(unique(df$category))), sep=']<-0
>data.frame(t(apply(df,1,)
    function(r){
      r[paste('category', r['category', sep=')]<-1
      r [-c(2)]
    })))

  id category1 category2 category3
1  1         0         0         1
2  2         0         1         0
3  3         0         0         1
4  4         1         0         0
5  5         0         1         0
6  6         0         1         0

apply(df,1,...) to process data frames line by line.You have also removed the category column from the data frame in r[-c(2)].


2022-09-29 21:45

Is {tidyverse} like this?

tidyr::spread(df, category, category)%>%
  dplyr::mute_each (funs(ifelse(is.na(.), 0, 1)), -id)


2022-09-29 21:45

You can use the model.matrix of base:

x=data.frame (id=1:6, category=c(3,2,3,1,2,2))

Change the category to factor:

x$category=factor(x$category)

model.matrix (~catgory-1,data=x)

#   category1 category2 category3
# 1         0         0         1
# 2         0         1         0
# 3         0         0         1
# 4         1         0         0
# 5         0         1         0
# 6         0         1         0

-1 indicates 部分remove を and all contain the category level.


2022-09-29 21:45

I am the author of the madeums package.
If you have more than one column you want to convert to a dummy variable, try madeums.

You recently registered with a CRAN.


2022-09-29 21:45

This task is called creating a dummy variable, and packages such as dummy and caret have functions for them.My best recommendation is the madeums package.

#install.packages("madeums")
library (madeums)

df$category<-as.factor(df$category)#madeums() expands the factor horizontally to 10.
The first category column is also created with madeums(df,basal_level=TRUE)#basal_level=T

#   id category_1 category_2 category_3
# 1  1          0          0          1
# 2  2          0          1          0
# 3  3          0          0          1
# 4  4          1          0          0
# 5  5          0          1          0
# 6  6          0          1          0


2022-09-29 21:45

If you have any answers or tips


© 2024 OneMinuteCode. All rights reserved.