Conditionally deleted from R cross aggregation

Asked 2 years ago, Updated 2 years ago, 32 views

I'd like to delete lines with X*Y less than 10 as a result of cross counting (table(X,Y) in R. Is there a way to do it all at once?

r

2022-09-30 13:45

1 Answers

First of all, I would like to confirm that the table function returns frequency (how many lines of combination of = X and Y exist in the original data), not X times Y.Here's an example: mpg:10.4 and cyl:8 have two combinations in the original data, so 2 is displayed. 10.4*8=83.2 does not appear.

library(tidyverse)
head(table(mtcars$mpg,mtcars$cyl))

       4 6 8
  10.4 0 0 2
  13.3 0 0 1
  14.3 0 0 1
  14.7 0 0 1
  15   0 0 1
  15.2 0 0 2

Am I correct in understanding that all of the items you want to count and check if they are less than 10 are frequency? Assume Yes and proceed.

library(tidyverse)
# Package containing cross aggregation function tabyl
library (janitor)
# Penguin dataset package (for this illustration)
library (palmerpenguins)
# For illustration purposes, 40 is the reference value instead of 10.
criteria1<-40
high_low<-function(x){x>=criteria1}

You can do cross-summarization on a table, but Janitor::tabyl is more useful for subsequent processing like this.Taking Biscoe Island's 2007 example, 44 penguins were observed.

tab1<-tabyl (penguins, island, year)
tab1
island 2007 2008 2009
    Biscoe 446460
     Dream463444
 Torgersen 201616

Torgersen Island will be deleted.

tab2<-tab1%>%filter(if_any(c(2:ncol(.)), high_low))
tab2
island 2007 2008 2009
 Biscoe 446460
  Dream463444

Dream and Torgersen islands will be deleted.

tab3<-tab1%>%filter(if_all(c(2:ncol(.)), high_low))
tab3
island 2007 2008 2009
 Biscoe 446460

I will make it from tab2 that I made earlier.

tab4<-tab2
options(warn=-1)
tab4 [tab4<criteria1]<-NA
tab4
island 2007 2008 2009
 Biscoe 446460
  Dream46 NA44

You can choose which treatment is good for you.


2022-09-30 13:45

If you have any answers or tips


© 2024 OneMinuteCode. All rights reserved.