I want to draw a histogram using R.
The variables covered are not sequential variables that make sense to the magnitude of the numbers, but industry codes that fall under the category of "name variables" (ex. Manufacturing: 01, Finance: 19, Food: 38 etc.).
Finally, I would like to make it a diagram where the frequency is displayed for each industry code, but since the x-axis is considered as an order variable, the number that is flying in the industry code (which does not exist in the first place) remains as a numerical value.
I don't know how to specify the options when drawing a histogram of the name variables, so please let me know.
r
From the Industry Code Table (Japan Standard Industrial Classification) create test data.
df<-data.frame(
"Industry" = c(
Banking, food and beverage retail, electricity, real estate transactions,
"Information service industry", "railway industry", "water industry", "food manufacturing industry"
),
"Industry Code" = c(
"62", "58", "33", "68", "39", "42", "36", "09"
)
)
# Sample from df["Industry Code"]
data<-df [sample(1:nrow(df), size=200, replace=T), "Industry Code"]
print(length(data))
# [1] 200
print (data[1:20])
# [1] "33" "39" "36" "33" "39" "36" "36" "68" "36" "58" "68" "09" "33" "09" "58"
# [16] "58" "39" "62" "58" "33"
Categorize the data with factor()
.
data<-factor(data,df[, "industry code"])
plot(data,ylim=c(0,35),xaxt="n")
text((1:(length(levels(data)))))))*1.2, par("usr")[3]-0.8,
labels=df[, "industry", srt=45, pos=2,xpd=T)
© 2024 OneMinuteCode. All rights reserved.