Bash refers to keywords in a file and extracts parts common to the string.

Asked 1 years ago, Updated 1 years ago, 65 views

[Contents]
In bash, I would like to refer to the list with keywords (with Japanese, spaces, separated by new lines) file, compare the given string (with Japanese, space hyphen, etc.) with the list, and use it to replace or change the given string.

[Example]
List of keywords (hoge_list)

Chuo Ward, Tokyo
Meguro Ward, Tokyo
Suginami Ward, Tokyo
Yokohama City, Kanagawa Prefecture

Give String ($INPUT)

 "Asagaya 1-2-3 in Suginami Ward, Tokyo"

The common part of the keyword in the list is Suginami Ward, Tokyo, so extract that part

Asagaya 1-2-3

I would like to print the .

Thank you for your cooperation.

linux bash shellscript

2022-09-30 19:26

3 Answers

If you use hoge_list like a dictionary and try to find a match and delete it…, it's quite troublesome, so make it simpler,

  • Delete the string if it starts with Chuo Ward, Tokyo
  • Delete the string if it starts with Meguro Ward, Tokyo
  • Delete the string if it starts with Suginami Ward, Tokyo
  • If the string starts with Yokohama City, Kanagawa Prefecture, delete it

It is useful to use sed for 一致remove matches 削除

s|^Chuo Ward, Tokyo||
s|^Meguro Ward, Tokyo||
s|^Suginami Ward, Tokyo||
s|^Yokohama City, Kanagawa Prefecture||

Create a file named hoge_list.sed and

 echo 'Asagaya 1-2-3' in Suginami Ward, Tokyo | sed-foge_list.sed

and

Asagaya 1-2-3

is obtained.

Since hoge_list actually has quite a few lines, you want to automate the creation of hoge_list.sed based on it.
Including that part, the shell script looks like this:

#!/bin/bash

cathoge_list | sed-e's / \(.*\)/s |^\1||/'>hoge_list.sed

sed-foge_list.sed | sed-e's /^//'

Save this script as test.sh and run it.

$echo'Tokyo Suginami Ward Asagaya 1-2-3'|./test.sh
Asagaya 1-2-3
$ 

You can see the output:


2022-09-30 19:26

I tried Awk.

$INPUT='Asagaya 1-2-3 in Suginami Ward, Tokyo'
$ awk-vinput="$INPUT"'
    input~"^"$0{
      sub("^"$0"[\t]*", "", input)
      exit
    }END {print input}
  ' hoge_list

Asagaya 1-2-3    


2022-09-30 19:26

If only the bash function can be realized, I think it would be like the following.

#!/bin/bash
INPUT="Asagaya 1-2-3 in Suginami Ward, Tokyo"
shopt-extglob
list = $(<hoge_list)
pattern=${list//$'\n'/|}
echo${INPUT#@($pattern)}   


2022-09-30 19:26

If you have any answers or tips


© 2024 OneMinuteCode. All rights reserved.