I want Python to use regular expressions and get strings with initials and endings

Asked 1 years ago, Updated 1 years ago, 405 views

Run Environment

Windows 10 Python 3.X

What do you want to do

We would like to list the following strings using the regular expression re.findall to obtain up to two units: numeric values and ms and MB.I could get up to the number, but I couldn't get the credit.

In order to make the string look like the following list, I want to get both initials and endings at the same time, but the way to do it is
I don't know. It may be a rudimentary question, but I would appreciate it if you could let me know.

 'Duration: 123.45 ms\tBuild Duration: 123 ms\tMemory Size: 123 MB\tMax Memory Used: 123 MB\tInit Duration: 1234.56 ms'
['123.45ms', '123ms', '123MB', '123MB', '1234.56ms']

Tried

After learning about regular expressions, I was able to get numbers first.

 string='Duration: 123.45 ms\tBuild Duration: 123 ms\tMemory Size: 123 MB\tMax Memory Used: 123 MB\tInit Duration: 1234.56 ms'
stringlist=re.findall(r'\d+\.\d+|\d+',string)

Next, I know I'll use $ when I end it, so I coded it as follows, but I couldn't get anything as a list.

 string='Duration: 123.45 ms\tBuild Duration: 123 ms\tMemory Size: 123 MB\tMax Memory Used: 123 MB\tInit Duration: 1234.56 ms'
stringlist=re.findall(r'\d+\.\d+ms$|\d+ms$',string)

python regular-expression

2022-12-17 17:24

2 Answers

import re

string = 'Duration: 123.45 ms\tBuild Duration: 123 ms\tMemory Size: 123 MB\tMax Memory Used: 123 MB\tInit Duration: 1234.56 ms'
US>stringlist=re.findall(r'(\d+(?\.\d+)?+?)(?=\t|$), string)
print(stringlist)

# ['123.45ms', '123ms', '123MB', '1234.56ms']

When using re.split()

 stringlist=re.split(r':|\t', string)[1::2]
print(stringlist)

# ['123.45ms', '123ms', '123MB', '1234.56ms']


2022-12-18 00:13

I don't mind anything, but I thought it would be better to use a Match object.( Contrary to what I want to do, I don't use findall)

  • Unsigned
  • 0 Not omitted
  • No scientific notation
  • One fixed space between units
  • Units are ms/MB only
  • Don't look at the characters that follow in units
  • Not in units
import re
string = 'Duration: 123.45 ms\tBuild Duration: 123 ms\tMemory Size: 123 MB\tMax Memory Used: 123 MB\tInit Duration: 1234.56 ms'
stringlist = [m[0] for min re.finder('(\d+(?:\.\d*)?)(?:(ms|MB)))?',string)]
print(stringlist)

# ['123.45ms', '123ms', '123MB', '1234.56ms']
  • Number and units can be retrieved separately (by making it a Match object)


2022-12-18 07:04

If you have any answers or tips


© 2024 OneMinuteCode. All rights reserved.