We would like to extract only the necessary information from the body of the Outlook email.
The part you want to extract has a mix of numbers (up to 5 integers and 1st place in the minority) and you are currently able to extract only numbers.
I would appreciate it if you could teach me how to write regular expressions that can be used either way.
import pandas as pd
import re
data="""Name:\r\nsatou:\r\nLast Score: 14686.5\r\nThis Score: 8992.5\r\n\r\nName:\r\ntanaka:\r\nLast Score: 778.5\r\nThis Score: 82.5\r\nName:\r\nSuzuki:\r\nLast Score: -r\n"\n"Last Score:\n"\n"\n"Last Score: 9."\r\n"\n"\n"\n"\r\n"\r\n"\n"
ptn=r "Name:\r\n(.*?):\r\nLast Score\s:\s*([\d.]+)\s* points\r\nthis time\s:\s*([\d.]+)\s*(\w+)"
output_data=pd.DataFrame(re.findall(ptn,data,re.M|re.DOTALL))
output_data=output_data.rename (columns={0:"Name", 1:"Last Score", 2:"This Time Score", 3:"Unit")
import pandas as pd
import re
pd.set_option('display.unicode.east_asian_width', True)
data = """名前:\r\nsatou:\r\n前回の点数 : 14686.5 点\r\n今回の点数 : 8992.5 点\r\n\r\n名前:\r\ntanaka:\r\n前回の点数 : 778.5 点\r\n今回の点数 : 82.5 点\r\n\r\n名前:\r\nsuzuki:\r\n前回の点数 : - 点\r\n今回の点数 : 9.5 点\r\n\r\n"""
output_data = (
pd.DataFrame([
m.groupdict() for m in re.finditer(
r ' Name :? it's Proctor & Gamble (n ; and gt. + his name?) : If n '
r ' If the number of the last : His. * (last time? it's Proctor & Gamble ; & g ;. +) If * If the number of n '
r ' As part of the number of guns : * (? it's Proctor & Gamble ; the current ;. +? and the number of) * '
a)])), ' (Unit? it's Proctor & Gamble ; and gt ; n) (? =) '
print(output_data)
#
the number of name last current the number of unit
Twisted 1 4 6 8 6 5 899, 2, point 5
1 78. 982 the Consent aka a point 5
2 where Suzuki 29 . 5
([\d.] if the problem is that suzuki's "last score:-point" cannot be extracted in regular expression.+) Rewrite
to match the minus sign as well as numbers.
Score\s:\s*([\d.]+)\s*
Score\s:\s*([\d.]+|-)\s*
score\s:\s*(\d{1,5}\.\d?|-)\s*score
score\s:\s*(\d{1,5}(?:\.\d)?|-)\s*score
Score\s:\s*([\d.]+)\s*
Score\s:\s*([\d.]+|-)\s*
score\s:\s*(\d{1,5}\.\d?|-)\s*score
score\s:\s*(\d{1,5}(?:\.\d)?|-)\s*score
916 When building Fast API+Uvicorn environment with PyInstaller, console=False results in an error
613 GDB gets version error when attempting to debug with the Presense SDK (IDE)
573 rails db:create error: Could not find mysql2-0.5.4 in any of the sources
618 Uncaught (inpromise) Error on Electron: An object could not be cloned
© 2024 OneMinuteCode. All rights reserved.