First question
import re
def commaParse(num):
return re.findall('(?=(\d{3}))',num)
a = commaParse('100000000')
print(a)
Result: ['100', '000', '000', '000', '000', '000', '000', '000']
Second question
import re
def commaParse(num):
return re.findall('(?=\d{3})',num)
a = commaParse('100000000')
print(a)
Results:['','','','','','','',']
Here's the question. You might think it's really nothing, but I can't find an answer no matter how much I look for it.
First of all, the difference in compilation options is the difference between (?=\d{3})
and (?=(\d{3}))
inside brackets, but I want to know why the results are so different.
In the first question, my expected answer should be ['100', '000', '000']
as a positive forward search, but I don't know why it's coming out like that...
And in the second question, I know that if it is matched with a positive search, it returns an empty string, so I thought ['','',']
would come out, but why more...
positive lookahead.
If you google it, it comes out very, very much, so I'll skip the explanation...
word(?=ahead)
Regular expression that finds the word word
but only word
with the word ahead
in front of it.
The regular expression you wrote doesn't have the pattern you want to find corresponding to word
for now! So it's a strange regular expression with unclear intent.
However, in the regular expression (\d{3})
, the ()
parentheses mean capturing, so they return this captured matching result as a value, which is not the intended match by the author!
In fact, the matching value for the regular expression in (?=)
for forward navigation is also used for subsequent matching because it does not consume .
So, contrary to the intention, seven matches are made up of 7 matches.
If you find and consume three as intended, you have to move on to the next number, but you can't actually consume them, so you move forward one by one and capture all the patterns. Like below.
The questioner didn't put the actual part he was looking for in the regular expression
The empty string immediately preceding the three highlighted numbers is the matched part.
The highlighted numbers are captured because of the ()
parentheses, so there are a total of 7 results.
Now, you'll understand why this second attempted regular expression without (?=\d{3})
parentheses has 7 bin values.
Just \d{3}
is enough to match 3 numbers as originally intended.
© 2024 OneMinuteCode. All rights reserved.