Understanding String Search Using Regular Expressions in Python 3.8.5 Series

If you look at tpe, you can see the data <class'pandas.core.series'>.
I would like to do a string search for this data.

 match=Series_data [Series_data=="F:\\"]

and the result was obtained.(So far it looks)

In fact, the drive name is indefinite, sometimes F, sometimes other alphabets.
So I'm wondering if I can write [] in regular expressions.

 match=Series_data [re.search(':\\', Series_data)]

I don't think I can do this, and I get the following message:

 raise error("badescape(end of pattern)",
re.error: badescape(end of pattern) at position 1

Is the format of the regular expression wrong?I can't tell if I can't use regular expressions in the first place.

Thank you for your cooperation.

===========================================================================

Thank you to everyone who responded.I think I can understand
match=Series_data [Series_data.str.contains(r':\\')]
I tried and found that it worked.
It's a bit disgusting to get an error if you don't write r':\\' (requires two \'s) even though you're putting r on it, but it worked for now.
When you write r':\',
SyntaxError: EOL while scanning string literal
The error appears.

python python3 pandas regular-expression

2022-09-30 19:51

3 Answers

If you want to express the file path, it might be easier to use / instead of \.

 match=Series_data [re.search(':/', Series_data)]

2022-09-30 19:51

Pandas includes in and regular expressions for Pandas.
Normal? It's a little different from Python, or something else.

s1=pd.Series (['aaa', 'bbb', 'abb', 'ccc'])
s1 [s1.str.match('.bb')]
# 1 bbb
# 2 abb
# dtype:object

s1 [s1.str.contains('bbb')] #regex=False and other options available
# 1 bbb
# 2 abb
# dtype:object

Note:
https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Series.str.contains.html

2022-09-30 19:51

If you are using re.search, you can use pandas.Series.apply.

import pandas as pd
import re

Series_data=pd.Series([
  'F:\\Documents\\Newsletters\\Summer 2018.pdf',
  'C:\\Projects\\apilibrary\\apilibrary.sln',
  '\\Windows\\Python.exe',
  'data_path'
])

>>Series_data [Series_data.apply(lambdax:bool(re.search(r':\\',x))]
0 F:\Documents\Newsletters\Summer 2018.pdf
1 C:\Projects\apilibrary\apilibrary.sln
dtype:object

You can also use os.path.splitdrive instead of re.search (ntpath.splitdrive on non-Windows operating systems).

import pandas as pd
import ntpath

>> Series_data [Series_data.apply (lambdax:bool(ntpath.splitdrive(x)[0]))]
0 F:\Documents\Newsletters\Summer 2018.pdf
1 C:\Projects\apilibrary\apilibrary.sln

2022-09-30 19:51

If you have any answers or tips

Popular Tags

python x 4647

android x 1593

java x 1494

javascript x 1427

c x 927

c++ x 878

ruby-on-rails x 696

php x 692

python3 x 685

html x 656