Definition of a function that determines whether a particular string is included

Asked 2 years ago, Updated 2 years ago, 254 views

For the data frame containing the following URL, we would like to make a flag by determining whether it matches a specific string [when it matches perfectly] [when it matches a part] [when it matches NULL] [other].The original data frame is as follows:

df=pd.DataFrame({'full_url':[')https://www.ABCDec.jp/shop/default.aspx', 'https://www.google.com/search', 'http://search.yahoo.co.jp/', 'https://www.ABCDec.jp/abcdSHOP/', 'https://www.google.co.jp/', 'None', 'https://www.google.com','https://www.google.co.jp/kfjdyes8&2222', 'https://www.google.com/search&ghsysusie?=sieueu3304', 'https://search.yahoo.com/smerudy&wkwjs8736?=dunx']})

For this data frame,
■ If 'full_url' matches [complete] below, type = 'natural search'
 urls_G=['https://www.google.com', 'https://www.google.co.jp/','https://www.google.com/search']

■ If 'full_url' matches [part] below, type = 'natural search'
 urls_Y=['search.yahoo.co.jp', 'search.yahoo.com']

■ If 'full_url' is 'None', type = 'No input'

■ Except for the above three patterns, type = 'Other'

I would like to add a flag (new column ttype )).

The contents of the data frame I would like to create are as follows.

ans=pd.DataFrame({'full_url':[')https://www.ABCDec.jp/shop/default.aspx', 'https://www.google.com/search', 'http://search.yahoo.co.jp/', 'https://www.ABCDec.jp/abcdSHOP/', 'https://www.google.co.jp/', 'None', 'https://www.google.com','https://www.google.co.jp/kfjdyes8&2222', 'https://www.google.com/search&ghsysusie?=sieueu3304', 'https://search.yahoo.com/smerudy&wkwjs8736?=dunx'],
                   'type': ['Other', 'Natural Search', 'Other', 'Natural Search', 'No Input', 'Natural Search', 'Other', 'Other', 'Natural Search']})

I thought about the code below, but

urls_G = ['https://www.google.com', 'https://www.google.co.jp/',' https://www.google.com/search' ]

urls_Y=['search.yahoo.co.jp', 'search.yahoo.com']

default_type(x):
    if x == 'None':
        return 'No input'
    elif x in urls_G:# string exact match determined
        return 'Natural Search'
    elif x.str.contains(urls_Y):# Determines partial string match
        return 'Natural Search'
    else:
        return 'Other'
        

df['type'] = df['full_url'].apply(get_type)

The following error appears in the partial match, which is causing me to stumble.
>'str' object has no attribute'str'

I would appreciate it if you could advise me on how to deal with this error.
Thank you for your cooperation.

python pandas

2022-09-30 21:53

1 Answers

elif x.str.contains(urls_Y):#Determines partial string match

instead of

elif urls_Y[0] in x or urls_Y[1] in x:#Determines partial string match

Or something like that. The in operator can determine if a string contains a specific string.

'str' object has no attribute' str'

This means that x is a string type in the first place, so there is no attribute called .str.


2022-09-30 21:53

If you have any answers or tips


© 2024 OneMinuteCode. All rights reserved.