You want to save a string that you scrap with scrapy to your data using re.sub. Hong Gil-dong (Hong Gil-dong) I removed \n (Hong Gil-dong) from this string and wrote a code to save only the Hong Gil-dong part. It appears normally in Python Shell, but when you crawl, it doesn't save properly. Below is my code on pipeline.
class NameRegexPipeline(object):
def process_item(self, item, spider):
pattern = re.compile(r'\s\(.*\)$')
for n in item['name']:
item_re = re.sub(pattern, ' ', n)
item['name'] = item_re
return item
I don't know why it doesn't seem to be a problem with the code, but it doesn't save properly TT
scrapy python
The above code will function normally only if the input factor value of process_item is in the same form as item={"name": ["Hong Gil-dong (Hong Gil-dong)"]}
.
For example, item=item={"name": "Hong Gil-dong (Hong Gil-dong)" and }
won't answer as you want. Check the input value once.
582 PHP ssh2_scp_send fails to send files as intended
613 GDB gets version error when attempting to debug with the Presense SDK (IDE)
621 Uncaught (inpromise) Error on Electron: An object could not be cloned
578 Understanding How to Configure Google API Key
573 rails db:create error: Could not find mysql2-0.5.4 in any of the sources
© 2024 OneMinuteCode. All rights reserved.