Scraping automatically translates some of the text into English

I have a question about Selenium scraping.

I am currently writing a scraping script for the ↓ page.
https://www.gakujo.ne.jp/2022/company/baseinfo/22242/

environment
MacBook Pro Google Chrome Jupiter Notebook
Selenium

symptoms
A part of the headline ("profile") is automatically translated into English.
↓Parts acquired this time

script

driver.get('https://www.gakujo.ne.jp/2022/company/baseinfo/22242/')
company_profile=driver.find_element_by_css_selector('#ctl00_ContentPlaceHolder1_MyBaseInfoCtl_pnlProfile')
company_profile=company_profile.text
print(company_profile)

↓Results

profile
[Medical × People × Technology = 】] What is the attraction of companies that continue to provide new value to the medical industry?
HR talks "Pay attention here!"
"Leading company" challenges Japan's medical community
Operate [Private Medical Bureau] to fully support doctors and medical institutions
New proposals based on collective knowledge of medical xVR, medical x drone, medical x food, etc. (all of which are acquired in Japanese below)

見出しThe headline Profile and HR talks "Pay attention here!".

Supplementary
Depending on the page, the symptoms may vary depending on the page, such as the case where all text is acquired in English or the case where all text is acquired in Japanese.I have a question because I can't find a good solution even if I google it. (Do you think it has anything to do with Google translation or DeepL plug-in?)

If you have any knowledge, please give me some advice.

python python3 selenium selenium-webdriver

2022-09-30 21:55

1 Answers

I solved this problem by setting the language setting for the headless Chrome to English → Japanese.
reference:
[Selenium] Set the language setting for the headless Chrome to English→Japanese
Add the following options
options.add_argument('--lang=ja-JP')
Thank you.

2022-09-30 21:55

If you have any answers or tips

Popular Tags

python x 4647
android x 1593
java x 1494
javascript x 1427
c x 927
c++ x 878
ruby-on-rails x 696
php x 692
python3 x 685
html x 656

Popular Questions

1022 In Java servlet, when SHA-256 sends WW-Authenticate header for digest authentication, the client does not return the result.
571 rails db:create error: Could not find mysql2-0.5.4 in any of the sources
609 GDB gets version error when attempting to debug with the Presense SDK (IDE)
581 PHP ssh2_scp_send fails to send files as intended
572 Who developed the "avformat-59.dll" that comes with FFmpeg?