Only a part of it is written to csv with python scraping...?

Hello
Thank you for watching.

I would like to ask you a question about scraping using python.

■Contents of questions
I'm scraping the race ticket from the web, but only a part of it is written on csv.
As a result of scraping, only the front part of the running list (column names such as horse names and frame numbers) was written on csv.
The horse's name, stable, odds, etc. are not written on the csv.

The page and code referenced below are listed.
I would appreciate it if you could check it.

import csv
from urllib.request import urlopen
from bs4 import BeautifulSoup

html=urlopen("http://race.netkeiba.com/?pid=race&id=c201605050211&mode=shutuba")
bsObj=BeautifulSoup(html, "html.parser")

table=bsObj.findAll("table", {"class": "race_table_01nk_tb_common shutuba_table")[0]
rows=table.findAll("tr")

csvFile=open("shutuba.csv", 'wt', newline=', encoding=
'shift_jis')
writer=csv.writer(csvFile)
try:for row in rows:
csvRow=[]
for cell in row.findAll (['th', 'td']):
csvRow.append(cell.get_text())
writer.writerow(csvRow)
finally:
csvFile.close()

If you enter it on the terminal like this, a csv will be created.
Only the top part of the table does not contain any important information such as the name of the horse.

Probably not the scraping command itself, but
I was wondering if the command when writing csv was not very good.
I'm guessing, but whatever I do doesn't work
I would like to borrow your knowledge.

Could you please let me know the points that need to be corrected in the command?

python web-scraping

2022-09-30 21:19

1 Answers

The for portion is probably indented incorrectly.
Also, I got an error statement saying that I should change ENCODE to utf-8.
The CSV file will be generated in the copy below.

import csv
from urllib.request import urlopen
from bs4 import BeautifulSoup


html=urlopen("http://race.netkeiba.com/?pid=race&id=c201605050211&mode=shutuba")
bsObj=BeautifulSoup(html, "html.parser")


table=bsObj.findAll("table", {"class": "race_table_01nk_tb_common shutuba_table"})[0]
rows=table.findAll("tr")

csvFile=open("shutuba.csv", 'wt', newline=', encoding='utf-8')
writer=csv.writer(csvFile)
try:
    For row in rows:
        csvRow = [ ]
        for cell in row.findAll('th', 'td'):
            csvRow.append(cell.get_text())
            writer.writerow (csvRow)
finally:
    csvFile.close()

PS
To coat the code in a stack overflow, select the code portion and use the {} above.

2022-09-30 21:19

If you have any answers or tips

Popular Tags

python x 4647

android x 1593

java x 1494

javascript x 1427

c x 927

c++ x 878

ruby-on-rails x 696

php x 692

python3 x 685

html x 656