I want to crawl the KBO ranking to Python.

Asked 2 years ago, Updated 2 years ago, 44 views

from bs4 import BeautifulSoup
from urllib.request import urlopen

response = urlopen('https://www.koreabaseball.com/TeamRank/TeamRank.aspx')
soup = BeautifulSoup(response, 'html.parser')
i = 1
data = ""
for anchor in soup.select("tbody"):
    data += anchor.get_text() + "\n"
    i += 1
a = data.split("\n")
print(data.replace(' ', ''))

I'm currently writing the code like this

1        
NC       
65       
44       
19       
2        
0.698    
0        
6 wins, 2 draws, 2 losses
Five wins      
23-1-9   
21-1-10  


2
Doosan Bears
66
39
27
0
0.591
6.5
6 wins, 0 draws, and 4 losses
one loss
18-0-13
21-0-14


3
Raised
68
38
30
0
0.559
8.5
3 wins, 0 draws, 7 losses
Three losses
23-0-11
15-0-19


4
KIA
64
35
29
0
0.547
9.5
6 wins, 0 draws, and 4 losses
Two wins
20-0-11
15-0-18


5
LG
66
35
30
1
0.538
10
Five wins, one draw, and four losses
one win
17-1-17
18-0-13


6
Samsung
66
34
32
0
0.515
11.5
Six wins, 0 out of four
2
21-0-15
13-0-17


7
Kt
66
32
33
1
0.492
13
Five wins so far
1
19-0-15
13-1-18


8
Lotte
64
31
33
0
0.484
13.5
With five wins, 0, draws 5 losses
One win,
18-0-11
13-0-22


9
Sk
67
23
44
0
0.343
23
0 walks and six wins and losses
Three wins,
14-0-20
9-0-24


10
Hanwha
68
17
51
0
0.250
29.5
Two wins and zero ties and eight losses each
And seven losses
9-0-24
8-0-27



NC■6-3-05-4-02-3-02-1-15-2-08-1-12-1-07-2-07-2-044-19-2두산3-6-0■2-2-07-2-07-3-03-3-03-2-05-3-06-3-03-3-039-27-0키움4-5-02-2-0■4-5-06-3-04-5-03-3-03-4-06-3-06-0-038-30-0KIA3-2-02-7-05-4-0■1-2-04-5-04-5-06-1-04-2-06-1-035-29-0LG1-2-13-7-03-6-02-1-0■4-5-03-4-03-3-07-2-09-0-035-30-1삼성2-5-03-3-05-4-05-4-05-4-0■2-6-06-3-04-2-02-1-034-32-0KT1-8-12-3-03-3-05-4-04-3-06-2-0■2-7-03-0-06-3-032-33-1롯데1-2-03-5-04-3-01-6-03-3-03-6-07-2-0■3-3-06-3-031-33-0SK2-7-03-6-03-6-02-4-02-7-02-4-00-3-03-3-0■6-4-023-44-0한화2-7-03-3-00-6-01-6-00-9-01-2-03-6-03-6-04-6-0■17-51-0

How can we make it like NC

python crawling

2022-09-20 20:55

1 Answers

data="1 NC 6744 21 20.6770 5 wins, 1 draw, 4 losses, 2 losses, 23-1-9 21-1-12
2 Doosan 6840 280 0.588 5.5 6 wins, 0 draws, 4 losses, 1 loss 19-0-14 21-0-14
3 KIA 66 3729 0 0.561 7.56 wins 0 draw 4 losses 4 wins 22-0-11 15-0-18
4 Kiwoom 7039 310 0.557 7.54 wins 0 draws 6 losses 1 win 24-0-12 15-0-19
5 LG 68 36 31 11 0.537 9 6 wins 0 draws 4 losses 1 win 17-1-17 19-0-14
6 KT 68 34 33 1 0.507 11 5 wins, 1 draw, 4 losses, 2 wins 21-0-15 13-1-18
7 Samsung 68 34 300 0.500 11.5 4 wins, 0 draws, 6 losses, 4 losses 21-0-15 13-0-19
8 Lotte 66 32 34 0 0.485 12.55 wins, 0 draws, 5 losses, 1 loss 18-0-11 14-0-23
9 SK 68 24 44 0 0.353 21.5 6 wins 0 draws 4 losses 4 wins 14-0-20 10-0-24
10 Hanwha 69 17 52 0 0.246 29 1 win, 0 draw, 9 losses, 8 losses, 9-0-25 8-0-27""".replace("\n", ").split("\t")

rank = [val for idx, val in enumerate(data) if idx % 12 == 0]
team = [val for idx, val in enumerate(data) if idx % 12 == 1]

# ['1', '2', '3', '4', '5', '6', '7', '8', '9', '10']
# # ['NC', '\xeb\x91\x90\xec\x82\xb0', 'KIA', '\xed\x82\xa4\xec\x9b\x80', 'LG', 'KT', '\xec\x82\xbc\xec\x84\xb1', '\xeb\xa1\xaf\xeb\x8d\xb0', 'SK', '\xed\x95\x9c\xed\x99\x94']

print(rank)
print(team)

result = map(lambda(idx, val): val + 'above' + team[idx], enumerate(rank))

print(result)

# # ['1\xec\x9c\x84 - NC', '2\xec\x9c\x84 - \xeb\x91\x90\xec\x82\xb0', '3\xec\x9c\x84 - KIA', '4\xec\x9c\x84 - \xed\x82\xa4\xec\x9b\x80', '5\xec\x9c\x84 - LG', '6\xec\x9c\x84 - KT', '7\xec\x9c\x84 - \xec\x82\xbc\xec\x84\xb1', '8\xec\x9c\x84 - \xeb\xa1\xaf\xeb\x8d\xb0', '9\xec\x9c\x84 - SK', '10\xec\x9c\x84 - \xed\x95\x9c\xed\x99\x94']

1st - NC It comes out on the list in the same shape.


2022-09-20 20:55

If you have any answers or tips


© 2024 OneMinuteCode. All rights reserved.