Append to a new list in the existing list

Asked 2 years ago, Updated 2 years ago, 182 views

Web-crawling shopping mall,

in the purchase item part

There is a button called Purchase Quantity,

If sold out, there is no button,

If you have multiple options (color or size), you will get different values.

Find_all ltd class = "xxx"

If there are more than one xxx, each added to the list

When you finally put it in an Excel file, you get a ValueError: Length of values (36) does not match length of index (19) error.

I need all the values for more than two, so even if I put the list in the list, I have to bring them all.

For example,

a = [ ]

b = [ ]

c = [ ] Create a list to collect.

a.append(url) #Only 1 comes out

b.append(image) #Only 1 appears

c.append(text) # 3 pieces come out

If it comes out like this

a = [url]

b = [image]

c = [text1,text2,text3]

This addition causes problems.

If you use a repetitive sentence, it seems that only the text of the c list grows bigger, causing a problem.

a = [url1,url2]

b = [image1,image2]

c = [[text1-1,text1-2,text1-3],text2-1]

I'm looking for it because I'm wondering if I can put it on the list, but it doesn't work out, so I'm leaving a question.

To show you the first code,

   # ..omission
links = [ ]
thumbnail = [ ]
option = [ ]

    links.append(baseurl+element.find("a")["href"])
    thumbnail.append("http://"+baseurl+element.select('img')[0].get('src'))        
    option_chk= soup3.find_all('td', {'class':'option_txt'})

    if len(option_chk):
            for i in option_chk:
                     option.append(i.string)

    else:
            option_chk == None
            option.append ("out of stock")

    print(option)

It's this part.

Option_chk may or may not have one or more values.

If you finally print the option list when crawling the 5 product pages

["Purchase quantity", "Purchase quantity", "Sold out", "Red", "Blue", "Yellow", "Purchase quantity"]

It comes out like this

["Purchase quantity", "Purchase quantity", "Sold out", ["Red", "Blue", "Yellow", "Purchase quantity"]

I think it'll be okay if it comes out like this.

Putting it as a list can also be a problem when extracting it with Excel later,

Finally, ['purchase quantity', 'purchase quantity', 'out of stock', 'red' + 'blue' + 'yellow', 'purchase quantity']

I'd like to tie it up like this.

I'm a beginner, so my writing got longer because I was thinking about how to explain it as much as possible. Please answer me. Thank you.

selenium crawling

2022-09-27 01:00

1 Answers

I think I cut off the part where the error content and code are not related at all.

However, if you refer to the code presented, you can fix it like this.

links = [ ]
thumbnail = [ ]
option = [ ]

    links.append(baseurl+element.find("a")["href"])
    thumbnail.append("http://"+baseurl+element.select('img')[0].get('src'))        
    option_chk= soup3.find_all('td', {'class':'option_txt'})

    """
    if len(option_chk):
            for i in option_chk:
                     option.append(i.string)

    else:
            option_chk == None
            option.append ("out of stock")
    """
    a = '+'.join([i.string for i in option_chk])
    if not a:
       a = "Sold Out"
    option.append(a)

    print(option)


2022-09-27 01:00

If you have any answers or tips


© 2024 OneMinuteCode. All rights reserved.