python - How to use Beautiful Soup's find() instead of find_all() for better runtime -
i writing webscraper using python's bs4. trying find first image has attribute 'data-a-dynamic-image'. far have code below, , works. but, prefer only use find()
not find_all
. because care first item on page attribute. don't want use find_all , waste time sifting through entire webpage.
def siftimage(soup): try: line in soup.find_all('img'): if line not none: if line.has_attr('data-a-dynamic-image'): return line['src'] except: return 'no image '
this second function made return result want, if first image on page image want, otherwise return nothing. but, has runtime looking for.
def siftimagetwo(soup): try: line = soup.find('img'): if line.has_attr('data-a-dynamic-image'): return line['src'] except: return 'no image '
i looking way have functionality of top script timing of bottom script.
according official documentation there way search custom data-* attributes.
should try this:
line = soup.find('img', attrs={'data-a-dynamic-image': true})
Comments
Post a Comment