Scrapy xpath returns an empty list although tag and syntax are correct -


in parse function, here code have written:

hs = selector(response) links = hs.xpath(".//*[@id='requisitionlistinterface.listrequisition']") items = [] x in links:         item =  crawlsiteitem()         item["title"] = x.xpath('.//*[contains(@title, "view job           description")]/text()').extract()         items.append(item) return items     

and title returns empty list.

i capturing xpath id tag in links , in links tag, want list of values withthe title has view job description.

please me fix error in code.

if curl request of url provided curl "https://cognizant.taleo.net/careersection/indapac_itbpo_ext_career/moresearch.ftl?lang=en" site way different 1 see in browser. search results in following <a> element not have text() attribute select:

<a id="requisitionlistinterface.reqtitlelinkaction"      title="view job description"     href="#"     onclick="javascript:setevent(event);requisition_openrequisitiondescription('requisitionlistinterface','actopenrequisitiondescription',_ftl_api.lstval('requisitionlistinterface', 'requisitionlistinterface.listrequisition', 'requisitionlistinterface.id5645', this),_ftl_api.intval('requisitionlistinterface', 'requisitionlistinterface.id5649', this));return ftlutil_followlink(this);"> </a> 

this because site loads site loads information displayed xhr request (you can in chrome example) , site updated dynamically returned information.

for information want extract should find xhr request (it not hard because one) , call scraper. resulting dataset can extract required data -- have create parsing algorithm goes through pipe separated format , splits job postings , extracts information need position, id, date , location.


Comments

Popular posts from this blog

javascript - Karma not able to start PhantomJS on Windows - Error: spawn UNKNOWN -

c# - Display ASPX Popup control in RowDeleteing Event (ASPX Gridview) -

Nuget pack csproj using nuspec -