web scraping - How to identify a request's crucial information that needs to be sent? -


i wanted scrape fares this website uses requests autocompletion.

this code:

import scrapy scrapy.http import request, formrequest import urllib  class cabforcespider(scrapy.spider):     name = 'cabforce'     start_urls = ['https://www.cabforce.com']     complete_url = 'https://www.cabforce.com/v1/geo/autocomplete'      def parse(self, response):         payload = {             'chnl': 'cforce',             'complete': 'barcelona airport',             'destination': 'barcelona'         }         return request(             self.complete_url,             self.print_json,             method='post',             body=urllib.urlencode(payload),             headers={'x-requested-with': 'xmlhttprequest'})      def print_json(self, response):         print response.body 

unfortunately response looks this:

{"status":"argumenterror","reason":"cannot validate input","description":null,"reasontype":2000,"details":[]} 

how find out information missing needs sent request? thought jsessionid , version couldn't figure out how that. hints , have lovely day!

you not need cookies send request. problem with

body=urllib.urlencode(payload), 

this encodes body url-format if @ body of request of browser see json body.

so solution import json , change line mentioned above one:

body=json.dumps(payload), 

in case following result spider:

{"status":"ok","result":{"autocomplete":{"elements":[{"type":16,"description":"(bcn) - barcelona airport, barcelona, spain","location":{"lat":41.289545,"lng":2.072639},"raw":{"name":"(bcn) - barcelona airport","city":"barcelona","country":"spain"}},{"location":{"lat":41.3181887517739,"lng":2.07441323388724},"description":"barcelona airport hotel, plaza volatería, 3, el prat de llobregat, spain","raw":{"name":"barcelona airport hotel","city":"el prat de llobregat","country":"spain"},"type":4},{"location":{"lat":41.3176275,"lng":2.0249774},"description":"airport barcelona apartments, rafael casanova, 37, viladecans, spain","raw":{"name":"airport barcelona apartments","city":"viladecans","country":"spain"},"type":4}]}}} 

Comments

Popular posts from this blog

javascript - Karma not able to start PhantomJS on Windows - Error: spawn UNKNOWN -

c# - Display ASPX Popup control in RowDeleteing Event (ASPX Gridview) -

Nuget pack csproj using nuspec -