Mechanize Python page download does not work with HTTPS -

- September 15, 2010

i'm on linux mint 13 xfce 32-bit, 3.2.0-7 python 2.7.3. i'm trying read source code of webpage protected https. here's little program:

#!/usr/bin/env python import mechanize  browser = mechanize.browser() browser.set_handle_robots(false) browser.set_handle_equiv(false) browser.addheaders = [('user-agent',                                'mozilla/5.0 (macintosh; intel mac os x 10_10_1) applewebkit/537.36     (khtml, gecko) chrome/39.0.2171.95 safari/537.36'),                               ('accept', 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8'),                               ('accept-encoding', 'gzip, deflate, sdch'),                               ('accept-language', 'en-us,en;q=0.8,ru;q=0.6'),                               ('cache-control', 'max-age=0'),                               ('connection', 'keep-alive')]  html = browser.open('https://scholar.google.com/citations?view_op=search_authors') print html.read()

but instead of source code of page, see this:

what's problem , how fix it? need use mechanize, since need play page later on.

your code works me, remove line

('accept-encoding', 'gzip, deflate, sdch'),

to not having reverse encoding afterwards. clarify: getting content, expect in "clear text". clear text not requesting gzipped content.

Search This Blog

Dil

Mechanize Python page download does not work with HTTPS -

Comments

Post a Comment

Popular posts from this blog

c# - Store DBContext Log in other EF table -

c# - Display ASPX Popup control in RowDeleteing Event (ASPX Gridview) -

Nuget pack csproj using nuspec -