URLLib2.URL Error: Reading Server Response Codes (Python) -


i have list of urls. i'd see server response code of each , find out if broken. can read server errors (500) , broken links (404) okay, code breaks once non-website read (e.g. "notawebsite_broken.com"). i've searched around , not found answer... hope can help.

here's code:

import urllib2  #list of urls. third url not website urls = ["http://www.google.com","http://www.ebay.com/broken-link", "http://notawebsite_broken"]  #empty list store output response_codes = []  # run "for" loop: server response code , save results response_codes url in urls:     try:         connection = urllib2.urlopen(url)         response_codes.append(connection.getcode())         connection.close()         print url, ' - ', connection.getcode()     except urllib2.httperror, e:         response_codes.append(e.getcode())         print url, ' - ', e.getcode()  print response_codes 

this gives output of...

http://www.google.com  -  200 http://www.ebay.com/broken-link  -  404 traceback (most recent call last):   file "test.py", line 12, in <module>     connection = urllib2.urlopen(url)   file "/system/library/frameworks/python.framework/versions/2.7/lib/python2.7/urllib2.py", line 127, in urlopen     return _opener.open(url, data, timeout)   file "/system/library/frameworks/python.framework/versions/2.7/lib/python2.7/urllib2.py", line 404, in open     response = self._open(req, data)   file "/system/library/frameworks/python.framework/versions/2.7/lib/python2.7/urllib2.py", line 422, in _open     '_open', req)   file "/system/library/frameworks/python.framework/versions/2.7/lib/python2.7/urllib2.py", line 382, in _call_chain     result = func(*args)   file "/system/library/frameworks/python.framework/versions/2.7/lib/python2.7/urllib2.py", line 1214, in http_open     return self.do_open(httplib.httpconnection, req)   file "/system/library/frameworks/python.framework/versions/2.7/lib/python2.7/urllib2.py", line 1184, in do_open     raise urlerror(err) urllib2.urlerror: <urlopen error [errno 8] nodename nor servname provided, or not known> 

does know fix or can point me in right direction?

you use requests:

import requests  urls = ["http://www.google.com","http://www.ebay.com/broken-link", "http://notawebsite_broken"]  u in urls:     try:         r = requests.get(u)         print "{} {}".format(u,r.status_code)     except exception,e:         print "{} {}".format(u,e)  http://www.google.com 200 http://www.ebay.com/broken-link 404 http://notawebsite_broken httpconnectionpool(host='notawebsite_broken', port=80): max retries exceeded url: / 

Comments

Popular posts from this blog

google api - Incomplete response from Gmail API threads.list -

Installing Android SQLite Asset Helper -

Qt Creator - Searching files with Locator including folder -