python的requests使用记录

python的requests库是基于urllib库实现的,但是使用起来更方便,更像jquery那样简洁。

api接口化

内部是通过Session类实现所有功能:

class Session(SessionRedirectMixin):
    def request(self, method, url,
        params=None,
        data=None,
        headers=None,
        cookies=None,
        files=None,
        auth=None,
        timeout=None,
        allow_redirects=True,
        proxies=None,
        hooks=None,
        stream=None,
        verify=None,
        cert=None):
        """Constructs a :class:`Request <Request>`, prepares it and sends it.
        Returns :class:`Response <Response>` object.

        :param method: method for the new :class:`Request` object.
        :param url: URL for the new :class:`Request` object.
        :param params: (optional) Dictionary or bytes to be sent in the query
            string for the :class:`Request`.
        :param data: (optional) Dictionary or bytes to send in the body of the
            :class:`Request`.
        :param headers: (optional) Dictionary of HTTP Headers to send with the
            :class:`Request`.
        :param cookies: (optional) Dict or CookieJar object to send with the
            :class:`Request`.
        :param files: (optional) Dictionary of 'filename': file-like-objects
            for multipart encoding upload.
        :param auth: (optional) Auth tuple or callable to enable
            Basic/Digest/Custom HTTP Auth.
        :param timeout: (optional) Float describing the timeout of the
            request.
        :param allow_redirects: (optional) Boolean. Set to True by default.
        :param proxies: (optional) Dictionary mapping protocol to the URL of
            the proxy.
        :param stream: (optional) whether to immediately download the response
            content. Defaults to ``False``.
        :param verify: (optional) if ``True``, the SSL cert will be verified.
            A CA_BUNDLE path can also be provided.
        :param cert: (optional) if String, path to ssl client cert file (.pem).
            If Tuple, ('cert', 'key') pair.
        """

但封装成api之后,就非常类似jquery的ajax了,如:

import requests
requests.get()
requests.post()
requests.put()
...

自动参数编码

使用urllib就知道,参数需要使用urllib.urlencode进行编码处理,requests自动解决这些:

requests.get('https://www.example.com/xxx', verify=False, params={'type': 'test'})
requests.post('https://www.example.com/xxx', verify=False, data={'type': '测试'})

也可以post一个json,如:

import json
requests.post('https://www.example.com/xxx', verify=False, data=json.dumps({'type': '测试'}))

json结果解析

类似jquery的ajax,返回结果可以看成普通text,也可以自动解json:

r = requests.get(...)
print(r.text)
print(r.json())

python牛逼的语法优势就显现出来了,可以直接获取某个url api的值,就像没有发起网络请求一样:

name = requests.get('https://www.example.com/api/xxx/', verify=False).json()['name']

文件上传

文件上传也非常方便,各种情形requests模块也都考虑周全了。

files = {'video': open('/tmp/test.video')}
r = requests.post(url, files=files)

文件可以有多种格式,即所谓的:Dictionary of 'name': file-like-objects (or {'name': ('filename', fileobj)}),如果fileobj为字符串,会帮你转成StringIO,如果是bytes类型,会帮你转成BytesIO,非常省心:

files = {'score': ('test.txt', '测试一下')}

cookies会话

cookies是一个类dict实现,可以r.cookies.get()和r.cookies.get_dict(),最重要的是,使用同一个session请求同一域名时,cookies能自动带上:

s = requests.Session()
r = s.get('https://www.example.com/login', verify=False)
r = s.post('https://www.example.com/login',
        data={'username': 'xxx', 'password': 'xxx', '_xsrf': s.cookies.get('_xsrf')},
        verify=False,
        headers=headers)
r = s.get('https://www.example.com/index', verify=False)

auth认证

支持HTTPProxyAuth、HTTPDigestAuth等认证方式

请求失败异常处理

任何网络请求都有可能失败,更为安全的请求写法应该是非200时抛出异常:

try:
    r = requests.get(...)
    r.raise_for_status()
except requests.RequestException as e:
    print(e)
else:
    return r.json()

请求指定Host的IP

有时候为了加快请求,可以手动指定IP地址,而不是每次都进行DNS查询,考虑到https的证书问题,也一并指定不验证:

requests.get('https://xx.xx.xx.xx', headers={'Host':'test.com'}, verify=False)

文件下载

借助stream=True,可以实现流式下载大文件:

def download_file(url, local_filename):
    # NOTE the stream=True parameter below
    with requests.get(url, stream=True) as r:
        r.raise_for_status()
        with open(local_filename, 'wb') as f:
            for chunk in r.iter_content(chunk_size=1024 * 32):
                # If you have chunk encoded response uncomment if
                # and set chunk_size parameter to None.
                #if chunk:
                f.write(chunk)
                print('.', sep='', end='', flush=True)
    return local_filename

 

发表于 2017年02月23日 17:08   修改于 2020年09月26日 20:41   评论:0   阅读:2338  



回到顶部

首页 | 关于我 | 关于本站 | 站内留言 | rss
python logo   django logo   tornado logo