最近
有些朋友
看完小帥b的文章之后
把小帥b的表情包都偷了
還在我的微信
瘋狂發(fā)表情包嘚瑟
我就呵呵了
只能說一句
盤他
還有一些朋友
看完文章不點(diǎn)好看
還來催更
小帥b也只能說一句
繼續(xù)盤他
ok
接下來我們要來玩一個(gè)新的庫
這個(gè)庫的名稱叫做
Requests
這個(gè)庫比我們上次說的 python爬蟲03:那個(gè)叫 Urllib 的庫讓我們的 python 假裝是瀏覽器可是要牛逼一丟丟的
畢竟 Requests 是在 urllib 的基礎(chǔ)上搞出來的
通過它我們可以用更少的代碼
模擬瀏覽器操作
人生苦短
接下來就是
學(xué)習(xí) Python 的正確姿勢(shì)
skr
對(duì)于不是 python 的內(nèi)置庫
我們需要安裝一下
直接使用 pip 安裝
pip install requests
安裝完后就可以使用了
接下來就來感受一下 requests 吧
導(dǎo)入 requests 模塊
import requests
一行代碼 Get 請(qǐng)求
r = requests.get('https://api.github.com/events')
一行代碼 Post 請(qǐng)求
r = requests.post('https://httpbin.org/post', data = {'key':'value'})
其它亂七八糟的 Http 請(qǐng)求
>>> r = requests.put('https://httpbin.org/put', data = {'key':'value'})
>>> r = requests.delete('https://httpbin.org/delete')
>>> r = requests.head('https://httpbin.org/get')
>>> r = requests.options('https://httpbin.org/get')
想要攜帶請(qǐng)求參數(shù)是吧?
>>> payload = {'key1': 'value1', 'key2': 'value2'}
>>> r = requests.get('https://httpbin.org/get', params=payload)
假裝自己是瀏覽器
>>> url = 'https://api.github.com/some/endpoint'
>>> headers = {'user-agent': 'my-App/0.0.1'}
>>> r = requests.get(url, headers=headers)
獲取服務(wù)器響應(yīng)文本內(nèi)容
>>> import requests
>>> r = requests.get('https://api.github.com/events')
>>> r.text
u'[{"repository":{"open_issues":0,"url":"https://github.com/...
>>> r.encoding
'utf-8'
獲取字節(jié)響應(yīng)內(nèi)容
>>> r.content
b'[{"repository":{"open_issues":0,"url":"https://github.com/...
獲取響應(yīng)碼
>>> r = requests.get('https://httpbin.org/get')
>>> r.status_code
200
獲取響應(yīng)頭
>>> r.headers
{
'content-encoding': 'gzip',
'transfer-encoding': 'chunked',
'connection': 'close',
'server': 'Nginx/1.0.4',
'x-runtime': '148ms',
'etag': '"e1ca502697e5c9317743dc078f67693f"',
'content-type': 'application/json'
}
獲取 Json 響應(yīng)內(nèi)容
>>> import requests
>>> r = requests.get('https://api.github.com/events')
>>> r.json()
[{u'repository': {u'open_issues': 0, u'url': 'https://github.com/...
獲取 socket 流響應(yīng)內(nèi)容
>>> r = requests.get('https://api.github.com/events', stream=True)
>>> r.raw
<urllib3.response.HTTPResponse object at 0x101194810>
>>> r.raw.read(10)
'\x1f\x8b\x08\x00\x00\x00\x00\x00\x00\x03'
Post請(qǐng)求
當(dāng)你想要一個(gè)鍵里面添加多個(gè)值的時(shí)候
>>> payload_tuples = [('key1', 'value1'), ('key1', 'value2')]
>>> r1 = requests.post('https://httpbin.org/post', data=payload_tuples)
>>> payload_dict = {'key1': ['value1', 'value2']}
>>> r2 = requests.post('https://httpbin.org/post', data=payload_dict)
>>> print(r1.text)
{ ... "form": { "key1": [ "value1", "value2" ] }, ...}
>>> r1.text == r2.text
True
請(qǐng)求的時(shí)候用 json 作為參數(shù)
>>> url = 'https://api.github.com/some/endpoint'
>>> payload = {'some': 'data'}
>>> r = requests.post(url, json=payload)
想上傳文件?
>>> url = 'https://httpbin.org/post'
>>> files = {'file': open('report.xls', 'rb')}
>>> r = requests.post(url, files=files)
>>> r.text
{ ... "files": { "file": "<censored...binary...data>" }, ...}
獲取 cookie 信息
>>> url = 'http://example.com/some/cookie/setting/url'
>>> r = requests.get(url)
>>> r.cookies['example_cookie_name']
'example_cookie_value'
發(fā)送 cookie 信息
>>> url = 'https://httpbin.org/cookies'
>>> cookies = dict(cookies_are='working')
>>> r = requests.get(url, cookies=cookies)
>>> r.text
'{"cookies": {"cookies_are": "working"}}'
設(shè)置超時(shí)
>>> requests.get('https://github.com/', timeout=0.001)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>requests.exceptions.Timeout: HTTPConnectionPool(host='github.com', port=80): Request timed out. (timeout=0.001)
除了牛逼
還能說什么呢??