Python爬蟲(chóng)爬取有道實(shí)現(xiàn)翻譯功能
準(zhǔn)備
首先安裝爬蟲(chóng)urllib庫(kù)
pip install urllib
獲取有道翻譯的鏈接url

需要發(fā)送的參數(shù)在form data里

示例
import urllib.requestimport urllib.parseurl = ’http://fanyi.youdao.com/translate_o?smartresult=dict&smartresult=rule’data = {}data[’i’] = ’i love python’data[’from’] = ’AUTO’data[’to’] = ’AUTO’data[’smartresult’] = ’dict’data[’client’] = ’fanyideskweb’data[’salt’] = ’16057996372935’data[’sign’] = ’0965172abb459f8c7a791df4184bf51c’data[’lts’] = ’1605799637293’data[’bv’] = ’f7d97c24a497388db1420108e6c3537b’data[’doctype’] = ’json’data[’version’] = ’2.1’data[’keyfrom’] = ’fanyi.web’data[’action’] = ’FY_BY_REALTlME’data = urllib.parse.urlencode(data).encode(’utf-8’)response = urllib.request.urlopen(url,data)html = response.read().decode(’utf-8’)print(html)
運(yùn)行會(huì)出現(xiàn)50的錯(cuò)誤,這里需要將url鏈接的_o刪除掉

刪除后運(yùn)行成功

但是這個(gè)結(jié)果看起來(lái)還是太復(fù)雜,需要在進(jìn)行優(yōu)化
導(dǎo)入json,然后轉(zhuǎn)換成字典進(jìn)行過(guò)濾
import urllib.requestimport urllib.parseimport jsonurl = ’http://fanyi.youdao.com/translate?smartresult=dict&smartresult=rule’data = {}data[’i’] = ’i love python’data[’from’] = ’AUTO’data[’to’] = ’AUTO’data[’smartresult’] = ’dict’data[’client’] = ’fanyideskweb’data[’salt’] = ’16057996372935’data[’sign’] = ’0965172abb459f8c7a791df4184bf51c’data[’lts’] = ’1605799637293’data[’bv’] = ’f7d97c24a497388db1420108e6c3537b’data[’doctype’] = ’json’data[’version’] = ’2.1’data[’keyfrom’] = ’fanyi.web’data[’action’] = ’FY_BY_REALTlME’data = urllib.parse.urlencode(data).encode(’utf-8’)response = urllib.request.urlopen(url,data)html = response.read().decode(’utf-8’)req = json.loads(html)result = req[’translateResult’][0][0][’tgt’]print(result)
但是這個(gè)程序只能翻譯一個(gè)單詞,用完就廢了。于是我在進(jìn)行優(yōu)化
import urllib.requestimport urllib.parseimport jsondef translate(): centens = input(’輸入要翻譯的語(yǔ)句:’) url = ’http://fanyi.youdao.com/translate?smartresult=dict&smartresult=rule’ head = {}#增加請(qǐng)求頭,防反爬蟲(chóng) head[’User-Agent’] = ’Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.198 Safari/537.36’ data = {}#帶上from data的數(shù)據(jù)進(jìn)行請(qǐng)求 data[’i’] = centens data[’from’] = ’AUTO’ data[’to’] = ’AUTO’ data[’smartresult’] = ’dict’ data[’client’] = ’fanyideskweb’ data[’salt’] = ’16057996372935’ data[’sign’] = ’0965172abb459f8c7a791df4184bf51c’ data[’lts’] = ’1605799637293’ data[’bv’] = ’f7d97c24a497388db1420108e6c3537b’ data[’doctype’] = ’json’ data[’version’] = ’2.1’ data[’keyfrom’] = ’fanyi.web’ data[’action’] = ’FY_BY_REALTlME’ data = urllib.parse.urlencode(data).encode(’utf-8’) req = urllib.request.Request(url,data,head) response = urllib.request.urlopen(req) html = response.read().decode(’utf-8’) req = json.loads(html) result = req[’translateResult’][0][0][’tgt’] # print(f’中英互譯的結(jié)果:{result}’) return resultt = translate()print(f’中英互譯的結(jié)果:{t}’)
優(yōu)化完成,效果還行。

以上就是本文的全部?jī)?nèi)容,希望對(duì)大家的學(xué)習(xí)有所幫助,也希望大家多多支持好吧啦網(wǎng)。
相關(guān)文章:
1. JS中6個(gè)對(duì)象數(shù)組去重的方法2. Java commons-httpclient如果實(shí)現(xiàn)get及post請(qǐng)求3. 資深程序員:給Python軟件開(kāi)發(fā)測(cè)試的25個(gè)忠告!4. 一文帶你徹底理解Java序列化和反序列化5. PHP程序員簡(jiǎn)單的開(kāi)展服務(wù)治理架構(gòu)操作詳解(二)6. PHP利用curl發(fā)送HTTP請(qǐng)求的實(shí)例代碼7. Python基于requests庫(kù)爬取網(wǎng)站信息8. vscode運(yùn)行php報(bào)錯(cuò)php?not?found解決辦法9. PHP laravel實(shí)現(xiàn)導(dǎo)出PDF功能10. python中文本字符處理的簡(jiǎn)單方法記錄

網(wǎng)公網(wǎng)安備