虾米怎么截取歌曲(歌曲网站教你爬取)
简单的 url 拼接先用一个 POST 请求,拿 ID 取音频资源路径,,下面我们就来说一说关于虾米怎么截取歌曲?我们一起去了解并探讨一下这个问题吧!
虾米怎么截取歌曲
从歌曲网站,获取音频和歌词的流程:- 1, 输入歌曲名,查找网站中存在的歌曲 id
- 2, 拿歌曲 id 下载歌词 lyric
简单的 url 拼接
- 3, 拿歌曲 id 下载音频 mp3
先用一个 POST 请求,拿 ID 取音频资源路径,
再用 GET 请求,拿到音频资源
4 个网络请求,解决,搜索歌曲,获取歌词,获取音频资源路径,获取音频资源
注意的是,4 个网络请求,都要模拟正常的浏览器请求,- GET 请求,需要配置请求头,
- POST 请求,需要配置请求头和请求体
配置 Session,
有一个加解密,具体见 github repo.
def __init__(self, timeout=60, cookie_path='.'):
self.headers = {
'Accept': '*/*',
'Accept-Encoding': 'gzip,deflate,sdch',
'Accept-Language': 'zh-CN,zh;q=0.8,gl;q=0.6,zh-TW;q=0.4',
'Connection': 'keep-alive',
'Content-Type': 'application/x-www-form-urlencoded',
'Host': 'music.x.com',
'Referer': 'http://music.x.com/search/',
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.132 Safari/537.36'
}
self.session = requests.Session()
self.session.headers.update(self.headers)
self.session.cookies = cookiejar.LWPCookieJar(cookie_path)
self.download_session = requests.Session()
self.timeout = timeout
self.ep = Encrypyed()
1234567891011121314151617
封装 Post 请求方法
def post_request(self, url, params):
"""
Post请求
:return: 字典
"""
data = self.ep.encrypted_request(params)
resp = self.session.post(url, data=data, timeout=self.timeout)
result = resp.json()
if result['code'] != 200:
click.echo('post_request error')
else:
return result
1234567891011121314
def search(self, search_content, search_type, limit=9):
"""
搜索API
:params search_content: 搜索内容
:params search_type: 搜索类型
:params limit: 返回结果数量
:return: 字典.
"""
url = 'http://music.x.com/weapi/xxx/get/web?csrf_token='
params = {'s': search_content, 'type': search_type, 'offset': 0, 'sub': 'false', 'limit': limit}
result = self.post_request(url, params)
return result
12345678910111213
拿到搜索结果:
result = self.search(song_name, search_type=1, limit=limit)
if result['result']['songCount'] <= 0:
click.echo('Song {} not existed.'.format(song_name))
else:
songs = result['result']['songs']
if quiet:
song_id, song_name = songs[0]['id'], songs[0]['name']
song = Song(song_id=song_id, song_name=song_name, song_num=song_num)
return song
1234567891011
下载很简单
lyricUrl = 'http://music.x.com/api/song/lyric/?id={}&lv=-1&csrf_token={}'.format(song_id, csrf)
lyricResponse = self.session.get(lyricUrl)
12
拿到一个 json ,获取里面的歌词,
lyricJSON = lyricResponse.json()
lyrics = lyricJSON['lrc']['lyric'].split("\n")
lyricList = []
for word in lyrics:
time = word[1:6]
name = word[11:]
p = Node(time, name)
lyricList.append(p)
json_string = json.dumps([node.__dict__ for node in lyricList], ensure_ascii = False, indent = 4)
1234567891011
写入新建的本地文件
if not os.path.exists(folder):
os.makedirs(folder)
fpath = os.path.join(folder, str(song_num) '_' song_name '.json')
text_file = open(fpath, "w")
n = text_file.write(json_string)
text_file.close()
123456
- 先拿到音频资源路径
url = 'http://music.x.com/weapi/song/enhance/player/url?csrf_token='
csrf = ''
params = {'ids': [song_id], 'br': bit_rate, 'csrf_token': csrf}
result = self.post_request(url, params)
# 歌曲下载地址
song_url = result['data'][0]['url']
# 歌曲不存在
if song_url is None:
click.echo('Song {} is not available due to copyright issue.'.format(song_id))
else:
return song_url
12345678910111213
- 再获取音频资源
if not os.path.exists(fpath):
resp = self.download_session.get(song_url, timeout=self.timeout, stream=True)
length = int(resp.headers.get('content-length'))
label = 'Downloading {} {}kb'.format(song_name, int(length/1024))
1234
一边下载,一边看进度
with click.progressbar(length=length, label=label) as progressbar:
with open(fpath, 'wb') as song_file:
for chunk in resp.iter_content(chunk_size=1024):
if chunk:
song_file.write(chunk)
progressbar.update(1024)
12345678
需要源码01私信小编
,免责声明:本文仅代表文章作者的个人观点,与本站无关。其原创性、真实性以及文中陈述文字和内容未经本站证实,对本文以及其中全部或者部分内容文字的真实性、完整性和原创性本站不作任何保证或承诺,请读者仅作参考,并自行核实相关内容。文章投诉邮箱:anhduc.ph@yahoo.com