网易云音乐歌曲mp3下载大厂内网条件下没办法安装网易云音乐客户端，用在线播放的话可能会被监控，所以搞了一个网易云音乐歌曲

背景

每天写BUG也会累的，这个时候还是需要来点音乐放松一下。

大厂内网条件下也没办法直接安装网易云音乐桌面程序，如果一直用网页播放的话会占用网速，也可能会被监控到，所以才有了想法先把歌曲下载到本地，然后用本地音乐播放器直接播放就可以了。

本地音乐播放器原先还有个老版的千千静听可以用，后面因为种种原因没得了，下载了很多老板的千千静听，但是基本都不好用了，有的版本歌曲一多放久了可能会导致播放列表不展示了，所以在github上找到了一个MusicPlayer2，可以根据歌曲自动下载歌词，还凑合用吧

原理

工具脚本的原理是用python调用chromedriver打开真实的歌单或歌手页面（会传输真实的cookie），然后操作页面的dom获取歌曲的属性（歌曲ID、歌手名、歌手名、专辑），根据歌曲的ID使用链接music.163.com/song/media/…{songId}.mp3下载对应的歌曲的mp3文件（mp3的质量无法保证，好像是128kbps）

Talk is cheap. Show you my code

 #!/user/bin/env python3
 # -*- coding: utf-8 -*-
 import json
 import os
 import logging
 import string
 import sys
 
 from selenium import webdriver
 from selenium.webdriver.chrome.options import Options
 from selenium.webdriver.common.by import By
 import re
 import requests
 
 # mp3属性修改
 from datetime import datetime
 import eyed3
 
 # 并发下载
 import threading
 import queue
 
 
 logger = None
 
 
 def init_webdriver():
     # 隐藏chrome窗口
     options = Options()
     options.add_argument('--headless')
     options.add_argument('--disable-gpu')
     return webdriver.Chrome(options=options)
 
 
 def init_logger():
     global logger
     logger = logging.getLogger()
     # 设置此logger的最低日志级别，以后添加的Handler级别若是低于这个设置，则以这个设置为最低限制
     logger.setLevel(logging.INFO)
 
     # 建立一个FileHandler，将日志输出到文件
     log_file = '../logs/sys_%s.log' % datetime.strftime(datetime.now(), '%Y-%m-%d')
     file_handler = logging.FileHandler(filename=log_file, encoding='utf-8')
     # 设置此Handler的最低日志级别
     file_handler.setLevel(logging.DEBUG)
     # 设置此Handler的日志输出字符串格式
     log_formatter = logging.Formatter('%(asctime)s [%(levelname)s]: %(message)s')
     file_handler.setFormatter(log_formatter)
 
     # 建立一个StreamHandler，将日志输出到Stream，默认输出到sys.stdout
     stream_handler = logging.StreamHandler(sys.stdout)
     stream_handler.setFormatter(log_formatter)
     stream_handler.setLevel(logging.INFO)
 
     # 将不一样的Handler添加到logger中，日志就会同时输出到不一样的Handler控制的输出中
     # 注意若是此logger在以前使用basicConfig进行基础配置，由于basicConfig会自动建立一个Handler，因此此logger将会有3个Handler
     # 会将日志同时输出到3个Handler控制的输出中
     logger.addHandler(file_handler)
     logger.addHandler(stream_handler)
 
 
 def start_craw(craw_driver: webdriver, cookie_file_path: str, craw_url: str):
     if not craw_url:
         logger.error("不能爬取空链接。。。")
     if craw_url.find('artist') != -1:
         return craw_artist(craw_driver, cookie_file_path, craw_url)
     elif craw_url.find('playlist') != -1:
         return craw_playlist(craw_driver, cookie_file_path, craw_url)
     elif craw_url.find('album') != -1:
         return craw_album(craw_driver, cookie_file_path, craw_url)
     else:
         logger.error("无法识别链接的类型是歌单还是歌手。。。")
         return queue.Queue()
 
 
 def craw_url(craw_driver: webdriver, cookie_file_path: str, url: str):
     craw_driver.get(url)
     with open(cookie_file_path, "r", encoding="utf-8") as f:
         content = json.load(f)
     for cookie in content:
         if cookie['sameSite'] == 'unspecified':
             cookie['sameSite'] = 'None'
         if cookie['sameSite'] == 'strict':
             cookie['sameSite'] = 'Strict'
         # logger.info(cookie)
         craw_driver.add_cookie(cookie)
     craw_driver.refresh()
 
     craw_driver.switch_to.frame('contentFrame')
     return craw_driver
 
 
 def craw_playlist(craw_driver: webdriver, cookie_file_path: str, url: str):
     craw_result = craw_url(craw_driver, cookie_file_path, url)
 
     tr_list = craw_result.find_element(By.TAG_NAME, "tbody").find_elements(By.TAG_NAME, 'tr')
     logger.info("总歌曲数：%s" % len(tr_list))
 
     song_list = queue.Queue()
     for tr in tr_list:
         geshou_txt = tr.find_elements(By.TAG_NAME, 'td')[1].find_element(By.TAG_NAME, 'a').get_property('href')
         result = re.findall('https://music.163.com/song?id=(\d+)', geshou_txt)
 
         # title包含了其他的信息，不是纯粹的歌曲名
         title = tr.find_elements(By.TAG_NAME, 'td')[1].find_element(By.TAG_NAME, 'a')\
                     .find_element(By.TAG_NAME, 'b').get_property('title')
         temp_array = tr.find_elements(By.TAG_NAME, 'td')[1].find_element(By.TAG_NAME, 'a')\
                         .find_element(By.TAG_NAME, 'b').text.replace(u'\xa0', u' ')
         temp_array = temp_array.split("\n")
         # 剔除中间的混淆随机汉字
         if len(temp_array) > 1:
             del temp_array[1]
         song_name = ''.join(temp_array)
 
         artist = tr.find_elements(By.TAG_NAME, 'td')[3].find_element(By.TAG_NAME, 'span').get_property('title').replace(u'\xa0', u' ')
         artist = ''.join(artist)
         # 多个歌手类似于A/B时会导致下载到的文件不能保存，因此只取第一个歌手的名字
         if artist.find("/") > 0:
             artist = artist.split("/")[0]
         # 歌曲文件最终的保存的文件名，格式为："第一个歌手名 - 歌名.mp3"
         song_file_name = artist + " - " + song_name
         # logger.info('歌曲文件名：' + song_file_name)
 
         temp_array = tr.find_elements(By.TAG_NAME, 'td')[4].find_element(By.TAG_NAME, 'a').get_property('title').replace(u'\xa0', u' ')
         album = ''.join(temp_array)
 
         song_list.put({"id": result[0], "artist": artist, "song_name": song_name, "album": album,
                        "song_file_name": song_file_name})
     craw_result.close()
     return song_list
 
 
 def craw_artist(craw_driver: webdriver, cookie_file_path: str, url: str):
     craw_result = craw_url(craw_driver, cookie_file_path, url)
 
     artist = craw_result.find_element(By.ID, "artist-name").text.replace(u'\xa0', u' ')
     logger.info("歌手名：%s" % artist)
 
     tr_list = craw_result.find_element(By.TAG_NAME, "tbody").find_elements(By.TAG_NAME, 'tr')
     logger.info("总歌曲数：%s" % len(tr_list))
 
     song_list = queue.Queue()
     song_index = 0
     for tr in tr_list:
         song_index = song_index + 1
         geshou_txt = tr.find_elements(By.TAG_NAME, 'td')[1].find_element(By.TAG_NAME, 'a').get_property('href')
         result = re.findall('https://music.163.com/song?id=(\d+)', geshou_txt)
 
         temp_array = tr.find_elements(By.TAG_NAME, 'td')[1].find_element(By.TAG_NAME, 'a')\
                         .find_element(By.TAG_NAME, 'b').text.replace(u'\xa0', u' ')
         temp_array = temp_array.split("\n")
         # 剔除中间的混淆随机汉字
         if len(temp_array) > 1:
             del temp_array[1]
         song_name = ''.join(temp_array)
 
         # 歌曲文件最终的保存的文件名，格式为："第一个歌手名 - 歌名.mp3"
         song_file_name = artist + " - " + song_name
         # logger.info('歌曲文件名：' + song_file_name)
 
         temp_array = tr.find_elements(By.TAG_NAME, 'td')[3].find_element(By.TAG_NAME, 'a').get_property('title').replace(u'\xa0', u' ')
         album = ''.join(temp_array)
 
         song_list.put({"id": result[0], "artist": artist, "song_name": song_name, "album": album,
                        "song_file_name": song_file_name, "song_index": song_index})
     craw_result.close()
     return song_list
 
 
 def craw_album(craw_driver: webdriver, cookie_file_path: str, url: str):
     craw_result = craw_url(craw_driver, cookie_file_path, url)
 
     album = craw_result.find_element(By.CSS_SELECTOR, ".m-info .hd h2").text.replace(u'\xa0', u' ')
     logger.info("专辑名：%s" % album)
 
     artist = craw_result.find_elements(By.CSS_SELECTOR, ".m-info .intr")[0].find_element(By.TAG_NAME, "span")\
         .get_property('title').replace(u'\xa0', u' ')
     logger.info("歌手名：%s" % artist)
 
     tr_list = craw_result.find_element(By.TAG_NAME, "tbody").find_elements(By.TAG_NAME, 'tr')
     logger.info("总歌曲数：%s" % len(tr_list))
 
     song_list = queue.Queue()
     song_index = 0
     for tr in tr_list:
         song_index = song_index + 1
         geshou_txt = tr.find_elements(By.TAG_NAME, 'td')[1].find_element(By.TAG_NAME, 'a').get_property('href')
         result = re.findall('https://music.163.com/song?id=(\d+)', geshou_txt)
 
         temp_array = tr.find_elements(By.TAG_NAME, 'td')[1].find_element(By.TAG_NAME, 'b').text.replace(u'\xa0', u' ')
         temp_array = temp_array.split("\n")
         # 剔除中间的混淆随机汉字
         if len(temp_array) > 1:
             del temp_array[1]
         song_name = ''.join(temp_array)
 
         # 歌曲文件最终的保存的文件名，格式为："第一个歌手名 - 歌名.mp3"
         song_file_name = artist + " - " + song_name
         # logger.info('歌曲文件名：' + song_file_name)
 
         song_list.put({"id": result[0], "artist": artist, "song_name": song_name, "album": album,
                        "song_file_name": song_file_name, "song_index": song_index})
     craw_result.close()
     return song_list
 
 
 def download(song_queue: queue.Queue, file_save_path):
     headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) '
                              'Chrome/114.0.0.0 Safari/537.36'}
     while not song_queue.empty():
         global index
         index = index + 1
 
         song = song_queue.get(timeout=3)
         song_name = song['song_name']
         song_index = song['song_index']
         music_url = f'http://music.163.com/song/media/outer/url?id={song["id"]}.mp3'
         song_detail_url = f'https://music.163.com/#/song?id={song["id"]}'
         # logger.info(music_url)
         if not os.path.exists(file_save_path):
             os.mkdir(file_save_path)
 
         try:
             final_file_name = file_save_path + song['song_file_name'] + '.mp3'
             # 文件不存在则保存文件，防止重复下载
             if not os.path.exists(final_file_name):
                 logger.info("----->[{}]正在下载[{}]的[{}]...".format(song_index, song['artist'], song['song_name']))
                 music_content = requests.get(url=music_url, headers=headers, timeout=30).content
                 with open(final_file_name, mode='wb') as f:
                     f.write(music_content)
             # 更新歌曲标签信息
             if os.path.exists(final_file_name):
                 audio_file = eyed3.load(final_file_name)
                 if audio_file.tag:
                     if not audio_file.tag.title:
                         audio_file.tag.title = song['song_name']
                     if not audio_file.tag.artist:
                         audio_file.tag.artist = song['artist']
                     if not audio_file.tag.album:
                         audio_file.tag.album = song['album']
                     audio_file.tag.save()
 
             log_buffer = list()
             log_buffer.append("---------------------------------------------------------------------")
             log_buffer.append("第 {} 首， ID：{}".format(song_index, song["id"]))
             log_buffer.append('歌曲名：' + song['song_name'])
             log_buffer.append('歌手：' + song['artist'])
             log_buffer.append('专辑：' + song['album'])
             log_buffer.append("[{}] {} 下载完成".format(song_index, song['song_name']))
             logger.info('\n'.join(log_buffer))
         except Exception as e:
             # 处理异常，因为有的歌是付费的或者没有版权没法下载
             global failed_count
             global failed_result
             failed_count = failed_count + 1
             logger.info("---->获取[{}]的[{}]失败, 歌曲详情页：{} 下载URL:{}".format(song['artist'], song_name, song_detail_url, music_url))
             failed_result.append("{}:{}".format(song_name, song_detail_url))
             logger.error(e)
 
 
 if __name__ == '__main__':
     init_logger()
     driver = init_webdriver()
     # 准备好的cookie, 需要将chrome插件EditThisCookie导出的json格式的所有cookie粘贴到cookie.config中，避免下载失败
     cookie_file = "cookie.config"
     # 下载的歌曲文件的保存路径
     save_path = "F:\网易云音乐\"
 
     # 爬取URL，支持歌单和歌手页面的爬取。注意：只能爬取浏览器打开后可见的歌曲
     # 部分歌曲会因为版权或者VIP的原因可能会无法下载，另外，下载的音质无法保证
     # 建议将歌曲先添加到自己的歌单再下载，歌单不要超过1000首
     # 歌单链接类似于：https://music.163.com/#/my/m/music/playlist?id=37432514  页面最多展示1000首
     # 歌手页面链接类似于：https://music.163.com/#/artist?id=2516 页面最多展示50首
     # 专辑链接类似于：https://music.163.com/#/album?id=39483040
     music_url = 'https://music.163.com/#/artist?id=5354'
     music_list = start_craw(driver, cookie_file, music_url)
 
     # 10个线程伪并发下载
     thread_count = 0
     threads = list()
     index = 0
     failed_count = 0
     failed_result = list()
     while thread_count < 10:
         thread_count = thread_count + 1
         t = threading.Thread(target=download, args=(music_list, save_path,), name=f"DownloadThread[{thread_count}]")
         threads.append(t)
         t.start()
     for thread in threads:
         thread.join()
     logger.info('总歌曲数：{} 失败数：{}'.format(index, failed_count))
     if failed_count > 0:
         logger.info("失败的歌曲列表：")
         for failed_song in failed_result:
             logger.info(failed_song)

使用方法

安装python

自行找文档安装好python，最好是3.8及以后的版本，这里不再赘述。

利用pip或IDE安装好需要的包：selenium、eyed3（用于更改下载到的mp3文件的属性，如果不需要，可以注释掉对应的代码）

准备chromedriver

首先根据chrome浏览器对应的版本下载对应的chromedriver（理论上其他基于chromium内核的浏览器可以同样操作，但是没验证过）

chromedriver下载地址：chromedriver.storage.googleapis.com/index.html

参考地址：zhuanlan.zhihu.com/p/110274934

下载到的chromedriver是一个exe可执行程序，放在本地任意位置，然后将路径配置到操作系统的环境变量Path中。这里为了方便，直接下载到了python的安装目录下：

安装EditThisCookie插件

谷歌浏览器安装好EditThisCookie插件，用于获取导出cookie，下载歌曲的时候需要用到，否则有可能导致没有权限爬取对应的歌曲导致下载失败。

开始下载歌曲

打开网易云音乐歌单或者歌手或者专辑详情页面，登录，然后通过EditThisCookie插件导出cookie的json并覆盖到NeteaseMusicDownloader.py脚本同级目录下的cookie.config文件中，在NeteaseMusicDownloader.py中修改歌曲的本地保存路径以及歌单或歌手页面的URL。

本地的mp3文件的保存路径位于main方法中save_path，路径不存在的话会自动创建。

music_url的值修改为歌单或者歌手页面的真实URL。

歌单链接类似于：music.163.com/#/my/m/musi…
歌手页面链接类似于：music.163.com/#/artist?id…
专辑页面链接类似于：music.163.com/#/album?id=…

运行脚本，就可以自动开始爬取歌曲并开启下载。

Tips

只能爬取浏览器打开后可见的歌曲，看不到的歌曲暂时还不能自动去爬取下载。
歌单页面展示的歌曲默认最多1000条，歌手页面只展示最多50条，剩余的无法下载。因此可以先利用手机app或网页将需要下载的歌曲保存成单独的歌单，然后再下载这个歌单。
mp3的音质质量问题暂时无法保证，持续研究中，将就用吧
部分歌曲会因为版权或者VIP的原因可能会无法下载，脚本最后会输出对应的歌名及歌曲详情页地址，可以自行下载
歌曲默认保存的文件名为“第一个歌手名 - 歌曲名.mp3”，需要修改的话参考代码自行修改song_file_name。有两个地方需要修改。

歌曲保存时会自动检测文件是否已经存在，已经存在的话不会下载，因此不怕中途网络断连导致歌单未下载完整，只需要重新运行一下脚本就可以了。
部分歌曲可能下载后保存的文件名是乱码的，这个暂时没找到原因，可以直接在保存路径下删掉已经下载的文件重新下载试试。
mp3文件的属性（标题、参与创作的艺术家、唱片集）有很小的概率会乱码，原因暂时未找到，发现的话自己手动修改一下吧
python刚学，代码很多不符合标准，不喜勿喷