返乡之路不容易之12306余票查询并给出备选方案v2(返乡之路不容易之12306余票查询并给出备选方案v2)

在第一版的返乡之路不容易之12306余票查询并给出备选方案中,给出了余票查询和备选方案推荐,但当时有两个问题:

  • 没有备选排名:虽然给出了备选,但哪个备选好没有给出排序
  • 没有座位信息(商务/一等/二等/硬座/硬卧/无座):虽然能买,但是不一定能买到适合自己的(便宜的),有点奢侈了

因此这几天对代码进行了更新。

首先说一下备选方案排序的原理,如果我们直接买不到出发地和目的地的车票,那对于这趟车来说,只要出发站买在首发站和出发地之间,到达站买在出发地和终点站之间,就可以保证我们能顺利踏上这趟车,大不了多买几站或者上车补票。那么我们的排序原理就是花最少的钱回家,怎么办呢?

出发站尽量买在离出发地近的站,到达站尽量买在离目的地近的站,而且尽量补票,不要多买站,毕竟我们秉着花钱最少的原则。

返乡之路不容易之12306余票查询并给出备选方案v2(返乡之路不容易之12306余票查询并给出备选方案v2)(1)

实现的效果如下

返乡之路不容易之12306余票查询并给出备选方案v2(返乡之路不容易之12306余票查询并给出备选方案v2)(2)

趁着这次做2.0,对代码做了优化:

  • 城市缩写使用保存的文件进行读入,不使用12306的接口进行获取,加快速度;(城市缩写文件可以由第一版的Citys()函数进行获取保存,形式如下:)

{ "北京北": "VAP", "北京东": "BOP", "北京": "BJP", "北京南": "VNP", "北京西": "BXP", "广州南": "IZQ", "重庆北": "CUW", "重庆": "CQW", "重庆南": "CRW", ... }

  • 增加爬虫自省机制,如果接口调用失败则延时后再次调用;
  • 增加了座位信息;
  • 对备选方案做出排序

代码如下:

# coding=utf-8 import requests import urllib.parse as parse import time import json import pretty_errors import re from fake_useragent import UserAgent TRAIN_NO = 2 TRAIN = 3 DEPARTURE_STATION = 6 TERMINUS = 7 DEPARTURE_TIME = 8 ARRIVAL_TIME = 9 DURATION = 10 IF_BOOK = 11 DATE = 13 FROM_STATION_NO = 16 TO_STATION_NO = 17 SEAT_TYPES = 35 # OTHER = 22 NO_SEAT = 26 # WZ HARD_SEAT = 29 # A1 SECOND_SEAT = 30 # O FIRST_SEAT = 31 # M BUSINESS_SEAT = 32 # A9 HARD_SLEEPER = 28 # A3 SOFT_SLEEPER = 23 # A4 # 城市缩写 with open('citys.json', 'r') as f: Citys = json.load(f) def Time(): """ 获取当前时间 :return: """ list_time = list(time.localtime()) year = str(list_time[0]) month = str(list_time[1]) day = str(list_time[2]) if len(month) == 1: month = '0' month if len(day) == 1: day = '0' day return year, month, day proxy = {'http': '113.125.128.4:8888'} class Train: def __init__(self, from_station, to_station, train_date=Time()[0] '-' Time()[1] '-' Time()[2] ): self.from_station = from_station self.to_station = to_station self.train_date = train_date self.url = 'https://kyfw.12306.cn/otn/leftTicket/queryA?leftTicketDTO.%s&leftTicketDTO.%s&leftTicketDTO.%s&purpose_codes=ADULT' self.headers = {'User-Agent': str(UserAgent().random)} self.session = requests.session() self.session.get( 'https://kyfw.12306.cn/otn/leftTicket/init?linktypeid=dc&fs=杭州东,HGH&ts=太原南,TNV&date=2022-01-19&flag=N,N,Y', headers=self.headers, proxies=proxy, timeout=5) def station(self, train_number): """ 查找列车起点可买和终点可买 :return: """ url = f'https://kyfw.12306.cn/otn/czxx/queryByTrainNo?' \ f'{parse.urlencode({"train_no": train_number})}&' \ f'{parse.urlencode({"from_station_telecode": Citys[self.from_station]})}&' \ f'{parse.urlencode({"to_station_telecode": Citys[self.to_station]})}&' \ f'{parse.urlencode({"depart_date": self.train_date})}' self.headers['User-Agent'] = str(UserAgent().random) content = self.session.get(url, headers=self.headers, proxies=proxy, timeout=5) content = content.content.decode('utf-8') data = json.loads(content) stations_data = data['data']['data'] stations_data.sort(key=lambda x: x['station_no']) from_station_idx = int( list(filter(lambda x: self.from_station in x['station_name'], stations_data))[0]['station_no']) to_station_idx = int( list(filter(lambda x: self.to_station in x['station_name'], stations_data))[0]['station_no']) from_station_buy = [station['station_name'] for station in stations_data[:from_station_idx]][::-1] to_station_buy_1 = [station['station_name'] for station in stations_data[from_station_idx:to_station_idx]][::-1] to_station_buy_2 = [station['station_name'] for station in stations_data[to_station_idx:]] to_station_buy = to_station_buy_1 to_station_buy_2 return from_station_buy, to_station_buy def train(self): """ 爬取信息 :return: """ url = self.url % (parse.urlencode({"train_date": self.train_date}), parse.urlencode({"from_station": Citys[self.from_station]}), parse.urlencode({"to_station": Citys[self.to_station]})) self.headers['User-Agent'] = str(UserAgent().random) content = self.session.get(url, headers=self.headers, proxies=proxy, timeout=5) content = content.content.decode('utf-8') data = json.loads(content) dict_train = data['data']['result'] dict_map = data['data']['map'] res = [] TRAINs = [] for train in dict_train: train_split = train.split('|') if train_split[TRAIN] in TRAINs: continue print(train_split[TRAIN]) prices = self.price(train_split[FROM_STATION_NO], train_split[TO_STATION_NO], train_split[SEAT_TYPES], train_split[TRAIN_NO]) from_station_buy, to_station_buy = self.station(train_split[TRAIN_NO]) buy = [] for from_station, to_station in [[x, y] for x in from_station_buy for y in to_station_buy]: if train_split[IF_BOOK] == 'N' and self.book_if(from_station, to_station, train_split[TRAIN]): buy.append(f'{from_station}-{to_station}') train_str = [train_split[TRAIN], dict_map[train_split[DEPARTURE_STATION]], dict_map[train_split[TERMINUS]], train_split[DEPARTURE_TIME], train_split[ARRIVAL_TIME], train_split[DURATION], f'{train_split[BUSINESS_SEAT]} / {prices.get("A9")}' if train_split[BUSINESS_SEAT] and train_split[BUSINESS_SEAT] != '无' else '无', f'{train_split[FIRST_SEAT]} / {prices.get("M")}' if train_split[FIRST_SEAT] and train_split[FIRST_SEAT] != '无' else '无', f'{train_split[SECOND_SEAT]} / {prices.get("O")}' if train_split[SECOND_SEAT] and train_split[SECOND_SEAT] != '无' else '无', f'{train_split[SOFT_SLEEPER]} / {prices.get("A4")}' if train_split[SOFT_SLEEPER] and train_split[SOFT_SLEEPER] != '无' else '无', f'{train_split[HARD_SLEEPER]} / {prices.get("A3")}' if train_split[HARD_SLEEPER] and train_split[HARD_SLEEPER] != '无' else '无', f'{train_split[HARD_SEAT]} / {prices.get("A1")}' if train_split[HARD_SEAT] and train_split[HARD_SEAT] != '无' else '无', f'{train_split[NO_SEAT]} / {prices.get("WZ")}' if train_split[NO_SEAT] and train_split[NO_SEAT] != '无' else '无', '可以' if train_split[IF_BOOK] == 'Y' else '不可以', ', '.join(buy)] res.append('| ' ' | '.join(train_str) ' |') TRAINs.append(train_split[TRAIN]) return res def book_if(self, from_station, to_station, train_number): """ 查询是否有票 :param from_station: :param to_station: :param train_number: :return: """ url = self.url % (parse.urlencode({"train_date": self.train_date}), parse.urlencode({"from_station": Citys[from_station]}), parse.urlencode({"to_station": Citys[to_station]})) train = [''] for _ in range(20): try: self.headers['User-Agent'] = str(UserAgent().random) content = self.session.get(url, headers=self.headers, proxies=proxy, timeout=5) content = content.content.decode('utf-8') data = json.loads(content) dict_train = data['data']['result'] train = list(filter(lambda x: x.split('|')[TRAIN] == train_number, dict_train)) if not train: return break except: print('查询余票请求失败,重新请求') time.sleep(2) return True if train[0].split('|')[IF_BOOK] == 'Y' else False def price(self, from_station_no, to_station_no, seat_types, train_no): """ 查询车票价格 :return: """ url = f'https://kyfw.12306.cn/otn/leftTicket/queryTicketPrice?' \ f'{parse.urlencode({"train_no": train_no})}&' \ f'{parse.urlencode({"from_station_no": from_station_no})}&' \ f'{parse.urlencode({"to_station_no": to_station_no})}&' \ f'{parse.urlencode({"seat_types": seat_types})}&' \ f'{parse.urlencode({"train_date": self.train_date})}' data = {} for _ in range(20): try: self.headers['User-Agent'] = str(UserAgent().random) content = self.session.get(url, headers=self.headers, proxies=proxy, timeout=5) content = content.content.decode('utf-8') data = json.loads(content) break except: print('查询价格请求失败,重新请求') time.sleep(2) return data.get('data') if __name__ == '__main__': print('--------------------12306信息查询-----------------------') while True: from_station = input('请输入出发地:') or '杭州' if from_station in Citys: break while True: to_station = input('请输入目的地:') or '太原' if to_station in Citys: break pattern = re.compile('\d{4}-\d{2}-\d{2}') while True: date = input('请输入出发时间(注意格式:2022-02-01, 默认情况下为购票当日):') if not date or re.match(pattern, date): break if not date: date = Time()[0] '-' Time()[1] '-' Time()[2] train = Train(from_station, to_station, date) # information = train.train() print('-------------------------------------------------------') print('-------------------12306查询结果如下---------------------') print('-------------------------------------------------------') print('| 车次 | 出发站 | 到达站 | 出发时间 | 到达时间 | 历时 | 商务座 | 一等座 | 二等座 | 软卧 | 硬卧 | 硬座 | 无座 |直接买 | 备选 |') for info in information: print(info) print('-------------------------------------------------------')


山西的疫情防控政策也变松了,之前只要在北京海淀呆过的回家必须14天集中隔离,恶意返乡事件出来后,政策也随之变松,目前海淀回去只要核酸即可,这次回家之路畅通了很多,给国家点赞,给山西政府点赞!

当然目前的版本还是存在着问题:就是这个排序太过死板,只考虑了出发站和到达站的距离,没有考虑座位信息,比如本来可以多买几站坐着回去,但非要为了少花点钱补票站着,也不太合适。

因此为了能舒舒服服地回家,着手下一版…

,

免责声明:本文仅代表文章作者的个人观点,与本站无关。其原创性、真实性以及文中陈述文字和内容未经本站证实,对本文以及其中全部或者部分内容文字的真实性、完整性和原创性本站不作任何保证或承诺,请读者仅作参考,并自行核实相关内容。文章投诉邮箱:anhduc.ph@yahoo.com

    分享
    投诉
    首页