2024字节青训营刷题日记

266 阅读48分钟

入营考核:【✍️ 入营考核】AI 加码,青训营 X 豆包MarsCode 技术训练营👏 欢迎各位同学报名“AI 加码,青训营 X - 掘金

任务时间: 截至 2024.10.29 23:59

任务:

在指定的题库中自主选择不少于 15 道算法题并完成解题,其中题目难度分配如下:

  • 简单题不少于 10 道
  • 中等题不少于 4 道
  • 困难题不少于 1 道

题目:AI刷题-掘金

提交进度:bytedance.larkoffice.com

简单题

1.计算位置x到y的最小步数(2)

每步的值等于上一步的值 -1+0+1

步长的规则是逐渐增长到某个最大值,然后再逐步减小,并且第一步必须是 1,最后一步必须是 1

法一(基于AI助手,逻辑有误):

思路:

  1. 先计算出 yx 的差值 diff = y - x 。这个差值就是我们需要通过一系列步长来填补的距离。

  2. 初始化一个变量 step1 ,表示当前的步长。

  3. 再创建一个变量 total_steps 来记录总共走的步数,初始值为 0

  4. 进入一个循环,在循环中判断:

    1. 如果 diff - step >= 0 ,说明当前步长还可以走,就增加步数 total_steps += 1 ,然后更新 diff = diff - step
    2. 如果 diff - step < 0 ,说明当前步长太大了,就将步长减 1 ,即 step -= 1
  5. diff 减小到 0 或者小于 0 时,结束循环。

  6. 但要注意,因为第一步和最后一步必须是 1 ,所以如果 total_steps 小于 2 ,则返回 2 ;否则返回 total_steps

def solution(x_position, y_position):
    diff = abs(y_position - x_position)
    step = 1
    total_step = 0
    while diff > 0:
        if diff >= step:
            diff = diff - step
            step += 1
            total_step += 1
        else:
            step -= 1
    if total_step < 2:
        return 2
    # Please write your code here
    return total_step

输出结果有误:

问题:

  1. 代码未处理最后一步也必须是 1 的要求
  2. diff < step 时,直接递减 step是错误的

法二(基于ChatGPT,不符合题意):

逐步递增并把最后一步调为1,思路错误

改进:

  1. 通过不断增加步长来累积总步长 current_sum,并增加步数 total_steps,直到 current_sum 超过或等于 diff
  2. 如果累积步长超过了目标距离,但超出部分不能通过对称的递减步数调整成偶数,则继续增加步数,直到满足条件。
def solution(x_position, y_position):
    diff = abs(y_position - x_position)
    step = 1 # 步长
    total_step = 0 # 步数
    current_sum = 0 # 总步长
    while current_sum < diff:
        current_sum += step
        total_step += 1
        # 步长按规则变化 (+1)
        step += 1

    # 当前总步长累加超过目标差距时,需要微调步数
    while (current_sum - diff) % 2 != 0:
        current_sum += step
        total_step += 1
        step += 1
    return total_step

输出结果比预计结果小1:

法三(基于ChatGPT,结果错误):

def solution(x_position, y_position):
    diff = abs(y_position - x_position)
    step = 1 # 步长
    total_step = 0 # 步数
    current_sum = 0 # 总步长
     # 不断增加步数,直到覆盖距离并且步数满足条件
    while current_sum < diff or (current_sum - diff) % 2 != 0:
        total_step += 1
        current_sum += step
        step += 1  # 增加步长

current_sum 小于 diff 或者 (current_sum - diff) 不是偶数时,继续增加步数。

在每次循环中:

  • total_step += 1:表示增加了一步。
  • current_sum += step:将当前步长加入到已走的距离中。
  • step += 1:增加步长,为下一步做准备。

法四(动态规划,无法得出结果):

思路:动态规划

使用缓存(哈希表、集合或数组)保存计算结果,从而避免子问题重复计算

  1. 定义状态:设定一个状态表示为 (current_position, step_length),其中 current_position 是当前的位置,step_length 是当前步长。
  2. 初始化:从位置 x 出发,第一步的长度为 1
from collections import deque
def solution(x_position, y_position):
    # BFS initialization
    queue = deque([(x_position, 1, 0)])  # (current_position, step_length, steps)
    visited = set((x_position, 1))  # visited set to avoid repeating (position, step_length)

    while queue:
        position, step_length, steps = queue.popleft()
        
        # Check if we reached the target with last step length as 1
        if position == y_position and step_length == 1:
            return steps
        
        # Next step length can be step_length - 1, step_length, step_length + 1
        for next_step_length in [step_length - 1, step_length, step_length + 1]:
            if next_step_length > 0:  # step length must be positive
                # Calculate new position
                new_position = position + next_step_length
                if (new_position, next_step_length) not in visited:
                    visited.add((new_position, next_step_length))
                    queue.append((new_position, next_step_length, steps + 1))
    
    return -1  # Should never reach here if y is reachable from x

法五(参考CSDN,结果正确):

计算位置x到y的最小步数_编程语言-CSDN问答

回答二、三经验证结果错误

将第一个回答的go语言代码转为python语言如下:

def solution(x, y):
    d = abs(y - x)
    
    if d < 3:
        return d
    
    count = 2
    d -= 2
    sum_steps = 0
    step = 1
    
    while True:
        if step + 1 + sum_steps <= d:
            step += 1
            sum_steps += step
        
        if sum_steps > d:
            sum_steps -= step
            step -= 1
        
        d -= step
        count += 1
        
        if d == 0:
            break
            
    return count

结果正确:

法六(参考掘金,结果正确):

通过研究步长的变法规律求解:

(1)考虑步长序列的可能形式

为了使总步数最少,步长序列应该尽可能地先增加步长,然后再减少步长。

例如,对于步长最大值为 m 的序列,步长可以是:

  • 当步数为奇数时:[1, 2, 3, ..., m, ..., 3, 2, 1],步数为 2m - 1
  • 当步数为偶数时:[1, 2, 3, ..., m, m, ..., 3, 2, 1],步数为 2m
(2)建立步数与距离的关系

当步数为奇数( N = 2m - 1 )时:

  • 总距离 S = 1 + 2 + 3 + ... + m + ... + 3 + 2 + 1 = m^2

当步数为偶数( N = 2m )时:

  • 总距离 S = 1 + 2 + 3 + ... + m + m + ... + 3 + 2 + 1 = m(m + 1)
(3)算法

给定距离 D = |y - x|,找到最小的m,使得总距离 S 不小于 D

  • 初始化 m = 1

  • 判断:

    • 如果 D <= m^2,则最小步数为 N = 2m - 1
    • 如果 m^2 < D <= m(m + 1),则最小步数为 N = 2m
    • 否则,m += 1,重复上述判断。
    • (这里奇数和偶数的m时候,是总步长的最大值)

即:给定d,判断d是在小于m^的区间还是m^2 < D <= m(m + 1)区间,前者步数为2m - 1,后者为2m,如果走奇数步的距离可以超过d,则奇数步可以走完,如果奇数步不足以走完,则需要偶数步

def solution(x_position, y_position):
    d = abs(x_position - y_position)
    m = 1
    while True:
        if m * m >= d:
            return 2 * m -1
        elif m * m < d <= m * (m + 1):
            return 2 * m
        m += 1

要解决d = 0的情况

def solution(x_position, y_position):
    d = abs(x_position - y_position)
    if d == 0:
        return 0
    m = 1
    while True:
        if m * m >= d:
            return 2 * m -1
        elif m * m < d <= m * (m + 1):
            return 2 * m
        m += 1

参考代码:

def solution(x_position, y_position):
    D = abs(x_position - y_position)
    if D == 0:
        return 0

    m = 0
    while True:
        m += 1
        if D <= m * m:
            return 2 * m - 1
        elif D <= m * (m + 1):
            return 2 * m

参考资料

第一篇算法博客,计算x到y的最少步数

计算A到B的最小步数_博问_博客园

计算位置x到y的最小步数_编程语言-CSDN问答

2024-10-6青训营刷题笔记-计算位置×到y的最少步数计算位置×到y的最少步数 AB 实验同学每天都很苦恼如何可以更 - 掘金

❗MarosCode题目——计算位置 x 到 y 的最少步数最小步数求解思路 问题描述 从位置 x 到位置 y 的最小步数 - 掘金

字节笔试 9.25 3/4_牛客网

字节跳动2023秋招研发第五场笔试【后端方向】题解_牛客网

总结

没想到做简单题直接遇到了一道对我来说很难的题,本身没有思路,借助AI也踩了很多坑,多向大家借鉴学习吧,后面发现是字节的笔试题,感受到难度了。

最后这道题通过借鉴他人的思路完成了,非常尽力地去理解了一下思路,很巧妙的算法。

2.环状DNA序列的最小表示法(4)(环状序列Circular Sequence,ACM/ICPC Seoul 2004,UVa1584)

思路:套用比大小的思路,先让第一个值为min,再一个个去和min比,如果比min小,就让当前值等于min,每次循环,都利用切片把dna_sequence循环右移一位

def solution(dna_sequence):
    min = dna_sequence
    for i in range(len(dna_sequence) - 1):
        dna_sequence = dna_sequence[1::] + dna_sequence[0]
        if dna_sequence < min:
            min = dna_sequence
    return dna_sequence

if __name__ == "__main__":
    #  You can add more test cases here
    print(solution("ATCA") == "AATC")
    print(solution("CGAGTC") == "AGTCCG")
    print(solution("TCATGGAGTGCTCCTGGAGGCTGAGTCCATCTCCAGTAG") == "AGGCTGAGTCCATCTCCAGTAGTCATGGAGTGCTCCTGG")

应该返回min

def solution(dna_sequence):
    min = dna_sequence
    for i in range(len(dna_sequence) - 1):
        dna_sequence = dna_sequence[1::] + dna_sequence[0]
        if dna_sequence < min:
            min = dna_sequence
    return min

if __name__ == "__main__":
    #  You can add more test cases here
    print(solution("ATCA") == "AATC")
    print(solution("CGAGTC") == "AGTCCG")
    print(solution("TCATGGAGTGCTCCTGGAGGCTGAGTCCATCTCCAGTAG") == "AGGCTGAGTCCATCTCCAGTAGTCATGGAGTGCTCCTGG")

for循环条件有误,应该是比较从第一个到最后一个

def solution(dna_sequence):
    min = dna_sequence
    for i in range(len(dna_sequence)):
        dna_sequence = dna_sequence[1::] + dna_sequence[0]
        if dna_sequence < min:
            min = dna_sequence
    return min

发现第二个测试用例结果为False,此处的测试用例结果有误

修改后结果正确:

其他修改建议:不要用min命名变量,会和内置函数混淆,可改为min_seq

参考资料

环状序列(算法竞赛入门经典二)-阿里云开发者社区

环状序列字典序_将环形字符串按字典序排序的方法-CSDN博客

python 环形序列_环形字符串 python-CSDN博客

环状序列字典序_将环形字符串按字典序排序的方法-CSDN博客

环状序列(ACM/ICPC Seoul 2004,UVa1584)_2004acm环状序列原题-CSDN博客

UVA1584 环状序列 Circular Sequence 题解(洛谷)

字典序最小问题(贪心)-腾讯云开发者社区-腾讯云

环状序列(Circular Sequence,ACM/ICPC Seoul 2004,UVa1584)题解_2004acm环状序列原题-CSDN博客

3.Base32编码和解码(13)

编码

  1. 字符串转为二进制

将字符串 foo 以 ASCII 编号形式表达为 102 111 111 的序列,将 102 111 111 的序列转换为二进制表示,即 01100110 01101111 01101111

python转ASCII码

ASCII码转字符:chr(int) 0-127

字符转ASCII码:ord(str) 单个可见ASCII字符串,不能中文和非ASCII码里的字符

进制转换

# -*- coding: UTF-8 -*-
 
# Filename : test.py
# author by : www.runoob.com
 
# 获取用户输入十进制数
dec = int(input("输入数字:"))
 
print("十进制数为:", dec)
print("转换为二进制为:", bin(dec))
print("转换为八进制为:", oct(dec))
print("转换为十六进制为:", hex(dec))
python3 test.py 
输入数字:5
十进制数为:5
转换为二进制为: 0b101
转换为八进制为: 0o5
转换为十六进制为: 0x5

转换为二进制后前面会带0b,在本题中使用是要删去0b

转换为八位二进制:

python中有一个zfill方法用来给字符串前面补0,非常有用

n = "123"
s = n.zfill(5)
assert s=="00123"

在 Python 中,assert是一个用于断言的语句,用来判断一个条件是否为真,如果条件为假,则会引发一个 AssertionError 异常。对于 assert s=="00123" 这样的代码,通常是期望变量 s 的值等于 "00123"

  1. 如果二进制数据的 bit 数目不不是 5 的倍数的话,在末尾补 0 直至为 5 的倍数

补零的方法:

'''
原字符串左侧对齐, 右侧补零:
'''
str.ljust(width,'0') 
input: '789'.ljust(32,'0')
output: '78900000000000000000000000000000'


'''
原字符串右侧对齐, 左侧补零:
方法一:
'''
str.rjust(width,'0') 
input: '798'.rjust(32,'0')
output: '00000000000000000000000000000798'
'''
方法二:
'''
str.zfill(width)
input: '123'.zfill(32)
output:'00000000000000000000000000000123'
'''
方法三:
'''
'%07d' % n
input: '%032d' % 89
output:'00000000000000000000000000000089'

3. 以 5 bit 为一组进行分组

使用切片和循环拆分字符串

一种常见的方法是使用切片和循环来拆分字符串。我们可以设置一个步长,然后从字符串的起始位置开始,每次截取固定长度的子字符串。以下是一个示例:

def split_string_by_length(string, length):
    return [string[i:i+length] for i in range(0, len(string), length)]

string = "Python is a powerful programming language."
length = 10
result = split_string_by_length(string, length)
print(result)
['Python is ', 'a powerful', ' programming', ' language.']

编译器不适用于该用法,修改为

five_bit = []
    for i in range(0, len(engit_bin), 5):
        five_bit.append(engit_bin[i: i + 5])

输出:['01100', '11001', '10111', '10110', '11110']

  1. 将每一组转换为十进制的索引,再转为对应的字符

Python3 字典 | 菜鸟教程

  1. 根据原始二进制数据的 bit 数目除以 40 后的余数,确定末尾需要补 0 的数目

    1. 如果原始二进制数据 bit 数目除以 40 后的余数是 0 的话,不需要补 +
    2. 如果原始二进制数据 bit 数目除以 40 后的余数是 8 的话,补 6 个 +
    3. 如果原始二进制数据 bit 数目除以 40 后的余数是 16 的话,补 4 个 +
    4. 如果原始二进制数据 bit 数目除以 40 后的余数是 24 的话,补 3 个 +
    5. 如果原始二进制数据 bit 数目除以 40 后的余数是 32 的话,补 1 个 +

输出结果有误,经检查,eight_bin字符串长度显示为24,可能有隐藏字符,实际为补零时在eight_bin后补了一位,没有改变量名,之后写代码修改变量要付给一个新的变量

解码

  1. 除去末尾的加号,转为二进制
# 解码
    # 定义解码字典
    characters = ['0', '1', '2', '3', '4', '5', '6', '7', '8', '9', 
              'm', 'n', 'b', 'v', 'c', 'x', 'z', 'a', 's', 'd', 
              'f', 'g', 'h', 'j', 'k', 'l', 'p', 'o', 'i', 'u', 
              'y', 't']
    indices = list(range(0, 32))
    # 使用字典推导式构建字典
    base32_dict = {characters[i]: indices[i] for i in range(len(characters))}
    # print(base32_dict)
    # 除去末尾的加号,转为二进制
    drop_plus = encodedStr.strip('+')
    # print(drop_plus)
    five_bins = ''
    for c in drop_plus:
        index = base32_dict.get('c')
        five_bins += str(bin(index)).lstrip('0b').zfill(5)
    # print(five_bins)

输出five_bins

0111001110011100111001110

2. 根据末尾加号个数,求出原长,除掉末尾补的0

# 2. 根据末尾加号个数,求出原长
    plus_count = encodedStr.count('+')
    if plus_count == 6:
       rem_new = 8
    elif plus_count == 4:
        rem_new = 16
    elif plus_count == 3:
        rem_new = 24
    elif plus_count == 1:
        rem_new = 32
    elif plus_count == 0:
        rem_new = 0
    len_eight_bins = len(five_bins) // 40 + 24
    # print(len_eight_bins)
    drop_end_zero =  five_bins[:-(len(five_bins) - len_eight_bins)]

除去末尾补的零后的结果:

  1. 转为十进制ASCII编号
# 3.转为十进制ASCII编号
    # 把字符串按八位分组
    eight_bits = []
    for i in range(0, len(drop_end_zero), 8):
        eight_bits.append(drop_end_zero[i: i + 8])
    print(eight_bits)
    result2 = ''
    for eight_bit in eight_bits:
        ascii = int(eight_bit, 2)
        print(ascii)
        result2 += chr(ascii) 
    print(result2)

分开print结果正确,但输出结果为False,return语句有误,改为return result1 + ":" + result2

前两个测试用例可以通过,最后一个不可以,原因为需要解码的字符串为"bljhy+++b0zj5+++",需要在代码里加入拆分两串编码的代码

不过按照规则,base32编码不应该是这样的格式

将原来的解码写成一个函数,将encodedStr分割后,逐个调用解码函数,再将得到的结果连接成最终的result2,不过当base32编码末尾没有加号时会无法分割,可能会使结果有误

# 解码
    def encode(s):
        # 定义解码字典
        characters = ['9', '8', '7', '6', '5', '4', '3', '2', '1', '0', 
                'm', 'n', 'b', 'v', 'c', 'x', 'z', 'a', 's', 'd', 
                'f', 'g', 'h', 'j', 'k', 'l', 'p', 'o', 'i', 'u', 
                'y', 't']
        indices = list(range(0, 32))
        # 使用字典推导式构建字典
        base32_dict = {characters[i]: indices[i] for i in range(len(characters))}
        # print(base32_dict)
        # 除去末尾的加号,转为二进制
        print(s)
        drop_plus = s.rstrip('+')
        print(drop_plus)
        five_bins = ''
        for c in drop_plus:
            index = base32_dict.get(c)
            if index is None:
                raise ValueError(f"字符 {c} 不是有效的 Base32 字符。")
            print(index)
            five_bins += str(bin(index)).lstrip('0b').zfill(5)
        print(five_bins)
        # 2. 根据末尾加号个数,求出原长,除掉末尾补的0
        plus_count = encodedStr.count('+')
        if plus_count == 6:
            rem_new = 8
        elif plus_count == 4:
            rem_new = 16
        elif plus_count == 3:
            rem_new = 24
        elif plus_count == 1:
            rem_new = 32
        elif plus_count == 0:
            rem_new = 0
        len_eight_bins = len(five_bins) // 40 + 24
        # print(len_eight_bins)
        drop_end_zero =  five_bins[:-(len(five_bins) - len_eight_bins)]
        print(drop_end_zero)
        # 3.转为十进制ASCII编号
        # 把字符串按八位分组
        eight_bits = []
        for i in range(0, len(drop_end_zero), 8):
            eight_bits.append(drop_end_zero[i: i + 8])
        print(eight_bits)
        result2 = ''
        for eight_bit in eight_bits:
            ascii = int(eight_bit, 2)
            print(ascii)
            result2 += chr(ascii) 
        print(result2)
        return result2
    # 分割多段base32编码
    # print(encodedStr)
    result = re.findall(r'\w++*', encodedStr)
    result = [s for s in result if s]  # 去除空字符串
    print("result:" + str(result))
    result2 = ""
    for s in result:
        result2 += encode(s)
result = re.findall(r'\w++*', encodedStr)
  • re.findall() 函数可以在字符串中找到正则表达式所匹配的所有子串,并返回一个列表
  • \w+匹配一个或多个单词字符
  • +* 会匹配零个或多个连续的 + 字符

目前三个测试用例均可通过

完整代码:

from re import L
import re

def solution(rawStr, encodedStr):
    # 将字符串每个字符转为ASCII码,再转为八位二进制
    def toASCII_tobin(rawStr):
        res = ''
        for i in rawStr:
            res += str(bin(ord(i))).lstrip('0b').zfill(8)
        return res
    eight_bin = toASCII_tobin(rawStr)
    # print(eight_bin)
    # 如果二进制数据的 bit 数目不不是 5 的倍数的话,在末尾补 0 直至为 5 的倍数
    eight_bin_zero = eight_bin.ljust(len(eight_bin) + (5 - len(eight_bin) % 5), '0')
    # 按5bit进行为一组进行分组
    five_bits = []
    for i in range(0, len(eight_bin_zero), 5):
        five_bits.append(eight_bin_zero[i: i + 5])
    # print(five_bits)
    # 建立Base32 的索引 - 字符转换表
    base32 = {
  0: '9', 1: '8', 2: '7', 3: '6', 4: '5', 5: '4', 6: '3', 7: '2', 
  8: '1', 9: '0', 10: 'm', 11: 'n', 12: 'b', 13: 'v', 14: 'c', 15: 'x', 
  16: 'z', 17: 'a', 18: 's', 19: 'd', 20: 'f', 21: 'g', 22: 'h', 23: 'j', 
  24: 'k', 25: 'l', 26: 'p', 27: 'o', 28: 'i', 29: 'u', 30: 'y', 31: 't'
}
    # 转十进制,再转为对应的字符
    dec_num_char = ''
    for five_bit in five_bits:
        dec_num = int(five_bit, 2)
        dec_num_char += base32.get(dec_num)
        # print(dec_num)
        # print(dec_num_char)
    # 判断末尾需要补 0 的数目
    rem = len(eight_bin) % 40
    # print(rem)
    result1 = dec_num_char
    if rem == 8:
        result1 += '+' * 6
    elif rem == 16:
        result1 += '+' * 4
    elif rem == 24:
        result1 += '+' * 3
    elif rem == 32:
        result1 += '+' * 1
    print(result1)
    # 解码
    def encode(s):
        # 定义解码字典
        characters = ['9', '8', '7', '6', '5', '4', '3', '2', '1', '0', 
                'm', 'n', 'b', 'v', 'c', 'x', 'z', 'a', 's', 'd', 
                'f', 'g', 'h', 'j', 'k', 'l', 'p', 'o', 'i', 'u', 
                'y', 't']
        indices = list(range(0, 32))
        # 使用字典推导式构建字典
        base32_dict = {characters[i]: indices[i] for i in range(len(characters))}
        # print(base32_dict)
        # 除去末尾的加号,转为二进制
        print(s)
        drop_plus = s.rstrip('+')
        print(drop_plus)
        five_bins = ''
        for c in drop_plus:
            index = base32_dict.get(c)
            if index is None:
                raise ValueError(f"字符 {c} 不是有效的 Base32 字符。")
            print(index)
            five_bins += str(bin(index)).lstrip('0b').zfill(5)
        print(five_bins)
        # 2. 根据末尾加号个数,求出原长,除掉末尾补的0
        plus_count = encodedStr.count('+')
        if plus_count == 6:
            rem_new = 8
        elif plus_count == 4:
            rem_new = 16
        elif plus_count == 3:
            rem_new = 24
        elif plus_count == 1:
            rem_new = 32
        elif plus_count == 0:
            rem_new = 0
        len_eight_bins = len(five_bins) // 40 + 24
        # print(len_eight_bins)
        drop_end_zero =  five_bins[:-(len(five_bins) - len_eight_bins)]
        print(drop_end_zero)
        # 3.转为十进制ASCII编号
        # 把字符串按八位分组
        eight_bits = []
        for i in range(0, len(drop_end_zero), 8):
            eight_bits.append(drop_end_zero[i: i + 8])
        print(eight_bits)
        result2 = ''
        for eight_bit in eight_bits:
            ascii = int(eight_bit, 2)
            print(ascii)
            result2 += chr(ascii) 
        print(result2)
        return result2
    # 分割多段base32编码
    # print(encodedStr)
    result = re.findall(r'\w++*', encodedStr)
    result = [s for s in result if s]  # 去除空字符串
    print("result:" + str(result))
    result2 = ""
    for s in result:
        result2 += encode(s)
    return result1 + ":" + result2

if __name__ == "__main__":
    #  You can add more test cases here
    print(solution("foo", "bljhy+++") == "bljhy+++:foo" )
    print(solution("foo", "b0zj5+++") == "bljhy+++:bar" )
    print(solution("The encoding process represents 40-bit groups of input bits as output strings of 8 encoded characters.  Proceeding from left to right, a 40-bit input group is formed by concatenating 5 8bit input groups. These 40 bits are then treated as 8 concatenated 5-bit groups, each of which is translated into a single character in the base 32 alphabet.  When a bit stream is encoded via the base 32 encoding, the bit stream must be presumed to be ordered with the most-significant- bit first. That is, the first bit in the stream will be the high- order bit in the first 8bit byte, the eighth bit will be the low- order bit in the first 8bit byte, and so on.", "bljhy+++b0zj5+++") == "maf3m164vlahyl60vlds9i6svuahmiod58l3mi6sbglhmodfcbz61b8vb0fj1162c0jjmi6d58jhb160vlk2mu89b0fj1il9b4ls9oogcak2mu89cvp25pncbuls9oo359i79lncbvjh1ln558ahzknsb4aj1lnscbj7917zc0jh3ln4bafhill9bll3yo09vashbu89cajs9id0buf21n89b5z61b8vb0fj1160vlk2mu89bul3yunz58fj3163vul3pln558a2s166vuj33knfbgj37u60vlds9v0928a3su89v4j29unf58dj5oogc8lsi17fv8sj3l093zk79kd0cals9knsbfz21p64vkz21id4b4p3ml89b4ls9c89bvjhiko8cashiknfbgs79v0vb0fj1162c0jjmi6d4zz3mkn6v9z3yla9cuf3sko158fj316fc0zhiiobb4p3ml89v4j21ol9b5z23pncbuh3m166v8zj5kn6casj5160vkz21p6458a37io459ld5168vak3zkn7bgp7i189muf3moa9b5z35pnf58lj1id4b4hs9pnd58shikoxbash116hv4zs9u61bfz35kndbfz63ba9bgj33oo5v4j3cn89caf3m167v4p79iofc0sh7o09vgpj3u89b0ss9i6sbgljmon4bzz21ol9b0ss9oosbasj5ln558ohsu6158p3zl09vgjj3u8vcvfhcod0blfh3kncczhs9kd0czz3bpnscvp7i17fv8zj1160cbh79u61bfz3bpnscvp79kd0czz3soa9caf3m16dcal3mknv58ohso6b58a3m16fv8ss9p60buf7p16xc0s3mia9b0fj1160vkz21p6458d3siddczz6zkd0czz35ynfbfh79u61bfz3mpn2v8p3z167v4p79uo0vah79kd458p3zl09vajjcn09vul31lns58a3su89v4j79u61bfz3bpnscvp79c67v4p79kdlcassk168vls79iox58jhinz+:foobar" )

注释全部print代码

from re import L
import re

def solution(rawStr, encodedStr):
    # 将字符串每个字符转为ASCII码,再转为八位二进制
    def toASCII_tobin(rawStr):
        res = ''
        for i in rawStr:
            res += str(bin(ord(i))).lstrip('0b').zfill(8)
        return res
    eight_bin = toASCII_tobin(rawStr)
    # print(eight_bin)
    # 如果二进制数据的 bit 数目不不是 5 的倍数的话,在末尾补 0 直至为 5 的倍数
    eight_bin_zero = eight_bin.ljust(len(eight_bin) + (5 - len(eight_bin) % 5), '0')
    # 按5bit进行为一组进行分组
    five_bits = []
    for i in range(0, len(eight_bin_zero), 5):
        five_bits.append(eight_bin_zero[i: i + 5])
    # print(five_bits)
    # 建立Base32 的索引 - 字符转换表
    base32 = {
  0: '9', 1: '8', 2: '7', 3: '6', 4: '5', 5: '4', 6: '3', 7: '2', 
  8: '1', 9: '0', 10: 'm', 11: 'n', 12: 'b', 13: 'v', 14: 'c', 15: 'x', 
  16: 'z', 17: 'a', 18: 's', 19: 'd', 20: 'f', 21: 'g', 22: 'h', 23: 'j', 
  24: 'k', 25: 'l', 26: 'p', 27: 'o', 28: 'i', 29: 'u', 30: 'y', 31: 't'
}
    # 转十进制,再转为对应的字符
    dec_num_char = ''
    for five_bit in five_bits:
        dec_num = int(five_bit, 2)
        dec_num_char += base32.get(dec_num)
        # print(dec_num)
        # print(dec_num_char)
    # 判断末尾需要补 0 的数目
    rem = len(eight_bin) % 40
    # print(rem)
    result1 = dec_num_char
    if rem == 8:
        result1 += '+' * 6
    elif rem == 16:
        result1 += '+' * 4
    elif rem == 24:
        result1 += '+' * 3
    elif rem == 32:
        result1 += '+' * 1
    # print(result1)
    # 解码
    def encode(s):
        # 定义解码字典
        characters = ['9', '8', '7', '6', '5', '4', '3', '2', '1', '0', 
                'm', 'n', 'b', 'v', 'c', 'x', 'z', 'a', 's', 'd', 
                'f', 'g', 'h', 'j', 'k', 'l', 'p', 'o', 'i', 'u', 
                'y', 't']
        indices = list(range(0, 32))
        # 使用字典推导式构建字典
        base32_dict = {characters[i]: indices[i] for i in range(len(characters))}
        # print(base32_dict)
        # 除去末尾的加号,转为二进制
        # print(s)
        drop_plus = s.rstrip('+')
        # print(drop_plus)
        five_bins = ''
        for c in drop_plus:
            index = base32_dict.get(c)
            if index is None:
                raise ValueError(f"字符 {c} 不是有效的 Base32 字符。")
            # print(index)
            five_bins += str(bin(index)).lstrip('0b').zfill(5)
        # print(five_bins)
        # 2. 根据末尾加号个数,求出原长,除掉末尾补的0
        plus_count = encodedStr.count('+')
        if plus_count == 6:
            rem_new = 8
        elif plus_count == 4:
            rem_new = 16
        elif plus_count == 3:
            rem_new = 24
        elif plus_count == 1:
            rem_new = 32
        elif plus_count == 0:
            rem_new = 0
        len_eight_bins = len(five_bins) // 40 + 24
        # print(len_eight_bins)
        drop_end_zero =  five_bins[:-(len(five_bins) - len_eight_bins)]
        # print(drop_end_zero)
        # 3.转为十进制ASCII编号
        # 把字符串按八位分组
        eight_bits = []
        for i in range(0, len(drop_end_zero), 8):
            eight_bits.append(drop_end_zero[i: i + 8])
        # print(eight_bits)
        result2 = ''
        for eight_bit in eight_bits:
            ascii = int(eight_bit, 2)
            # print(ascii)
            result2 += chr(ascii) 
        # print(result2)
        return result2
    # 分割多段base32编码
    # print(encodedStr)
    result = re.findall(r'\w++*', encodedStr)
    result = [s for s in result if s]  # 去除空字符串
    # print("result:" + str(result))
    result2 = ""
    for s in result:
        result2 += encode(s)
    return result1 + ":" + result2

if __name__ == "__main__":
    #  You can add more test cases here
    print(solution("foo", "bljhy+++") == "bljhy+++:foo" )
    print(solution("foo", "b0zj5+++") == "bljhy+++:bar" )
    print(solution("The encoding process represents 40-bit groups of input bits as output strings of 8 encoded characters.  Proceeding from left to right, a 40-bit input group is formed by concatenating 5 8bit input groups. These 40 bits are then treated as 8 concatenated 5-bit groups, each of which is translated into a single character in the base 32 alphabet.  When a bit stream is encoded via the base 32 encoding, the bit stream must be presumed to be ordered with the most-significant- bit first. That is, the first bit in the stream will be the high- order bit in the first 8bit byte, the eighth bit will be the low- order bit in the first 8bit byte, and so on.", "bljhy+++b0zj5+++") == "maf3m164vlahyl60vlds9i6svuahmiod58l3mi6sbglhmodfcbz61b8vb0fj1162c0jjmi6d58jhb160vlk2mu89b0fj1il9b4ls9oogcak2mu89cvp25pncbuls9oo359i79lncbvjh1ln558ahzknsb4aj1lnscbj7917zc0jh3ln4bafhill9bll3yo09vashbu89cajs9id0buf21n89b5z61b8vb0fj1160vlk2mu89bul3yunz58fj3163vul3pln558a2s166vuj33knfbgj37u60vlds9v0928a3su89v4j29unf58dj5oogc8lsi17fv8sj3l093zk79kd0cals9knsbfz21p64vkz21id4b4p3ml89b4ls9c89bvjhiko8cashiknfbgs79v0vb0fj1162c0jjmi6d4zz3mkn6v9z3yla9cuf3sko158fj316fc0zhiiobb4p3ml89v4j21ol9b5z23pncbuh3m166v8zj5kn6casj5160vkz21p6458a37io459ld5168vak3zkn7bgp7i189muf3moa9b5z35pnf58lj1id4b4hs9pnd58shikoxbash116hv4zs9u61bfz35kndbfz63ba9bgj33oo5v4j3cn89caf3m167v4p79iofc0sh7o09vgpj3u89b0ss9i6sbgljmon4bzz21ol9b0ss9oosbasj5ln558ohsu6158p3zl09vgjj3u8vcvfhcod0blfh3kncczhs9kd0czz3bpnscvp7i17fv8zj1160cbh79u61bfz3bpnscvp79kd0czz3soa9caf3m16dcal3mknv58ohso6b58a3m16fv8ss9p60buf7p16xc0s3mia9b0fj1160vkz21p6458d3siddczz6zkd0czz35ynfbfh79u61bfz3mpn2v8p3z167v4p79uo0vah79kd458p3zl09vajjcn09vul31lns58a3su89v4j79u61bfz3bpnscvp79c67v4p79kdlcassk168vls79iox58jhinz+:foobar" )

结果:

代码改进

因为编码和解码过程较为复杂,所以前前后后用了很多变量,可能在命名上不够规范,同时逻辑上不够精简

  1. 编码和解码分两个函数写,可以减少不同的变量名,逻辑也更清晰

参考资料

青训营刷题-13Base32编码和解码 实现一个 Base32 的编码和解码函数。 Base32 以 5 bit 为一组 - 掘金

总结

这道题思路不难,只需要跟着步骤一步一步完成,但是总体来说实现较复杂,需要花费一定的时间。

  1. DNA序列距离编辑(16)

思路

动态规划

示例:

  • 求解:

    • **划分阶段:**按照台阶的层数进行划分为0∼n。
    • **定义状态:**定义状态dp[i]为:爬到第i阶台阶的方案数。
    • 状态转移方程:dp[i]=dp[i1]+dp[i2]dp[i]=dp[i−1]+dp[i−2]
    • 初始条件:dp[0]=1,dp[1]=1dp[0]=1,dp[1]=1
    • 最终结果:dp[n]dp[n]
class Solution:
    def climbStairs(self, n: int) -> int:
        # 状态
        dp = [0 for _ in range(0, n + 1)]
        # 初始条件:
        dp[0] = 1
        dp[1] = 1
        # 状态转移
        for i in range(2, n + 1):
            dp[i] = dp[i - 1] + dp[i - 2]
        return dp[n]

定义一个二维数组 dp,其中 dp[i][j] 表示将 dna1 的前 i 个字符转换为 dna2 的前 j 个字符所需的最小操作数

  • 初始状态: dp[0][0] = 0,因为将空字符串转换为空字符串不需要任何操作。 dp[i][0] = i,因为将前 i 个字符的 dna1 转换为空字符串需要 i 次删除操作。 dp[0][j] = j,因为将空字符串转换为前 j 个字符的 dna2 需要 j 次插入操作。

  • 状态转移方程: 如果 dna1[i-1] == dna2[j-1](说明第i个和第j个字符相同),则 dp[i][j] = dp[i-1][j-1],因为最后一个字符相同,不需要额外操作。 如果dna1[i-1] != dna2[j-1] 则需要考虑三种操作:

    • 插入操作:dp[i][j] = dp[i][j-1] + 1
    • 删除操作:dp[i][j] = dp[i-1][j] + 1
    • 替换操作:dp[i][j] = dp[i-1][j-1] + 1

注意:

  1. df初始化时,行数为[0,len(dna1)+1),说明从前0个字符到前len(len(dna1))均可向dna2转化
  2. 如果dna1[i-1] != dna2[j-1],取删除、插入、替换下步数最小的

我的代码

def solution(dna1, dna2):
    # Please write your code here
    # 状态
    # dna1长度为行数,dna2长度为列数
    dp = [[0 for _ in range(len(dna2) + 1)] for _ in range(len(dna1) + 1)]
    # 初始条件
    dp[0][0] = 0
    # dp[i][0] = i0列等于行号
    for i in range(1, len(dna1) + 1):
        dp[i][0] = i
    # dp[0][j] = j0行等于列号
    for j in range(1, len(dna2) + 1):
        dp[0][j] = j
    for i in range(len(dna1) + 1):
        for j in range(len(dna2) + 1):
            if dna1[i-1] == dna2[j-1]:
                dp[i][j] = dp[i-1][j-1]
            else:
                dp[i][j] = min(dp[i][j-1] + 1, dp[i-1][j] + 1, dp[i-1][j-1] + 1)
    return dp[len(dna1)][len(dna2)]

if name == "__main__":
    #  You can add more test cases here
    print(solution("AGCTTAGC", "AGCTAGCT") == 2 )
    print(solution("AGCCGAGC", "GCTAGCT") == 4)

输出结果有误。填充dp数组时从1开始,0行0列已经初始化

def solution(dna1, dna2):
    # Please write your code here
    # 状态
    # dna1长度为行数,dna2长度为列数
    dp = [[0 for _ in range(len(dna2) + 1)] for _ in range(len(dna1) + 1)]
    # 初始条件
    dp[0][0] = 0
    # dp[i][0] = i0列等于行号
    for i in range(1, len(dna1) + 1):
        dp[i][0] = i
    # dp[0][j] = j0行等于列号
    for j in range(1, len(dna2) + 1):
        dp[0][j] = j
    for i in range(1, len(dna1) + 1):
        for j in range(1, len(dna2) + 1):
            if dna1[i-1] == dna2[j-1]:
                dp[i][j] = dp[i-1][j-1]
            else:
                dp[i][j] = min(dp[i][j-1] + 1, dp[i-1][j] + 1, dp[i-1][j-1] + 1)
    return dp[len(dna1)][len(dna2)]

if __name__ == "__main__":
    #  You can add more test cases here
    print(solution("AGCTTAGC", "AGCTAGCT") == 2 )
    print(solution("AGCCGAGC", "GCTAGCT") == 4)

正确:

参考代码

def solution(dna1, dna2):
    len1 = len(dna1)
    len2 = len(dna2)
    dp = [[0] * (len2 + 1) for _ in range(len1 + 1)]

    for i in range(1, len1 + 1):
        dp[i][0] = i

    for j in range(1, len2 + 1):
        dp[0][j] = j

    for i in range(1, len1 + 1):
        for j in range(1, len2 + 1):
            if dna1[i - 1] == dna2[j - 1]:
                dp[i][j] = dp[i - 1][j - 1]
            else:
                dp[i][j] = min(
                    dp[i - 1][j] + 1,  # 删除操作
                    dp[i][j - 1] + 1,  # 插入操作
                    dp[i - 1][j - 1] + 1  # 替换操作
                )

    return dp[len1][len2]

# 测试用例
print(solution("AGCTTAGC", "AGCTAGCT") == 2)
print(solution("AGCCGAGC", "GCTAGCT") == 4)

参考资料

总结

对动态规划不是特别熟,所以这道题刚拿到并没有思路,动态规划似乎很适合求这种从一个状态到另一个状态最少需要多少步的问题,这道题一个特别大的难点在于状态的定义和状态转移方程的确定,可能动态规划做多了会更有思路吧。

  1. 寻找独特数字卡片

思路

可以用Python 标准库中的 collections.Counter 类,对列表中的数字进行计数,找到出现次数为1的即为结果

Counter使用示例:

from collections import Counter

c = Counter(['red', 'blue', 'red', 'green', 'blue', 'blue'])
print(c)  # Counter({'blue': 3, 'red': 2, 'green': 1}) 

代码

from collections import Counter
def solution(inp):
    # Edit your code here
    c = Counter(inp)
    # print(c)
    # c = Counter({1: 2, 2: 2, 3: 2, 5: 2, 4: 1})
    for key, count in c.items():
        if count == 1:
            res = key
    return res

if __name__ == "__main__":
    # Add your test cases here

    print(solution([1, 1, 2, 2, 3, 3, 4, 5, 5]) == 4)
    print(solution([0, 1, 0, 1, 2]) == 2)

6. ## 数字字符串格式化

思路

  • 去除前面无用的0
  • 取出整数部分,从右往左三位一个逗号

未完成的代码:

# 去掉前面的零
    drop_zero = s.lstrip('0')
    # print(drop_zero)
    # 分离成整数和小数部分
    parts = drop_zero.split('.')
    # ['1294512', '12412']
    # print(parts)
    # write code here

  • 后了解到用python中的字符串格式化可以直接实现

python3

print("{:,}".format(1234567))  # 输出: 1,234,567

python3.6

large_number = 1000000
print(f"{large_number:,}")

代码

def solution(s: str) -> str:
    # 小数
    if '.' in s:
        return "{:,}".format(float(s))
    # 整数
    return "{:,}".format(int(s))
    

if __name__ == '__main__':
    print(solution("1294512.12412") == '1,294,512.12412')
    print(solution("0000123456789.99") == '123,456,789.99')
    print(solution("987654321") == '987,654,321')

7. ## 数字分组求偶数和

思路

  • 穷举,dp,dfs
  • 三个数加起来是偶数,两奇一偶或者三个偶数,推广到多个小组,以组合成的数字中奇偶数的个数来判断和的奇偶性,奇数个数为偶数个,则和为偶数
  • 如果是手算的话感觉用树结构会好一点

穷举法:

itertools 是 Python 标准库中的一个模块,提供了高效的迭代器操作工具。这些工具可以用来处理可迭代对象(如列表、元组、字典等),并生成各种复杂的迭代器组合。itertools 模块非常适合用于操作和处理组合、排列、笛卡尔积等情况,并且比手写循环更简洁高效。

使用迭代器可以避免写多重循环,避免不知道有几组的情况。

itertools.product(*iterables, repeat=1)

  • 功能:生成输入可迭代对象的笛卡尔积。
  • 用法

import itertools for p in itertools.product([1, 2], ['a', 'b']):

print(p)

输出: (1, 'a'), (1, 'b'), (2, 'a'), (2, 'b')

代码

import itertools

def solution(numbers):
    # 生成每个数字组的所有可能选择
    groups = [list(str(num)) for num in numbers]  # 将每个数字组转为字符串列表
    
    # 穷举所有组合
    all_combinations = itertools.product(*groups)  # *groups 解包作为独立参数传递给 product
    
    # 统计满足各位数字之和为偶数的组合
    count = 0
    for combination in all_combinations:
        digit_sum = sum(int(digit) for digit in combination)  # 计算组合的各位数字之和
        if digit_sum % 2 == 0:  # 如果各位数字之和为偶数
            count += 1
    
    return count

if __name__ == "__main__":
    # 你可以在这里添加更多的测试用例
    print(solution([123, 456, 789]) == 14)
    print(solution([123456789]) == 4)
    print(solution([14329, 7568]) == 10)

参考资料

刷题笔记-day01 blog.51cto.com

和为偶数的数字组合 深度优先搜索(dfs)-CSDN博客

  1. 创意标题匹配问题

(通配符是用成对 {} 括起来的字符串,可以包含 0 个或者多个字符)进行替换

输入:n = 4, template = "ad{xyz}cdc{y}f{x}e", titles = ["``ad``cdc``efdfe``f``f``e``", "``ad``cdc``efd``f``e``ff", "dcdcefdfeffe", "adcdcfe"] 输出:[True, False, False, True]

思路

正则表达式

  1. 将template转换成正则表达式,将{xxx}转换成.*

    1. {xxx}用正则表达式表示为{.*?}
    2.   非贪婪模式 ?。非贪婪匹配确保尽可能少地匹配字符,这样它只会匹配 {} 内部的内容。
    3. re.sub(pattern, replacement, string)表示替换

/表示转义

  1. 将转换后的template和title匹配,这里要用fullmatch

补充:

  • re.match 尝试从字符串的起始位置匹配一个模式,如果不是起始位置匹配成功的话,match() 就返回 none。

函数语法

re.match(pattern, string, flags=0)

#!/usr/bin/python
import re
 
line = "Cats are smarter than dogs"
 
matchObj = re.match( r'(.*) are (.*?) .*', line, re.M|re.I)
 
if matchObj:
   print "matchObj.group() : ", matchObj.group()
   print "matchObj.group(1) : ", matchObj.group(1)
   print "matchObj.group(2) : ", matchObj.group(2)
else:
   print "No match!!"

结果

matchObj.group() :  Cats are smarter than dogs
matchObj.group(1) :  Cats
matchObj.group(2) :  smarter
  • .:表示匹配任意单个字符(除了换行符 \n)。

*:表示前面的字符可以出现 0 次或更多次。

.*:任意字符(除换行符外)出现 0 次或更多次

代码

import re
def solution(n, template, titles):
    # Please write your code here
    result = ""
    # 将template转换成正则表达式,将{xxx}转换成.*
    # ad{xyz}cdc{y}f{x}e -> ad.*cdc.*f.*e
    re_template = re.sub(r"{.*?}", ".*", template)
    # print(re_template)
    for title in titles:
        if re.fullmatch(re_template, title):
            result += "True,"
        else:
            result += "False,"
    # 删去最后的','
    result = result[0: -1]
    # print(result)
    return result
if __name__ == "__main__":
    #  You can add more test cases here
    testTitles1 = ["adcdcefdfeffe", "adcdcefdfeff", "dcdcefdfeffe", "adcdcfe"]
    testTitles2 = ["CLSomGhcQNvFuzENTAMLCqxBdj", "CLSomNvFuXTASzENTAMLCqxBdj", "CLSomFuXTASzExBdj", "CLSoQNvFuMLCqxBdj", "SovFuXTASzENTAMLCq", "mGhcQNvFuXTASzENTAMLCqx"]
    testTitles3 = ["abcdefg", "abefg", "efg"]

    print(solution(4, "ad{xyz}cdc{y}f{x}e", testTitles1) == "True,False,False,True" )
    print(solution(6, "{xxx}h{cQ}N{vF}u{XTA}S{NTA}MLCq{yyy}", testTitles2) == "False,False,False,False,False,True" )
    print(solution(3, "a{bdc}efg", testTitles3) == "True,True,False" )

9. ## 打点计数器的区间合并

思路

先将列表按第一个数从小到大排(按每个子列表的第一个元素),如果前一个的end>=后一个的start,则合并

代码

def solution(inputArray):
    # Please write your code here
    # 按每个子列表的第一个元素排序
    # [[1,4], [7, 10], [3, 5]] -> [[1, 4], [3, 5], [7, 10]]
    sorted_array = sorted(inputArray, key = lambda x: x[0])
    merged_array = []
    i = 0
    while i < len(inputArray) - 1:
        if inputArray[i][1] >= inputArray[i+1][0]:
            # 合并
            merged_array.append([inputArray[i][1], inputArray[i+1][0]])
            i += 2
        else:
            merged_array.append(inputArray[i])
            i += 1
    # 输出点数
    count = 0
    for array in merged_array:
        for _ in range(array[0], array[1] + 1):
            count += 1
    return count

if __name__ == "__main__":
    #  You can add more test cases here
    testArray1 = [[1,4], [7, 10], [3, 5]]
    testArray2 = [[1,2], [6, 10], [11, 15]]

    print(solution(testArray1) == 7 )
    print(solution(testArray2) == 9 )

输出结果为False

  • 合并逻辑有误,不是merged_array.append([inputArray[i][1], inputArray[i+1][0]]),而是merged_array.append([inputArray[i][0], inputArray[i+1][1]])
  • 合并的是sorted_array,不是mergerd_array
def solution(inputArray):
    # Please write your code here
    # 按每个子列表的第一个元素排序
    # [[1,4], [7, 10], [3, 5]] -> [[1, 4], [3, 5], [7, 10]]
    sorted_array = sorted(inputArray, key = lambda x: x[0])
    merged_array = []
    i = 0
    while i < len(sorted_array) - 1:
        if sorted_array[i][1] >= sorted_array[i+1][0]:
            # 合并
            merged_array.append([sorted_array[i][0], sorted_array[i+1][1]])
            i += 2
        else:
            merged_array.append(sorted_array[i])
            i += 1
    # 如果最后一个数组没有合并的话,加入merged_array
    if merged_array[-1] 
    print(merged_array)
    # 输出点数
    count = 0
    for array in merged_array:
        for _ in range(array[0], array[1] + 1):
            count += 1
    return count

if __name__ == "__main__":
    #  You can add more test cases here
    testArray1 = [[1,4], [7, 10], [3, 5]]
    testArray2 = [[1,2], [6, 10], [11, 15]]

    print(solution(testArray1) == 7 )
    print(solution(testArray2) == 9 )

现在的代码有两个问题;

  1. 如果最后一个数组不需要合并,它应该加入merged_array,但没有加入,因为i在指向倒数第二个数组时就退出了
  2. 源代码合并逻辑有误,因为不是只需要挨着去看需不需要合并,加到一个新数组,而是需要合并之后再排序再合并

修改合并逻辑:

如果输入数组不为空,先把第一个加入megerd_array,然后遍历排序后的数组,依次和result中的数组比较,重叠合并,不重叠加入

发现代码中样例输出值错了

def solution(inputArray):
    # Please write your code here
    if not inputArray:
        return 0
    # 按每个子列表的第一个元素排序
    # [[1,4], [7, 10], [3, 5]] -> [[1, 4], [3, 5], [7, 10]]
    sorted_array = sorted(inputArray, key = lambda x: x[0])
    # 合并重叠区间
    merged_array = []
    merged_array.append(sorted_array[0])
    for i in range(1, len(sorted_array)):
        if merged_array[-1][1] >= sorted_array[i][0]:
            # 合并不一定生成一个新数组append,可以更新merged_array[-1]的右边界
            merged_array[-1][1] = sorted_array[i][1]
        else:
            merged_array.append(sorted_array[i])
    print(merged_array)
    # 输出点数
    count = 0
    for array in merged_array:
        for _ in range(array[0], array[1] + 1):
            count += 1
    return count

if __name__ == "__main__":
    #  You can add more test cases here
    testArray1 = [[1,4], [7, 10], [3, 5]]
    testArray2 = [[1,2], [6, 10], [11, 15]]
    testArray3 = [[1, 3], [2, 6], [8, 10]]

    print(solution(testArray1) == 9 )
    print(solution(testArray2) == 12 )
    print(solution(testArray3) == 9 )

样例通过,提交不通过

测试用例输入:
[WARN]    inputArray = [[6,18],[2,16],[12,16],[5,16],[8,10],[1,9],[7,21],[2,3],[7,21],[6,7],[1,24],[9,17],[1,4],[12,18],[2,17],[4,19],[9,22],[8,24],[13,21],[7,8],[19,22],[22,23],[6,14]]
[WARN] 你的输出:
[WARN]    23
[WARN] 预期输出:
[WARN]    24

合并时右区间应该更新为二者之中右区间更大的

merged_array[-1][1] = max(merged_array[-1][1], sorted_array[i][1])

最终代码:

def solution(inputArray):
    # Please write your code here
    if not inputArray:
        return 0
    # 按每个子列表的第一个元素排序
    # [[1,4], [7, 10], [3, 5]] -> [[1, 4], [3, 5], [7, 10]]
    sorted_array = sorted(inputArray, key = lambda x: x[0])
    # 合并重叠区间
    merged_array = []
    merged_array.append(sorted_array[0])
    for i in range(1, len(sorted_array)):
        if merged_array[-1][1] >= sorted_array[i][0]:
            # 合并不一定生成一个新数组append,可以更新merged_array[-1]的右边界
            merged_array[-1][1] = max(merged_array[-1][1], sorted_array[i][1])
        else:
            merged_array.append(sorted_array[i])
    print(merged_array)
    # 输出点数
    count = 0
    for array in merged_array:
        for _ in range(array[0], array[1] + 1):
            count += 1
    return count

if __name__ == "__main__":
    #  You can add more test cases here
    testArray1 = [[1,4], [7, 10], [3, 5]]
    testArray2 = [[1,2], [6, 10], [11, 15]]
    testArray3 = [[1, 3], [2, 6], [8, 10]]

    print(solution(testArray1) == 9 )
    print(solution(testArray2) == 12 )
    print(solution(testArray3) == 9 )

参考资料

  1. 寻找最大葫芦(测试样例有误)

思路

先找所有的葫芦,3+2,然后是否满足小于max,再找最大的,可以定义一个比较葫芦大小的函数

这道题测试样例有问题先不做了

  1. 找出整型数组中占比超过一半的数

思路

只有一个数字占比超过一半,找到出现次数最多的数即可,可直接用most_common()

most_common()

返回一个列表,包含counter中n个最大数目的元素,如果忽略n或者为None,most_common()将会返回counter中的所有元素,元素有着相同数目的将会选择出现早的元素

list1 = ["a", "a", "a", "b", "c", "f", "g", "g", "c", "11", "g", "f", "10", "2"]
print(Counter(list1).most_common(3))
#结果:[('a', 3), ('g', 3), ('c', 2)]

#"c"、"f"调换位置,结果变化
list2 = ["a", "a", "a", "b", "f", "c", "g", "g", "c", "11", "g", "f", "10", "2"]
print(Counter(list2).most_common(3))
#结果:[('a', 3), ('g', 3), ('f', 2)]

本题中使用most_common(1)即可

代码

from collections import Counter
def solution(array):
    # Edit your code here
    c = Counter(array)
    # print(c.most_common(1))
    # [(3, 5)]
    return c.most_common(1)[0][0]

if __name__ == "__main__":
    # Add your test cases here

    print(solution([1, 3, 8, 2, 3, 1, 3, 3, 3]) == 3)

中等题

  1. 徒步旅行中的补给问题

思路

  • 贪心算法+最小堆

K :小R可以先囤积食物,因为他最多能携带 K 份食物。在这段时间内,小R可以一次性购买较便宜的食物,以应付这几天的食物需求,不需要每天都去补给站购买。

从第 K 天开始:由于小R每天要消耗 1 份食物,而他只能最多携带 K 份食物,所以他在第 K 天时,必须去补充当天及之后几天的食物。从此之后,他每天都需要确保有足够的食物供应,这意味着从第 K 天开始,他每天都需要从堆中选择最便宜的食物来补给自己。

import heapq
def solution(n, k, data):
    # Edit your code here
    total_cost = 0
    min_heap = []
    for i in range(n):
        heapq.heappush(min_heap, data[i])
        # 堆中存放k天的价格
        if len(min_heap) > k:
            heapq.heappop(min_heap)
        # 第k天开始需要每天补给
        if i + 1 >= k:
            total_cost += min_heap[0]
    return total_cost

if __name__ == "__main__":
    # Add your test cases here

    print(solution(5, 2, [1, 2, 3, 3, 2]) == 9)
    print(solution(6, 3, [4, 1, 5, 2, 1, 3]) == 9)
    print(solution(4, 1,  [3, 2, 4, 1]) == 10)

代码只能通过第一个样例

  • 动态规划

状态: dp[i][j] 表示在第 i 天结束时携带 j 份食物所需的最小花费。

初始状态: dp[0][0] = 0

目标状态: dp[N][0]

状态更新:

假设今天携带 j 份食物,我们可以选择购买 buy 份(buy 的范围从 0 到 K - j

购买的食物加上今天携带的食物 j 应该能保证有足够的食物度过下一天。

每次更新时,花费会增加 buy * data[i]

状态转移方程:

下一天结束时携带的食物量为 new_j = j + buy - 1(小R从第 i 天走向第 i+1 天时,需要消耗 1 份食物)

遍历每种购买方案,找到最小的:dp[i+1][new_j] = min(dp[i+1][new_j], dp[i][j] + buy * data[i])

代码解释

  • dp = [[float('inf')] * (k + 1) for _ in range(n + 1)]

创建一个大小为 (n + 1) x (k + 1) 的二维列表 dp,其中每个元素初始值为正无穷大(float('inf')

inf表示无穷大

[[0,1,2,3,k],

[0,1,2,3,k],

[01,2,3,k]]

代码

def solution(n, k, data):
    # 初始化 dp 数组,dp[i][j] 表示第 i 天结束时携带 j 份食物的最小花费
    dp = [[float('inf')] * (k + 1) for _ in range(n + 1)]
    dp[0][0] = 0  # 第 0 天没有食物,花费为 0

    for i in range(n):
        for j in range(k + 1):
            if dp[i][j] == float('inf'):
                continue  # 如果当前状态不可达,跳过
            # 从第 i 天到第 i+1 天,消耗一份食物
            # 可以选择购买 buy 份(buy 的范围从 0 到 K - j)
            for buy in range(k - j + 1):  # 在当前携带 j 份食物的情况下,最多还能购买 (K-j) 份食物
                if j + buy >= 1:  # 必须至少有一份食物来度过下一天
                    new_j = j + buy - 1  # 下一天结束后携带的食物数量
                    dp[i + 1][new_j] = min(dp[i + 1][new_j], dp[i][j] + buy * data[i])

    # 返回第 n 天,携带 0 份食物时的最小花费
    return dp[n][0]


if __name__ == "__main__":
    # 测试用例
    print(solution(5, 2, [1, 2, 3, 3, 2]) == 9)  # 输出 True

参考资料

  1. 最大矩形面积问题

思路

穷举

代码

def solution(n, array):
    # Edit your code here
    r_max = 0
    for k in range(1, n + 1):
        for i in range(n):
            j = i + k - 1
            if j >= n:
                break
            # 求min(h[i],h[i+1],...,h[i+k−1])
            min_h = 0
            for m in range(i, j + 1):
                min_h = min(min_h, array[m])
            if k * min_h > r_max:
                r_max = k * min_h
    return r_max

if __name__ == "__main__":
    # Add your test cases here
    print(solution(5, [1, 2, 3, 4, 5]))
    print(solution(5, [1, 2, 3, 4, 5]) == 9)
    print(solution(6, [5, 4, 3, 2, 1, 6]) == 9)
    print(solution(4, [4, 4, 4, 4]) == 16)

结果错误

改进:

  1. 不需要j直接在for循环控制i的范围,不断切片取最小值
def solution(n, array):
    # Edit your code here
    r_max = 0
    for k in range(1, n + 1):
        for i in range(n - k + 1):
            # 求min(h[i],h[i+1],...,h[i+k−1])
            min_h = min(array[i: i + k])
            if k * min_h > r_max:
                r_max = k * min_h
    return r_max

if __name__ == "__main__":
    # Add your test cases here
    # print(solution(5, [1, 2, 3, 4, 5]))
    print(solution(5, [1, 2, 3, 4, 5]) == 9)
    print(solution(6, [5, 4, 3, 2, 1, 6]) == 9)
    print(solution(4, [4, 4, 4, 4]) == 16)

14. ## 最小替换子串长度

问题描述

小F得到了一个特殊的字符串,这个字符串只包含字符ASDF,其长度总是4的倍数。他的任务是通过尽可能少的替换,使得ASDF这四个字符在字符串中出现的频次相等。求出实现这一条件的最小子串长度。


测试样例

样例1:

输入:input = "ADDF" 输出:1

样例2:

输入:input = "ASAFASAFADDD" 输出:3

样例3:

输入:input = "SSDDFFFFAAAS" 输出:1

样例4:

输入:input = "AAAASSSSDDDDFFFF" 输出:0

样例5:

输入:input = "AAAADDDDAAAASSSS" 输出:4

思路

滑动窗口

有一个counter表示每个字母出现的次数

左指针left,右指针right,表示替换的字串

比如有8个字母,则每个字母出现的次数小于等于二,这个字串就是平衡的

开始时左右指针都指向0,不平衡时right右移,扩大字串长度,平衡了之后left左移,看是不是有小的字串可以平衡,在这个过程中不断更新最小替代字串长度

这种情况下会进入死循环,此时是不平衡的,right也没法扩大,肯定不会有更小的代替字串使原字串平衡了,所以break

代码

我的代码

def solution(input):
    # Please write your code here
    # 定义函数判断当前是否平衡
    def is_balanced(d: dict, avg: int):
        for v in d.values():
            if v > avg:
                return False
        return True
    d = {'A': 0, 'S': 0, 'D': 0, 'F': 0}
    for c in input:
        d[c] += 1
    n = len(input)
    avg = n / 4
    # 如果已经平衡,无需替换,最小替换字串长度为0
    if is_balanced(d, avg):
        return 0
    left = 0
    right = 0
    ans = n
    d[input[left]] -= 1
    while left <= right and right < n:
        if is_balanced(d, avg):
            ans = min(ans, right - left + 1)
            d[input[left]] += 1
            left += 1
        # 不平衡且right < n - 1才可以右移
        elif right < n - 1:
            right += 1
            d[input[right]] -= 1
        else:
            break
    return ans

if __name__ == "__main__":
    #  You can add more test cases here
    print(solution("ADDF") == 1 )
    print(solution("ASAFASAFADDD") == 3)
    print(solution("AAAASSSSDDDDFFFF") == 0)

参考代码:

import unittest
from typing import Dict


class Solution:
    def isBalanced(self, d: Dict, avg: int):
        for v in d.values():
            if v > avg:
                return False

        return True

    def balancedString(self, s: str) -> int:
        n = len(s)
        avg = n // 4
        d = {"Q": 0, "W": 0, "E": 0, "R": 0}
        for c in s:
            d[c] += 1

        if self.isBalanced(d, avg):
            return 0

        left = 0
        right = 0
        ans = n
        d[s[0]] -= 1
        while left <= right and right < n:
            if self.isBalanced(d, avg):
                ans = min(ans, right - left + 1)
                d[s[left]] += 1
                left += 1
            elif right < n - 1:
                right += 1
                d[s[right]] -= 1
            else:
                break

        return ans


def test(testObj: unittest.TestCase, s: str, expected: int) -> None:
    so = Solution()
    actual = so.balancedString(s)
    testObj.assertEqual(actual, expected)


class TestClass(unittest.TestCase):
    def test_1(self):
        test(self, "QWER", 0)

    def test_2(self):
        test(self, "QQWE", 1)

    def test_3(self):
        test(self, "QQQW", 2)

    def test_4(self):
        test(self, "WWEQERQWQWWRWWERQWEQ", 4)


if __name__ == "__main__":
    unittest.main()

"""
409ms, 68.21%
"""

参考资料

  1. Bytedance Tree 问题(未做)

没有思路,没有找到现有的解法,AI的代码也得不到正确的结果,跳过

  1. 还原原始字符串

思路

看样子是把后面的重复的字符串删去

蛮力法,找字符串后面和前面相同的重复子串,删掉

代码

我的代码

from queue import Queue
def solution(str1):
    q = Queue()
    q.put(str1)
    ans = str1
    while not q.empty():
        s1 = q.get()
        len_s1 = len(s1)
        for i in range(1, len_s1 // 2):
            t = s1
            # s2是后i个
            s2 = t[len_s1 - i:]
            # s3是除去后i个的后i个
            s3 = t[len_s1 - 2 * i: len_s1 - i]
            # 如果相等说明后i个和前面重复,把后i个去掉
            if s2 == s3:
                q.put(t[: len_s1 - i])
                ans = t[: len_s1 - i]
    # 如果从队列里取了没放进去,说明没有重复,返回结果
    return ans
  • for i in range(1, len_s1 // 2 + 1):改为for i in range(1, len_s1 // 2 + 1):
from queue import Queue
def solution(str1):
    q = Queue()
    q.put(str1)
    ans = str1
    while not q.empty():
        s1 = q.get()
        len_s1 = len(s1)
        for i in range(1, len_s1 // 2 + 1):
            t = s1
            # s2是后i个
            s2 = t[len_s1 - i:]
            # s3是除去后i个的后i个
            s3 = t[len_s1 - 2 * i: len_s1 - i]
            # 如果相等说明后i个和前面重复,把后i个去掉
            if s2 == s3:
                q.put(t[: len_s1 - i])
                ans = t[: len_s1 - i]
    # 如果从队列里取了没放进去,说明没有重复,返回结果
    return ans

if __name__ == "__main__":
    # Add your test cases here

    print(solution("abbabbbabb") == "ab")
    print(solution("abbbabbbb") == "ab")
    print(
        solution(
            "jiabanbananananiabanbananananbananananiabanbananananbananananbananananbanananan"
        )
        == "jiaban"
    )
    print(
        solution(
            "selectecttectelectecttectcttectselectecttectelectecttectcttectectelectecttectcttectectcttectectcttectectcttect"
        )
        == "select"
    )
    print(
        solution(
            "discussssscussssiscussssscussssdiscussssscussssiscussssscussssiscussssscussss"
        )
        == "discus"
    )

参考代码:

from queue import Queue

def solution(str1):
    q = Queue()
    q.put(str1)
    ans = str1
    while not q.empty():
        s1 = q.get()
        len_s1 = len(s1)
        for i in range(1, len_s1 // 2 + 1):
            t = s1
            s2 = t[len_s1 - i:]  # 取后面的i个字符
            s3 = t[len_s1 - 2 * i: len_s1 - i]  # 取从len_s1-2*i开始,长度为i的子串
            if s2 == s3:
                t = s1[:len_s1 - i]  # 删除最后的重复部分
                q.put(t)
                if len(t) < len(ans):
                    ans = t
    return ans

if __name__ == "__main__":
    print(solution("abbabbbabb") == "ab")  # Expected output: True
    print(solution("abbbabbbb") == "ab")  # Expected output: True
    print(solution("jiabanbananananiabanbananananbananananiabanbananananbananananbananananbanananan") == "jiaban")  # Expected output: True
    print(solution("selectecttectelectecttectcttectselectecttectelectecttectcttectectelectecttectcttectectcttectectcttectectcttect") == "select")  # Expected output: True
    print(solution("discussssscussssiscussssscussssdiscussssscussssiscussssscussssiscussssscussss") == "discus")  # Expected output: True

参考资料

困难题

  1. 最大UCC子串计算(未做)

最多有m次插入、删除或替换单个字符操作,找到最多的ucc子串

样例1:

输入:m = 3,s = "UC``U``UCC``C``CC" 输出:3

样例2:

输入:m = 6,s = "U"ccucc 输出:2

样例3:

输入:m = 2,s = "UCCU``UU``" 输出:2

  1. 二进制之和

代码

无脑解法:

def solution(binary1, binary2):
    # Please write your code here
    print(int(binary1, 2) + int(binary2, 2))
    return str(int(binary1, 2) + int(binary2, 2))

if __name__ == "__main__":
    #  You can add more test cases here
    print(solution("101", "110") == "11")
    print(solution("111111", "10100") == "83")
    print(solution("111010101001001011", "100010101001") == "242420")
    print(solution("111010101001011", "10010101001") == "31220")

python封装得比较好,直接用转换成十进制相加,不知道时间复杂度是多少

完结撒花💐💐💐