面试代码题(华为)编辑距离

965 阅读2分钟

描述

Given two words word1 and word2, find the minimum number of operations required to convert word1 to word2.

You have the following 3 operations permitted on a word:

  1. Insert a character
  2. Delete a character
  3. Replace a character

Example 1:

Input: word1 = "horse", word2 = "ros"
Output: 3
Explanation: 
horse -> rorse (replace 'h' with 'r')
rorse -> rose (remove 'r')
rose -> ros (remove 'e')

Example 2:

Input: word1 = "intention", word2 = "execution"
Output: 5
Explanation: 
intention -> inention (remove 't')
inention -> enention (replace 'i' with 'e')
enention -> exention (replace 'n' with 'x')
exention -> exection (replace 'n' with 'c')
exection -> execution (insert 'u')

思路

动态规划 01背包问题中,其中的知乎专栏详细的讲解了这个问题。(但是我看到这个专栏的时候已经考完试了,故考试时未能答出:),似乎带点难度的题我都没能答出来)

大部分情况下,dp[i] [j] dp[i-1] [j]dp[i] [j-1]dp[i-1] [j-1] 肯定存在某种关系。

字符串 word1 的长度为 i,字符串 word2 的长度为 j 时,将 word1 转化为 word2 所使用的最少操作次数为 dp[i] [j],即使用ij来代表两个字符串的状态。


有时候,数组的含义并不容易找,还得看自己去领悟。


  1. 如果我们 word1[i]word2 [j] 相等,这个时候不需要进行任何操作,显然有 dp[i] [j] = dp[i-1] [j-1],即最短编辑距离不变。
  2. 如果我们 word1[i]word2 [j] 不相等,这个时候我们就必须进行调整,而调整的操作有 3 种,我们要选择一种。三种操作对应的关系试如下(最短编辑距离都要+1):
  • 如果把字符 word1[i] 替换成与 word2[j] 相等,则有 dp[i] [j] = dp[i-1] [j-1] + 1,即在两个字符串分别为i、j的基础上;
  • 如果在字符串 word1末尾插入一个与 word2[j] 相等的字符,则有 dp[i] [j] = dp[i] [j-1] + 1;
  • 如果把字符 word1[i] 删除,则有 dp[i] [j] = dp[i-1] [j] + 1;


关系式为dp[i] [j] = min(dp[i-1] [j-1],dp[i] [j-1],dp[[i-1] [j]]) + 1;

class Solution {
    public int minDistance(String word1, String word2) {
        int s1Len = word1.length(), s2Len = word2.length();
        int dp[][] = new int[s1Len+1][s2Len+1];
        //对边界值进行计算
        //即边界值只能通过上一个边界值+1得来,
        //因为一个字符串长度为0时,和另一个字符串的最短编辑距离差只有增加一种情况。
        dp[0][0] = 0;
        for(int i = 1; i<s1Len+1; i++){
            dp[i][0] = dp[i-1][0] + 1;
        }
        for(int i = 1; i<s2Len+1; i++){
            dp[0][i] = dp[0][i-1] + 1;
        }

        //填表过程
        for(int i = 1; i<s1Len+1; i++){
            for(int j = 1; j<s2Len+1; j++){
                if(word1.charAt(i-1) == word2.charAt(j-1)){
                    dp[i][j] = dp[i-1][j-1];
                }else{
                    dp[i][j] = Math.min(Math.min(dp[i-1][j-1], dp[i-1][j]), dp[i][j-1])+1;
                }
            }
        }
        return dp[s1Len][s2Len];
    }
}
Runtime: 5 ms, faster than 84.06% of Java online submissions for Edit Distance.
Memory Usage: 42 MB, less than 5.88% of Java online submissions for Edit Distance.

找到数组的含义对解题至关重要,因为它决定着动态规划的核心——关系式