MD5算法逆向快速分析

4,066 阅读16分钟

1. MD5基本原理

MD5即Message-Digest Algorithm 5(信息-摘要算法5),主要用于确保信息一致性。MD5原理简要概述为以512位分组来处理输入的信息,且每一分组又被划分为16个32位子分组,经过了一系列的处理后,算法的输出由四个32位分组组成,将这四个32位分组级联后将生成一个128位散列值。

2. 算法流程

下面是一个so库中调用标准的MD5算法加密的实例

F5转换成伪c代码,如下

一个标准的MD5算法,首先会按顺序进行MD5Init()、MD5Update()、MD5Final(),最后会计算出一个32bits的16进制字符串,然后如果有需要的话会进行以16进制的形式输出字符串(sprintf(str_tmp, "%02x", decrypt[v1])),否则就是看起来是乱码的形式。

  1. MD5Init()

这步是初始化变量,初始的128位值为初始链接变量,这些参数用于第一轮的运算,以大端字节序来表示,他们分别为: A=0x01234567,B=0x89ABCDEF,C=0xFEDCBA98,D=0x76543210。 (每一个变量给出的数值是高字节存于内存低地址,低字节存于内存高地址,即大端字节序。在程序中变量A、B、C、D的值分别为0x67452301,0xEFCDAB89,0x98BADCFE,0x10325476)

F5转换成伪c代码后,如下

基本上,在一般的so中看到初始化为0x67452301,0xEFCDAB89,0x98BADCFE,0x10325476这些值的基本可以判断使用了MD5

  1. MD5Update() 、 MD5Final()

每一分组的算法流程如下:

第一分组需要将上面四个链接变量复制到另外四个变量中:A到a,B到b,C到c,D到d。从第二分组开始的变量为上一分组的运算结果,即A = a, B = b, C = c, D = d。

主循环有四轮(MD4只有三轮),每轮循环都很相似。第一轮进行16次操作。每次操作对a、b、c和d中的其中三个作一次非线性函数运算,然后将所得结果加上第四个变量,文本的一个子分组和一个常数。再将所得结果向左环移一个不定的数,并加上a、b、c或d中之一。最后用该结果取代a、b、c或d中之一。

以下是每次操作中用到的四个非线性函数(每轮一个)。

F( X ,Y ,Z ) = ( X & Y ) | ( (~X) & Z )

G( X ,Y ,Z ) = ( X & Z ) | ( Y & (~Z) )

H( X ,Y ,Z ) =X ^ Y ^ Z

I( X ,Y ,Z ) =Y ^ ( X | (~Z) )

(&是与(And),|是或(Or),~是非(Not),^是异或(Xor))

这四个函数的说明:如果X、Y和Z的对应位是独立和均匀的,那么结果的每一位也应是独立和均匀的。

F是一个逐位运算的函数。即,如果X,那么Y,否则Z。函数H是逐位奇偶操作符。

假设Mj表示消息的第j个子分组(从0到15),常数ti是4294967296*abs( sin(i) )的整数部分,i取值从1到64,单位是弧度。(4294967296=232) 现定义:

FF(a ,b ,c ,d ,Mj ,s ,ti ) 操作为 a = b + ( (a + F(b,c,d) + Mj + ti) << s)

GG(a ,b ,c ,d ,Mj ,s ,ti ) 操作为 a = b + ( (a + G(b,c,d) + Mj + ti) << s)

HH(a ,b ,c ,d ,Mj ,s ,ti) 操作为 a = b + ( (a + H(b,c,d) + Mj + ti) << s)

II(a ,b ,c ,d ,Mj ,s ,ti) 操作为 a = b + ( (a + I(b,c,d) + Mj + ti) << s)

注意:“<<”表示循环左移位,不是左移位。

这四轮(共64步)是:

第一轮

FF(a ,b ,c ,d ,M0 ,7 ,0xd76aa478 )

FF(d ,a ,b ,c ,M1 ,12 ,0xe8c7b756 )

FF(c ,d ,a ,b ,M2 ,17 ,0x242070db )

FF(b ,c ,d ,a ,M3 ,22 ,0xc1bdceee )

FF(a ,b ,c ,d ,M4 ,7 ,0xf57c0faf )

FF(d ,a ,b ,c ,M5 ,12 ,0x4787c62a )

FF(c ,d ,a ,b ,M6 ,17 ,0xa8304613 )

FF(b ,c ,d ,a ,M7 ,22 ,0xfd469501)

FF(a ,b ,c ,d ,M8 ,7 ,0x698098d8 )

FF(d ,a ,b ,c ,M9 ,12 ,0x8b44f7af )

FF(c ,d ,a ,b ,M10 ,17 ,0xffff5bb1 )

FF(b ,c ,d ,a ,M11 ,22 ,0x895cd7be )

FF(a ,b ,c ,d ,M12 ,7 ,0x6b901122 )

FF(d ,a ,b ,c ,M13 ,12 ,0xfd987193 )

FF(c ,d ,a ,b ,M14 ,17 ,0xa679438e )

FF(b ,c ,d ,a ,M15 ,22 ,0x49b40821 )

第二轮

GG(a ,b ,c ,d ,M1 ,5 ,0xf61e2562 )

GG(d ,a ,b ,c ,M6 ,9 ,0xc040b340 )

GG(c ,d ,a ,b ,M11 ,14 ,0x265e5a51 )

GG(b ,c ,d ,a ,M0 ,20 ,0xe9b6c7aa )

GG(a ,b ,c ,d ,M5 ,5 ,0xd62f105d )

GG(d ,a ,b ,c ,M10 ,9 ,0x02441453 )

GG(c ,d ,a ,b ,M15 ,14 ,0xd8a1e681 )

GG(b ,c ,d ,a ,M4 ,20 ,0xe7d3fbc8 )

GG(a ,b ,c ,d ,M9 ,5 ,0x21e1cde6 )

GG(d ,a ,b ,c ,M14 ,9 ,0xc33707d6 )

GG(c ,d ,a ,b ,M3 ,14 ,0xf4d50d87 )

GG(b ,c ,d ,a ,M8 ,20 ,0x455a14ed )

GG(a ,b ,c ,d ,M13 ,5 ,0xa9e3e905 )

GG(d ,a ,b ,c ,M2 ,9 ,0xfcefa3f8 )

GG(c ,d ,a ,b ,M7 ,14 ,0x676f02d9 )

GG(b ,c ,d ,a ,M12 ,20 ,0x8d2a4c8a )

第三轮

HH(a ,b ,c ,d ,M5 ,4 ,0xfffa3942 )

HH(d ,a ,b ,c ,M8 ,11 ,0x8771f681 )

HH(c ,d ,a ,b ,M11 ,16 ,0x6d9d6122 )

HH(b ,c ,d ,a ,M14 ,23 ,0xfde5380c )

HH(a ,b ,c ,d ,M1 ,4 ,0xa4beea44 )

HH(d ,a ,b ,c ,M4 ,11 ,0x4bdecfa9 )

HH(c ,d ,a ,b ,M7 ,16 ,0xf6bb4b60 )

HH(b ,c ,d ,a ,M10 ,23 ,0xbebfbc70 )

HH(a ,b ,c ,d ,M13 ,4 ,0x289b7ec6 )

HH(d ,a ,b ,c ,M0 ,11 ,0xeaa127fa )

HH(c ,d ,a ,b ,M3 ,16 ,0xd4ef3085 )

HH(b ,c ,d ,a ,M6 ,23 ,0x04881d05 )

HH(a ,b ,c ,d ,M9 ,4 ,0xd9d4d039 )

HH(d ,a ,b ,c ,M12 ,11 ,0xe6db99e5 )

HH(c ,d ,a ,b ,M15 ,16 ,0x1fa27cf8 )

HH(b ,c ,d ,a ,M2 ,23 ,0xc4ac5665 )

第四轮

II(a ,b ,c ,d ,M0 ,6 ,0xf4292244 )

II(d ,a ,b ,c ,M7 ,10 ,0x432aff97 )

II(c ,d ,a ,b ,M14 ,15 ,0xab9423a7 )

II(b ,c ,d ,a ,M5 ,21 ,0xfc93a039 )

II(a ,b ,c ,d ,M12 ,6 ,0x655b59c3 )

II(d ,a ,b ,c ,M3 ,10 ,0x8f0ccc92 )

II(c ,d ,a ,b ,M10 ,15 ,0xffeff47d )

II(b ,c ,d ,a ,M1 ,21 ,0x85845dd1 )

II(a ,b ,c ,d ,M8 ,6 ,0x6fa87e4f )

II(d ,a ,b ,c ,M15 ,10 ,0xfe2ce6e0 )

II(c ,d ,a ,b ,M6 ,15 ,0xa3014314 )

II(b ,c ,d ,a ,M13 ,21 ,0x4e0811a1 )

II(a ,b ,c ,d ,M4 ,6 ,0xf7537e82 )

II(d ,a ,b ,c ,M11 ,10 ,0xbd3af235 )

II(c ,d ,a ,b ,M2 ,15 ,0x2ad7d2bb )

II(b ,c ,d ,a ,M9 ,21 ,0xeb86d391 )

所有这些完成之后,将a、b、c、d分别在原来基础上再加上A、B、C、D。 即a = a + A,b = b + B,c = c + C,d = d + D

关键的伪代码如下:

void __fastcall MD5Transform(unsigned int *state, unsigned __int8 *block)
{
  unsigned int *v2; // r4@1
  unsigned int v3; // r5@1
  unsigned int v4; // r6@1
  unsigned int v5; // ST14_4@1
  unsigned int v6; // r7@1
  int v7; // r1@1
  int v8; // r1@1
  int v9; // r2@1
 ------skip--------
  int v132; // r0@1
  int v133; // r3@1
  unsigned int v134; // r5@1
  unsigned int v135; // r3@1
  unsigned int x[64]; // [sp+48h] [bp-118h]@1

  v2 = state;
  v3 = state[1];
  v4 = *state;
  v5 = state[2];
  v6 = state[3];
  MD5Decode(x, block, 0x40u);
  v7 = __ROR4__(x[0] - 680876936 + v4 + (v6 & ~v3 | v5 & v3), 25);
  v8 = v7 + v3;
  v9 = __ROR4__(x[1] - 389564586 + v6 + (v5 & ~v8 | v3 & v8), 20);
  v10 = v9 + v8;
  v11 = __ROR4__(x[2] + 606105819 + v5 + (v8 & v10 | v3 & ~v10), 15);
  v12 = v11 + v10;
  v13 = __ROR4__(v3 + x[3] - 1044525330 + (v10 & v12 | v8 & ~v12), 10);
  v14 = v13 + v12;
  v15 = __ROR4__(v8 + x[4] - 176418897 + (v12 & v14 | v10 & ~v14), 25);
  v16 = v15 + v14;
  v17 = __ROR4__(v10 + x[5] + 1200080426 + (v14 & v16 | v12 & ~v16), 20);
  v18 = v17 + v16;
  v19 = __ROR4__(v12 + x[6] - 1473231341 + (v16 & v18 | v14 & ~v18), 15);
  v20 = v19 + v18;
  v21 = __ROR4__(v14 + x[7] - 45705983 + (v18 & v20 | v16 & ~v20), 10);
  v22 = v21 + v20;
  v23 = __ROR4__(v16 + x[8] + 1770035416 + (v20 & v22 | v18 & ~v22), 25);
  v24 = v23 + v22;
  v25 = __ROR4__(v18 + x[9] - 1958414417 + (v22 & v24 | v20 & ~v24), 20);
  v26 = v25 + v24;
  v27 = __ROR4__(v20 + x[10] - 42063 + (v24 & v26 | v22 & ~v26), 15);
  v28 = v27 + v26;
  v29 = __ROR4__(v22 + x[11] - 1990404162 + (v26 & v28 | v24 & ~v28), 10);
  v30 = v29 + v28;
  v31 = __ROR4__(v24 + x[12] + 1804603682 + (v28 & v30 | v26 & ~v30), 25);
  v32 = v31 + v30;
  v33 = __ROR4__(x[13] - 40341101 + v26 + (v30 & v32 | v28 & ~v32), 20);
  v34 = v33 + v31 + v30;
  v35 = __ROR4__(x[14] - 1502002290 + v28 + ((v31 + v30) & v34 | ~v34 & v30), 15);
  v36 = v35 + v34;
  v37 = __ROR4__(x[15] + 1236535329 + v30 + (v34 & v36 | ~v36 & (v31 + v30)), 10);
  v38 = v37 + v36;
  v39 = __ROR4__(x[1] - 165796510 + v32 + (~v34 & v36 | v34 & v38), 27);
  v40 = v39 + v38;
  v41 = __ROR4__(x[6] - 1069501632 + v34 + (~v36 & v38 | v36 & v40), 23);
  v42 = v41 + v40;
  v43 = __ROR4__(v36 + x[11] + 643717713 + (v40 & ~v38 | v38 & v42), 18);
  v44 = v43 + v42;
  v45 = __ROR4__(v38 + x[0] - 373897302 + (v42 & ~v40 | v40 & v44), 12);
  v46 = v45 + v44;
  v47 = __ROR4__(v40 + x[5] - 701558691 + (v44 & ~v42 | v42 & v46), 27);
  v48 = v47 + v46;
  v49 = __ROR4__(v42 + x[10] + 38016083 + (v46 & ~v44 | v44 & v48), 23);
  v50 = v49 + v48;
  v51 = __ROR4__(v44 + x[15] - 660478335 + (v48 & ~v46 | v46 & v50), 18);
  v52 = v51 + v50;
  v53 = __ROR4__(v46 + x[4] - 405537848 + (v50 & ~v48 | v48 & v52), 12);
  v54 = v53 + v52;
  v55 = __ROR4__(v48 + x[9] + 568446438 + (v52 & ~v50 | v50 & v54), 27);
  v56 = v55 + v54;
  v57 = __ROR4__(v50 + x[14] - 1019803690 + (v54 & ~v52 | v52 & v56), 23);
  v58 = v57 + v56;
  v59 = __ROR4__(v52 + x[3] - 187363961 + (v56 & ~v54 | v54 & v58), 18);
  v60 = v59 + v58;
  v61 = __ROR4__(v54 + x[8] + 1163531501 + (v58 & ~v56 | v56 & v60), 12);
  v62 = v61 + v60;
  v63 = __ROR4__(v56 + x[13] - 1444681467 + (v60 & ~v58 | v58 & v62), 27);
  v64 = v63 + v62;
  v65 = __ROR4__(v58 + x[2] - 51403784 + (v62 & ~v60 | v60 & v64), 23);
  v66 = v65 + v64;
  v67 = __ROR4__(x[7] + 1735328473 + v60 + (v64 & ~v62 | v62 & v66), 18);
  v68 = v67 + v66;
  v69 = __ROR4__(x[12] - 1926607734 + v62 + (v66 & ~v64 | v64 & v68), 12);
  v70 = v69 + v68;
  v71 = __ROR4__(x[5] - 378558 + v64 + (v68 ^ v66 ^ v70), 28);
  v72 = v71 + v70;
  v73 = __ROR4__(x[8] - 2022574463 + v66 + (v70 ^ v68 ^ v72), 21);
  v74 = v73 + v72;
  v75 = __ROR4__(x[11] + 1839030562 + v68 + (v72 ^ v70 ^ v74), 16);
  v76 = v75 + v74;
  v77 = __ROR4__(x[14] - 35309556 + v70 + (v74 ^ v72 ^ v76), 9);
  v78 = v77 + v76;
  v79 = __ROR4__((v76 ^ v74 ^ v78) + x[1] - 1530992060 + v72, 28);
  v80 = v79 + v78;
  v81 = __ROR4__((v78 ^ v76 ^ v80) + x[4] + 1272893353 + v74, 21);
  v82 = v81 + v80;
  v83 = __ROR4__((v80 ^ v78 ^ v82) + x[7] - 155497632 + v76, 16);
  v84 = v83 + v82;
  v85 = __ROR4__((v82 ^ v80 ^ v84) + x[10] - 1094730640 + v78, 9);
  v86 = v85 + v84;
  v87 = __ROR4__((v84 ^ v82 ^ v86) + x[13] + 681279174 + v80, 28);
  v88 = v87 + v86;
  v89 = __ROR4__((v86 ^ v84 ^ v88) + x[0] - 358537222 + v82, 21);
  v90 = v89 + v88;
  v91 = __ROR4__((v88 ^ v86 ^ v90) + x[3] - 722521979 + v84, 16);
  v92 = v91 + v90;
  v93 = __ROR4__((v90 ^ v88 ^ v92) + x[6] + 76029189 + v86, 9);
  v94 = v93 + v92;
  v95 = __ROR4__((v92 ^ v90 ^ v94) + x[9] - 640364487 + v88, 28);
  v96 = v95 + v94;
  v97 = __ROR4__((v94 ^ v92 ^ v96) + x[12] - 421815835 + v90, 21);
  v98 = v97 + v96;
  v99 = __ROR4__(v92 + x[15] + 530742520 + (v96 ^ v94 ^ v98), 16);
  v100 = v99 + v98;
  v101 = __ROR4__(v94 + x[2] - 995338651 + (v98 ^ v96 ^ v100), 9);
  v102 = v101 + v100;
  v103 = __ROR4__(x[0] - 198630844 + v96 + ((~v98 | v102) ^ v100), 26);
  v104 = v103 + v102;
  v105 = __ROR4__(x[7] + 1126891415 + v98 + ((~v100 | v104) ^ v102), 22);
  v106 = v105 + v104;
  v107 = __ROR4__(x[14] - 1416354905 + v100 + ((~v102 | v106) ^ v104), 17);
  v108 = v107 + v106;
  v109 = __ROR4__(v102 + x[5] - 57434055 + ((~v104 | v108) ^ v106), 11);
  v110 = v109 + v108;
  v111 = __ROR4__(x[12] + 1700485571 + v104 + ((~v106 | v110) ^ v108), 26);
  v112 = v111 + v110;
  v113 = __ROR4__(x[3] - 1894986606 + v106 + ((~v108 | v112) ^ v110), 22);
  v114 = v113 + v112;
  v115 = __ROR4__(x[10] - 1051523 + v108 + ((~v110 | v114) ^ v112), 17);
  v116 = v115 + v114;
  v117 = __ROR4__(x[1] - 2054922799 + v110 + ((~v112 | v116) ^ v114), 11);
  v118 = v117 + v116;
  v119 = __ROR4__(x[8] + 1873313359 + v112 + ((~v114 | v118) ^ v116), 26);
  v120 = v119 + v118;
  v121 = __ROR4__(x[15] - 30611744 + v114 + ((~v116 | v120) ^ v118), 22);
  v122 = v121 + v120;
  v123 = __ROR4__(x[6] - 1560198380 + v116 + ((~v118 | v122) ^ v120), 17);
  v124 = v123 + v122;
  v125 = __ROR4__(x[13] + 1309151649 + v118 + ((~v120 | v124) ^ v122), 11);
  v126 = v125 + v124;
  v127 = __ROR4__(x[4] - 145523070 + v120 + ((~v122 | v126) ^ v124), 26);
  v128 = v127 + v126;
  v129 = __ROR4__(x[11] - 1120210379 + v122 + ((~v124 | v128) ^ v126), 22);
  v130 = v129 + v128;
  v131 = __ROR4__(x[2] + 718787259 + v124 + ((~v126 | v130) ^ v128), 17);
  v132 = v131 + v130;
  v133 = __ROR4__(x[9] - 343485551 + v126 + ((~v128 | v132) ^ v130), 11);
  *v2 += v128;
  v134 = v2[3];
  v2[1] += v132 + v133;
  v135 = v2[2];
  v2[3] = v134 + v130;
  v2[2] = v135 + v132;
}

然后用下一分组数据继续运行以上算法。

最后的输出是a、b、c和d的级联

3. IDA动态调试

在MD5Update()上下个断点,发现进去之前的参数是

MD5("[hbzFZ9dew0]@YWROdW09MWJkdXNzPWJpcnRoZGF5PTIwMTgtMDUtMTFjaGFubmVsPWJhaWR1emh1c2hvdWN1aWQ9OEY3ODcxMzNERTdDOTQ1NUZEMkI5MTc3MTFDNzEwMzN8NDAzNTU2NzYwMDk0MzUzZGlnRmxhZz0xaG90PTBpc0FJPTBvdnVsYXRpb25UaW1lPTBwZXJpb2Q9MnBuPTEwcHJlZ1N0PTNxdWVyeT3lrp3lrp3og47orrBybj0xMHRva2VuPTFfODA1ZmE3MWI2NzU3MDZjYjNmYmNiZjBjYTFjMzc3OTB0eXBlPTB2Yz0yMzZ2b2ljZT0w") = d7166568860b42b3dcb13c695e0fc5b0

出来之后变成了

最后经过sprintf格式化输出之后会变成

4. Python的MD5加密

import hashlib

m = hashlib.md5()
m.update("123".encode("utf-8"))
result = m.hexdigest() # m.digest()返回的是类似于上面未经过sprintf格式化之前的样子,16进制的格式
print (result)

注意:MD5Update()执行两次实际上是一个拼接效果(比如md5.update(md5.update("123".encode("utf-8")))实际上是等于md5.update("123123".encode("utf-8"))),所以为了避免这种干扰我们每次都应该重新实例化,即MD5Init()

5. Java层中的MD5加密 & Inspeckage中hash功能揭秘

下面是一个apk反编译后的例子:

在逆向存在在Java层中的使用java.security.MessageDigest类的MD5加密,可以直接使用Inspeckage来查看MD5加密前后字符串区别

  • http://repo.xposed.info/module/mobi.acpm.inspeckage

其源码HashHook.java中是这样实现MD5字符串hook的

首先hook java.security.MessageDigest.getInstance,获取是什么hash类型

然后 hook java.security.MessageDigest.update,获取update之前的字符串

最后hook java.security.MessageDigest.digest,获取格式化输出之前的字符

总结

从上可知,我们可以先分析是不是存在MD5Init时候需要初始化的值,然后判断是否存在MD5加密,然后hook MD5Update()的前后输入输出字符串即可。

另外可以用下面参数快速测试下是不是MD5算法:

MD5("") = d41d8cd98f00b204e9800998ecf8427e

MD5("a") = 0cc175b9c0f1b6a831c399e269772661

MD5("abc") = 900150983cd24fb0d6963f7d28e17f72

参考

  • https://baike.baidu.com/item/MD5/212708?fr=aladdin