debug_info_off(u4) 0x21b insns_size(u4) 0x4
insns ushort[insns_size] 1070 0001 0000 000e ? 0x70的opcode为:invoke-direct 其格式为:
invoke-direct {vD, vE, vF, vG, vA}, meth@CCCC B: argument word count (4 bits) C: method index (16 bits)
D..G, A: argument registers (4 bits each)
分布为:
B|A|op CCCC G|F|E|D [B=5] op {vD, vE, vF, vG, vA}, meth@CCCC
[B=5] op {vD, vE, vF, vG, vA}, type@CCCC [B=4] op {vD, vE, vF, vG}, kind@CCCC [B=3] op {vD, vE, vF}, kind@CCCC [B=2] op {vD, vE}, kind@CCCC [B=1] op {vD}, kind@CCCC [B=0] op {}, kind@CCCC
由于B=1,D=0,CCCC=0x0001,对应得method为Ljava/lang/Object;的
经过上述分析,得到指令为:
|0000: invoke-direct {v0}, Ljava/lang/Object;.
2) 01 09 c8 02
method_idx_diff 为0x1
registers_size(u2) 0x3 ins_size(u2) 0x1 outs_size(u2) 0x2 tries_size(u2) 0 debug_info_off(u4) 0x220 insns_size(u4) 0x8
insns ushort[insns_size] 0062 0000 011a 000c 206e 0000 0010 000e
? 0x62的opcode为sget-object 其格式为:
sget-object vAA, field@BBBB
A: value register or pair; may be source or dest (8 bits) B: static field reference index (16 bits) 分布为:
AA|op BBBB
由于AA=0,BBBB=0x0000,字段为out
? 0x1a的opcode为const-string 其格式为
const-string vAA, string@BBBB A: destination register (8 bits) B: string index 分布为:
AA|op BBBB
由于AA=1,BBBB=0x000c,对应的字符串为:test!
? 0x6e的opcode为:invoke-virtual 其格式和分布同invoke-direct:
B|A|op CCCC G|F|E|D [B=5] op {vD, vE, vF, vG, vA}, meth@CCCC
[B=5] op {vD, vE, vF, vG, vA}, type@CCCC [B=4] op {vD, vE, vF, vG}, kind@CCCC [B=3] op {vD, vE, vF}, kind@CCCC [B=2] op {vD, vE}, kind@CCCC [B=1] op {vD}, kind@CCCC [B=0] op {}, kind@CCCC
这里B=2,A=0,E=1,D=0,CCCC=0x0000,方法为Ljava/io/PrintStream;的println ? 0x0e的opcode为return-void
经过上述分析,得到指令为:
sget-object v0, Ljava/lang/System;.out:Ljava/io/PrintStream; // field@0000 const-string v1, \
invoke-virtual {v0, v1}, Ljava/io/PrintStream;.println:(Ljava/lang/String;)V // method@0000 return-void
encoded_method
Name Format Description index into the method_ids list for the identity of this method (includes the name and descriptor), represented as a difference from the index of previous element in the list. The index of the first element in a list is represented directly. access flags for the method (public, final, etc.). See \access_flags Definitions\method_idx_diff uleb128 access_flags uleb128 code_off uleb128 offset from the start of the file to the code structure for this method, or 0 if this method is either abstract or native. The offset should be to a location in the data section. The format of the data is specified by \code_item\below. encoded_field Name Format Description index into the field_ids list for the identity of this field (includes the name and descriptor), represented as a difference from the index of previous element in the list. The index of the first element in a list is represented directly. field_idx_diff uleb128 access_flags code_item Name registers_size ins_size outs_size tries_size uleb128 access flags for the field (public, final, etc.). See \Format ushort ushort ushort ushort debug_info_off uint insns_size uint Description the number of registers used by this code the number of words of incoming arguments to the method that this code is for the number of words of outgoing argument space required by this code for method invocation the number of try_items for this instance. If non-zero, then these appear as the tries array just after the insns in this instance. offset from the start of the file to the debug info (line numbers + local variable info) sequence for this code, or 0 if there simply is no information. The offset, if non-zero, should be to a location in the data section. The format of the data is specified by \below. size of the instructions list, in 16-bit code units actual array of bytecode. The format of code in an insns array is specified by the companion document \Bytecode for the Dalvik VM\that though this is defined as an array of ushort, there are some internal structures that prefer four-byte alignment. Also, if this happens to be in an endian-swapped file, then the swapping is only done on individual ushorts and not on the larger internal structures. insns ushort[insns_size] padding tries handlers two bytes of padding to make tries four-byte ushort (optional) = aligned. This element is only present if 0 tries_size is non-zero and insns_size is odd. array indicating where in the code exceptions may be caught and how to handle them. Elements try_item[tries_sizeof the array must be non-overlapping in range ] (optional) and in order from low to high address. This element is only present if tries_size is non-zero. bytes representing a list of lists of catch types and associated handler addresses. Each encoded_catch_handltry_item has a byte-wise offset into this er_list (optional) structure. This element is only present if tries_size is non-zero.
附录1 LEB128算法 算法如下:
DEX_INLINE int readUnsignedLeb128(const u1** pStream) { const u1* ptr = *pStream; int result = *(ptr++);
if (result > 0x7f) { //如果第一个字节的最高位是1 int cur = *(ptr++); //指向第二个字节
//当前值是第一个字节的7位加上第二个字节的7位
result = (result & 0x7f) | ((cur & 0x7f) << 7);
if (cur > 0x7f) { //如果第二个字节的最高位是1 cur = *(ptr++); //指向第三个字节
result |= (cur & 0x7f) << 14;//当前值加上第三个字节的7位 if (cur > 0x7f) {//如果第三个字节的最高位是1 cur = *(ptr++);
result |= (cur & 0x7f) << 21;//当前值加上第四个字节的7位 if (cur > 0x7f) {//如果第四个字节的最高位是1 /*
* Note: We don't check to see if cur is out of * range here, meaning we tolerate garbage in the * high four-order bits. */
cur = *(ptr++);
result |= cur << 28;//当前值加上第五个字节的7位 } } } }
*pStream = ptr; return result; } /*
* 读取有符号的,符号位取决于最后字节的有效负荷最高位。>>是到符号的。 */
DEX_INLINE int readSignedLeb128(const u1** pStream) { const u1* ptr = *pStream; int result = *(ptr++);
if (result <= 0x7f) {
result = (result << 25) >> 25; } else {
int cur = *(ptr++);