Ghidra 中的表格
Ghidra 是一个软件逆向工程平台,具有强大的基于 Java 的扩展系统。
¥Ghidra is a software reverse engineering platform with a robust Java-based extension system.
SheetJS 是一个用于从电子表格读取和写入数据的 JavaScript 库。
¥SheetJS is a JavaScript library for reading and writing data from spreadsheets.
完整演示 使用 SheetJS 从 Ghidra 脚本导出数据。我们将创建一个扩展,通过 Ghidra.js[^1] 集成加载 V8 JavaScript 引擎,并使用 SheetJS 库将位域表从 Apple Numbers 导出到 XLSX 工作簿。
¥The Complete Demo uses SheetJS to export data from a Ghidra script. We'll create an extension that loads the V8 JavaScript engine through the Ghidra.js[^1] integration and uses the SheetJS library to export a bitfield table from Apple Numbers to a XLSX workbook.
此演示由 SheetJS 用户在以下部署中进行了测试:
¥This demo was tested by SheetJS users in the following deployments:
架构 | Ghidra | 日期 |
---|---|---|
darwin-arm | 11.1.2 | 2024-10-13 |
集成详情
¥Integration Details
Ghidra 原生支持用 Java 运行的脚本。JS 扩展脚本需要具有 Java 绑定的 JavaScript 引擎。
¥Ghidra natively supports scripts that are run in Java. JS extension scripts require a JavaScript engine with Java bindings.
Ghidra.js[^1] 是 RhinoJS、GraalJS 和 V8 的 Ghidra 集成。当前版本使用 Javet V8 绑定。
¥Ghidra.js[^1] is a Ghidra integration for RhinoJS, GraalJS and V8. The current version uses the Javet V8 binding.
加载 SheetJS 脚本
¥Loading SheetJS Scripts
可以使用 require
在 Ghidra.js 脚本中加载 SheetJS NodeJS 模块:
¥The SheetJS NodeJS module can be
loaded in Ghidra.js scripts using require
:
const XLSX = require("xlsx");
SheetJS NodeJS 模块必须安装在 Ghidra 脚本路径中的文件夹中!
¥SheetJS NodeJS modules must be installed in a folder in the Ghidra script path!
位字段和工作表
¥Bitfields and Sheets
二进制文件格式通常使用位字段来紧凑地存储一组布尔(真或假)标志。例如,在 XLSB 文件格式中,BrtRowHdr
记录 [^2] 对 行属性 进行编码。位偏移量 91-96 被解释为标记行是否隐藏或是否折叠的标志。
¥Binary file formats commonly use bitfields to compactly store a set of Boolean
(true or false) flags. For example, in the XLSB file format, the BrtRowHdr
record[^2] encodes row properties. Bit offsets
91-96 are interpreted as flags marking if a row is hidden or if it is collapsed.
程序集实现
¥Assembly Implementation
解析位域的函数通常按顺序测试每个位:
¥Functions that parse bitfields typically test each bit sequentially:
CASE_1c
41 0f ba e5 1c BT R13D,0x1c
73 69 JNC CASE_1d
;; .... Do some work here (bit offset 28)
CASE_1d
41 0f ba e5 1d BT R13D,0x1d
73 69 JNC CASE_1e
;; .... Do some work here (bit offset 29)
汇编由以下 TypeScript 代码片段近似:
¥The assembly is approximated by the following TypeScript snippet:
/* R13 is a 64-bit register */
declare let R13: BigInt;
/* NOTE: asm R13D is technically a live binding */
let R13D: number = Number(R13 & 0xFFFFFFFFn);
if((R13D >> 28) & 1) {
// .... Do some work here (bit offset 28)
}
if((R13D >> 29) & 1) {
// .... Do some work here (bit offset 29)
}
对象数组
¥Array of Objects
位掩码或位偏移可以与 JavaScript 对象中的描述配对。
¥A bitmask or bit offset can be paired with a description in a JavaScript object.
例如,在 BrtRowHdr
记录中,位偏移量 92 表示行是隐藏的(如果设置了位)还是可见的(如果未设置位)。偏移量和描述可以作为对象中的字段存储:
¥For example, in the BrtRowHdr
record, bit offset 92 indicates whether the row
is hidden (if the bit is set) or visible (if the bit is not set). The offset and
description can be stored as fields in an object:
const metadata_92 = { Offset: 92, Description: "Hidden flag" };
每个对象都可以存储在一个数组中:
¥Each object can be stored in an array:
const metadata = [
{ Offset: 91, Description: "Collapsed flag" },
{ Offset: 92, Description: "Hidden flag" },
// ...
];
这是 "对象数组"。SheetJS json_to_sheet
方法 [^3] 可以从数组生成 SheetJS 工作表对象:
¥This is an "Array of Objects".
The SheetJS json_to_sheet
method[^3] can generate a SheetJS worksheet object
from the array:
const ws = XLSX.utils.json_to_sheet(metadata);
SheetJS book_new
方法 [^4] 生成一个 SheetJS 工作簿对象,可以使用 writeFile
方法 [^5] 将其写入文件系统:
¥The SheetJS book_new
method[^4] generates a SheetJS workbook object that can
be written to the filesystem using the writeFile
method[^5]:
const wb = XLSX.utils.book_new(ws, "Offsets");
XLSX.utils.writeFile(wb, "SheetJSGhidra.xlsx");
Java 绑定
¥Java Binding
Ghidra.js 公开了许多用于与 Ghidra 交互的全局变量,包括:
¥Ghidra.js exposes a number of globals for interacting with Ghidra, including:
-
currentProgram
:有关已加载程序的信息。¥
currentProgram
: information about the loaded program. -
JavaHelper
:用于加载类的 Java 助手。¥
JavaHelper
: Java helper to load classes.
Ghidra.js 自动将实例方法桥接到 Java 方法调用。它还处理插件和文件扩展名的详细信息。
¥Ghidra.js automatically bridges instance methods to Java method calls. It also handles the plugin and file extension details.
启动反编译器
¥Launching the Decompiler
ghidra.app.decompiler.DecompInterface
是反编译器的主要 Java 接口。在 Ghidra.js 中,JavaHelper.getClass
将加载类。
¥ghidra.app.decompiler.DecompInterface
is the primary Java interface to the
decompiler. In Ghidra.js, JavaHelper.getClass
will load the class.
Java
import ghidra.app.script.GhidraScript;
import ghidra.app.decompiler.DecompInterface;
import ghidra.program.model.listing.Program;
public class SheetZilla extends GhidraScript {
@Override public void run() throws Exception {
DecompInterface ifc = new DecompInterface();
boolean success = ifc.openProgram(currentProgram);
/* ... do work here ... */
}
}
Ghidra.js
const DecompInterface = JavaHelper.getClass('ghidra.app.decompiler.DecompInterface');
const decompiler = new DecompInterface();
decompiler.openProgram(currentProgram);
识别功能
¥Identifying a Function
符号表实例的 getGlobalSymbols
方法将返回与给定名称匹配的符号数组:
¥The getGlobalSymbols
method of a symbol table instance will return an array of
symbols matching the given name:
/* name of function to find */
const fname = 'MyMethod';
/* find symbols matching the name */
const fsymbs = currentProgram.getSymbolTable().getGlobalSymbols(fname);
/* get first result */
const fsymb = fsymbs[0];
函数管理器实例的 getFunctionAt
方法将获取地址并返回对函数的引用:
¥The getFunctionAt
method of a function manager instance will take an address
and return a reference to a function:
/* get address */
const faddr = fsymb.getAddress();
/* find function */
const fn = currentProgram.getFunctionManager().getFunctionAt(faddr);
反编译函数
¥Decompiling a Function
decompileFunction
方法尝试反编译引用的函数:
¥The decompileFunction
method attempts to decompile the referenced function:
/* decompile function */
const decomp = decompiler.decompileFunction(fn, 10000, null);
反编译后,可以检索反编译的 C 代码:
¥Once decompiled, it is possible to retrieve the decompiled C code:
/* get generated C code */
const src = decomp.getDecompiledFunction().getC();
完整演示
¥Complete Demo
在此演示中,我们将检查 Apple Numbers 14.2 的 TSTables
框架内的 _TSTCellToCellStorage
方法。此特定方法处理单元格序列化为 NUMBERS 文件格式。
¥In this demo, we will inspect the _TSTCellToCellStorage
method within the
TSTables
framework of Apple Numbers 14.2. This particular method handles
serialization of cells to the NUMBERS file format.
实现有许多块,看起来像以下脚本:
¥The implementation has a number of blocks which look like the following script:
if(flags >> 0x0d & 1) {
const field = "numberFormatID";
const current_value = cell[field];
// ... check if current_value is set, do other stuff
}
根据位偏移量和字段名称,我们将生成以下行:
¥Based on the bit offset and the field name, we will generate the following row:
const mask = 1 << 0x0d; // = 8192 = 0x2000
const name = "number format ID";
const row = { Mask: "0x" + mask.toString(16), "Internal Name": name };
将为每个块生成行,并导出最终数据集。
¥Rows will be generated for each block and the final dataset will be exported.
系统设置
¥System Setup
-
安装 Ghidra、Xcode 和 Apple Numbers。
¥Install Ghidra, Xcode, and Apple Numbers.
Installation Notes (click to show)
On macOS, Ghidra was installed using Homebrew:
brew install --cask ghidra
-
将基本 Ghidra 文件夹添加到 PATH 变量。以下 shell 命令添加到当前
zsh
或bash
会话的路径:¥Add the base Ghidra folder to the PATH variable. The following shell command adds to the path for the current
zsh
orbash
session:
export PATH="$PATH":$(dirname $(realpath `which ghidraRun`))
-
全局安装
ghidra.js
:¥Install
ghidra.js
globally:
npm install -g ghidra.js
如果安装因权限问题而失败,请使用 root 用户安装:
¥If the install fails with a permissions issue, install with the root user:
sudo npm install -g ghidra.js
程序准备
¥Program Preparation
-
创建一个临时文件夹来保存 Ghidra 项目:
¥Create a temporary folder to hold the Ghidra project:
mkdir -p /tmp/sheetjs-ghidra
-
将
TSTables
框架复制到当前目录:¥Copy the
TSTables
framework to the current directory:
cp /Applications/Numbers.app/Contents/Frameworks/TSTables.framework/Versions/Current/TSTables .
-
通过提取框架的
x86_64
部分创建 "thin" 二进制文件:¥Create a "thin" binary by extracting the
x86_64
part of the framework:
lipo TSTables -thin x86_64 -output TSTables.macho
上次测试此演示时,无头分析器不支持 Mach-O 胖二进制文件。lipo
创建支持一种架构的新二进制文件。
¥When this demo was last tested, the headless analyzer did not support Mach-O fat
binaries. lipo
creates a new binary with support for one architecture.
-
分析程序:
¥Analyze the program:
$(dirname $(realpath `which ghidraRun`))/support/analyzeHeadless /tmp/sheetjs-ghidra Numbers -import TSTables.macho
此过程可能需要一段时间并打印许多 Java 堆栈跟踪。可以忽略错误。
¥This process may take a while and print a number of Java stacktraces. The errors can be ignored.
SheetJS 集成
¥SheetJS Integration
-
¥Download
sheetjs-ghidra.js
:
curl -LO https://xlsx.nodejs.cn/ghidra/sheetjs-ghidra.js
-
¥Install the SheetJS NodeJS module:
npm i --save https://cdn.sheetjs.com/xlsx-0.20.3/xlsx-0.20.3.tgz
-
运行脚本:
¥Run the script:
$(dirname $(realpath `which ghidraRun`))/support/analyzeHeadless /tmp/sheetjs-ghidra Numbers -process TSTables.macho -noanalysis -scriptPath `pwd` -postScript sheetjs-ghidra.js
-
打开生成的
SheetJSGhidraTSTCell.xlsx
电子表格。¥Open the generated
SheetJSGhidraTSTCell.xlsx
spreadsheet.
[^1]: 该项目没有网站。源存储库 是公开可用的。
¥The project does not have a website. The source repository is publicly available.
[^2]: BrtRowHdr
在 MS-XLSB
规范 中定义
¥BrtRowHdr
is defined in the MS-XLSB
specification
[^3]: 见 json_to_sheet
于 "实用工具"
¥See json_to_sheet
in "Utilities"
[^4]: 见 book_new
于 "实用工具"
[^5]: 见 writeFile
于 "写入文件"