Skip to main content

Ghidra 中的表格

Ghidra 是一个软件逆向工程平台,具有强大的基于 Java 的扩展系统。

¥Ghidra is a software reverse engineering platform with a robust Java-based extension system.

SheetJS 是一个用于从电子表格读取和写入数据的 JavaScript 库。

¥SheetJS is a JavaScript library for reading and writing data from spreadsheets.

完整演示 使用 SheetJS 从 Ghidra 脚本导出数据。我们将创建一个扩展,通过 Ghidra.js[^1] 集成加载 V8 JavaScript 引擎,并使用 SheetJS 库将位域表从 Apple Numbers 导出到 XLSX 工作簿。

¥The Complete Demo uses SheetJS to export data from a Ghidra script. We'll create an extension that loads the V8 JavaScript engine through the Ghidra.js[^1] integration and uses the SheetJS library to export a bitfield table from Apple Numbers to a XLSX workbook.

测试部署

此演示由 SheetJS 用户在以下部署中进行了测试:

¥This demo was tested by SheetJS users in the following deployments:

架构Ghidra日期
darwin-arm11.1.22024-10-13

集成详情

¥Integration Details

Ghidra 原生支持用 Java 运行的脚本。JS 扩展脚本需要具有 Java 绑定的 JavaScript 引擎

¥Ghidra natively supports scripts that are run in Java. JS extension scripts require a JavaScript engine with Java bindings.

Ghidra.js[^1] 是 RhinoJSGraalJSV8 的 Ghidra 集成。当前版本使用 Javet V8 绑定

¥Ghidra.js[^1] is a Ghidra integration for RhinoJS, GraalJS and V8. The current version uses the Javet V8 binding.

加载 SheetJS 脚本

¥Loading SheetJS Scripts

可以使用 require 在 Ghidra.js 脚本中加载 SheetJS NodeJS 模块

¥The SheetJS NodeJS module can be loaded in Ghidra.js scripts using require:

Loading SheetJS scripts in Ghidra.js
const XLSX = require("xlsx");

SheetJS NodeJS 模块必须安装在 Ghidra 脚本路径中的文件夹中!

¥SheetJS NodeJS modules must be installed in a folder in the Ghidra script path!

位字段和工作表

¥Bitfields and Sheets

二进制文件格式通常使用位字段来紧凑地存储一组布尔(真或假)标志。例如,在 XLSB 文件格式中,BrtRowHdr 记录 [^2] 对 行属性 进行编码。位偏移量 91-96 被解释为标记行是否隐藏或是否折叠的标志。

¥Binary file formats commonly use bitfields to compactly store a set of Boolean (true or false) flags. For example, in the XLSB file format, the BrtRowHdr record[^2] encodes row properties. Bit offsets 91-96 are interpreted as flags marking if a row is hidden or if it is collapsed.

程序集实现

¥Assembly Implementation

解析位域的函数通常按顺序测试每个位:

¥Functions that parse bitfields typically test each bit sequentially:

x86_64 sample assembly with mnemonics
            CASE_1c
41 0f ba e5 1c BT R13D,0x1c
73 69 JNC CASE_1d

;; .... Do some work here (bit offset 28)

CASE_1d
41 0f ba e5 1d BT R13D,0x1d
73 69 JNC CASE_1e

;; .... Do some work here (bit offset 29)

汇编由以下 TypeScript 代码片段近似:

¥The assembly is approximated by the following TypeScript snippet:

Approximate TypeScript
/* R13 is a 64-bit register */
declare let R13: BigInt;
/* NOTE: asm R13D is technically a live binding */
let R13D: number = Number(R13 & 0xFFFFFFFFn);

if((R13D >> 28) & 1) {
// .... Do some work here (bit offset 28)
}

if((R13D >> 29) & 1) {
// .... Do some work here (bit offset 29)
}

对象数组

¥Array of Objects

位掩码或位偏移可以与 JavaScript 对象中的描述配对。

¥A bitmask or bit offset can be paired with a description in a JavaScript object.

例如,在 BrtRowHdr 记录中,位偏移量 92 表示行是隐藏的(如果设置了位)还是可见的(如果未设置位)。偏移量和描述可以作为对象中的字段存储:

¥For example, in the BrtRowHdr record, bit offset 92 indicates whether the row is hidden (if the bit is set) or visible (if the bit is not set). The offset and description can be stored as fields in an object:

Sample metadata for BrtRowHdr offset 92
const metadata_92 = { Offset: 92, Description: "Hidden flag" };

每个对象都可以存储在一个数组中:

¥Each object can be stored in an array:

Array of sample metadata for BrtRowHdr
const metadata = [
{ Offset: 91, Description: "Collapsed flag" },
{ Offset: 92, Description: "Hidden flag" },
// ...
];

这是 "对象数组"。SheetJS json_to_sheet 方法 [^3] 可以从数组生成 SheetJS 工作表对象:

¥This is an "Array of Objects". The SheetJS json_to_sheet method[^3] can generate a SheetJS worksheet object from the array:

Generating a worksheet from the metadata
const ws = XLSX.utils.json_to_sheet(metadata);

SheetJS book_new 方法 [^4] 生成一个 SheetJS 工作簿对象,可以使用 writeFile 方法 [^5] 将其写入文件系统:

¥The SheetJS book_new method[^4] generates a SheetJS workbook object that can be written to the filesystem using the writeFile method[^5]:

Exporting the worksheet to file
const wb = XLSX.utils.book_new(ws, "Offsets");
XLSX.utils.writeFile(wb, "SheetJSGhidra.xlsx");

Java 绑定

¥Java Binding

Ghidra.js 公开了许多用于与 Ghidra 交互的全局变量,包括:

¥Ghidra.js exposes a number of globals for interacting with Ghidra, including:

  • currentProgram:有关已加载程序的信息。

    ¥currentProgram: information about the loaded program.

  • JavaHelper:用于加载类的 Java 助手。

    ¥JavaHelper: Java helper to load classes.

Ghidra.js 自动将实例方法桥接到 Java 方法调用。它还处理插件和文件扩展名的详细信息。

¥Ghidra.js automatically bridges instance methods to Java method calls. It also handles the plugin and file extension details.

启动反编译器

¥Launching the Decompiler

ghidra.app.decompiler.DecompInterface 是反编译器的主要 Java 接口。在 Ghidra.js 中,JavaHelper.getClass 将加载类。

¥ghidra.app.decompiler.DecompInterface is the primary Java interface to the decompiler. In Ghidra.js, JavaHelper.getClass will load the class.

Java

Launch decompiler process in Java (snippet)
import ghidra.app.script.GhidraScript;
import ghidra.app.decompiler.DecompInterface;
import ghidra.program.model.listing.Program;

public class SheetZilla extends GhidraScript {
@Override public void run() throws Exception {
DecompInterface ifc = new DecompInterface();
boolean success = ifc.openProgram(currentProgram);
/* ... do work here ... */
}
}

Ghidra.js

Launch decompiler process in Ghidra.js
const DecompInterface = JavaHelper.getClass('ghidra.app.decompiler.DecompInterface');
const decompiler = new DecompInterface();
decompiler.openProgram(currentProgram);

识别功能

¥Identifying a Function

符号表实例的 getGlobalSymbols 方法将返回与给定名称匹配的符号数组:

¥The getGlobalSymbols method of a symbol table instance will return an array of symbols matching the given name:

/* name of function to find */
const fname = 'MyMethod';

/* find symbols matching the name */
const fsymbs = currentProgram.getSymbolTable().getGlobalSymbols(fname);

/* get first result */
const fsymb = fsymbs[0];

函数管理器实例的 getFunctionAt 方法将获取地址并返回对函数的引用:

¥The getFunctionAt method of a function manager instance will take an address and return a reference to a function:

/* get address */
const faddr = fsymb.getAddress();

/* find function */
const fn = currentProgram.getFunctionManager().getFunctionAt(faddr);

反编译函数

¥Decompiling a Function

decompileFunction 方法尝试反编译引用的函数:

¥The decompileFunction method attempts to decompile the referenced function:

/* decompile function */
const decomp = decompiler.decompileFunction(fn, 10000, null);

反编译后,可以检索反编译的 C 代码:

¥Once decompiled, it is possible to retrieve the decompiled C code:

/* get generated C code */
const src = decomp.getDecompiledFunction().getC();

完整演示

¥Complete Demo

在此演示中,我们将检查 Apple Numbers 14.2 的 TSTables 框架内的 _TSTCellToCellStorage 方法。此特定方法处理单元格序列化为 NUMBERS 文件格式。

¥In this demo, we will inspect the _TSTCellToCellStorage method within the TSTables framework of Apple Numbers 14.2. This particular method handles serialization of cells to the NUMBERS file format.

实现有许多块,看起来像以下脚本:

¥The implementation has a number of blocks which look like the following script:

if(flags >> 0x0d & 1) {
const field = "numberFormatID";
const current_value = cell[field];
// ... check if current_value is set, do other stuff
}

根据位偏移量和字段名称,我们将生成以下行:

¥Based on the bit offset and the field name, we will generate the following row:

const mask = 1 << 0x0d; // = 8192 = 0x2000
const name = "number format ID";
const row = { Mask: "0x" + mask.toString(16), "Internal Name": name };

将为每个块生成行,并导出最终数据集。

¥Rows will be generated for each block and the final dataset will be exported.

系统设置

¥System Setup

  1. 安装 Ghidra、Xcode 和 Apple Numbers。

    ¥Install Ghidra, Xcode, and Apple Numbers.

Installation Notes (click to show)

On macOS, Ghidra was installed using Homebrew:

brew install --cask ghidra
  1. 将基本 Ghidra 文件夹添加到 PATH 变量。以下 shell 命令添加到当前 zshbash 会话的路径:

    ¥Add the base Ghidra folder to the PATH variable. The following shell command adds to the path for the current zsh or bash session:

export PATH="$PATH":$(dirname $(realpath `which ghidraRun`))
  1. 全局安装 ghidra.js

    ¥Install ghidra.js globally:

npm install -g ghidra.js

如果安装因权限问题而失败,请使用 root 用户安装:

¥If the install fails with a permissions issue, install with the root user:

sudo npm install -g ghidra.js

程序准备

¥Program Preparation

  1. 创建一个临时文件夹来保存 Ghidra 项目:

    ¥Create a temporary folder to hold the Ghidra project:

mkdir -p /tmp/sheetjs-ghidra
  1. TSTables 框架复制到当前目录:

    ¥Copy the TSTables framework to the current directory:

cp /Applications/Numbers.app/Contents/Frameworks/TSTables.framework/Versions/Current/TSTables .
  1. 通过提取框架的 x86_64 部分创建 "thin" 二进制文件:

    ¥Create a "thin" binary by extracting the x86_64 part of the framework:

lipo TSTables -thin x86_64 -output TSTables.macho

上次测试此演示时,无头分析器不支持 Mach-O 胖二进制文件。lipo 创建支持一种架构的新二进制文件。

¥When this demo was last tested, the headless analyzer did not support Mach-O fat binaries. lipo creates a new binary with support for one architecture.

  1. 分析程序:

    ¥Analyze the program:

$(dirname $(realpath `which ghidraRun`))/support/analyzeHeadless /tmp/sheetjs-ghidra Numbers -import TSTables.macho

此过程可能需要一段时间并打印许多 Java 堆栈跟踪。可以忽略错误。

¥This process may take a while and print a number of Java stacktraces. The errors can be ignored.

SheetJS 集成

¥SheetJS Integration

  1. 下载 sheetjs-ghidra.js

    ¥Download sheetjs-ghidra.js:

curl -LO https://xlsx.nodejs.cn/ghidra/sheetjs-ghidra.js
  1. 安装 SheetJS NodeJS 模块

    ¥Install the SheetJS NodeJS module:

npm i --save https://cdn.sheetjs.com/xlsx-0.20.3/xlsx-0.20.3.tgz
  1. 运行脚本:

    ¥Run the script:

$(dirname $(realpath `which ghidraRun`))/support/analyzeHeadless /tmp/sheetjs-ghidra Numbers -process TSTables.macho -noanalysis -scriptPath `pwd` -postScript sheetjs-ghidra.js
  1. 打开生成的 SheetJSGhidraTSTCell.xlsx 电子表格。

    ¥Open the generated SheetJSGhidraTSTCell.xlsx spreadsheet.

[^1]: 该项目没有网站。源存储库 是公开可用的。

¥The project does not have a website. The source repository is publicly available.

[^2]: BrtRowHdrMS-XLSB 规范 中定义

¥BrtRowHdr is defined in the MS-XLSB specification

[^3]: 见 json_to_sheet 于 "实用工具"

¥See json_to_sheet in "Utilities"

[^4]: 见 book_new 于 "实用工具"

¥See book_new in "Utilities"

[^5]: 见 writeFile 于 "写入文件"

¥See writeFile in "Writing Files"