Skip to main content

在 MuJS 中修改数据

MuJS 是一个兼容 C89 的嵌入式 JS 引擎。

¥MuJS is a C89-compatible embeddable JS engine.

SheetJS 是一个用于从电子表格读取和写入数据的 JavaScript 库。

¥SheetJS is a JavaScript library for reading and writing data from spreadsheets.

该演示使用 MuJS 和 SheetJS 从电子表格中提取数据并打印 CSV 行。我们将探讨如何在 MuJS 上下文中加载 SheetJS 并从 C 程序处理电子表格。

¥This demo uses MuJS and SheetJS to pull data from a spreadsheet and print CSV rows. We'll explore how to load SheetJS in a MuJS context and process spreadsheets from a C program.

"集成示例" 部分包括一个完整的命令行工具,用于从文件中读取数据。

¥The "Integration Example" section includes a complete command-line tool for reading data from files.

MuJS 引擎存在许多错误,影响 XLSX、XLML 以及其他 XML 和纯文本文件格式的解析。如果软件不需要支持旧版系统或架构,强烈建议使用现代引擎,例如 Duktape

¥The MuJS engine has a number of bugs that affect parsing in XLSX, XLML and other XML and plaintext file formats. If software does not need to support legacy systems or architecture, it is strongly recommended to use a modern engine such as Duktape.

集成详情

¥Integration Details

许多 MuJS 函数没有文档记录。该解释已针对版本 1.3.4 进行了验证。

¥Many MuJS functions are not documented. The explanation was verified against version 1.3.4.

初始化 MuJS

¥Initialize MuJS

使用 js_newstate 创建 MuJS 引擎实例:

¥A MuJS engine instance is created with js_newstate:

js_State *J = js_newstate(NULL, NULL, 0);

错误信息

¥Error Messages

应使用特殊的 report 回调来显示错误消息。这个报告函数在官方例子中使用:

¥A special report callback should be used to display error messages. This report function is used in official examples:

static void report(js_State *J, const char *msg) { fprintf(stderr, "REPORT MSG: %s\n", msg); }

js_setreport 函数将报告器连接到引擎:

¥The js_setreport function attaches the reporter to the engine:

js_setreport(J, report);

全局的

¥Global

MuJS 不公开 global 变量。它可以通过未绑定函数中对 this 的引用获得。将评估以下代码片段:

¥MuJS does not expose a global variable. It can be obtained from a reference to this in an unbound function. The following snippet will be evaluated:

/* create global object */
var global = (function(){ return this; }).call(null);

在 MuJS 中,js_dostring 计算存储在 C 字符串中的代码:

¥In MuJS, js_dostring evaluates code stored in C strings:

/* create `global` variable */
js_dostring(J, "var global = (function() { return this; })(null);");

控制台

¥Console

MuJS 没有内置的方法来打印数据。官方示例定义了以下 print 方法:

¥MuJS has no built-in method to print data. The official examples define the following print method:

static void jsB_print(js_State *J) {
int i = 1, top = js_gettop(J);
for (; i < top; ++i) {
const char *s = js_tostring(J, i);
if (i > 1) putchar(' ');
/* note: the official example uses `fputs`, but `puts` makes more sense */
puts(s);
}
putchar('\n');
js_pushundefined(J);
}

通过使用 js_newcfunction 将函数添加到引擎并使用 js_setglobal 绑定到名称,可以在 JS 引擎中公开该函数:

¥This function can be exposed in the JS engine by using js_newcfunction to add the function to the engine and js_setglobal to bind to a name:

js_newcfunction(J, jsB_print, "print", 0);
js_setglobal(J, "print");

print 添加到引擎后,以下 JS 片段将使用 log 方法创建一个 console 对象:

¥After adding print to the engine, the following JS snippet will create a console object with a log method:

/* create a fake `console` from the hermes `print` builtin */
var console = { log: function(x) { print(x); } };

在 MuJS 中,js_dostring 计算存储在 C 字符串中的代码:

¥In MuJS, js_dostring evaluates code stored in C strings:

js_dostring(J, "var console = { log: print };");

加载 SheetJS 脚本

¥Load SheetJS Scripts

SheetJS 独立脚本 可以在 C 上下文中进行解析和计算。

¥SheetJS Standalone scripts can be parsed and evaluated in a C context.

shim 和主库可以通过 MuJS js_dofile 方法加载。它从文件系统读取脚本并在 MuJS 上下文中进行计算:

¥The shim and main library can be loaded by with the MuJS js_dofile method. It reads scripts from the filesystem and evaluates in the MuJS context:

/* load scripts */
js_dofile(J, "shim.min.js");
js_dofile(J, "xlsx.full.min.js");

读取文件

¥Reading Files

MuJS 没有公开将原始字节数组传递到引擎的方法。相反,原始数据应采用 Base64 进行编码。

¥MuJS does not expose a method to pass raw byte arrays into the engine. Instead, the raw data should be encoded in Base64.

读取文件字节

¥Reading File Bytes

可以使用标准 C 库方法读取文件字节。该示例定义了具有以下签名的方法 read_file

¥File bytes can be read using standard C library methods. The example defines a method read_file with the following signature:

/* Read data from filesystem

* `filename` - path to filename

* `sz` - pointer to size_t

* return value is a pointer to the start of the file data

* the length of the data will be written to `sz`
*/
char *read_file(const char *filename, size_t *sz);
File Reader Implementation (click to show)

This function uses standard C API methods.

/* -------------------- */
/* read file from filesystem */

static char *read_file(const char *filename, size_t *sz) {
FILE *f = fopen(filename, "rb");
if(!f) return NULL;
long fsize; { fseek(f, 0, SEEK_END); fsize = ftell(f); fseek(f, 0, SEEK_SET); }
char *buf = (char *)malloc(fsize * sizeof(char));
*sz = fread((void *) buf, 1, fsize, f);
fclose(f);
return buf;
}

/* -------------------- */

该示例程序将接受一个参数并读取指定的文件:

¥The example program will accept an argument and read the specified file:

/* read file */
size_t dlen; char *dbuf = read_file(argv[1], &dlen);

Base64 字符串

¥Base64 String

该示例定义了具有以下签名的方法 Base64_encode

¥The example defines a method Base64_encode with the following signature:

/* Encode data with Base64

* `dst` - start of output buffer

* `src` - start of input data

* `len` - number of bytes to encode

* return value is the number of bytes
*/
int Base64_encode(char *dst, const char *src, int len);
Base64 Encoder Implementation (click to show)

The method mirrors the TypeScript implementation:

/* -------------------- */
/* base64 encoder */

const char Base64_map[] = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/";

static int Base64_encode(char *dst, const char *src, int len) {
unsigned char c1 = 0, c2 = 0, c3 = 0;
char *p = dst;
size_t i = 0;

for(; i < len;) {
c1 = src[i++];
*p++ = Base64_map[(c1 >> 2)];

c2 = src[i++];
*p++ = Base64_map[((c1 & 3) << 4) | (c2 >> 4)];

c3 = src[i++];
*p++ = Base64_map[((c2 & 15) << 2) | (c3 >> 6)];
*p++ = Base64_map[c3 & 0x3F];
}

if(i < len) {
c1 = src[i++];
*p++ = Base64_map[(c1 >> 2)];
if(i == len) {
*p++ = Base64_map[(c1 & 3) << 4];
*p++ = '=';
} else {
c2 = src[i++];
*p++ = Base64_map[((c1 & 3) << 4) | (c2 >> 4)];
*p++ = Base64_map[(c2 & 15) << 2];
}
*p++ = '=';
}

*p++ = '\0';
return p - dst;
}

/* -------------------- */

通常,C 代码将读取文件并编码为 Base64 字符串。中间字符串长度比原始长度大约大 33%(3 个原始字节映射到 4 个 Base64 字符)。

¥Typically C code will read files and encode to Base64 strings. The intermediate string length is approximately 33% larger than the original length (3 raw bytes are mapped to 4 Base64 characters).

/* base64 encode the file */
int sz = ((dlen + 2) / 3) * 4 + 1;
char *b64 = malloc(sz+1);
sz = Base64_encode(b64, dbuf, dlen);

传递字符串

¥Passing Strings

可以使用 js_pushlstring 将 Base64 字符串添加到引擎中。添加到引擎后,js_setglobal 可以将变量绑定到名称 buf 上:

¥The Base64 string can be added to the engine using js_pushlstring. After adding to the engine, js_setglobal can bind the variable to the name buf:

/* create `buf` global from the data */
js_pushlstring(J, b64, sz);
js_setglobal(J, "buf");

SheetJS 操作

¥SheetJS Operations

在此示例中,目标是提取第一个工作表并生成 CSV 行。

¥In this example, the goal is to pull the first worksheet and generate CSV rows.

XLSX.read[^1] 解析 Base64 字符串并返回 SheetJS 工作簿对象:

¥XLSX.read[^1] parses the Base64 string and returns a SheetJS workbook object:

/* parse file */
js_dostring(J, "var wb = XLSX.read(buf, {type: 'base64'});");

SheetNames 属性 [^2] 是工作簿中工作表名称的数组。第一个工作表名称可以通过以下 JS 片段获取:

¥The SheetNames property[^2] is an array of the sheet names in the workbook. The first sheet name can be obtained with the following JS snippet:

var first_sheet_name = wb.SheetNames[0];

Sheets 属性 [^3] 是一个对象,其键是工作表名称,其对应的值是工作表对象。

¥The Sheets property[^3] is an object whose keys are sheet names and whose corresponding values are worksheet objects.

var first_sheet = wb.Sheets[first_sheet_name];

sheet_to_csv 实用函数 [^4] 从工作表生成 CSV 字符串:

¥The sheet_to_csv utility function[^4] generates a CSV string from the sheet:

var csv = XLSX.utils.sheet_to_csv(first_sheet);

C 集成代码

¥C integration code

在此示例中,console.log 方法将打印生成的 CSV:

¥In this example, the console.log method will print the generated CSV:

/* print CSV from first worksheet */
js_dostring(J, "var ws = wb.Sheets[wb.SheetNames[0]]");
js_dostring(J, "console.log(XLSX.utils.sheet_to_csv(ws));");

集成示例

¥Integration Example

测试部署

该演示在以下部署中进行了测试:

¥This demo was tested in the following deployments:

架构版本日期
darwin-x641.3.42024-05-25
darwin-arm1.3.42024-05-23
win10-x641.3.42024-06-20
win11-arm1.3.42024-06-20
linux-x641.3.42024-04-21
linux-arm1.3.42024-05-25

MuJS 发行版不包含原生 Windows 项目。win10-x64win11-arm 测试完全在 Linux 的 Windows 子系统中运行。

¥MuJS distributions do not include native Windows projects. The win10-x64 and win11-arm tests were run entirely within Windows Subsystem for Linux.

在 WSL 中构建时,必须使用 apt 安装 libreadline-dev

¥When building in WSL, libreadline-dev must be installed using apt:

sudo apt-get install libreadline-dev
  1. 建立一个项目目录:

    ¥Make a project directory:

mkdir sheetjs-mu
cd sheetjs-mu
  1. 从源代码构建 MuJS 共享库:

    ¥Build the MuJS shared library from source:

curl -LO https://mujs.com/downloads/mujs-1.3.4.zip
unzip mujs-1.3.4.zip
cd mujs-1.3.4
make release
cd ..
  1. mujs.h 头文件和 libmujs.a 库复制到项目文件夹中:

    ¥Copy the mujs.h header file and libmujs.a library to the project folder:

cp mujs-1.3.4/build/release/libmujs.a mujs-1.3.4/mujs.h .
  1. 下载 SheetJSMu.c

    ¥Download SheetJSMu.c:

curl -LO https://xlsx.nodejs.cn/mujs/SheetJSMu.c
  1. 构建应用:

    ¥Build the application:

gcc -o SheetJSMu SheetJSMu.c -L. -lmujs -lm -lc -std=c89 -Wall
  1. 下载 SheetJS Standalone 脚本、shim 脚本和测试文件。将所有三个文件移动到项目目录:

    ¥Download the SheetJS Standalone script, shim script and test file. Move all three files to the project directory:

curl -LO https://cdn.sheetjs.com/xlsx-0.20.3/package/dist/shim.min.js
curl -LO https://cdn.sheetjs.com/xlsx-0.20.3/package/dist/xlsx.full.min.js
curl -LO https://xlsx.nodejs.cn/pres.xlsb
  1. 运行应用:

    ¥Run the application:

./SheetJSMu pres.xlsb

如果成功,应用会将第一张工作表的内容打印为 CSV 行。

¥If successful, the app will print the contents of the first sheet as CSV rows.

[^1]: 见 read 于 "读取文件"

¥See read in "Reading Files"

[^2]: 见 "工作簿对象"

¥See "Workbook Object"

[^3]: 见 "工作簿对象"

¥See "Workbook Object"

[^4]: 见 sheet_to_csv 于 "实用工具"

¥See sheet_to_csv in "Utilities"