在 MuJS 中修改数据
MuJS 是一个兼容 C89 的嵌入式 JS 引擎。
¥MuJS is a C89-compatible embeddable JS engine.
SheetJS 是一个用于从电子表格读取和写入数据的 JavaScript 库。
¥SheetJS is a JavaScript library for reading and writing data from spreadsheets.
该演示使用 MuJS 和 SheetJS 从电子表格中提取数据并打印 CSV 行。我们将探讨如何在 MuJS 上下文中加载 SheetJS 并从 C 程序处理电子表格。
¥This demo uses MuJS and SheetJS to pull data from a spreadsheet and print CSV rows. We'll explore how to load SheetJS in a MuJS context and process spreadsheets from a C program.
"集成示例" 部分包括一个完整的命令行工具,用于从文件中读取数据。
¥The "Integration Example" section includes a complete command-line tool for reading data from files.
MuJS 引擎存在许多错误,影响 XLSX、XLML 以及其他 XML 和纯文本文件格式的解析。如果软件不需要支持旧版系统或架构,强烈建议使用现代引擎,例如 Duktape。
¥The MuJS engine has a number of bugs that affect parsing in XLSX, XLML and other XML and plaintext file formats. If software does not need to support legacy systems or architecture, it is strongly recommended to use a modern engine such as Duktape.
集成详情
¥Integration Details
许多 MuJS 函数没有文档记录。该解释已针对版本 1.3.4
进行了验证。
¥Many MuJS functions are not documented. The explanation was verified against
version 1.3.4
.
初始化 MuJS
¥Initialize MuJS
使用 js_newstate
创建 MuJS 引擎实例:
¥A MuJS engine instance is created with js_newstate
:
js_State *J = js_newstate(NULL, NULL, 0);
错误信息
¥Error Messages
应使用特殊的 report
回调来显示错误消息。这个报告函数在官方例子中使用:
¥A special report
callback should be used to display error messages. This
report function is used in official examples:
static void report(js_State *J, const char *msg) { fprintf(stderr, "REPORT MSG: %s\n", msg); }
js_setreport
函数将报告器连接到引擎:
¥The js_setreport
function attaches the reporter to the engine:
js_setreport(J, report);
全局的
¥Global
MuJS 不公开 global
变量。它可以通过未绑定函数中对 this
的引用获得。将评估以下代码片段:
¥MuJS does not expose a global
variable. It can be obtained from a reference
to this
in an unbound function. The following snippet will be evaluated:
/* create global object */
var global = (function(){ return this; }).call(null);
在 MuJS 中,js_dostring
计算存储在 C 字符串中的代码:
¥In MuJS, js_dostring
evaluates code stored in C strings:
/* create `global` variable */
js_dostring(J, "var global = (function() { return this; })(null);");
控制台
¥Console
MuJS 没有内置的方法来打印数据。官方示例定义了以下 print
方法:
¥MuJS has no built-in method to print data. The official examples define the
following print
method:
static void jsB_print(js_State *J) {
int i = 1, top = js_gettop(J);
for (; i < top; ++i) {
const char *s = js_tostring(J, i);
if (i > 1) putchar(' ');
/* note: the official example uses `fputs`, but `puts` makes more sense */
puts(s);
}
putchar('\n');
js_pushundefined(J);
}
通过使用 js_newcfunction
将函数添加到引擎并使用 js_setglobal
绑定到名称,可以在 JS 引擎中公开该函数:
¥This function can be exposed in the JS engine by using js_newcfunction
to add
the function to the engine and js_setglobal
to bind to a name:
js_newcfunction(J, jsB_print, "print", 0);
js_setglobal(J, "print");
将 print
添加到引擎后,以下 JS 片段将使用 log
方法创建一个 console
对象:
¥After adding print
to the engine, the following JS snippet will create a
console
object with a log
method:
/* create a fake `console` from the hermes `print` builtin */
var console = { log: function(x) { print(x); } };
在 MuJS 中,js_dostring
计算存储在 C 字符串中的代码:
¥In MuJS, js_dostring
evaluates code stored in C strings:
js_dostring(J, "var console = { log: print };");
加载 SheetJS 脚本
¥Load SheetJS Scripts
SheetJS 独立脚本 可以在 C 上下文中进行解析和计算。
¥SheetJS Standalone scripts can be parsed and evaluated in a C context.
shim 和主库可以通过 MuJS js_dofile
方法加载。它从文件系统读取脚本并在 MuJS 上下文中进行计算:
¥The shim and main library can be loaded by with the MuJS js_dofile
method. It
reads scripts from the filesystem and evaluates in the MuJS context:
/* load scripts */
js_dofile(J, "shim.min.js");
js_dofile(J, "xlsx.full.min.js");
读取文件
¥Reading Files
MuJS 没有公开将原始字节数组传递到引擎的方法。相反,原始数据应采用 Base64 进行编码。
¥MuJS does not expose a method to pass raw byte arrays into the engine. Instead, the raw data should be encoded in Base64.
读取文件字节
¥Reading File Bytes
可以使用标准 C 库方法读取文件字节。该示例定义了具有以下签名的方法 read_file
:
¥File bytes can be read using standard C library methods. The example defines a
method read_file
with the following signature:
/* Read data from filesystem
* `filename` - path to filename
* `sz` - pointer to size_t
* return value is a pointer to the start of the file data
* the length of the data will be written to `sz`
*/
char *read_file(const char *filename, size_t *sz);
File Reader Implementation (click to show)
This function uses standard C API methods.
/* -------------------- */
/* read file from filesystem */
static char *read_file(const char *filename, size_t *sz) {
FILE *f = fopen(filename, "rb");
if(!f) return NULL;
long fsize; { fseek(f, 0, SEEK_END); fsize = ftell(f); fseek(f, 0, SEEK_SET); }
char *buf = (char *)malloc(fsize * sizeof(char));
*sz = fread((void *) buf, 1, fsize, f);
fclose(f);
return buf;
}
/* -------------------- */
该示例程序将接受一个参数并读取指定的文件:
¥The example program will accept an argument and read the specified file:
/* read file */
size_t dlen; char *dbuf = read_file(argv[1], &dlen);
Base64 字符串
¥Base64 String
该示例定义了具有以下签名的方法 Base64_encode
:
¥The example defines a method Base64_encode
with the following signature:
/* Encode data with Base64
* `dst` - start of output buffer
* `src` - start of input data
* `len` - number of bytes to encode
* return value is the number of bytes
*/
int Base64_encode(char *dst, const char *src, int len);
Base64 Encoder Implementation (click to show)
The method mirrors the TypeScript implementation:
/* -------------------- */
/* base64 encoder */
const char Base64_map[] = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/";
static int Base64_encode(char *dst, const char *src, int len) {
unsigned char c1 = 0, c2 = 0, c3 = 0;
char *p = dst;
size_t i = 0;
for(; i < len;) {
c1 = src[i++];
*p++ = Base64_map[(c1 >> 2)];
c2 = src[i++];
*p++ = Base64_map[((c1 & 3) << 4) | (c2 >> 4)];
c3 = src[i++];
*p++ = Base64_map[((c2 & 15) << 2) | (c3 >> 6)];
*p++ = Base64_map[c3 & 0x3F];
}
if(i < len) {
c1 = src[i++];
*p++ = Base64_map[(c1 >> 2)];
if(i == len) {
*p++ = Base64_map[(c1 & 3) << 4];
*p++ = '=';
} else {
c2 = src[i++];
*p++ = Base64_map[((c1 & 3) << 4) | (c2 >> 4)];
*p++ = Base64_map[(c2 & 15) << 2];
}
*p++ = '=';
}
*p++ = '\0';
return p - dst;
}
/* -------------------- */
通常,C 代码将读取文件并编码为 Base64 字符串。中间字符串长度比原始长度大约大 33%(3 个原始字节映射到 4 个 Base64 字符)。
¥Typically C code will read files and encode to Base64 strings. The intermediate string length is approximately 33% larger than the original length (3 raw bytes are mapped to 4 Base64 characters).
/* base64 encode the file */
int sz = ((dlen + 2) / 3) * 4 + 1;
char *b64 = malloc(sz+1);
sz = Base64_encode(b64, dbuf, dlen);
传递字符串
¥Passing Strings
可以使用 js_pushlstring
将 Base64 字符串添加到引擎中。添加到引擎后,js_setglobal
可以将变量绑定到名称 buf
上:
¥The Base64 string can be added to the engine using js_pushlstring
. After
adding to the engine, js_setglobal
can bind the variable to the name buf
:
/* create `buf` global from the data */
js_pushlstring(J, b64, sz);
js_setglobal(J, "buf");
SheetJS 操作
¥SheetJS Operations
在此示例中,目标是提取第一个工作表并生成 CSV 行。
¥In this example, the goal is to pull the first worksheet and generate CSV rows.
XLSX.read
[^1] 解析 Base64 字符串并返回 SheetJS 工作簿对象:
¥XLSX.read
[^1] parses the Base64 string and returns a SheetJS workbook object:
/* parse file */
js_dostring(J, "var wb = XLSX.read(buf, {type: 'base64'});");
SheetNames
属性 [^2] 是工作簿中工作表名称的数组。第一个工作表名称可以通过以下 JS 片段获取:
¥The SheetNames
property[^2] is an array of the sheet names in the workbook.
The first sheet name can be obtained with the following JS snippet:
var first_sheet_name = wb.SheetNames[0];
Sheets
属性 [^3] 是一个对象,其键是工作表名称,其对应的值是工作表对象。
¥The Sheets
property[^3] is an object whose keys are sheet names and whose
corresponding values are worksheet objects.
var first_sheet = wb.Sheets[first_sheet_name];
sheet_to_csv
实用函数 [^4] 从工作表生成 CSV 字符串:
¥The sheet_to_csv
utility function[^4] generates a CSV string from the sheet:
var csv = XLSX.utils.sheet_to_csv(first_sheet);
C 集成代码
¥C integration code
在此示例中,console.log
方法将打印生成的 CSV:
¥In this example, the console.log
method will print the generated CSV:
/* print CSV from first worksheet */
js_dostring(J, "var ws = wb.Sheets[wb.SheetNames[0]]");
js_dostring(J, "console.log(XLSX.utils.sheet_to_csv(ws));");
集成示例
¥Integration Example
该演示在以下部署中进行了测试:
¥This demo was tested in the following deployments:
架构 | 版本 | 日期 |
---|---|---|
darwin-x64 | 1.3.4 | 2024-05-25 |
darwin-arm | 1.3.4 | 2024-05-23 |
win10-x64 | 1.3.4 | 2024-06-20 |
win11-arm | 1.3.4 | 2024-06-20 |
linux-x64 | 1.3.4 | 2024-04-21 |
linux-arm | 1.3.4 | 2024-05-25 |
MuJS 发行版不包含原生 Windows 项目。win10-x64
和 win11-arm
测试完全在 Linux 的 Windows 子系统中运行。
¥MuJS distributions do not include native Windows projects. The win10-x64
and
win11-arm
tests were run entirely within Windows Subsystem for Linux.
在 WSL 中构建时,必须使用 apt
安装 libreadline-dev
:
¥When building in WSL, libreadline-dev
must be installed using apt
:
sudo apt-get install libreadline-dev
-
建立一个项目目录:
¥Make a project directory:
mkdir sheetjs-mu
cd sheetjs-mu
-
从源代码构建 MuJS 共享库:
¥Build the MuJS shared library from source:
curl -LO https://mujs.com/downloads/mujs-1.3.4.zip
unzip mujs-1.3.4.zip
cd mujs-1.3.4
make release
cd ..
-
将
mujs.h
头文件和libmujs.a
库复制到项目文件夹中:¥Copy the
mujs.h
header file andlibmujs.a
library to the project folder:
cp mujs-1.3.4/build/release/libmujs.a mujs-1.3.4/mujs.h .
-
下载
SheetJSMu.c
:¥Download
SheetJSMu.c
:
curl -LO https://xlsx.nodejs.cn/mujs/SheetJSMu.c
-
构建应用:
¥Build the application:
gcc -o SheetJSMu SheetJSMu.c -L. -lmujs -lm -lc -std=c89 -Wall
-
下载 SheetJS Standalone 脚本、shim 脚本和测试文件。将所有三个文件移动到项目目录:
¥Download the SheetJS Standalone script, shim script and test file. Move all three files to the project directory:
curl -LO https://cdn.sheetjs.com/xlsx-0.20.3/package/dist/shim.min.js
curl -LO https://cdn.sheetjs.com/xlsx-0.20.3/package/dist/xlsx.full.min.js
curl -LO https://xlsx.nodejs.cn/pres.xlsb
-
运行应用:
¥Run the application:
./SheetJSMu pres.xlsb
如果成功,应用会将第一张工作表的内容打印为 CSV 行。
¥If successful, the app will print the contents of the first sheet as CSV rows.
[^1]: 见 read
于 "读取文件"
[^2]: 见 "工作簿对象"
¥See "Workbook Object"
[^3]: 见 "工作簿对象"
¥See "Workbook Object"
[^4]: 见 sheet_to_csv
于 "实用工具"