Skip to main content

使用 JE 进行数据处理

在生产应用中,强烈建议使用 C 引擎(如 JavaScript::Duktape)的绑定

¥In a production application, it is strongly recommended to use a binding for a C engine like JavaScript::Duktape

JE 是一个纯 Perl JavaScript 引擎。

¥JE is a pure-Perl JavaScript engine.

SheetJS 是一个用于从电子表格读取和写入数据的 JavaScript 库。

¥SheetJS is a JavaScript library for reading and writing data from spreadsheets.

此演示使用 JE 和 SheetJS 从电子表格中提取数据并打印 CSV 行。我们将探讨如何在 JE 上下文中加载 SheetJS 以及如何从 Perl 脚本处理电子表格。

¥This demo uses JE and SheetJS to pull data from a spreadsheet and print CSV rows. We'll explore how to load SheetJS in a JE context and process spreadsheets from Perl scripts.

"完整示例" 部分包含一个完整的脚本,用于从 XLS 文件读取数据、打印 CSV 行和写入 FODS 工作簿。

¥The "Complete Example" section includes a complete script for reading data from XLS files, printing CSV rows, and writing FODS workbooks.

集成详情

¥Integration Details

SheetJS ExtendScript 构建 可以在 JE 上下文中进行解析和评估。

¥The SheetJS ExtendScript build can be parsed and evaluated in a JE context.

发动机与 ES3 不同。修改原型可以修复一些行为:

¥The engine deviates from ES3. Modifying prototypes can fix some behavior:

Required shim to support JE (click to show)

The following features are implemented:

  • simple string charCodeAt
  • Number charCodeAt (to work around string split bug)
  • String match (to work around a bug when there are no matches)
Required shim to support JE
/* String#charCodeAt is missing */
var string = "";
for(var i = 0; i < 256; ++i) string += String.fromCharCode(i);
String.prototype.charCodeAt = function(n) {
var result = string.indexOf(this.charAt(n));
if(result == -1) throw this.charAt(n);
return result;
};

/* workaround for String split bug */
Number.prototype.charCodeAt = function(n) { return this + 48; };

/* String#match bug with empty results */
String.prototype.old_match = String.prototype.match;
String.prototype.match = function(p) {
var result = this.old_match(p);
return (Array.isArray(result) && result.length == 0) ? null : result;
};

加载 ExtendScript 版本时,必须删除 BOM:

¥When loading the ExtendScript build, the BOM must be removed:

## Load SheetJS source
my $src = read_file('xlsx.extendscript.js', { binmode => ':raw' });
$src =~ s/^\xEF\xBB\xBF//; ## remove UTF8 BOM
my $XLSX = $je->eval($src);

读取文件

¥Reading Files

数据应作为 Base64 字符串传递:

¥Data should be passed as Base64 strings:

use File::Slurp;
use MIME::Base64 qw( encode_base64 );

## Set up conversion method
$je->eval(<<'EOF');
function sheetjsparse(data) { try {
return XLSX.read(String(data), {type: "base64", WTF:1});
} catch(e) { return String(e); } }
EOF

## Read file
my $raw_data = encode_base64(read_file($ARGV[0], { binmode => ':raw' }), "");

## Call method with data
$return_val = $je->method(sheetjsparse => $raw_data);

写入文件

¥Writing Files

由于数据交换中的错误,强烈建议使用像 .fods 这样的简单格式:

¥Due to bugs in data interchange, it is strongly recommended to use a simple format like .fods:

use File::Slurp;

## Set up conversion method
$je->eval(<<'EOF');
function sheetjswrite(wb) { try {
return XLSX.write(wb, { WTF:1, bookType: "fods", type: "string" });
} catch(e) { return String(e); } }
EOF

## Generate file
my $fods = $je->method(sheetjswrite => $workbook);

## Write to filesystem
write_file("SheetJE.fods", $fods);

完整示例

¥Complete Example

测试部署

该演示在以下部署中进行了测试:

¥This demo was tested in the following deployments:

架构版本日期
darwin-x640.0662024-06-29
darwin-arm0.0662024-05-25
linux-x640.0662024-06-29
linux-arm0.0662024-05-25
  1. 通过 CPAN 安装 JEFile::Slurp

    ¥Install JE and File::Slurp through CPAN:

cpan install JE File::Slurp

某些测试运行中存在权限错误:

¥There were permissions errors in some test runs:

mkdir /Library/Perl/5.30/File: Permission denied at /System/Library/Perl/5.30/ExtUtils/Install.pm line 489.

在 macOS 上,命令应通过 sudo 运行:

¥On macOS, the commands should be run through sudo:

sudo cpan install JE File::Slurp
  1. 下载 SheetJS ExtendScript 构建

    ¥Download the SheetJS ExtendScript build:

curl -LO https://cdn.sheetjs.com/xlsx-0.20.3/package/dist/xlsx.extendscript.js
  1. 下载演示 SheetJE.pl

    ¥Download the demo SheetJE.pl:

curl -LO https://xlsx.nodejs.cn/perl/SheetJE.pl
  1. 下载 测试文件 并运行:

    ¥Download the test file and run:

curl -LO https://xlsx.nodejs.cn/cd.xls
perl SheetJE.pl cd.xls

稍等片刻后,内容将以 CSV 形式显示。该脚本还将生成可以在 LibreOffice 中打开的电子表格 SheetJE.fods

¥After a short wait, the contents will be displayed in CSV form. The script will also generate the spreadsheet SheetJE.fods which can be opened in LibreOffice.