流导出
许多平台提供写入文件的方法。这些方法通常期望在写入之前生成整个文件。大型工作簿文件可能超出平台特定的大小限制。
¥Many platforms offer methods to write files. These methods typically expect the entire file to be generated before writing. Large workbook files may exceed platform-specific size limits.
一些平台还提供 "streaming" 或 "incremental" 方法。这些方法可以接受小块数据并逐步写入文件系统,而不是一次写入整个文件。
¥Some platforms also offer a "streaming" or "incremental" approach. Instead of writing the entire file at once, these methods can accept small chunks of data and incrementally write to the filesystem.
流式写入 演示包括实时浏览器演示和不支持 SheetJS 流的平台的说明。
¥The Streaming Write demo includes live browser demos and notes for platforms that do not support SheetJS streams.
此功能在版本 0.20.3
中进行了扩展。强烈推荐给 升级到最新版本。
¥This feature was expanded in version 0.20.3
. It is strongly recommended to
upgrade to the latest version.
流基础
¥Streaming Basics
SheetJS 流使用 NodeJS 推送流 API。强烈建议查看官方 NodeJS "流" 文档 [^1]。
¥SheetJS streams use the NodeJS push streams API. It is strongly recommended to review the official NodeJS "Stream" documentation[^1].
Historical Note (click to show)
NodeJS push streams were introduced in 2012. The text streaming methods to_csv
and to_html
are supported in NodeJS v0.10 and later while the object streaming
method to_json
is supported in NodeJS v0.12 and later.
The first SheetJS streaming write function, to_csv
, was introduced in 2017. It
used and still uses the battle-tested NodeJS streaming API.
Years later, browser vendors opted to standardize a different stream API.
For maximal compatibility, the library uses NodeJS push streams.
NodeJS ECMAScript 模块支持
¥NodeJS ECMAScript Module Support
在 CommonJS 模块中,库可以使用 require
加载 stream
模块。SheetJS 库将在适用的情况下加载流支持。
¥In CommonJS modules, libraries can load the stream
module using require
.
SheetJS libraries will load streaming support where applicable.
由于 ESM 限制,库无法自由导入 stream
模块。
¥Due to ESM limitations, libraries cannot freely import the stream
module.
原始规范仅支持顶层导入:
¥The original specification only supported top-level imports:
import { Readable } from 'stream';
如果模块不可用,脚本无法正常失败或忽略错误。
¥If a module is unavailable, there is no way for scripts to gracefully fail or ignore the error.
规范补丁为该问题添加了两种不同的解决方案:
¥Patches to the specification added two different solutions to the problem:
-
"动态导入" 将抛出可由库处理的错误。动态导入将污染不使用基于 Promise 的方法的 API。
¥"dynamic imports" will throw errors that can be handled by libraries. Dynamic imports will taint APIs that do not use Promise-based methods.
/* Readable will be undefined if stream cannot be imported */
const Readable = await (async() => {
try {
return (await import("stream"))?.Readable;
} catch(e) { /* silently ignore error */ }
})();
-
"导入映射" 控制模块解析,允许库用户手动分流不受支持的模块。
¥"import maps" control module resolution, allowing library users to manually shunt unsupported modules.
这些补丁是在浏览器采用 ESM 后发布的!许多浏览器和其他平台支持顶层导入,但不支持补丁。
¥These patches were released after browsers adopted ESM! A number of browsers and other platforms support top-level imports but do not support the patches.
由于 ESM 问题,强烈建议尽可能使用 CommonJS!
¥Due to ESM woes, it is strongly recommended to use CommonJS when possible!
为了最大限度地支持平台,SheetJS 库公开了一种特殊的 set_readable
方法来提供 Readable
实现:
¥For maximal platform support, SheetJS libraries expose a special set_readable
method to provide a Readable
implementation:
import { stream as SheetJStream } from 'xlsx';
import { Readable } from 'stream';
SheetJStream.set_readable(Readable);
工作表导出
¥Worksheet Export
工作表导出方法接受 SheetJS 工作表对象。
¥The worksheet export methods accept a SheetJS worksheet object.
CSV 导出
¥CSV Export
以 "逗号分隔值" (CSV) 格式导出工作表数据
¥Export worksheet data in "Comma-Separated Values" (CSV)
var csvstream = XLSX.stream.to_csv(ws, opts);
to_csv
创建 NodeJS 文本流。这些选项镜像非流式 sheet_to_csv
方法。
¥to_csv
creates a NodeJS text stream. The options mirror the non-streaming
sheet_to_csv
method.
以下 NodeJS 脚本获取 https://xlsx.nodejs.cn/pres.numbers 并将 CSV 行流式传输到终端。
¥The following NodeJS script fetches https://xlsx.nodejs.cn/pres.numbers and streams CSV rows to the terminal.
- CommonJS
- ESM
const XLSX = require("xlsx");
(async() => {
var ab = await (await fetch("https://xlsx.nodejs.cn/pres.numbers")).arrayBuffer()
var wb = XLSX.read(ab);
var ws = wb.Sheets[wb.SheetNames[0]];
XLSX.stream.to_csv(ws).pipe(process.stdout);
})();
import { read, stream } from "xlsx";
import { Readable } from "stream";
stream.set_readable(Readable);
var ab = await (await fetch("https://xlsx.nodejs.cn/pres.numbers")).arrayBuffer()
var wb = read(ab);
var ws = wb.Sheets[wb.SheetNames[0]];
stream.to_csv(ws).pipe(process.stdout);
JSON 导出
¥JSON Export
将工作表数据导出到 "数组的数组" 或 "对象数组"
¥Export worksheet data to "Arrays of Arrays" or "Arrays of Objects"
var jsonstream = XLSX.stream.to_json(ws, opts);
to_json
创建 NodeJS 对象流。这些选项镜像非流式 sheet_to_json
方法。
¥to_json
creates a NodeJS object stream. The options mirror the non-streaming
sheet_to_json
method.
以下 NodeJS 脚本获取 https://xlsx.nodejs.cn/pres.numbers 并将 JSON 行流式传输到终端。Transform
[^2] 流从对象流生成文本。
¥The following NodeJS script fetches https://xlsx.nodejs.cn/pres.numbers and
streams JSON rows to the terminal. A Transform
[^2] stream generates text from
the object streams.
- CommonJS
- ESM
const XLSX = require("xlsx")
const { Transform } = require("stream");
/* this Transform stream converts JS objects to text */
var conv = new Transform({writableObjectMode:true});
conv._transform = function(obj, e, cb){ cb(null, JSON.stringify(obj) + "\n"); };
(async() => {
var ab = await (await fetch("https://xlsx.nodejs.cn/pres.numbers")).arrayBuffer()
var wb = XLSX.read(ab);
var ws = wb.Sheets[wb.SheetNames[0]];
XLSX.stream.to_json(ws, {raw: true}).pipe(conv).pipe(process.stdout);
})();
import { read, stream } from "xlsx";
import { Readable, Transform } from "stream";
stream.set_readable(Readable);
/* this Transform stream converts JS objects to text */
var conv = new Transform({writableObjectMode:true});
conv._transform = function(obj, e, cb){ cb(null, JSON.stringify(obj) + "\n"); };
var ab = await (await fetch("https://xlsx.nodejs.cn/pres.numbers")).arrayBuffer()
var wb = read(ab);
var ws = wb.Sheets[wb.SheetNames[0]];
stream.to_json(ws, {raw: true}).pipe(conv).pipe(process.stdout);
HTML 导出
¥HTML Export
将工作表数据导出到 HTML TABLE
¥Export worksheet data to HTML TABLE
var htmlstream = XLSX.stream.to_html(ws, opts);
to_html
创建 NodeJS 文本流。这些选项镜像非流式 sheet_to_html
方法。
¥to_html
creates a NodeJS text stream. The options mirror the non-streaming
sheet_to_html
method.
以下 NodeJS 脚本获取 https://xlsx.nodejs.cn/pres.numbers 并将 HTML TABLE 行流式传输到终端。
¥The following NodeJS script fetches https://xlsx.nodejs.cn/pres.numbers and streams HTML TABLE rows to the terminal.
- CommonJS
- ESM
const XLSX = require("xlsx");
(async() => {
var ab = await (await fetch("https://xlsx.nodejs.cn/pres.numbers")).arrayBuffer()
var wb = XLSX.read(ab);
var ws = wb.Sheets[wb.SheetNames[0]];
XLSX.stream.to_html(ws).pipe(process.stdout);
})();
import { read, stream } from "xlsx";
import { Readable } from "stream";
stream.set_readable(Readable);
var ab = await (await fetch("https://xlsx.nodejs.cn/pres.numbers")).arrayBuffer()
var wb = read(ab);
var ws = wb.Sheets[wb.SheetNames[0]];
stream.to_html(ws).pipe(process.stdout);
工作簿导出
¥Workbook Export
工作簿导出方法接受 SheetJS 工作簿对象。
¥The workbook export methods accept a SheetJS workbook object.
XLML 导出
¥XLML Export
将工作簿数据导出到 SpreadsheetML2003 XML 文件
¥Export workbook data to SpreadsheetML2003 XML files
var xlmlstream = XLSX.stream.to_xlml(wb, opts);
to_xlml
创建 NodeJS 文本流。这些选项使用 xlml
书籍类型镜像非流式 write
方法。
¥to_xlml
creates a NodeJS text stream. The options mirror the non-streaming
write
method using the xlml
book type.
以下 NodeJS 脚本获取 https://xlsx.nodejs.cn/pres.numbers 并将 SpreadsheetML2003 工作簿写入 SheetJStream.xml.xls
:
¥The following NodeJS script fetches https://xlsx.nodejs.cn/pres.numbers and
writes a SpreadsheetML2003 workbook to SheetJStream.xml.xls
:
- CommonJS
- ESM
const XLSX = require("xlsx"), fs = require("fs");
(async() => {
var ab = await (await fetch("https://xlsx.nodejs.cn/pres.numbers")).arrayBuffer()
var wb = XLSX.read(ab);
XLSX.stream.to_xlml(wb).pipe(fs.createWriteStream("SheetJStream.xml.xls"));
})();
import { read, stream } from "xlsx";
import { Readable } from "stream";
stream.set_readable(Readable);
import { createWriteStream } from "fs";
var ab = await (await fetch("https://xlsx.nodejs.cn/pres.numbers")).arrayBuffer()
var wb = read(ab);
stream.to_xlml(wb).pipe(createWriteStream("SheetJStream.xml.xls"));
[^1]: 请参阅 NodeJS 文档中的 "流"。
¥See "Stream" in the NodeJS documentation.
[^2]: 请参阅 NodeJS 文档中的 Transform
。
¥See Transform
in the NodeJS documentation.