Skip to main content

流导出

许多平台提供写入文件的方法。这些方法通常期望在写入之前生成整个文件。大型工作簿文件可能超出平台特定的大小限制。

¥Many platforms offer methods to write files. These methods typically expect the entire file to be generated before writing. Large workbook files may exceed platform-specific size limits.

一些平台还提供 "streaming" 或 "incremental" 方法。这些方法可以接受小块数据并逐步写入文件系统,而不是一次写入整个文件。

¥Some platforms also offer a "streaming" or "incremental" approach. Instead of writing the entire file at once, these methods can accept small chunks of data and incrementally write to the filesystem.

流式写入 演示包括实时浏览器演示和不支持 SheetJS 流的平台的说明。

¥The Streaming Write demo includes live browser demos and notes for platforms that do not support SheetJS streams.

此功能在版本 0.20.3 中进行了扩展。强烈推荐给 升级到最新版本

¥This feature was expanded in version 0.20.3. It is strongly recommended to upgrade to the latest version.

流基础

¥Streaming Basics

SheetJS 流使用 NodeJS 推送流 API。强烈建议查看官方 NodeJS "流" 文档 [^1]。

¥SheetJS streams use the NodeJS push streams API. It is strongly recommended to review the official NodeJS "Stream" documentation[^1].

Historical Note (click to show)

NodeJS push streams were introduced in 2012. The text streaming methods to_csv and to_html are supported in NodeJS v0.10 and later while the object streaming method to_json is supported in NodeJS v0.12 and later.

The first SheetJS streaming write function, to_csv, was introduced in 2017. It used and still uses the battle-tested NodeJS streaming API.

Years later, browser vendors opted to standardize a different stream API.

For maximal compatibility, the library uses NodeJS push streams.

NodeJS ECMAScript 模块支持

¥NodeJS ECMAScript Module Support

在 CommonJS 模块中,库可以使用 require 加载 stream 模块。SheetJS 库将在适用的情况下加载流支持。

¥In CommonJS modules, libraries can load the stream module using require. SheetJS libraries will load streaming support where applicable.

由于 ESM 限制,库无法自由导入 stream 模块。

¥Due to ESM limitations, libraries cannot freely import the stream module.

ECMAScript 模块限制

原始规范仅支持顶层导入​​:

¥The original specification only supported top-level imports:

import { Readable } from 'stream';

如果模块不可用,脚本无法正常失败或忽略错误。

¥If a module is unavailable, there is no way for scripts to gracefully fail or ignore the error.


规范补丁为该问题添加了两种不同的解决方案:

¥Patches to the specification added two different solutions to the problem:

  • "动态导入" 将抛出可由库处理的错误。动态导入将污染不使用基于 Promise 的方法的 API。

    ¥"dynamic imports" will throw errors that can be handled by libraries. Dynamic imports will taint APIs that do not use Promise-based methods.

/* Readable will be undefined if stream cannot be imported */
const Readable = await (async() => {
try {
return (await import("stream"))?.Readable;
} catch(e) { /* silently ignore error */ }
})();
  • "导入映射" 控制模块解析,允许库用户手动分流不受支持的模块。

    ¥"import maps" control module resolution, allowing library users to manually shunt unsupported modules.

这些补丁是在浏览器采用 ESM 后发布的!许多浏览器和其他平台支持顶层导入​​,但不支持补丁。

¥These patches were released after browsers adopted ESM! A number of browsers and other platforms support top-level imports but do not support the patches.


由于 ESM 问题,强烈建议尽可能使用 CommonJS!

¥Due to ESM woes, it is strongly recommended to use CommonJS when possible!

为了最大限度地支持平台,SheetJS 库公开了一种特殊的 set_readable 方法来提供 Readable 实现:

¥For maximal platform support, SheetJS libraries expose a special set_readable method to provide a Readable implementation:

SheetJS NodeJS ESM streaming support
import { stream as SheetJStream } from 'xlsx';
import { Readable } from 'stream';

SheetJStream.set_readable(Readable);

工作表导出

¥Worksheet Export

工作表导出方法接受 SheetJS 工作表对象。

¥The worksheet export methods accept a SheetJS worksheet object.

CSV 导出

¥CSV Export

以 "逗号分隔值" (CSV) 格式导出工作表数据

¥Export worksheet data in "Comma-Separated Values" (CSV)

var csvstream = XLSX.stream.to_csv(ws, opts);

to_csv 创建 NodeJS 文本流。这些选项镜像非流式 sheet_to_csv 方法。

¥to_csv creates a NodeJS text stream. The options mirror the non-streaming sheet_to_csv method.

以下 NodeJS 脚本获取 https://xlsx.nodejs.cn/pres.numbers 并将 CSV 行流式传输到终端。

¥The following NodeJS script fetches https://xlsx.nodejs.cn/pres.numbers and streams CSV rows to the terminal.

Streaming CSV Print Example
const XLSX = require("xlsx");

(async() => {
var ab = await (await fetch("https://xlsx.nodejs.cn/pres.numbers")).arrayBuffer()
var wb = XLSX.read(ab);
var ws = wb.Sheets[wb.SheetNames[0]];
XLSX.stream.to_csv(ws).pipe(process.stdout);
})();

JSON 导出

¥JSON Export

将工作表数据导出到 "数组的数组" 或 "对象数组"

¥Export worksheet data to "Arrays of Arrays" or "Arrays of Objects"

var jsonstream = XLSX.stream.to_json(ws, opts);

to_json 创建 NodeJS 对象流。这些选项镜像非流式 sheet_to_json 方法。

¥to_json creates a NodeJS object stream. The options mirror the non-streaming sheet_to_json method.

以下 NodeJS 脚本获取 https://xlsx.nodejs.cn/pres.numbers 并将 JSON 行流式传输到终端。Transform[^2] 流从对象流生成文本。

¥The following NodeJS script fetches https://xlsx.nodejs.cn/pres.numbers and streams JSON rows to the terminal. A Transform[^2] stream generates text from the object streams.

Streaming Objects Print Example
const XLSX = require("xlsx")
const { Transform } = require("stream");

/* this Transform stream converts JS objects to text */
var conv = new Transform({writableObjectMode:true});
conv._transform = function(obj, e, cb){ cb(null, JSON.stringify(obj) + "\n"); };

(async() => {
var ab = await (await fetch("https://xlsx.nodejs.cn/pres.numbers")).arrayBuffer()
var wb = XLSX.read(ab);
var ws = wb.Sheets[wb.SheetNames[0]];
XLSX.stream.to_json(ws, {raw: true}).pipe(conv).pipe(process.stdout);
})();

HTML 导出

¥HTML Export

将工作表数据导出到 HTML TABLE

¥Export worksheet data to HTML TABLE

var htmlstream = XLSX.stream.to_html(ws, opts);

to_html 创建 NodeJS 文本流。这些选项镜像非流式 sheet_to_html 方法。

¥to_html creates a NodeJS text stream. The options mirror the non-streaming sheet_to_html method.

以下 NodeJS 脚本获取 https://xlsx.nodejs.cn/pres.numbers 并将 HTML TABLE 行流式传输到终端。

¥The following NodeJS script fetches https://xlsx.nodejs.cn/pres.numbers and streams HTML TABLE rows to the terminal.

Streaming HTML Print Example
const XLSX = require("xlsx");

(async() => {
var ab = await (await fetch("https://xlsx.nodejs.cn/pres.numbers")).arrayBuffer()
var wb = XLSX.read(ab);
var ws = wb.Sheets[wb.SheetNames[0]];
XLSX.stream.to_html(ws).pipe(process.stdout);
})();

工作簿导出

¥Workbook Export

工作簿导出方法接受 SheetJS 工作簿对象。

¥The workbook export methods accept a SheetJS workbook object.

XLML 导出

¥XLML Export

将工作簿数据导出到 SpreadsheetML2003 XML 文件

¥Export workbook data to SpreadsheetML2003 XML files

var xlmlstream = XLSX.stream.to_xlml(wb, opts);

to_xlml 创建 NodeJS 文本流。这些选项使用 xlml 书籍类型镜像非流式 write 方法。

¥to_xlml creates a NodeJS text stream. The options mirror the non-streaming write method using the xlml book type.

以下 NodeJS 脚本获取 https://xlsx.nodejs.cn/pres.numbers 并将 SpreadsheetML2003 工作簿写入 SheetJStream.xml.xls

¥The following NodeJS script fetches https://xlsx.nodejs.cn/pres.numbers and writes a SpreadsheetML2003 workbook to SheetJStream.xml.xls:

Streaming XLML Write Example
const XLSX = require("xlsx"), fs = require("fs");

(async() => {
var ab = await (await fetch("https://xlsx.nodejs.cn/pres.numbers")).arrayBuffer()
var wb = XLSX.read(ab);
XLSX.stream.to_xlml(wb).pipe(fs.createWriteStream("SheetJStream.xml.xls"));
})();

[^1]: 请参阅 NodeJS 文档中的 "流"

¥See "Stream" in the NodeJS documentation.

[^2]: 请参阅 NodeJS 文档中的 Transform

¥See Transform in the NodeJS documentation.