Skip to main content

HTTP 下载

SheetJS 是一个用于从电子表格读取和写入数据的 JavaScript 库。

¥SheetJS is a JavaScript library for reading and writing data from spreadsheets.

许多 JavaScript API(包括 XMLHttpRequestfetch)允许脚本下载电子表格以供进一步处理。

¥A number of JavaScript APIs, including XMLHttpRequest and fetch, allow scripts to download spreadsheets for further processing.

此演示使用各种 API 和封装器库下载工作簿并将原始二进制数据传递给 SheetJS 库。

¥This demo uses various APIs and wrapper libraries to download workbooks and pass raw binary data to SheetJS libraries.

  • "浏览器演示" 完全在 Web 浏览器中运行。将在 Web 浏览器中下载和解析测试工作簿。

    ¥"Browser Demos" run entirely within the web browser. A test workbook will be downloaded and parsed in the web browser.

  • "NodeJS 演示" 在 NodeJS 和其他服务器端平台中运行。

    ¥"NodeJS Demos" run in NodeJS and other server-side platforms.

这个演示主要是下载文件。其他演示涵盖其他 HTTP 用例:

¥This demo focuses on downloading files. Other demos cover other HTTP use cases:

第三方主机和二进制数据

AWS 等第三方云平台可能会通过以 UTF-8 字符串对请求和响应进行编码来破坏原始二进制下载。

¥Third-party cloud platforms such as AWS may corrupt raw binary downloads by encoding requests and responses in UTF-8 strings.

对于 AWS,在 API Gateway 控制台的 "二进制媒体类型" 部分中,应添加 "application/vnd.ms-excel" 类型,以确保 AWS Lambda 函数可以将文件发送到客户端。

¥For AWS, in the "Binary Media Types" section of the API Gateway console, the "application/vnd.ms-excel" type should be added to ensure that AWS Lambda functions functions can send files to clients.

二进制数据

¥Binary Data

最有趣的电子表格文件是包含表示无效 UTF-8 字符的字节序列的二进制数据。

¥Most interesting spreadsheet files are binary data that contain byte sequences that represent invalid UTF-8 characters.

API 通常提供选项来控制如何解释下载的数据。arraybuffer 响应类型通常强制将数据呈现为 ArrayBuffer 对象,该对象可以使用 SheetJS read 方法 [^1] 进行解析。

¥APIs generally provide options to control how downloaded data is interpreted. The arraybuffer response type usually forces the data to be presented as an ArrayBuffer object which can be parsed with the SheetJS read method[^1].

下面的例子展示了使用 fetch 下载文件的数据流:

¥The following example shows the data flow using fetch to download files:

/* download data into an ArrayBuffer object */
const res = await fetch("https://xlsx.nodejs.cn/pres.numbers");
const ab = await res.arrayBuffer(); // recover data as ArrayBuffer

/* parse file */
const wb = XLSX.read(ab);

浏览器演示

¥Browser Demos

当页面被访问时,https://xlsx.nodejs.cn/pres.numbers 将被获取并解析。旧表将替换为内容与第一个工作表匹配的表。SheetJS sheet_to_html 方法 [^2] 创建 HTML 表。

¥When the page is accessed, https://xlsx.nodejs.cn/pres.numbers will be fetched and parsed. The old table will be replaced with a table whose contents match the first worksheet. The SheetJS sheet_to_html method[^2] creates the HTML table.

测试部署

每个浏览器演示都在以下环境中进行了测试:

¥Each browser demo was tested in the following environments:

浏览器日期
Chrome 1262024-06-19
Safari 17.32024-06-19

XMLHttp 请求

¥XMLHttpRequest

对于下载数据,arraybuffer 响应类型生成 ArrayBuffer,可以将其视为 Uint8Array 并馈送到 SheetJS read 方法。对于旧版浏览器,应指定选项 type: "array"

¥For downloading data, the arraybuffer response type generates an ArrayBuffer that can be viewed as an Uint8Array and fed to the SheetJS read method. For legacy browsers, the option type: "array" should be specified:

/* set up an async GET request */
var req = new XMLHttpRequest();
req.open("GET", url, true);
req.responseType = "arraybuffer";

req.onload = function(e) {
/* parse the data when it is received */
var data = new Uint8Array(req.response);
var workbook = XLSX.read(data, {type:"array"});
/* DO SOMETHING WITH workbook HERE */
};
req.send();
Live Download demo (click to show)

This demo uses XMLHttpRequest to fetch https://xlsx.nodejs.cn/pres.numbers and show the data in an HTML table.

Result
Loading...
Live Editor
function SheetJSXHRDL() {
  const [__html, setHTML] = React.useState("");

  /* Fetch and update HTML */
  React.useEffect(() => { (async() => {
    /* Fetch file */
    const req = new XMLHttpRequest();
    req.open("GET", "https://xlsx.nodejs.cn/pres.numbers", true);
    req.responseType = "arraybuffer";
    req.onload = e => {
      /* Parse file */
      const wb = XLSX.read(new Uint8Array(req.response));
      const ws = wb.Sheets[wb.SheetNames[0]];

      /* Generate HTML */
      setHTML(XLSX.utils.sheet_to_html(ws));
    };
    req.send();
  })(); }, []);

  return ( <div dangerouslySetInnerHTML={{ __html }}/> );
}

fetch

为了下载数据,Response#arrayBuffer 解析为 ArrayBuffer,可以将其转换为 Uint8Array 并传递给 SheetJS read 方法:

¥For downloading data, Response#arrayBuffer resolves to an ArrayBuffer that can be converted to Uint8Array and passed to the SheetJS read method:

fetch(url).then(function(res) {
/* get the data as a Blob */
if(!res.ok) throw new Error("fetch failed");
return res.arrayBuffer();
}).then(function(ab) {
/* parse the data when it is received */
var data = new Uint8Array(ab);
var workbook = XLSX.read(data, {type:"array"});

/* DO SOMETHING WITH workbook HERE */
});
Live Download demo (click to show)

This demo uses fetch to download https://xlsx.nodejs.cn/pres.numbers and show the data in an HTML table.

Result
Loading...
Live Editor
function SheetJSFetchDL() {
  const [__html, setHTML] = React.useState("");

  /* Fetch and update HTML */
  React.useEffect(() => { (async() => {
    /* Fetch file */
    const res = await fetch("https://xlsx.nodejs.cn/pres.numbers");
    const ab = await res.arrayBuffer();

    /* Parse file */
    const wb = XLSX.read(ab);
    const ws = wb.Sheets[wb.SheetNames[0]];

    /* Generate HTML */
    setHTML(XLSX.utils.sheet_to_html(ws));
  })(); }, []);

  return ( <div dangerouslySetInnerHTML={{ __html }}/> );
}

jQuery

jQuery 是一个 JavaScript 库,其中包含用于执行 "Ajax" 网络请求的辅助程序。jQuery.ajax ($.ajax) 不支持开箱即用的二进制数据 [^3]。定制 ajaxTransport 可以添加支持。

¥jQuery is a JavaScript library that includes helpers for performing "Ajax" network requests. jQuery.ajax ($.ajax) does not support binary data out of the box[^3]. A custom ajaxTransport can add support.

SheetJS 用户报告在 IE10 中成功使用 jquery.binarytransport.js[^4]。

¥SheetJS users have reported success with jquery.binarytransport.js[^4] in IE10.

在包含主要的 jquery.jsjquery.binarytransport.js 脚本后,$.ajax 将支持 dataType: "binary"processData: false

¥After including the main jquery.js and jquery.binarytransport.js scripts, $.ajax will support dataType: "binary" and processData: false.

实时下载演示

¥Live Download Demo

在 GET 请求中,默认行为是返回 Blob 对象。传递 responseType: "arraybuffer" 在 IE10 中返回正确的 ArrayBuffer 对象:

¥In a GET request, the default behavior is to return a Blob object. Passing responseType: "arraybuffer" returns a proper ArrayBuffer object in IE10:

$.ajax({
type: "GET", url: "https://xlsx.nodejs.cn/pres.numbers",

/* suppress jQuery post-processing */
processData: false,

/* use the binary transport */
dataType: "binary",

/* pass an ArrayBuffer in the callback */
responseType: "arraybuffer",

success: function (ab) {
/* at this point, ab is an ArrayBuffer */
var wb = XLSX.read(ab);

/* do something with workbook here */
var ws = wb.Sheets[wb.SheetNames[0]];
var html = XLSX.utils.sheet_to_html(ws);
$("#out").html(html);
}
});

封装库

¥Wrapper Libraries

fetch 随浏览器一起发布之前,有各种封装器库来简化 XMLHttpRequest。由于 fetch 的限制,这些库仍然相关。

¥Before fetch shipped with browsers, there were various wrapper libraries to simplify XMLHttpRequest. Due to limitations with fetch, these libraries are still relevant.

axios

axios 提供了一个基于 Promise 的接口。将 responseType 设置为 arraybuffer 可确保返回类型是 ArrayBuffer。结果的 data 属性可以传递给 SheetJS read 方法:

¥axios presents a Promise based interface. Setting responseType to arraybuffer ensures the return type is an ArrayBuffer. The data property of the result can be passed to the SheetJS read method:

async function workbook_dl_axios(url) {
const res = await axios(url, {responseType:'arraybuffer'});
const workbook = XLSX.read(res.data);
return workbook;
}
Live Download demo (click to show)

This demo uses axios to download https://xlsx.nodejs.cn/pres.numbers and show the data in an HTML table.

If the live demo shows a message

ReferenceError: axios is not defined

please refresh the page. This is a known bug in the documentation generator.

Result
Loading...
Live Editor
function SheetJSAxiosDL() {
  const [__html, setHTML] = React.useState("");

  /* Fetch and update HTML */
  React.useEffect(() => { (async() => {
    if(typeof axios != "function") return setHTML("ReferenceError: axios is not defined");
    /* Fetch file */
    const res = await axios("https://xlsx.nodejs.cn/pres.numbers", {responseType: "arraybuffer"});

    /* Parse file */
    const wb = XLSX.read(res.data);
    const ws = wb.Sheets[wb.SheetNames[0]];

    /* Generate HTML */
    setHTML(XLSX.utils.sheet_to_html(ws));
  })(); }, []);

  return ( <div dangerouslySetInnerHTML={{ __html }}/> );
}

superagent

superagent 是一个带有 "流畅的界面" 的网络请求库。使用 "arraybuffer" 调用 responseType 方法将确保最终响应对象是 ArrayBuffer

¥superagent is a network request library with a "Fluent Interface". Calling the responseType method with "arraybuffer" will ensure the final response object is an ArrayBuffer:

/* set up an async GET request with superagent */
superagent
.get(url)
.responseType('arraybuffer')
.end(function(err, res) {
/* parse the data when it is received */
var data = new Uint8Array(res.body);
var workbook = XLSX.read(data, {type:"array"});

/* DO SOMETHING WITH workbook HERE */
});
Live Download demo (click to show)

This demo uses superagent to download https://xlsx.nodejs.cn/pres.numbers and show the data in an HTML table.

If the live demo shows a message

ReferenceError: superagent is not defined

please refresh the page. This is a known bug in the documentation generator.

Result
Loading...
Live Editor
function SheetJSSuperAgentDL() {
  const [__html, setHTML] = React.useState("");

  /* Fetch and update HTML */
  React.useEffect(() => { (async() => {
    if(typeof superagent == "undefined" || typeof superagent.get != "function")
      return setHTML("ReferenceError: superagent is not defined");
    /* Fetch file */
    superagent
      .get("https://xlsx.nodejs.cn/pres.numbers")
      .responseType("arraybuffer")
      .end((err, res) => {
        /* Parse file */
        const wb = XLSX.read(res.body);
        const ws = wb.Sheets[wb.SheetNames[0]];

        /* Generate HTML */
        setHTML(XLSX.utils.sheet_to_html(ws));
      });
  })(); }, []);

  return ( <div dangerouslySetInnerHTML={{ __html }}/> );
}

NodeJS 演示

¥NodeJS Demos

这些示例展示了如何在 NodeJS 中下载数据。

¥These examples show how to download data in NodeJS.

HTTPS 获取

¥HTTPS GET

https 模块为 HTTPS GET 请求提供底层 get 方法:

¥The https module provides a low-level get method for HTTPS GET requests:

SheetJSHTTPSGet.js
var https = require("https"), XLSX = require("xlsx");

https.get('https://xlsx.nodejs.cn/pres.xlsx', function(res) {
var bufs = [];
res.on('data', function(chunk) { bufs.push(chunk); });
res.on('end', function() {
var buf = Buffer.concat(bufs);
var wb = XLSX.read(buf);
/* print the first worksheet to console */
var ws = wb.Sheets[wb.SheetNames[0]];
console.log(XLSX.utils.sheet_to_csv(ws));
});
});
测试部署

本 demo 在以下环境下进行了测试:

¥This demo was tested in the following environments:

NodeJS日期解决方法
0.10.482024-06-21NODE_TLS_REJECT_UNAUTHORIZED
0.12.182024-06-21NODE_TLS_REJECT_UNAUTHORIZED
4.9.12024-06-21NODE_TLS_REJECT_UNAUTHORIZED
6.17.12024-06-21NODE_TLS_REJECT_UNAUTHORIZED
8.17.02024-06-21NODE_TLS_REJECT_UNAUTHORIZED
10.24.12024-06-21
12.22.122024-06-21
14.21.32024-06-21
16.20.22024-06-21
18.20.32024-06-21
20.15.02024-06-21
22.3.02024-06-21

NODE_TLS_REJECT_UNAUTHORIZED 解决方法将值设置为 '0'

¥The NODE_TLS_REJECT_UNAUTHORIZED workaround sets the value to '0':

Legacy NodeJS Certificate has Expired Bypass (prepend to script)
process.env.NODE_TLS_REJECT_UNAUTHORIZED = '0';
Complete Example (click to show)
  1. Install the NodeJS module
npm i --save https://cdn.sheetjs.com/xlsx-0.20.3/xlsx-0.20.3.tgz
  1. Copy the SheetJSHTTPSGet.js code snippet to a file SheetJSHTTPSGet.js

  2. Run the script:

node SheetJSHTTPSGet.js

If successful, the script will print CSV contents of the test file.

For older versions of NodeJS, the script will fail due to a certificate error. The error can be suppressed by prepending the following line to the script:

SheetJSHTTPSGet.js (add to top)
process.env.NODE_TLS_REJECT_UNAUTHORIZED = '0';

It is strongly encouraged to upgrade to a newer NodeJS version!

fetch

NodeJS 16.15.0 中引入了对 fetch 的实验性支持。它在 NodeJS LTS 版本 22 中被认为是稳定的。

¥Experimental support for fetch was introduced in NodeJS 16.15.0. It will be considered stable in NodeJS LTS version 22.

fetch 实现具有与浏览器版本相同的返回类型:

¥The fetch implementation has the same return types as the browser version:

async function parse_from_url(url) {
const res = await fetch(url);
if(!res.ok) throw new Error("fetch failed");
const ab = await res.arrayBuffer();
const workbook = XLSX.read(ab);
return workbook;
}
测试部署

本 demo 在以下环境下进行了测试:

¥This demo was tested in the following environments:

NodeJS日期
18.20.32024-06-21
20.15.02024-06-21
22.3.02024-06-21
Complete Example (click to show)
  1. Install the NodeJS module
npm i --save https://cdn.sheetjs.com/xlsx-0.20.3/xlsx-0.20.3.tgz
  1. Save the following to SheetJSFetch.js:
SheetJSFetch.js
var XLSX = require("xlsx");

async function parse_from_url(url) {
const res = await fetch(url);
if(!res.ok) throw new Error("fetch failed");
const ab = await res.arrayBuffer();
const workbook = XLSX.read(ab);
return workbook;
}

(async() => {
const wb = await parse_from_url('https://xlsx.nodejs.cn/pres.numbers');
/* print the first worksheet to console */
var ws = wb.Sheets[wb.SheetNames[0]];
console.log(XLSX.utils.sheet_to_csv(ws));
})();
  1. Run the script:
node SheetJSFetch.js

If successful, the script will print CSV contents of the test file.

封装库

¥Wrapper Libraries

NodeJS 的最新版本原生支持 fetch。在将 fetch 支持添加到平台之前,第三方模块封装了原生 API。

¥The latest releases of NodeJS support fetch natively. Before fetch support was added to the platform, third party modules wrapped the native APIs.

request

request 已被弃用,仅应在旧部署中使用。

¥request has been deprecated and should only be used in legacy deployments.

设置选项 encoding: null 传递原始缓冲区:

¥Setting the option encoding: null passes raw buffers:

SheetJSRequest.js
var XLSX = require('xlsx'), request = require('request');
var url = 'https://xlsx.nodejs.cn/pres.xlsx';

/* call `request` with the option `encoding: null` */
request(url, {encoding: null}, function(err, res, data) {
if(err || res.statusCode !== 200) return;

/* if the request was successful, parse the data */
var wb = XLSX.read(data);

/* print the first worksheet to console */
var ws = wb.Sheets[wb.SheetNames[0]];
console.log(XLSX.utils.sheet_to_csv(ws));
});
测试部署

本 demo 在以下环境下进行了测试:

¥This demo was tested in the following environments:

NodeJS日期解决方法
0.10.482024-06-21NODE_TLS_REJECT_UNAUTHORIZED
0.12.182024-06-21NODE_TLS_REJECT_UNAUTHORIZED
4.9.12024-06-21NODE_TLS_REJECT_UNAUTHORIZED
6.17.12024-06-21NODE_TLS_REJECT_UNAUTHORIZED
8.17.02024-06-21NODE_TLS_REJECT_UNAUTHORIZED
10.24.12024-06-21
12.22.122024-06-21
14.21.32024-06-21
16.20.22024-06-21
18.20.32024-06-21
20.15.02024-06-21
22.3.02024-06-21

NODE_TLS_REJECT_UNAUTHORIZED 解决方法将值设置为 '0'

¥The NODE_TLS_REJECT_UNAUTHORIZED workaround sets the value to '0':

Legacy NodeJS Certificate has Expired Bypass (prepend to script)
process.env.NODE_TLS_REJECT_UNAUTHORIZED = '0';
Complete Example (click to show)
  1. Install the NodeJS module
npm i --save https://cdn.sheetjs.com/xlsx-0.20.3/xlsx-0.20.3.tgz request@2.88.2
  1. Copy the SheetJSRequest.js code snippet to a file SheetJSRequest.js

  2. Run the script:

node SheetJSRequest.js

If successful, the script will print CSV contents of the test file.

For older versions of NodeJS, the script will fail due to a certificate error. The error can be suppressed by prepending the following line to the script:

SheetJSRequest.js (add to top)
process.env.NODE_TLS_REJECT_UNAUTHORIZED = '0';

It is strongly encouraged to upgrade to a newer NodeJS version!

axios

responseType"arraybuffer" 时,axios 实际上捕获的是 NodeJS Buffer 中的数据。SheetJS read 方法处理 NodeJS Buffer 对象:

¥When the responseType is "arraybuffer", axios actually captures the data in a NodeJS Buffer. The SheetJS read method handles NodeJS Buffer objects:

SheetJSAxios.js
const XLSX = require("xlsx"), axios = require("axios");

async function workbook_dl_axios(url) {
const res = await axios(url, {responseType:'arraybuffer'});
/* at this point, res.data is a Buffer */
const workbook = XLSX.read(res.data);
return workbook;
}
测试部署

本 demo 在以下环境下进行了测试:

¥This demo was tested in the following environments:

NodeJSAxios日期
10.24.10.28.12024-06-21
12.22.121.7.22024-06-21
14.21.31.7.22024-06-21
16.20.21.7.22024-06-21
18.20.31.7.22024-06-21
20.15.01.7.22024-06-21
22.3.01.7.22024-06-21
Complete Example (click to show)
  1. Install the NodeJS module
npm i --save https://cdn.sheetjs.com/xlsx-0.20.3/xlsx-0.20.3.tgz axios@1.7.2
  1. Save the following to SheetJSAxios.js:
SheetJSAxios.js
const XLSX = require("xlsx"), axios = require("axios");

async function workbook_dl_axios(url) {
const res = await axios(url, {responseType:'arraybuffer'});
/* at this point, res.data is a Buffer */
const workbook = XLSX.read(res.data);
return workbook;
}

(async() => {
const wb = await workbook_dl_axios('https://xlsx.nodejs.cn/pres.numbers');
/* print the first worksheet to console */
var ws = wb.Sheets[wb.SheetNames[0]];
console.log(XLSX.utils.sheet_to_csv(ws));
})();
  1. Run the script:
node SheetJSAxios.js

If successful, the script will print CSV contents of the test file.

其他平台

¥Other Platforms

其他演示展示了特殊平台上的网络操作:

¥Other demos show network operations in special platforms:

[^1]: 见 read 于 "读取文件"

¥See read in "Reading Files"

[^2]: 见 sheet_to_html 于 "实用工具"

¥See sheet_to_html in "Utilities"

[^3]: 请参阅 jQuery 官方文档中的 dataTypejQuery.ajax

¥See dataType in jQuery.ajax in the official jQuery documentation.

[^4]: 详细信息请参见 官方 jquery.binarytransport.js 存储库

¥See the official jquery.binarytransport.js repo for more details.