Skip to main content

HTML

HTML 是在网络中渲染数据的常见格式。虽然一般读取函数(XLSX.readXLSX.readFile)可以解析 HTML 字符串,写入函数(XLSX.writeXLSX.writeFile)可以生成 HTML 字符串,但本节中的实用程序函数可以使用 DOM 功能。

¥HTML is a common format for presenting data in the web. While the general read functions (XLSX.read and XLSX.readFile) can parse HTML strings and the write functions (XLSX.write and XLSX.writeFile) can generate HTML strings, the utility functions in this section can use DOM features.

SheetJS CE 主要关注数据和数字格式。

¥SheetJS CE primarily focuses on data and number formatting.

SheetJS 专业版 支持 HTML 格式和 HTML 表格实用程序中的 CSS 文本和单元格样式。

¥SheetJS Pro supports CSS text and cell styles in the HTML format and HTML table utilities.

HTML 表格输出

¥HTML Table Output

在 HTML 表格中显示工作表数据

¥Display worksheet data in a HTML table

var html = XLSX.utils.sheet_to_html(ws, opts);

作为 writeFile HTML 类型的替代方案,XLSX.utils.sheet_to_html 也生成 HTML 输出。该函数接受一个选项参数:

¥As an alternative to the writeFile HTML type, XLSX.utils.sheet_to_html also produces HTML output. The function takes an options argument:

选项名称默认描述
idTABLE 元素指定 id 属性
editablefalse如果为 true,则为每个 TD 设置 contenteditable="true"
header覆盖标头
footer覆盖页脚

示例文件 pres.numbers 开始:

¥Starting from the sample file pres.numbers:

Result
Loading...
Live Editor
function SheetJSHTML() {
  const url = "https://xlsx.nodejs.cn/pres.numbers";
  const [__html, setHTML] = React.useState("");
  React.useEffect(() => { (async() => {
    /* download file and parse */
    const wb = XLSX.read(await (await fetch(url)).arrayBuffer());
    /* get the first worksheet */
    const ws = wb.Sheets[wb.SheetNames[0]];

    /* generate HTML */
    const html = XLSX.utils.sheet_to_html(ws);

    setHTML(html);
  })(); }, []);
  return ( <>
    <b>XLSX.utils.sheet_to_html(ws)</b>
    <div dangerouslySetInnerHTML={{__html}}/>
  </> );
}

实现细节

¥Implementation Details

生成的表将包含每个 TD 元素的特殊数据属性:

¥The generated table will include special data attributes on each TD element:

属性描述
data-t覆盖 单元类型
data-v覆盖单元格值
data-z覆盖 数字格式

外部单元链接将被写入为封装单元内容的 A 标记。

¥External cell links will be written as A tags wrapping the cell contents.

HTML 表格输入

¥HTML Table Input

创建新工作表

¥Create New Sheet

从 TABLE 元素创建工作表或工作簿

¥Create a worksheet or workbook from a TABLE element

var ws = XLSX.utils.table_to_sheet(elt, opts);
var wb = XLSX.utils.table_to_book(elt, opts);

XLSX.utils.table_to_sheet 采用表 DOM 元素并返回类似于输入表的工作表。数字被解析。所有其他数据都将存储为字符串。

¥XLSX.utils.table_to_sheet takes a table DOM element and returns a worksheet resembling the input table. Numbers are parsed. All other data will be stored as strings.

XLSX.utils.table_to_book 根据工作表生成最小工作簿。

¥XLSX.utils.table_to_book produces a minimal workbook based on the worksheet.

这两个函数都接受选项参数:

¥Both functions accept options arguments:

选项名称默认描述
raw如果为 true,则每个单元格都将保存原始字符串
dateNF菌群移植 14在字符串输出中使用指定的日期格式
cellDatesfalse将日期存储为 d 类型(默认为 n
sheetRows0如果 >0,则读取表的前 sheetRows
displayfalse如果为 true,隐藏的行和单元格将不会被解析
UTCfalse如果为 true,则日期将被解释为 UTC **

UTC 选项在 "日期" 中解释

¥UTC option is explained in "Dates"

在 Web 浏览器中将表导出到电子表格文件涉及 3 个步骤:"找到表格"、"生成工作簿对象" 和 "导出到文件"。

¥Exporting a table to a spreadsheet file in the web browser involves 3 steps: "find the table", "generate a workbook object", and "export to file".

例如,如果 HTML 表的 id 属性设置为 sheetjs

¥For example, if the HTML table has id attribute set to sheetjs:



<table id="sheetjs">
<tr><th>名称</th><th>索引</th></tr>
<tr><td>巴拉克奥巴马</td><td>44</td></tr>
<tr><td>唐纳德·特朗普</td><td>45</td></tr>
<tr><td>约瑟夫·拜登</td><td>46</td></tr>
</table>


document.getElementById("sheetjs") 是对该表的实时引用。

¥document.getElementById("sheetjs") is a live reference to the table.

/* find the table element in the page */
var tbl = document.getElementById('sheetjs');
/* create a workbook */
var wb = XLSX.utils.table_to_book(tbl);
/* export to file */
XLSX.writeFile(wb, "SheetJSTable.xlsx");
Demo (click to hide)

此 HTML 表的 id 设置为 sheetjs

¥This HTML table has id set to sheetjs:

名称索引
巴拉克奥巴马44
唐纳德·特朗普45
约瑟夫·拜登46
Result
Loading...
Live Editor
function SheetJSExportTable() { return ( <button onClick={() => {
  /* find the table element in the page */
  var tbl = document.getElementById('sheetjs');
  /* create a workbook */
  var wb = XLSX.utils.table_to_book(tbl);
  /* export to file */
  XLSX.writeFile(wb, "SheetJSTable.xlsx");
}}><b>Export XLSX!</b></button> ); }

添加到工作表

¥Add to Sheet

将数据从 TABLE 元素添加到现有工作表

¥Add data from a TABLE element to an existing worksheet

XLSX.utils.sheet_add_dom(ws, elt, opts);

XLSX.utils.sheet_add_dom 采用表 DOM 元素并更新现有工作表对象。它遵循与 table_to_sheet 相同的过程并接受选项参数:

¥XLSX.utils.sheet_add_dom takes a table DOM element and updates an existing worksheet object. It follows the same process as table_to_sheet and accepts an options argument:

选项名称默认描述
raw如果为 true,则每个单元格都将保存原始字符串
dateNF菌群移植 14在字符串输出中使用指定的日期格式
cellDatesfalse将日期存储为 d 类型(默认为 n
sheetRows0如果 >0,则读取表的前 sheetRows
displayfalse如果为 true,隐藏的行和单元格将不会被解析
UTCfalse如果为 true,则日期将被解释为 UTC **

UTC 选项在 "日期" 中解释

¥UTC option is explained in "Dates"

origin 预计为以下之一:

¥origin is expected to be one of:

origin描述
(单元格对象)使用指定单元格(单元格对象)
(字符串)使用指定单元格(A1 型单元格)
(数量 >= 0)从指定行的第一列开始(0 索引)
-1从第一列开始附加到工作表底部
(默认)从单元格 A1 开始

sheet_add_dom 的一个常见用例涉及将多个表添加到单个工作表中,通常每个表之间有一些空白行:

¥A common use case for sheet_add_dom involves adding multiple tables to a single worksheet, usually with a few blank rows in between each table:

/* get "table1" and create worksheet */
const table1 = document.getElementById('table1');
const ws = XLSX.utils.table_to_sheet(table1);

/* get "table2" and append to the worksheet */
const table2 = document.getElementById('table2');
XLSX.utils.sheet_add_dom(ws, table2, {origin: -1});
Multi-table Export Example (click to show)

This demo creates a worksheet that should look like the screenshot below:

Multi-Table Export in Excel

The create_gap_rows helper function expands the worksheet range, adding blank rows between the data tables.

Result
Loading...
Live Editor
function MultiTable() {
  const headers = ["Table 1", "Table2", "Table 3"];

  /* Callback invoked when the button is clicked */
  const xport = React.useCallback(async () => {
    /* This function creates gap rows */
    function create_gap_rows(ws, nrows) {
      var ref = XLSX.utils.decode_range(ws["!ref"]);       // get original range
      ref.e.r += nrows;                                    // add to ending row
      ws["!ref"] = XLSX.utils.encode_range(ref);           // reassign row
    }

    /* first table */
    const ws = XLSX.utils.aoa_to_sheet([[headers[0]]]);
    XLSX.utils.sheet_add_dom(ws, document.getElementById('table1'), {origin: -1});
    create_gap_rows(ws, 1); // one row gap after first table

    /* second table */
    XLSX.utils.sheet_add_aoa(ws, [[headers[1]]], {origin: -1});
    XLSX.utils.sheet_add_dom(ws, document.getElementById('table2'), {origin: -1});
    create_gap_rows(ws, 2); // two rows gap after second table

    /* third table */
    XLSX.utils.sheet_add_aoa(ws, [[headers[2]]], {origin: -1});
    XLSX.utils.sheet_add_dom(ws, document.getElementById('table3'), {origin: -1});

    /* create workbook and export */
    const wb = XLSX.utils.book_new();
    XLSX.utils.book_append_sheet(wb, ws, "Export");
    XLSX.writeFile(wb, "SheetJSMultiTablexport.xlsx");
  });

  return ( <>
    <button onClick={xport}><b>Export XLSX!</b></button><br/><br/>
    <b>{headers[0]}</b><br/>
    <table id="table1">
      <tr><td>A2</td><td>B2</td></tr>
      <tr><td>A3</td><td>B3</td></tr>
    </table>
    <b>{headers[1]}</b><br/>
    <table id="table2">
      <tr><td>A6</td><td>B6</td><td>C6</td></tr>
      <tr><td>A7</td><td>B7</td><td>C7</td></tr>
    </table>
    <br/>
    <b>{headers[2]}</b><br/>
    <table id="table3">
      <tr><td>A11</td><td>B11</td></tr>
      <tr><td>A12</td><td>B12</td></tr>
    </table>
  </> );
}

HTML 字符串

¥HTML Strings

从 HTML 字符串创建工作表或工作簿

¥Create a worksheet or workbook from HTML string

table_to_book / table_to_sheet / sheet_add_dom 作用于 HTML DOM 元素。从 HTML 字符串开始,有两种解析方法:

¥table_to_book / table_to_sheet / sheet_add_dom act on HTML DOM elements. Starting from an HTML string, there are two parsing approaches:

A) 桌上幻象:创建一个 DIV,其 innerHTML 设置为 HTML 字符串,使用 DOM 元素生成工作表,然后删除该 DIV:

¥A) Table Phantasm: create a DIV whose innerHTML is set to the HTML string, generate worksheet using the DOM element, then remove the DIV:

/* create element from the source */
var elt = document.createElement("div");
elt.innerHTML = html_source;
document.body.appendChild(elt);

/* generate worksheet */
var ws = XLSX.utils.table_to_sheet(elt.getElementsByTagName("TABLE")[0]);

/* remove element */
document.body.removeChild(elt);
Phantasm Demo (click to show)

The html variable in the demo is an editable HTML string

Result
Loading...
Live Editor
function SheetJSTablePhantasm() {
  /* HTML stored as a string */
  const html = `\
<table>
  <tr><th>Name</th><th>Index</th></tr>
  <tr><td>Barack Obama</td><td>44</td></tr>
  <tr><td>Donald Trump</td><td>45</td></tr>
  <tr><td>Joseph Biden</td><td>46</td></tr>
</table>
`;
  return ( <>
    <button onClick={() => {
      /* create element from the source */
      var elt = document.createElement("div");
      elt.innerHTML = html;
      document.body.appendChild(elt);

      /* generate workbook */
      var tbl = elt.getElementsByTagName("TABLE")[0];
      var wb = XLSX.utils.table_to_book(tbl);

      /* remove element */
      document.body.removeChild(elt);

      /* generate file */
      XLSX.writeFile(wb, "SheetJSTablePhantasm.xlsx");
    }}><b>Export XLSX!</b></button>
    <pre><b>HTML:</b><br/>{html}</pre>
  </>);
}

B) 原始 HTML:使用 XLSX.read 以与 CSV 相同的方式读取文本。

¥B) Raw HTML: use XLSX.read to read the text in the same manner as CSV.

var wb = XLSX.read(html_source, { type: "string" });
var ws = wb.Sheets[wb.SheetNames[0]];
Raw HTML Demo (click to show)

The html variable in the demo is an editable HTML string

Result
Loading...
Live Editor
function SheetJSRawHTMLToXLSX() {
  /* HTML stored as a string */
  const html = `\
<table>
  <tr><th>Name</th><th>Index</th></tr>
  <tr><td>Barack Obama</td><td>44</td></tr>
  <tr><td>Donald Trump</td><td>45</td></tr>
  <tr><td>Joseph Biden</td><td>46</td></tr>
</table>
`;
  return ( <>
    <button onClick={() => {
      /* read HTML string */
      var wb = XLSX.read(html, {type: "string"});

      /* generate file */
      XLSX.writeFile(wb, "SheetJSRawHTML.xlsx");
    }}><b>Export XLSX!</b></button>
    <pre><b>HTML:</b><br/>{html}</pre>
  </>);
}

值覆盖

¥Value Override

当指定 raw: true 选项时,解析器将生成文本单元格。当未指定该选项或将其设置为 false 时,解析器将尝试解释每个 TD 元素的文本。

¥When the raw: true option is specified, the parser will generate text cells. When the option is not specified or when it is set to false, the parser will try to interpret the text of each TD element.

要覆盖特定单元格的转换,可以将以下数据属性添加到各个 TD 元素:

¥To override the conversion for a specific cell, the following data attributes can be added to the individual TD elements:

属性描述
data-t覆盖 单元类型
data-v覆盖单元格值
data-z覆盖 数字格式

例如:

¥For example:

<!-- Parser interprets value as `new Date("2012-12-03")` default date format -->
<td>2012-12-03</td>

<!-- String cell "2012-12-03" -->
<td data-t="s">2012-12-03</td>

<!-- Numeric cell with the correct date code and General format -->
<td data-t="n" data-v="41246">2012-12-03</td>

<!-- Traditional Excel Date 2012-12-03 with style yyyy-mm-dd -->
<td data-t="n" data-v="41246" data-z="yyyy-mm-dd">2012-12-03</td>
HTML Value Examples (click to hide)
Result
Loading...
Live Editor
function SheetJSHTMLValueOverride() {
  /* HTML stored as a string */
  const html = `\
<table>
  <tr><th>Cell</th><th>data-t</th><th>data-v</th><th>data-z</th></tr>
  <tr><td>2012-12-03</td><td/><td/><td/></tr>
  <tr><td data-t="s">2012-12-03</td><td>s</td><td/><td/></tr>
  <tr><td data-t="n" data-v="41246">2012-12-03</td><td>n</td><td>41246</td><td/></tr>
  <tr><td data-t="n" data-v="41246" data-z="yyyy-mm-dd">2012-12-03</td><td>n</td><td>41246</td><td>yyyy-mm-dd</td></tr>
</table>
`;
  return ( <>
    <button onClick={() => {
      /* create element from the source */
      var elt = document.createElement("div");
      elt.innerHTML = html;
      document.body.appendChild(elt);

      /* generate workbook */
      var tbl = elt.getElementsByTagName("TABLE")[0];
      var wb = XLSX.utils.table_to_book(tbl);

      /* remove element */
      document.body.removeChild(elt);

      /* generate file */
      XLSX.writeFile(wb, "SheetJSHTMLValueOverride.xlsx");
    }}><b>Export XLSX!</b></button>
    <pre><b>HTML String:</b><br/>{html}<br/><b>TABLE:</b></pre>
    <div dangerouslySetInnerHTML={{__html: html}}/>
  </>);
}

合成 DOM

¥Synthetic DOM

table_to_book / table_to_sheet / sheet_add_dom 作用于 HTML DOM 元素。传统上,包括 NodeJS 在内的服务器端环境中没有 DOM。

¥table_to_book / table_to_sheet / sheet_add_dom act on HTML DOM elements. Traditionally there is no DOM in server-side environments including NodeJS.

服务器端处理的最简单方法是自动化无头 Web 浏览器。"浏览器自动化" 涵盖了一些浏览器。

¥The simplest approach for server-side processing is to automate a headless web browser. "Browser Automation" covers some browsers.

一些生态系统提供与 SheetJS 兼容的类似 DOM 的框架。"合成 DOM" 演示中包含示例

¥Some ecosystems provide DOM-like frameworks that are compatible with SheetJS. Examples are included in the "Synthetic DOM" demo