Skip to main content

使用 PostgreSQL 的工作表

PostgreSQL(俗称 "Postgres")是一个开源对象关系数据库系统。

¥PostgreSQL (colloquially referenced as "Postgres") is an open source object-relational database system.

SheetJS 是一个用于从电子表格读取和写入数据的 JavaScript 库。

¥SheetJS is a JavaScript library for reading and writing data from spreadsheets.

该演示使用 SheetJS 在电子表格和 PostgreSQL 数据库之间交换数据。我们将探讨如何将数据库中的表保存到电子表格以及如何将电子表格中的数据添加到数据库中。

¥This demo uses SheetJS to exchange data between spreadsheets and PostgreSQL databases. We'll explore how to save tables from a database to spreadsheets and how to add data from spreadsheets into a database.

强烈建议将 PostgreSQL 与查询构建器或 ORM 一起使用。

¥It is strongly recommended to use PostgreSQL with a query builder or ORM.

虽然可以直接生成 SQL 语句,但存在许多微妙的细节和陷阱。经过实际检验的解决方案通常可以针对 SQL 注入和其他漏洞提供缓解措施。

¥While it is possible to generate SQL statements directly, there are many subtle details and pitfalls. Battle-tested solutions generally provide mitigations against SQL injection and other vulnerabilities.

测试部署

本 demo 在以下环境下进行了测试:

¥This demo was tested in the following environments:

Postgres连接器库日期
16.2.1pg(8.11.4)2024-03-31
15.6pg(8.11.4)2024-03-31
14.11pg(8.11.4)2024-03-31

集成详情

¥Integration Details

SheetJS NodeJS 模块 可以在连接到 PostgreSQL 数据库的 NodeJS 脚本中加载。

¥The SheetJS NodeJS module can be loaded in NodeJS scripts that connect to PostgreSQL databases.

该演示使用 pg 连接器模块 [^1],但相同的机制适用于其他 PostgreSQL 库。

¥This demo uses the pg connector module[^1], but the same mechanics apply to other PostgreSQL libraries.

导出数据

¥Exporting Data

Client#query 返回一个解析为结果集的 Promise。结果的 rows 属性是一个对象数组。

¥Client#query returns a Promise that resolves to a result set. The rows property of the result is an array of objects.

SheetJS json_to_sheet 方法 [^2] 可以从对象数组生成工作表对象 [^3]:

¥The SheetJS json_to_sheet method[^2] can generate a worksheet object[^3] from the array of objects:

const table_name = "Tabeller1"; // name of table

/* fetch all data from specified table */
const res = await client.query(`SELECT * FROM ${table_name}`);

/* generate a SheetJS worksheet object from the data */
const worksheet = XLSX.utils.json_to_sheet(res.rows);

可以使用实用函数 [^4] 从工作表构建工作簿对象。可以使用 SheetJS writeFile 方法 [^5] 导出工作簿:

¥A workbook object can be built from the worksheet using utility functions[^4]. The workbook can be exported using the SheetJS writeFile method[^5]:

/* create a new workbook and add the worksheet */
const wb = XLSX.utils.book_new();
XLSX.utils.book_append_sheet(wb, worksheet, "Sheet1");

/* export workbook to XLSX */
XLSX.writeFile(wb, "SheetJSPGExport.xlsx");

导入数据

¥Importing Data

SheetJS sheet_to_json 函数 [^6] 接受一个工作表对象并生成一个对象数组。

¥The SheetJS sheet_to_json function[^6] takes a worksheet object and generates an array of objects.

必须从对象手动生成查询。假设对象中的字段名称与列标题匹配,则循环可以生成 INSERT 个查询。

¥Queries must be manually generated from the objects. Assuming the field names in the object match the column headers, a loop can generate INSERT queries.

PostgreSQL 不允许使用可变列名的参数化查询

¥PostgreSQL does not allow parameterized queries with variable column names

INSERT INTO table_name (?) VALUES (?);
-- ---------------------^ variable column names are not valid

查询是手动生成的。为了帮助防止 SQL 注入漏洞,pg-format[^7] 模块对标识符和字段进行转义。

¥Queries are generated manually. To help prevent SQL injection vulnerabilities, the pg-format[^7] module escapes identifiers and fields.

/* generate an array of arrays from the worksheet */
const aoo = XLSX.utils.sheet_to_json(ws);

const table_name = "Blatte1"; // name of table

/* loop through the data rows */
for(let row of aoo) {

/* generate format helper strings */
const ent = Object.entries(row);
const Istr = Array.from({length: entries.length}, ()=>"%I").join(", ");
const Lstr = Array.from({length: entries.length}, ()=>"%L").join(", ");

/* generate INSERT statement */
let query = format.withArray(
`INSERT INTO %I (${Istr}) VALUES(${Lstr})`,
[ table_name, ...ent.map(x => x[0]), ...ent.map(x => x[1]) ]
);

/* execute INSERT statement */
await client.query(query);
}

创建表

¥Creating a Table

可以扫描对象数组以确定列名称和类型。有了名称和类型,就可以编写 CREATE TABLE 查询。

¥The array of objects can be scanned to determine column names and types. With the names and types, a CREATE TABLE query can be written.

Implementation Details (click to show)

The aoo_to_pg_table function:

  • scans each row object to determine column names and types
  • drops and creates a new table with the determined column names and types
  • loads the entire dataset into the new table
/* create table and load data given an array of objects and a PostgreSQL client */
async function aoo_to_pg_table(client, aoo, table_name) {
/* define types that can be converted (e.g. boolean can be stored in float) */
const T_FLOAT = ["float8", "boolean"];
const T_BOOL = ["boolean"];

/* types is a map from column headers to Knex schema column type */
const types = {};

/* names is an ordered list of the column header names */
const names = [];

/* loop across each row object */
aoo.forEach(row =>
/* Object.entries returns a row of [key, value] pairs */
Object.entries(row).forEach(([k,v]) => {

/* If this is first occurrence, mark unknown and append header to names */
if(!types[k]) { types[k] = ""; names.push(k); }

/* skip null and undefined values */
if(v == null) return;

/* check and resolve type */
switch(typeof v) {
/* change type if it is empty or can be stored in a float */
case "number": if(!types[k] || T_FLOAT.includes(types[k])) types[k] = "float8"; break;
/* change type if it is empty or can be stored in a boolean */
case "boolean": if(!types[k] || T_BOOL.includes(types[k])) types[k] = "boolean"; break;
/* no other type can hold strings */
case "string": types[k] = "text"; break;
default: types[k] = "text"; break;
}
})
);

/* Delete table if it exists in the DB */
const query = format("DROP TABLE IF EXISTS %I;", table_name);
await client.query(query);

/* Create table */
{
const entries = Object.entries(types);
const Istr = entries.map(e => format(`%I ${e[1]}`, e[0])).join(", ");
let query = format.withArray(`CREATE TABLE %I (${Istr});`, [ table_name ]);
await client.query(query);
}

/* Insert each row */
for(let row of aoo) {
const ent = Object.entries(row);
const Istr = Array.from({length: ent.length}, ()=>"%I").join(", ");
const Lstr = Array.from({length: ent.length}, ()=>"%L").join(", ");
let query = format.withArray(
`INSERT INTO %I (${Istr}) VALUES (${Lstr});`,
[ table_name, ...ent.map(x => x[0]), ...ent.map(x => x[1]) ]
);
await client.query(query);
}

return client;
}

完整示例

¥Complete Example

  1. 安装并启动 PostgreSQL 服务器。

    ¥Install and start the PostgreSQL server.

Installation Notes (click to show)

On macOS, install the postgresql formula with Homebrew:

brew install postgresql@16

The last few lines of the installer explain how to start the database:

Or, if you don't want/need a background service you can just run:
LC_ALL="C" /usr/local/opt/postgresql@16/bin/postgres -D /usr/local/var/postgresql@16

Run the command to start a local database instance.

  1. 删除任何名为 SheetJSPG 的现有数据库:

    ¥Drop any existing database with the name SheetJSPG:

dropdb SheetJSPG

如果服务器在其他地方运行,或者用户名与当前用户不同,则命令行标志可以覆盖默认值。

¥If the server is running elsewhere, or if the username is different from the current user, command-line flags can override the defaults.

选项解释
-h HOSTNAME服务器名称
-p PORT指定端口号
-U USERNAME指定用户名
  1. 使用 createdb 命令创建一个空的 SheetJSPG 数据库:

    ¥Create an empty SheetJSPG database using the createdb command:

createdb SheetJSPG

createdb 支持与 dropdb 相同的 -h-p-U 标志。

¥createdb supports the same -h, -p, and -U flags as dropdb.

连接器测试

¥Connector Test

  1. 创建项目文件夹:

    ¥Create a project folder:

mkdir sheetjs-pg
cd sheetjs-pg
npm init -y
  1. 安装 pg 连接器模块:

    ¥Install the pg connector module:

npm i --save pg@8.11.4
  1. 将以下示例代码块保存到 PGTest.js

    ¥Save the following example codeblock to PGTest.js:

PGTest.js
const pg = require("pg");
const client = new pg.Client({
database:"SheetJSPG",
host: "127.0.0.1", // localhost
port: 5432,
//user: "",
//password: ""
});
(async() => {

await client.connect();
const res = await client.query('SELECT $1::text as message', ['Hello world!']);
console.log(res.rows[0].message); // Hello world!
await client.end();

})();
  1. 编辑新的 PGTest.js 脚本并修改代码块中高亮的行以反映数据库部署设置。

    ¥Edit the new PGTest.js script and modify the highlighted lines from the codeblock to reflect the database deployment settings.

代码块中的设置与 macOS Homebrew PostgreSQL 服务器的默认配置匹配。对于其他部署:

¥The settings in the codeblock match the default configuration for macOS Homebrew PostgreSQL server. For other deployments:

  • 如果服务器未在你的计算机上运行,请将 hostport 设置为正确的主机名和端口号。

    ¥If the server is not running on your computer, set host and port to the correct host name and port number.

  • 如果服务器需要不同的用户名和密码,请取消注释 userpassword 行并将这些值替换为用户名和密码。

    ¥If the server expects a different username and password, uncomment the user and password lines and replace the values with the username and password.

  1. 运行脚本:

    ¥Run the script:

node PGTest.js

它应该打印 Hello world!

¥It should print Hello world!

如果输出不是 Hello world! 或者存在错误,请将问题报告给 pg 连接器项目以进行进一步诊断。

¥If the output is not Hello world! or if there is an error, please report the issue to the pg connector project for further diagnosis.

添加 SheetJS

¥Add SheetJS

  1. 安装依赖:

    ¥Install dependencies:

npm i --save https://cdn.sheetjs.com/xlsx-0.20.3/xlsx-0.20.3.tgz pg-format@1.0.4
  1. 下载 SheetJSPG.js

    ¥Download SheetJSPG.js:

curl -LO https://xlsx.nodejs.cn/postgresql/SheetJSPG.js

该脚本将:

¥This script will:

  • 读取并解析测试文件 pres.numbers

    ¥read and parse the test file pres.numbers

  • 创建与本地 PostgreSQL 服务器上的 SheetJSPG 数据库的连接

    ¥create a connection to the SheetJSPG database on a local PostgreSQL server

  • 将第一个工作表中的数据加载到名为 Presidents 的表中

    ¥load data from the first worksheet into a table with name Presidents

  • 断开并重新连接到数据库

    ¥disconnect and reconnect to the database

  • 从表 Presidents 转储数据

    ¥dump data from the table Presidents

  • 将数据集导出到 SheetJSPG.xlsx

    ¥export the dataset to SheetJSPG.xlsx

  1. 编辑 SheetJSPG.js 脚本。

    ¥Edit the SheetJSPG.js script.

该脚本定义了一个 opts 对象:

¥The script defines an opts object:

SheetJSPG.js (configuration lines)
const XLSX = require("xlsx");
const opts = {
database:"SheetJSPG",
host: "127.0.0.1", // localhost
port: 5432,
//user: "",
//password: ""
};

代码块中的设置与 macOS Homebrew PostgreSQL 服务器的默认配置匹配。对于其他部署:

¥The settings in the codeblock match the default configuration for macOS Homebrew PostgreSQL server. For other deployments:

  • 如果服务器未在你的计算机上运行,请将 hostport 设置为正确的主机名和端口号。

    ¥If the server is not running on your computer, set host and port to the correct host name and port number.

  • 如果服务器需要不同的用户名和密码,请取消注释 userpassword 行并将这些值替换为用户名和密码。

    ¥If the server expects a different username and password, uncomment the user and password lines and replace the values with the username and password.

  1. 获取示例文件 pres.numbers

    ¥Fetch the example file pres.numbers:

curl -L -O https://xlsx.nodejs.cn/pres.numbers
  1. 运行脚本:

    ¥Run the script:

node SheetJSPG.js
  1. 验证结果:

    ¥Verify the result:

  • SheetJSPGExport.xlsx 可以在电子表格应用中打开或在终端中测试

    ¥SheetJSPGExport.xlsx can be opened in a spreadsheet app or tested in the terminal

npx xlsx-cli SheetJSPGExport.xlsx
  • 可以使用 psql 命令行工具查询数据库服务器。

    ¥The database server can be queried using the psql command line tool.

如果服务器在本地运行,则命令将为:

¥If the server is running locally, the command will be:

psql SheetJSPG -c 'SELECT * FROM "Presidents";'

psql 支持与 dropdbcreatedb 相同的 -h-p-U 标志。

¥psql supports the same -h, -p, and -U flags as dropdb and createdb.

[^1]: 请参阅 pg 官方网站 了解更多信息。

¥See the official pg website for more info.

[^2]: 见 json_to_sheet 于 "实用工具"

¥See json_to_sheet in "Utilities"

[^3]: 详细信息请参见 "SheetJS 数据模型" 中的 "Sheet 对象"

¥See "Sheet Objects" in "SheetJS Data Model" for more details.

[^4]: 有关 book_newbook_append_sheet 的详细信息,请参阅 "工作簿助手" 于 "实用工具"

¥See "Workbook Helpers" in "Utilities" for details on book_new and book_append_sheet.

[^5]: 见 writeFile 于 "写入文件"

¥See writeFile in "Writing Files"

[^6]: 见 sheet_to_json 于 "实用工具"

¥See sheet_to_json in "Utilities"

[^7]: pg-format 包可在公共 NPM 注册表中获取。尽管该项目被标记为已弃用,但官方 pg 网站仍然推荐 pg-format

¥The pg-format package is available on the public NPM registry. Even though the project is marked as deprecated, the official pg website still recommends pg-format