使用 PostgreSQL 的工作表
PostgreSQL(俗称 "Postgres")是一个开源对象关系数据库系统。
¥PostgreSQL (colloquially referenced as "Postgres") is an open source object-relational database system.
SheetJS 是一个用于从电子表格读取和写入数据的 JavaScript 库。
¥SheetJS is a JavaScript library for reading and writing data from spreadsheets.
该演示使用 SheetJS 在电子表格和 PostgreSQL 数据库之间交换数据。我们将探讨如何将数据库中的表保存到电子表格以及如何将电子表格中的数据添加到数据库中。
¥This demo uses SheetJS to exchange data between spreadsheets and PostgreSQL databases. We'll explore how to save tables from a database to spreadsheets and how to add data from spreadsheets into a database.
强烈建议将 PostgreSQL 与查询构建器或 ORM 一起使用。
¥It is strongly recommended to use PostgreSQL with a query builder or ORM.
虽然可以直接生成 SQL 语句,但存在许多微妙的细节和陷阱。经过实际检验的解决方案通常可以针对 SQL 注入和其他漏洞提供缓解措施。
¥While it is possible to generate SQL statements directly, there are many subtle details and pitfalls. Battle-tested solutions generally provide mitigations against SQL injection and other vulnerabilities.
本 demo 在以下环境下进行了测试:
¥This demo was tested in the following environments:
Postgres | 连接器库 | 日期 |
---|---|---|
16.2.1 | pg (8.11.4 ) | 2024-03-31 |
15.6 | pg (8.11.4 ) | 2024-03-31 |
14.11 | pg (8.11.4 ) | 2024-03-31 |
集成详情
¥Integration Details
SheetJS NodeJS 模块 可以在连接到 PostgreSQL 数据库的 NodeJS 脚本中加载。
¥The SheetJS NodeJS module can be loaded in NodeJS scripts that connect to PostgreSQL databases.
该演示使用 pg
连接器模块 [^1],但相同的机制适用于其他 PostgreSQL 库。
¥This demo uses the pg
connector module[^1], but the same mechanics apply to
other PostgreSQL libraries.
导出数据
¥Exporting Data
Client#query
返回一个解析为结果集的 Promise。结果的 rows
属性是一个对象数组。
¥Client#query
returns a Promise that resolves to a result set. The rows
property of the result is an array of objects.
SheetJS json_to_sheet
方法 [^2] 可以从对象数组生成工作表对象 [^3]:
¥The SheetJS json_to_sheet
method[^2] can generate a worksheet object[^3] from
the array of objects:
const table_name = "Tabeller1"; // name of table
/* fetch all data from specified table */
const res = await client.query(`SELECT * FROM ${table_name}`);
/* generate a SheetJS worksheet object from the data */
const worksheet = XLSX.utils.json_to_sheet(res.rows);
可以使用实用函数 [^4] 从工作表构建工作簿对象。可以使用 SheetJS writeFile
方法 [^5] 导出工作簿:
¥A workbook object can be built from the worksheet using utility functions[^4].
The workbook can be exported using the SheetJS writeFile
method[^5]:
/* create a new workbook and add the worksheet */
const wb = XLSX.utils.book_new();
XLSX.utils.book_append_sheet(wb, worksheet, "Sheet1");
/* export workbook to XLSX */
XLSX.writeFile(wb, "SheetJSPGExport.xlsx");
导入数据
¥Importing Data
SheetJS sheet_to_json
函数 [^6] 接受一个工作表对象并生成一个对象数组。
¥The SheetJS sheet_to_json
function[^6] takes a worksheet object and generates
an array of objects.
必须从对象手动生成查询。假设对象中的字段名称与列标题匹配,则循环可以生成 INSERT
个查询。
¥Queries must be manually generated from the objects. Assuming the field names
in the object match the column headers, a loop can generate INSERT
queries.
PostgreSQL 不允许使用可变列名的参数化查询
¥PostgreSQL does not allow parameterized queries with variable column names
INSERT INTO table_name (?) VALUES (?);
-- ---------------------^ variable column names are not valid
查询是手动生成的。为了帮助防止 SQL 注入漏洞,pg-format
[^7] 模块对标识符和字段进行转义。
¥Queries are generated manually. To help prevent SQL injection vulnerabilities,
the pg-format
[^7] module escapes identifiers and fields.
/* generate an array of arrays from the worksheet */
const aoo = XLSX.utils.sheet_to_json(ws);
const table_name = "Blatte1"; // name of table
/* loop through the data rows */
for(let row of aoo) {
/* generate format helper strings */
const ent = Object.entries(row);
const Istr = Array.from({length: entries.length}, ()=>"%I").join(", ");
const Lstr = Array.from({length: entries.length}, ()=>"%L").join(", ");
/* generate INSERT statement */
let query = format.withArray(
`INSERT INTO %I (${Istr}) VALUES(${Lstr})`,
[ table_name, ...ent.map(x => x[0]), ...ent.map(x => x[1]) ]
);
/* execute INSERT statement */
await client.query(query);
}
创建表
¥Creating a Table
可以扫描对象数组以确定列名称和类型。有了名称和类型,就可以编写 CREATE TABLE
查询。
¥The array of objects can be scanned to determine column names and types. With
the names and types, a CREATE TABLE
query can be written.
Implementation Details (click to show)
The aoo_to_pg_table
function:
- scans each row object to determine column names and types
- drops and creates a new table with the determined column names and types
- loads the entire dataset into the new table
/* create table and load data given an array of objects and a PostgreSQL client */
async function aoo_to_pg_table(client, aoo, table_name) {
/* define types that can be converted (e.g. boolean can be stored in float) */
const T_FLOAT = ["float8", "boolean"];
const T_BOOL = ["boolean"];
/* types is a map from column headers to Knex schema column type */
const types = {};
/* names is an ordered list of the column header names */
const names = [];
/* loop across each row object */
aoo.forEach(row =>
/* Object.entries returns a row of [key, value] pairs */
Object.entries(row).forEach(([k,v]) => {
/* If this is first occurrence, mark unknown and append header to names */
if(!types[k]) { types[k] = ""; names.push(k); }
/* skip null and undefined values */
if(v == null) return;
/* check and resolve type */
switch(typeof v) {
/* change type if it is empty or can be stored in a float */
case "number": if(!types[k] || T_FLOAT.includes(types[k])) types[k] = "float8"; break;
/* change type if it is empty or can be stored in a boolean */
case "boolean": if(!types[k] || T_BOOL.includes(types[k])) types[k] = "boolean"; break;
/* no other type can hold strings */
case "string": types[k] = "text"; break;
default: types[k] = "text"; break;
}
})
);
/* Delete table if it exists in the DB */
const query = format("DROP TABLE IF EXISTS %I;", table_name);
await client.query(query);
/* Create table */
{
const entries = Object.entries(types);
const Istr = entries.map(e => format(`%I ${e[1]}`, e[0])).join(", ");
let query = format.withArray(`CREATE TABLE %I (${Istr});`, [ table_name ]);
await client.query(query);
}
/* Insert each row */
for(let row of aoo) {
const ent = Object.entries(row);
const Istr = Array.from({length: ent.length}, ()=>"%I").join(", ");
const Lstr = Array.from({length: ent.length}, ()=>"%L").join(", ");
let query = format.withArray(
`INSERT INTO %I (${Istr}) VALUES (${Lstr});`,
[ table_name, ...ent.map(x => x[0]), ...ent.map(x => x[1]) ]
);
await client.query(query);
}
return client;
}
完整示例
¥Complete Example
-
安装并启动 PostgreSQL 服务器。
¥Install and start the PostgreSQL server.
Installation Notes (click to show)
On macOS, install the postgresql
formula with Homebrew:
brew install postgresql@16
The last few lines of the installer explain how to start the database:
Or, if you don't want/need a background service you can just run:
LC_ALL="C" /usr/local/opt/postgresql@16/bin/postgres -D /usr/local/var/postgresql@16
Run the command to start a local database instance.
-
删除任何名为
SheetJSPG
的现有数据库:¥Drop any existing database with the name
SheetJSPG
:
dropdb SheetJSPG
如果服务器在其他地方运行,或者用户名与当前用户不同,则命令行标志可以覆盖默认值。
¥If the server is running elsewhere, or if the username is different from the current user, command-line flags can override the defaults.
选项 | 解释 |
---|---|
-h HOSTNAME | 服务器名称 |
-p PORT | 指定端口号 |
-U USERNAME | 指定用户名 |
-
使用
createdb
命令创建一个空的SheetJSPG
数据库:¥Create an empty
SheetJSPG
database using thecreatedb
command:
createdb SheetJSPG
createdb
支持与 dropdb
相同的 -h
、-p
和 -U
标志。
¥createdb
supports the same -h
, -p
, and -U
flags as dropdb
.
连接器测试
¥Connector Test
-
创建项目文件夹:
¥Create a project folder:
mkdir sheetjs-pg
cd sheetjs-pg
npm init -y
-
安装
pg
连接器模块:¥Install the
pg
connector module:
npm i --save pg@8.11.4
-
将以下示例代码块保存到
PGTest.js
:¥Save the following example codeblock to
PGTest.js
:
const pg = require("pg");
const client = new pg.Client({
database:"SheetJSPG",
host: "127.0.0.1", // localhost
port: 5432,
//user: "",
//password: ""
});
(async() => {
await client.connect();
const res = await client.query('SELECT $1::text as message', ['Hello world!']);
console.log(res.rows[0].message); // Hello world!
await client.end();
})();
-
编辑新的
PGTest.js
脚本并修改代码块中高亮的行以反映数据库部署设置。¥Edit the new
PGTest.js
script and modify the highlighted lines from the codeblock to reflect the database deployment settings.
代码块中的设置与 macOS Homebrew PostgreSQL 服务器的默认配置匹配。对于其他部署:
¥The settings in the codeblock match the default configuration for macOS Homebrew PostgreSQL server. For other deployments:
-
如果服务器未在你的计算机上运行,请将
host
和port
设置为正确的主机名和端口号。¥If the server is not running on your computer, set
host
andport
to the correct host name and port number. -
如果服务器需要不同的用户名和密码,请取消注释
user
和password
行并将这些值替换为用户名和密码。¥If the server expects a different username and password, uncomment the
user
andpassword
lines and replace the values with the username and password.
-
运行脚本:
¥Run the script:
node PGTest.js
它应该打印 Hello world!
¥It should print Hello world!
如果输出不是 Hello world!
或者存在错误,请将问题报告给 pg
连接器项目以进行进一步诊断。
¥If the output is not Hello world!
or if there is an error, please report the
issue to the pg
connector project for further diagnosis.
添加 SheetJS
¥Add SheetJS
-
安装依赖:
¥Install dependencies:
npm i --save https://cdn.sheetjs.com/xlsx-0.20.3/xlsx-0.20.3.tgz pg-format@1.0.4
-
下载
SheetJSPG.js
:¥Download
SheetJSPG.js
:
curl -LO https://xlsx.nodejs.cn/postgresql/SheetJSPG.js
该脚本将:
¥This script will:
-
读取并解析测试文件
pres.numbers
¥read and parse the test file
pres.numbers
-
创建与本地 PostgreSQL 服务器上的
SheetJSPG
数据库的连接¥create a connection to the
SheetJSPG
database on a local PostgreSQL server -
将第一个工作表中的数据加载到名为
Presidents
的表中¥load data from the first worksheet into a table with name
Presidents
-
断开并重新连接到数据库
¥disconnect and reconnect to the database
-
从表
Presidents
转储数据¥dump data from the table
Presidents
-
将数据集导出到
SheetJSPG.xlsx
¥export the dataset to
SheetJSPG.xlsx
-
编辑
SheetJSPG.js
脚本。¥Edit the
SheetJSPG.js
script.
该脚本定义了一个 opts
对象:
¥The script defines an opts
object:
const XLSX = require("xlsx");
const opts = {
database:"SheetJSPG",
host: "127.0.0.1", // localhost
port: 5432,
//user: "",
//password: ""
};
代码块中的设置与 macOS Homebrew PostgreSQL 服务器的默认配置匹配。对于其他部署:
¥The settings in the codeblock match the default configuration for macOS Homebrew PostgreSQL server. For other deployments:
-
如果服务器未在你的计算机上运行,请将
host
和port
设置为正确的主机名和端口号。¥If the server is not running on your computer, set
host
andport
to the correct host name and port number. -
如果服务器需要不同的用户名和密码,请取消注释
user
和password
行并将这些值替换为用户名和密码。¥If the server expects a different username and password, uncomment the
user
andpassword
lines and replace the values with the username and password.
-
获取示例文件
pres.numbers
:¥Fetch the example file
pres.numbers
:
curl -L -O https://xlsx.nodejs.cn/pres.numbers
-
运行脚本:
¥Run the script:
node SheetJSPG.js
-
验证结果:
¥Verify the result:
-
SheetJSPGExport.xlsx
可以在电子表格应用中打开或在终端中测试¥
SheetJSPGExport.xlsx
can be opened in a spreadsheet app or tested in the terminal
npx xlsx-cli SheetJSPGExport.xlsx
-
可以使用
psql
命令行工具查询数据库服务器。¥The database server can be queried using the
psql
command line tool.
如果服务器在本地运行,则命令将为:
¥If the server is running locally, the command will be:
psql SheetJSPG -c 'SELECT * FROM "Presidents";'
psql
支持与 dropdb
和 createdb
相同的 -h
、-p
和 -U
标志。
¥psql
supports the same -h
, -p
, and -U
flags as dropdb
and createdb
.
[^1]: 请参阅 pg
官方网站 了解更多信息。
¥See the official pg
website for more info.
[^2]: 见 json_to_sheet
于 "实用工具"
¥See json_to_sheet
in "Utilities"
[^3]: 详细信息请参见 "SheetJS 数据模型" 中的 "Sheet 对象"。
¥See "Sheet Objects" in "SheetJS Data Model" for more details.
[^4]: 有关 book_new
和 book_append_sheet
的详细信息,请参阅 "工作簿助手" 于 "实用工具"。
¥See "Workbook Helpers" in "Utilities" for details on book_new
and book_append_sheet
.
[^5]: 见 writeFile
于 "写入文件"
¥See writeFile
in "Writing Files"
[^6]: 见 sheet_to_json
于 "实用工具"
¥See sheet_to_json
in "Utilities"
[^7]: pg-format
包可在公共 NPM 注册表中获取。尽管该项目被标记为已弃用,但官方 pg
网站仍然推荐 pg-format
¥The pg-format
package is available on the public NPM registry. Even though the project is marked as deprecated, the official pg
website still recommends pg-format