Bug#1100123: RFP: pg-parquet -- Copy to/from Parquet in S3 or Azure Blob Storage from within PostgreSQL
Package: wnpp
Severity: wishlist
X-Debbugs-Cc: hiro@torproject.org, debian-rust@lists.debian.org
* Package name : pg-parquet
Version : 0.3.0
Upstream Contact: https://github.com/CrunchyData/
* URL : https://github.com/CrunchyData/pg_parquet/
* License : PostgreSQL
Programming Lang: Rust
Description : Copy to/from Parquet in S3 or Azure Blob Storage from within PostgreSQL
pg_parquet is a PostgreSQL extension that allows you to read and write
Parquet files, which are located in S3 or file system, from PostgreSQL
via COPY TO/FROM commands. It depends on Apache Arrow project to read
and write Parquet files and pgrx project to extend PostgreSQL's COPY
command.
----
This is something we're looking at using inside Tor to archive
long-term data from postgresql into object storage. See
https://gitlab.torproject.org/tpo/tpa/team/-/issues/41416 and related.
Another alternative is a "foreign data wrapper" like:
https://github.com/pgspider/parquet_s3_fdw
We're not absolutely sure which one the best, we might need both.
Reply to: