[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Bug#675106: ITP: pgbulkload -- A high speed data loading utility for PostgreSQL



>>>>> Alexander Kuznetsov <acca@cpan.org> writes:

[…]

	(Some wording fixes and suggestions.)

 > Description     : A high speed data loading utility for PostgreSQL
 > pg_bulkload is designed to load huge amount of data to a database.
 > You can choose whether database constraints are checked and how many errors are

	If “You can…” here starts a new paragraph, there's ought to be
	an empty (“.”) line.  And if not, the linebreak here came a bit
	too early than necessary.

 > ignored during the loading. For example, you can skip integrity checks for
 > performance when you copy data from another database to PostgreSQL. On the
 > other hand, you can enable constraint checks when loading unclean data.
 > .

	Are “constraint checks” different to “integrity checks” in the
	above?  Unless they are, it should rather be, e. g.:

   … For example, you can skip integrity checks for performance when you
   copy data from another database to PostgreSQL, or have them in place
   when loading potentially unclean data.

 > The original goal of pg_bulkload was an faster alternative of COPY command in

   … was /a/ faster…

	Or, perhaps: … was to provide a faster…

 > PostgreSQL, but version 3.0 or later has some ETL features like input data
 > validation and data transformation with filter functions.
 > .

   … but as of version 3.0 some ETL features… were added.

	And what's ETL, BTW?

 > In version 3.1, pg_bulkload can convert the load data into the binary file
 > which can be used as an input file of pg_bulkload. If you check whether

	Perhaps:

   As of version 3.1, pg_bulkload can dump the preprocessed data into a
   binary file, allowing for…

	(Here, the purpose should be mentioned.  Is this for improving
	the performance of later multiple “bulkloads”, for instance?)

 > the load data is valid when converting it into the binary file, you can skip
 > the check when loading it from the binary file to a table. Which would reduce
 > the load time itself. Also in version 3.1, parallel loading works
 > more effectively than before.

	s/effectively/efficiently/.  But the whole sentence makes little
	sense, as the earlier versions weren't packaged for Debian.

-- 
FSF associate member #7257


Reply to: