Re: Invalid UTF-8 byte? (was: Re: utf)

To: debian-user@lists.debian.org
Subject: Re: Invalid UTF-8 byte? (was: Re: utf)
From: Greg Wooledge <wooledg@eeg.ccf.org>
Date: Tue, 3 Apr 2018 15:47:57 -0400
Message-id: <[🔎] 20180403194757.3gc6mij7rzexplg6@eeg.ccf.org>
Mail-followup-to: debian-user@lists.debian.org
In-reply-to: <[🔎] 20180403213642.af986be396a72c67a6a210f3@freenet.de>
References: <[🔎] 92aa2f6d-d39f-61a6-311b-f0c45b00b9c9@gmx.com> <[🔎] 201804020837.54725.rhkramer@gmail.com> <[🔎] 20180403004328.f49e19cbe32cfd5773b9e5e7@freenet.de> <[🔎] 201804030743.02707.rhkramer@gmail.com> <[🔎] 20180403135833.3156da4df8b9e11298ae6306@freenet.de> <[🔎] 20180403141407.aeb42cf877a09a753571a810@freenet.de> <[🔎] 20180403123208.GA26421@tuxteam.de> <[🔎] 20180403213642.af986be396a72c67a6a210f3@freenet.de>

On Tue, Apr 03, 2018 at 09:36:42PM +0200, Michael Lange wrote:
> >From what i have understood I think the OP should certainly at least,
> whatever the files they want to include exactly look like and whichever
> byte they choose as delimiter, scan the file first for such a byte and if
> it is actually found replace it with either an empty string or
> (probably better) some sort of "tag" before applying the contents to the
> new database. This way they could at least be sure that their chosen
> delimiter does not split one record into halves.

Or abort the program with an error message.

> I have no idea what these "text files" look like of course. It just seemed
> -to me - that the fact that the null byte cannot ever be part of a file
> name might make it slightly more appropriate for this purpose than other
> candidate bytes. Of course, it depends...

NUL bytes are an excellent choice for delimiter in lots of situations.

Of course, we still need the OP to tell us what the actual situation is.

Reply to:

Follow-Ups:
- Re: Invalid UTF-8 byte? (was: Re: utf)
  - From: Michael Lange <klappnase@freenet.de>

References:
- utf
  - From: mess-mate <mess-mate@gmx.com>
- Invalid UTF-8 byte? (was: Re: utf)
  - From: rhkramer@gmail.com
- Re: Invalid UTF-8 byte? (was: Re: utf)
  - From: Michael Lange <klappnase@freenet.de>
- Re: Invalid UTF-8 byte? (was: Re: utf)
  - From: rhkramer@gmail.com
- Re: Invalid UTF-8 byte? (was: Re: utf)
  - From: Michael Lange <klappnase@freenet.de>
- Re: Invalid UTF-8 byte? (was: Re: utf)
  - From: Michael Lange <klappnase@freenet.de>
- Re: Invalid UTF-8 byte? (was: Re: utf)
  - From: <tomas@tuxteam.de>
- Re: Invalid UTF-8 byte? (was: Re: utf)
  - From: Michael Lange <klappnase@freenet.de>

Prev by Date: Re: Unknown Systemd version
Next by Date: Re: Invalid UTF-8 byte? (was: Re: utf)
Previous by thread: Re: Invalid UTF-8 byte? (was: Re: utf)
Next by thread: Re: Invalid UTF-8 byte? (was: Re: utf)
Index(es):
- Date
- Thread