[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Invalid UTF-8 byte? (was: Re: utf)



On Tue, Apr 03, 2018 at 09:36:42PM +0200, Michael Lange wrote:
> >From what i have understood I think the OP should certainly at least,
> whatever the files they want to include exactly look like and whichever
> byte they choose as delimiter, scan the file first for such a byte and if
> it is actually found replace it with either an empty string or
> (probably better) some sort of "tag" before applying the contents to the
> new database. This way they could at least be sure that their chosen
> delimiter does not split one record into halves.

Or abort the program with an error message.

> I have no idea what these "text files" look like of course. It just seemed
> -to me - that the fact that the null byte cannot ever be part of a file
> name might make it slightly more appropriate for this purpose than other
> candidate bytes. Of course, it depends...

NUL bytes are an excellent choice for delimiter in lots of situations.

Of course, we still need the OP to tell us what the actual situation is.


Reply to: