[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: sha256sum --text generating blank spaces and hyphens?



On 4/26/23, Andy Smith <andy@strugglers.net> wrote:
> If you're referring to the space and then the file name ("-" in case
> of stdin) on the end, you can just select only the first output up
> to whitespace with e.g. awk:
>
>     _SHA256=$(printf '%s' "${_TXT}" | sha256sum | awk '{print $1}')

 Yes, you could but I am trying to find out why this is happening
instead of truncating the string when a space appears because I don't
think what would be safe.

> These web sites can change their URLs at any time you know, so it
> may not be worth trying to replicate their structure locally.

 yes, I know and my way to deal with such issues is:

 a) by including in the name of the web log of the download the date
and time ...
 b) once the data file is downloaded, say a pdf file of an old book or
some publication, all the metadata in the front and back pages of the
book are OCRed, the actual title, ISBN, publishing date ...

On 4/26/23, tomas@tuxteam.de <tomas@tuxteam.de> wrote:
>>  a) encode the string name as base64
>>  b) calculate the sha256sum of §a
>
> Why the detour over base64?

 because I would like to include the three strings in the file descriptor:
 a) the crazy long name
 b) its base64 representation
 c) §b's sha256sum representation which is the one used for the file
name and the log of the download.

 I would like to make this scheme "fool (and fail) proof" as they say.
There is no way in earth that a file system messes with all three
aspects of it.

>>  c) use §b as file name (of course, leaving the original extension as it
>> is)
>
> Why the extension? DOS nostalgia?

 The local copies should represent the web URLs as close as possible
in order to minimize "what came from where" kinds of confusions. Also
from the same URL you would then download the corresponding pdf file
with exactly the same name, the only difference being the extension.

>> // __ $_SHA256:
>> |7d5895cb24ab49692a8ad495e036074fec8e61b22040544f02a9b69c926dbdeb  -|
>
>
> I only see harmless hexadecimal chars there.
>
>>  I am trying to avoid funky characters and sha256sum --text still
>> generates them!?!
>
> Where are there "funky chars"?

 This is the first time I have seen blank spaces and hyphens in a text
segment's sum. Those characters might be confusing.

> Besides, I don't think --text does what you think it does. Quoting
> the manpage:
>
>   "Note: There is no difference between binary mode and text
>    mode on GNU systems."

 Thank you. I was playing with different options to see if that was
the reason I was getting those white spaces and hyphens at the end.

 Why is that happening? How could it be avoided? COuld you set the
characters used in the representation of a sum?

 lbrtchx


Reply to: