[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: sha256sum --text generating blank spaces and hyphens?



On 4/27/23 01:04, Nicolas George wrote:
David Christensen (12023-04-26):
My suggestion assumes that the URL => hash => content mapping is saved
somehow.

That is an assumption that needed to be made explicit from the start.

	   For example, save the content in a file named after the hash and
save the URL in a file whose name is the hash plus a suffix. Finding a
document by URL then becomes a grep(1) invocation.

This is not very efficient.


Please see the OP, step (d).


You are free to propose better solutions.


On 4/26/23 21:02, David Christensen wrote:

> Things get more interesting when you approach the problem as a database.
>   Save the content wherever and put the metadata into a table -- content
> hash (primary key), URL, download timestamp, author, subject, title,
> keywords, etc..  Create fully inverted indexes.  Create a search engine.
>   Create a spider.  Implementation could range from a CSV/TSV flat-file
> and shell/P* scripts, to a desktop database/UI, to a LAMP stack, and
> beyond (NoSQL, N-tier).  There are distributed file sharing systems
> based on such ideas.


David


Reply to: