[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: trying to parse lines from an awkwardly formatted HAR file ...



On Sat, Mar 23, 2024 at 11:55:04AM -0400, Greg Wooledge wrote:
> On Sat, Mar 23, 2024 at 09:54:05AM -0500, Albretch Mueller wrote:
> >  1) That HAR file is not properly formatted. Instead of
> > "attribute":value pairs in the standard way, they have used front
> > slash + quote pairs (instead of just quotes) erratically all around
> > the file. That is why you can't use jq.
> 
> That is not what I see in the file which I pasted here.

Further investigation:

https://google.com/search?q=what+is+a+HAR+file

  https://www.keycdn.com/support/what-is-a-har-file
  Jan 12, 2023 — A HAR file is primarily used for identifying
  performance issues, such as bottlenecks and slow load times, and page
  rendering problems.

  https://en.wikipedia.org/wiki/HAR_(file_format)
  The HTTP Archive format, or HAR, is a JSON-formatted archive file
  format for logging of a web browser's interaction with a site.
  ...
  This document was never published by the Web Performance Working Group
  and has been abandoned.

So, putting these together, it looks like you are taking a file that
was intended to be used for diagnosing browser/network performance
issues, and attempting to use this in place of a downloadable index
of documents from archive.org.

Furthermore, whatever method you are using to *create* this HAR file
is questionable, since apparently you aren't even getting a properly
formatted file in the end.

This tells me we're deep inside an X-Y problem.  The original goal is
possibly something like "I want an index of all the books about this
Greek dude".  Maybe start from there, and see what answers you get.


Reply to: