Here's a hint at a start of what you need to do, it should be pretty easy to extend
this, if it's unclear, let me know:
for starters, run your "gunk" into jq like this:
$ echo {\"index\":\"prod-h-006\",\"fields\":{\"identifier\":\"bub_gb_O2EAAAAAMAAJ\",\"title\":\"Die
Wissenschaft vom subjectiven Geist\",\"creator\":[\"Karl Rosenkranz\", \"Mr. ABC123\"],\"collection\":[\"europeanlibraries\", \"americana\"],\"year\":1843,\"language\":[\"German\"],\"item_size\":797368506},\"_score\":[50.629513]} | jq
{
"index": "prod-h-006",
"fields": {
"identifier": "bub_gb_O2EAAAAAMAAJ",
"title": "Die Wissenschaft vom subjectiven Geist",
"creator": [
"Karl Rosenkranz",
"Mr. ABC123"
],
"collection": [
"europeanlibraries",
"americana"
],
"year": 1843,
"language": [
"German"
],
"item_size": 797368506
},
"_score": [
50.629513
]
}
then, start building your output like this:
echo {\"index\":\"prod-h-006\",\"fields\":{\"identifier\":\"bub_gb_O2EAAAAAMAAJ\",\"title\":\"Die
Wissenschaft vom subjectiven Geist\",\"creator\":[\"Karl Rosenkranz\", \"Mr. ABC123\"],\"collection\":[\"europeanlibraries\", \"americana\"],\"year\":1843,\"language\":[\"German\"],\"item_size\":797368506},\"_score\":[50.629513]} | jq '.fields.identifier +
"|" + .fields.title'
jq is an amazing tool, it's a full fledged programming language. You just need to continue
concatenating your desired output. You might even find you can do what you want all inside a jq script instead of what you're doing. Consider writing a jq script with the first line of the script #!/usr/bin/jq
Hope this gets you on the right path!
Michael Grant
From: tomas@tuxteam.de Sent: Friday, March 22, 2024 23:44 To: Albretch Mueller Cc: debian-user Subject: Re: trying to parse lines from an awkwardly formatted HAR file ... On Sat, Mar 23, 2024 at 12:53:24AM -0500, Albretch Mueller wrote:
> out of a HAR file containing lots of obfuscating js cr@p and all kinds of > nonsense I was able to extract line looking like: It's not "js cr@p", It is called JSON. And there's a spec for it. [...] > I have tried substring substitution, sed et tr to no avail. You might have a lot of fun trying to parse JSON with sed and tr. If you are serious about it, you should try a proper parser and extractor. I'd recommend jq [1], available in Debian under the same-named package. I have written a few shell scripts reaching into the innards of You'll have to wrap your brain around it, but in the time you have implemented a parser for js in "sed and tr" (you might need a dash of "proper programming language" around that, some luck and a ton of elbow grease) you might have wrapped your brain like 16 times around jq (or some other appropriate tool). Cheers -- tomás |