Re: sed:awk:perl::rock:paper:chainsaw [was Re: Using .XCompose]

To: debian-user@lists.debian.org
Subject: Re: sed:awk:perl::rock:paper:chainsaw [was Re: Using .XCompose]
From: David Christensen <dpchrist@holgerdanske.com>
Date: Sun, 19 Jul 2020 14:19:32 -0700
Message-id: <[🔎] f8555c17-568c-cc00-9112-6711354e4149@holgerdanske.com>
In-reply-to: <[🔎] 1365305252.2275163.1595155599884@mail.yahoo.com>
References: <143551610.2618526.1594013699409.ref@mail.yahoo.com> <[🔎] 143551610.2618526.1594013699409@mail.yahoo.com> <[🔎] alpine.DEB.2.21.2007061804280.26486@azone.org> <[🔎] alpine.DEB.2.21.2007070628500.6440@azone.org> <[🔎] 885539452.3315628.1594115314827@mail.yahoo.com> <[🔎] alpine.DEB.2.21.2007092139232.19897@azone.org> <908961137.4769747.1594370585219@mail.yahoo.com> <[🔎] 1104871163.4762905.1594370616864@mail.yahoo.com> <[🔎] alpine.DEB.2.21.2007110839180.28476@azone.org> <[🔎] alpine.DEB.2.21.2007111021500.7787@azone.org> <[🔎] 520110585.417671.1594565493490@mail.yahoo.com> <[🔎] alpine.DEB.2.21.2007152324040.10414@azone.org> <[🔎] 1370813785.2444121.1594924286284@mail.yahoo.com> <[🔎] alpine.DEB.2.21.2007180917350.17731@azone.org> <[🔎] 14958976-05a6-eb53-c0f4-e5857fbcb974@holgerdanske.com> <[🔎] 1365305252.2275163.1595155599884@mail.yahoo.com>

One last point I missed in my previous post -- big files. Many Perlfunctions and/or libraries expect to do everything in RAM. This becomesa problem when I want to compress, encrypt, save, and checksum 14 GiBsystem drive images using a live drive and a Perl program in computerswithout large amounts of memory (64 GiB?). So, I ended up writing aPerl program that is truly a script -- it formulates and invokesgzip(1), ccrypt(1), md5sum(1), and sha256sum(1) commands as required.Traditional Bourne shell scripts are ideally suited for such tasks, andthe syntax can be cleaner than Perl in simple cases.



On 2020-07-19 03:46, Ajith R wrote:

I seem to recall that he puts Perl at the top of the
heap, and notes that Perl compatible regular expressions (PCRE) are
available via libraries in other programming languages.


Thanks for confirming that I didn't make a wrong choice. Programs that claim to use PCRE don't support everything that PERL does.

My use of regular expressions is primarily via grep/ egrep and Perl. Iam curious to see what Erlang offers.

I wanted to clean many documents (Wikipedia dump) to analyse the Malayalam content. As I was not comfortable with scripting, I was looking for some prorgam that could remove the foreign language text from the files. As, I could find none that could do the job, I had to use a Perl script with the line below (among others)

s/[^\p{Block: Malayalam}\p{Block: Basic_Latin}\p{Block: General_Punctuation}\s]//g; # remove characters outside the specified unicode blocks.

As of now, the simple substitute command of perl is sufficient for my requirements. Even that one command appears powerful compared to others.

That sounds like you need lexing and parsing, followed by your desiredprocessing. In simple cases, Perl regular expressions can accomplishtwo or three of those tasks. But as complexity grows, you will needmore and more code. Writing a lexer/ parser in any language is anon-trivial task. I have used the Perl 'LWP' library to parse HTML 4pages, but I have not tried to parse HTML 5 and/or Wikipedia pages. Iwould look for a library:


https://metacpan.org/search?q=parse

https://metacpan.org/search?q=wikipedia

I spent a little time with Raku (formerly Perl 6). AIUI improvingregular expressions was a design goal of Perl 6, and features were addedspecifically for parsing. There is a book dedicated to the subject. Ihave a friend who put some time into this area, and he seemed impressed:


https://www.apress.com/us/book/9781484232279


David

Reply to:

References:
- Using .XCompose
  - From: Ajith R <ajithramayyan@yahoo.co.in>
- Re: Using .XCompose
  - From: davidson <davidson@freevolt.org>
- Re: Using .XCompose
  - From: davidson <davidson@freevolt.org>
- Re: Using .XCompose
  - From: Ajith R <ajithramayyan@yahoo.co.in>
- Re: Using .XCompose
  - From: davidson <davidson@freevolt.org>
- Using .XCompose
  - From: Ajith R <ajithramayyan@yahoo.co.in>
- Re: Using .XCompose
  - From: davidson <davidson@freevolt.org>
- Re: Using .XCompose
  - From: davidson <davidson@freevolt.org>
- Re: Using .XCompose
  - From: Ajith R <ajithramayyan@yahoo.co.in>
- Re: Using .XCompose
  - From: davidson <davidson@freevolt.org>
- Re: Using .XCompose
  - From: Ajith R <ajithramayyan@yahoo.co.in>
- sed:awk:perl::rock:paper:chainsaw [was Re: Using .XCompose]
  - From: davidson <davidson@freevolt.org>
- Re: sed:awk:perl::rock:paper:chainsaw [was Re: Using .XCompose]
  - From: David Christensen <dpchrist@holgerdanske.com>
- Re: sed:awk:perl::rock:paper:chainsaw [was Re: Using .XCompose]
  - From: Ajith R <ajithramayyan@yahoo.co.in>

Prev by Date: Re: Bug #961990
Next by Date: Re: XFCE4 - How to increase font size on applications?
Previous by thread: Re: sed:awk:perl::rock:paper:chainsaw [was Re: Using .XCompose]
Next by thread: Re: sed:awk:perl::rock:paper:chainsaw [was Re: Using .XCompose]
Index(es):
- Date
- Thread