[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: sed:awk:perl::rock:paper:chainsaw [was Re: Using .XCompose]



Hello Ajith,

Tom Browder suggests taking a look at Raku (née Perl6), and I concur.
While I don't know Malayalam at all, I can write the regex code below
with ease:

> #all code below using the Raku REPL:
> say '0123456789'.chars;
10
> say $/ if '0123456789' ~~ /  \d+ /;
「0123456789」

> #now with Bengali digits:
> say '০১২৩৪৫৬৭৮৯'.chars;
10
> say $/ if '০১২৩৪৫৬৭৮৯' ~~ /  \d+ /;
「০১২৩৪৫৬৭৮৯」

>#now with Malayalam digits:
> say '൦൧൨൩൪൫൬൭൮൯'.chars;
10
> say $/ if '൦൧൨൩൪൫൬൭൮൯' ~~ /  \d+ /;
「൦൧൨൩൪൫൬൭൮൯」


More info here:

https://www.nntp.perl.org/group/perl.perl6.users/2020/06/msg8828.html
https://www.nntp.perl.org/group/perl.perl6.users/2020/06/msg8845.html

HTH, Bill.



On Sun, Jul 19, 2020 at 4:36 AM Ajith R <ajithramayyan@yahoo.co.in> wrote:
>
> Hi,
>
> > First, there is a somewhat specific question about unspecified
> > substitutions. For all I know about these substitutions, you might
> > actually need XSLT to do them properly.
>
> The substitution that I had in mind requires referring to characters based on their unicode properties like script, block...
>
> > I think you should absolutely use perl if it makes you happy.
> > Unix has a pretty interesting collection of various small tools (which
> > "do one thing and do it well" as you may hear), and shells facilitate
> > hooking up their outputs and inputs. Almost as if they were made to do
> > just that.
>
> I don't prescribe to using a tool for the sake of happiness. With my limited knoweldege I want to select one that is adequate to do the job.
> The subsitution that I wanted in many text files was deleting text from languages other than Malayalam,english and punctuation. This required a program that could match charcters based on their unicode character of block / script. I didn't find anything to suggest that sed could do that. May be, I didn't search properly.
> Did  I miss a utility(including sed) that can do the kind of substitution I mentioned above?
>
> Thanks,
> ajith
>


Reply to: