[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Alternate delimiters (for sed) above decimal 127? (was Re: sed question)



On Sat 07 Dec 2019 at 09:27:59 (-0500), rhkramer@gmail.com wrote:
> On Saturday, December 07, 2019 07:20:35 AM The Wanderer wrote:
> > Yep - using '/' is only a standard convention, it's not required. When
> > writing an s-expression which I know will be passed a path, I generally
> > use '@' myself; that A: is conveniently typable on the keyboard, B: is a
> > comparatively rare character to find in either path or filename, C:
> > doesn't have special meaning as part of a regular expression, and D:
> > unlike most of the other characters that fit the other criteria, isn't
> > treated specially by most shells that I know of.
> > 
> > 
> > `~!#$&*()={}|\;"'<> are all treated specially by bash, in at least some
> > circumstances. (Assuming I haven't mixed anything up.)
> > 
> > $^*()+[]|\.? are treated specially as part of a regular expression.
> > 
> > !%&()_+=-:;'",./? are comparatively common in paths and/or filenames.
> > 
> > As far as I can see, at least on my keyboard, that pretty much just
> > leaves @. It does still sometimes occur in paths and filenames, so it's
> > not really ideal, but it's probably less common there than any of the
> > non-special-meaning others.
> 
> I'm not the OP, but thanks for the explanation / discussion.
> 
> I just have a wild idea / question.  Those are (iirc) all ASCII characters, 
> (basically 7 bits) (yes, I know they are in an 8 bit byte), I wonder if SED 
> (and AWK) could use something in, well, is it called the 2nd code page (I 
> forget), but some character like the degree symbol (which, iirc, is something 
> like 240 octal?).  Also, although I haven't used it in a very long time, it 
> seems there is (or was) a means to do something like type <alt>240 to actually 
> enter the degree sign.

I won't speak to awk, but sed requires the delimiters to be single
bytes.  A "penalty" for using utf8 throughout the system is that the
top bit has to be 0 in single byte characters because setting the
top bit indicates there's at least one more byte in the character.

> Oh, hmm, <alt>240 doesn't do it, maybe something has changed (or, more likely, 
> I'm mis-remembering ;-)

If *my* memory serves, isn't that how M$ systems used to enter
characters? Anyway, you can make things a lot easier for yourself
by defining characters in a way that makes sense to you. For example,
I use degrees ° quite often and type it with three keystrokes:
<CapsLk> <o> <o>
I rely on the defaults (wherever they originate) as much as possible
(they seem to make sense), but I add quite a lot more, and have
endeavoured to make VCs and X behave similarly:

https://lists.debian.org/debian-user/2019/07/msg00926.html

Cheers,
David.


Reply to: