[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Deleting some regexp/simple expression from lots of files in a secure way



Thanks for this really complete answer. I really appreciated it. Thanks
for your investment in it. I learnt a lot thanks to it.

Boyd Stephen Smith Jr. wrote:
> On Friday 14 May 2010 12:52:45 Merciadri Luca wrote:
>   
>> "Boyd Stephen Smith Jr." <bss@iguanasuicide.net> writes:
>>     
>>> On Friday 14 May 2010 12:04:42 Merciadri Luca wrote:
>>>       
>>>> I have many text files (actually .tex files) which contain some
>>>> sequence or regexp (it depends on the files) that I would like to
>>>> remove. Is there a commandline/GUI for doing this massive edit?
>>>>         
>>> (sed -i -e "s/$regexp//" "$file") for a single file.  (GNU sed only.)
>>>
>>> (find $dir -type f -exec sed -i -e "s/$regexp//" {} \;) for all files in
>>> a directory.
>>>       
>> I am using the second command. The problem is that, for one set of
>> files (that I have selected, no problem for this), I have to use a
>> really simple expression: I need to find all the occurences of
>> `\paragraph{}' and replace them with nothing (i.e. with `'). I know
>> regexps, but replacing `$regexp' with `\paragraph{}' gives error
>> messages. Any idea? Thanks.
>>     
>
> First you need a (basic) regular expression (BRE) that matches "\paragraph{}".  
> The '\' is a BRE special character, so it needs to be escaped.  Also, the "{}" 
> is a bit troublesome with find/-exec, so we will match it using the construct 
> "[{][}]".
>
> The definitive documentation for regular expression is the Single UNIX 
> Specification, Version 3 -- Base Definitions, Chapter 9.  I don't actually 
> like (man 7 regex) for this.
>
> This gives us the regex "\\paragraph[{][}]".  Now, we need to get that regular 
> expression to sed.  (find $dir -type f -exec sed -i -e "s/\\paragraph[{][}]//" 
> {} \;) won't work, since during Quote Removal, one of the '\'s are dropped and 
> neither find nor sed "sees" it.
>
> The shell does a *lot* of processing to the text you type before it reaches 
> the command you are invoking.  Single UNIX Specification, Version 3 -- Shell 
> and Utilities, Chapter 2 is the core documentation, but some shells are much 
> more featureful.
>
> We can either use (find $dir -type f -exec sed -i -e 's/\\paragraph[{][}]//' 
> {} \;) OR--my preference--(regex='\\paragraph[{][}]'; find $dir -type f -exec 
> sed -i -e "s/$regex//" {} \;) to make sure sed gets that important '\'.
>
> Also, I left it out, but you may want the "g" flag to the "s"ubstitute command 
> in sed.  Otherwise, only one occurrence of the regex will get eliminated per 
> line.
>   


-- 
Merciadri Luca
See http://www.student.montefiore.ulg.ac.be/~merciadri/
I use PGP. If there is an incompatibility problem with your mail
client, please contact me.


First deserve, then desire.

Attachment: signature.asc
Description: OpenPGP digital signature


Reply to: