Re: Removing duplicate text

To: Tony Baldwin <photodharma@gmail.com>
Cc: Nyizsnyik Ferenc <nyizsa@gmail.com>, debian-user@lists.debian.org
Subject: Re: Removing duplicate text
From: Wilfred Zegwaard <wilfred.zegwaard@gmail.com>
Date: Mon, 25 May 2009 14:53:53 +0200
Message-id: <[🔎] 1243256033.3157.6.camel@localhost>
In-reply-to: <[🔎] 4A19867F.6070109@gmail.com>
References: <[🔎] 1243183272.8938.1.camel@localhost> <[🔎] 4A197BD2.9050608@gmail.com> <[🔎] 20090524191951.063b5b47@debian> <[🔎] 4A19867F.6070109@gmail.com>

I've tried uniq, but it didn't resolve the problem. I got the same file
back after trying it. So I took the LaTeX route (it's a LaTeX file). 
I started up RefTeX to see which parts are identical, jumped with RefTeX
to that part and deleted it by hand. It's not full-proof, but I think
most (if not all) of the duplicate text is gone now.

I also saw this pointer:

http://www.ibm.com/developerworks/linux/library/l-tiptex6.html

I didn't try it.

Wilfred

Op zo, 24-05-2009 te 13:40 -0400, schreef Tony Baldwin:
> Nyizsnyik Ferenc wrote:
> > On Sun, 24 May 2009 12:54:42 -0400
> > Tony Baldwin <photodharma@gmail.com> wrote:
> > 
> >> Wilfred Zegwaard wrote:
> >>> Dear users,
> >>>
> >>> I've got a textfile with a lot of duplicate text. How do I remove
> >>> it? I'm using Emacs 21.
> >>>
> >>> Wilfred
> >>>
> >>>
> >> That seems like a very general question, and one which has a broad 
> >> plethora of answers.
> > 
> > Exactly; here is a simple one:
> > uniq -u infile outfile
> 
> Wow, that's cool.  I never saw uniq before.
> I just read the man and tried it out.
> Wouldn't he just want
> uniq infile outfile
> since -u option will remove all instances
> of a repeated element, where as uniq f1 f2
> will leave one instance of said repeating element?
> 
> I did
> for 1 in 1 2 3 4 5 6 7 8 9; do echo banana >> banana; done
> then echo "orange you glad I didn't say banana? " >> banana
> then
> uniq -u banana orange
> cat orange
> orange you glad I didn't say banana?
> 
> then did
> uniq banana orange  # no -u
> cat orange
> banana
> orange you glad I didn't say banana?
> 
> so it kept one banana.
> 
> /tony
> 
> 
> -- 
> http://www.photodharma.com
> art & photos | tony baldwin
> 
>

Reply to:

References:
- Removing duplicate text
  - From: Wilfred Zegwaard <wilfred.zegwaard@gmail.com>
- Re: Removing duplicate text
  - From: Tony Baldwin <photodharma@gmail.com>
- Re: Removing duplicate text
  - From: Nyizsnyik Ferenc <nyizsa@gmail.com>
- Re: Removing duplicate text
  - From: Tony Baldwin <photodharma@gmail.com>

Prev by Date: Re: Installing xmonad?
Next by Date: Re: OT Looking for certain Geode LX Board
Previous by thread: Re: Removing duplicate text
Next by thread: Why burning ISO of a multisession disc results in a disc that doesn't have multisession?
Index(es):
- Date
- Thread