[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: piping find to zip -- with spaces in path



On Tue, 11 Jan 2011 22:02:48 -0500
Doug <dmcgarrett@optonline.net> wrote:

> On 01/11/2011 08:46 PM, Robert Blair Mason Jr. wrote:
> > On Tue, 11 Jan 2011 14:53:33 -0700
> > Bob Proulx<bob@proulx.com>  wrote:
> >
> >> Robert Blair Mason Jr. wrote:
> >>> Rob Owens<rowens@ptd.net>  wrote:
> >>>> I tried this and it successfully creates myfile.zip:
> >>>>
> >>>> find ./ -iname "*.jpg" -print | zip myfile -@
> >>>>
> >>>> But it fails if there are spaces in the path or filename.  How can I
> >>>> make it work with spaces?
> >>> I think the best way would be to quote them in the pipe:
> >>>
> >>> find ./ -iname "*.jpg" -printf "'%p'\n" | zip myfile -@
> >> But that fails when the filename contains a quote character.
> >>
> >>    John's Important File
> >>
> >> Using zero terminated strings (zstrings) are best for handling
> >> arbitrary data in filenames.
> >>
> >> Real Unix(TM) users never put [^[:ascii:]] characters in file names.
> >>
> >> Bob
> > True.  Underscores are _wonderful_ things.  But remember, Linux is Not Unix!
> >
> > Unfortunately for the OP, i don't *think* zip accepts zstrings.  Perhaps a script to just remove all of the non-ascii characters in the filename of all files in the current directory?
> >
> > Random tangent, but pascal strings are often more efficient from a programming standpoint than c-style strings...
> >
> I didn't prune anything because I can't figure out where to do it 
> without losing any semblance of coherence, but anyway:
> The comment about real Unixers not using ascii characters:  what about 
> urls?  They come from the Unix world, and are
> full of underscores and question marks and equal signs.  Then there are 
> emails, all of which require the @ sign.  Not
> complaining, just asking.
> 
> --doug
> 

Well, to be technical, almost all characters *ARE* ascii.  Just not the alphanumeric subset.

The underscore is often included in the set of 'ascii characters'.  The C Programming Language strikes again!  Generally, what is meant by non-ascii-character is any character which might have a special meaning to the users shell.

Usually, however, when you see a question mark and/or equals signs in URLs, they're generally special characters showing that whatever CGI scripting engine the site is running (usually one of the 3Ps: Perl Python PHP) should pass these variables as 'arguments' to the page.  So they're not part of the file on the actual server hard disk, just the URL.

A practical example:  Googling 'question mark in URL' returns (to me, i'm feeling lucky) http://www.abestweb.com/forums/showthread.php?t=55417 .  Which shows us this:
 - Protocol HTTP, site www.abestweb.com
 - File /forums/showthread.php
 - Pass paramater t=5547 to showthread.php

This way, the site doesn't need to make a separate html file to show each thread; it can have a single php file that takes the thread id (t) as a variable when you request it, and then look up this thread's postings in the database and generate HTML from that.  Much more efficient.

If you request just showthread.php, you get a valid web page back, but the site says that you did not specify a thread.  Handling threads with a script is also much more efficient because it fails much more graciously when you don't specify a thread or the thread is non-existent (a notice box detailing the issue instead of a page not found).

Hope this helps.

-- 
rbmj


Reply to: