[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Re: Please review changed man-file of w3m



Justin B Rye wrote:
> I'm re-merging some frayed threads here, skimming through and just
> answering the parts that still seem to be relevant.

I'm passing the new w3m(1) draft through "col -b".  Really to make this process
work I should be proposing revisions to some _input_ format like POD or Docbook
that then gets converted into a man page.

> W3M(1)									W3M(1)
> 
> 
> 
> NAME
>        W3M - a text based web browser and pager
         ^^^
I think *here* we're meant to be talking about the command, "w3m".
 
> SYNOPSIS
>        w3m [OPTIONS|...  [ file | URL ]...
                     ^
Should that first "|" be a "]"?

I suppose if it's "optionS" then you don't need to say "..."...

> DESCRIPTION
>        W3M  has  been  developed  as a text based web client. [...]


The "has been developed" is a bit off grammatically, but more
importantly, it's talking about the development process when the
reader only cares about the successful result: it *is* a text based
web client.

>                                                                It displays HTML
>        documents stored local or on remote systems.  [...]

"It displays HTML" makes it sound as if it can only "view source".

"Local" would have to be "locally", but really this should say
something like:

         W3M is a text based browser which can display local or remote web pages
         as well as other documents.

(Also avoiding the repetition of "web" and leaving open the possibility
that it might also be a gopher/NNTP/whatever browser.)

>                                                       It renders HTML tables and
>        frames.	W3M  ignores JavaScript and Cascading Style Sheets. [...]

Cosmetic surgery:

                                     Its rendering engine can process HTML tables
         and frames, but it ignores JavaScript and Cascading Style Sheets.

>                                                                       It accepts
>        plain text from files or from standard input, serving as a  "pager"  in
>        much the same manner as "more" or "less".

This idea that it handles either HTML via HTTPS (browser mode) or plain text via
the commandline (pager mode) is simply wrong - it can also handle "w3m foo.html"
and browsing to http://example.org/foo.txt.  Of course, you're more *likely* to
hand it random text files on the commandline and web pages via a URL, but that's
a fact about how users work, not w3m!

                                                                           W3M can
         also serve as a general purpose browser and pager (like "more" or "less")
         for text files named as arguments, passed on standard input, or accessed
         via the net.

This still needs work since I'm not sure how close we should get to hinting at
the existence of "w3m ./" and "w3m gopher://foo";.
 
>        W3M  organizes  content   in  buffers and tabs, allowing easy navigation
>        between.   Having the extra package with  w3m-img installed,  W3M  shows
>        graphics  within a page or in a new buffer. Whenever W3M's capabilities
>        to render HTML do no meet your needs, the target URL can be handed over
>        to a graphic browser by a single command.

You've added a few good ideas, but it needs a bit or reanglicisation.

         W3M  organizes  its content in  buffers or tabs, allowing easy navigation
         between. With the w3m-img extension installed, W3M can display inline
         graphics in web pages. And whenever W3M's HTML rendering capabilities
         do not meet your needs, the target URL can be handed over to a graphical
         browser with a single command.

(Talking in terms of the w3m-img "extension" should make this less
Debian-specific and easier to pass upstream.)
 
>        For help with runtime options, press "H" while running W3M.
> 
> 
> ARGUMENTS
>        If given one argument or more arguments, W3M works like a browser. Hav-
>        ing recieved a URL, the respective content is worked out  according  to
>        the  MIME type. With relative or absolute paths as argument, W3M relies
>        on filenames to display the content in an adequate manner.

Several English problems here, but more importantly, the number of command line
arguments is not what determines whether it acts like a browser.

         When given one or more command line arguments, W3M will attempt to work
         out the MIME type before deciding how to handle it.  Web URLs provide
         W3M with content-type headers; for relative or absolute file system
         paths, W3M relies on filenames.
 
>        With no argument, W3M expects data from STDIN. Input given by the  user
>        may be necessary the control how these data processed further.

Try:
         With no argument, W3M expects data from STDIN, and unless assisted by
	 the user will have to use the default of "text/plain".
 
>        With  no  explicite targets and no stream of data from STDIN, W3M exits
>        with usage information.

No it doesn't!

         If provided with no target (and no fallback target - see for instance
         option -v below), W3M will exit with usage information.
 
> OPTIONS
>        Options are introduced with a dash and may take an argument

Groff distinguishes between "dash", "hyphen", and "minus", but insists for
reasons I have never understood that the ASCII "-" glyph used for options is
"minus".  See "http://manpages.debian.org/cgi-bin/man.cgi?query=groff_char";.
There's even a Lintian error for using the wrong one:
	"https://lintian.debian.org/tags/hyphen-used-as-minus-sign.html";

It's cruel to expect users to understand this (especially given that in strict
Unicode terms, 0x2D = "-" is "hyphen-minus" and the true "minus sign" is a
completely separate glyph, 0x2212 = "−"), so I would suggest:

         Options must each be introduced by a single "-" character and may in
	 some cases take a parameter.

> 
>    Options to select predefined configurations and resources

That doesn't really work, but the category makes sense.  Maybe:

     Options overriding W3M resources (normally in ~/.w3m/; see FILES)

And I would be inclined to make it one of the last subsections.

>        -config file
> 	      use file instead of the default config file

(This works if "file" is appropriately highlighted)

>        -bookmark file
> 	      specifies another bookmark file to be used

For consistency:
  	      use file instead of the default bookmark.html file

> 
>        -reqlog
> 	      log headers of HTTP communication in file ~/.w3m/request.log

              use a request.log file to log network connection headers

>        -debug DO NOT USE

Why is it in this set?  Is it just trailing behind -reqlog?  I suppose
this last one fits, though:

>        -o option=value
> 	      modify one configuration item with an explicitely given value

A reminder: it's "explicit", not "explicite".  And why not just say:
         -o option=value
  	      explicitly set one configuration item

Or if we're being encyclopaedic
  	      explicitly set one configuration item (per invocation of -o)
but we're allowed to leave a few details to the MANUAL.html...

Now, alphabeticise them:

     Options overriding W3M resources (normally in ~/.w3m/; see FILES)

         -bookmark file
  	      use file instead of the default bookmark.html file

         -config file
  	      use file instead of the default config file

        -debug
              DO NOT USE
 
         -o option=value
  	      explicitly set one configuration item

         -reqlog
              use a request.log file to log network connection headers
 
>    Tuning W3M / Options to tune Program-User-Interaction /  Man-Machine-Inter-
>        face

Almost anything could be "tuning its interface"; what these have in
common is that they affect the persistent Textual User Interface.

An alternative approach would be to move this section so that it's
 1) web-browser-mode options
 2) text-pager-mode options
 3) generic textual-UI options

>        -title use the buffer name as terminal title string.
> 
>        -title=TERM
> 	      use  the	buffer name as terminal title string. TERM style title
> 	      configuration is used
> 	      Implementation not verified

I think the "=screen" test verifies that it works as it says; we just happen to
be short of use cases.

>        -no-mouse
> 	      do not use mouse

Or "deactivate mouse support", since "using the mouse" means pushing
it around my desk.

> 
>        -num   display each line's number

Strictly speaking this also affects material sent to STDOUT, but I'll
ignore that for now.
 
>        -M     monochrome display
> 
>        -W     toggle wrapping in searches
> 
>        -X     do not initialize/deinitilize the terminal
                                         ^a
> 	      Implementation not verified

I think this is another one where we've verified it to the limits of
my curiosity.  At least compared to the following two mysteries:

> 
>        -ppc N width of N pixels per character. Range of 4.0 to	32.0,  default
> 	      8.0.  Larger values will make tables narrower.
> 	      Implementation not verified
> 
>        -ppl N height of N pixels per line. Range of 4.0 to 64.0.
> 	      Implementation not verified

Okay, a whole section I'm mostly happy with.  But alpha-sorting (I'm
going to assume it's aAbBcC):

     Textual User Interface tuning options:

         (possibly import "-B" here)

         -M 
              monochrome display

         -no-mouse
 	      deactivate mouse support

         -num 
              display each line's number

         -N
              distribute multiple command line arguments to tabs.
              By default, a stack of buffers is used.

         -ppc N
              width of N pixels per character. The range is 4.0 to 32.0, default
              8.0 - larger values will make tables narrower.
              (Implementation not verified)
 
         -ppl N
              height of N pixels per line. The range is 4.0 to 64.0.
              (Implementation not verified)

         -title, -title=TERM
              use the buffer name as terminal title string.  TERM if specified
 	      sets the title configuration style as for that TERM.

         (possibly import "-v" here)

         -W
              toggle wrapping in searches

         -X 
              do not initialize/deinitialize the terminal

(I have imported -N from the following section; I also consider the
possibility of importing +<N>, .)

>    Command line options for a browser-like usage

You mean something like "web browsing behaviour options".

>        -F     render frames
> 
>        -N     distribute  the contents passed with multiple command line argu-
> 	      ments to tabs. By default, a stack of buffers is used.

This isn't specific to HTTP, or web pages... yes, it's a feature that
first appeared in web browsers, but now in effect it's a basic TUI
feature and deserves a place in the preceding Section.

> 
>        -cookie, -no-cookie
> 	      accept and use cookies, neither accept nor use cookies

What's the difference between accepting and using cookies, anyway?

> 
>        -graph, -no-graph
> 	      use or do not use graphic characters for borders of  frames  and
> 	      tables

Almost a TUI feature, but web-specific enough to stay here.

>        -no-proxy
> 	      do not use proxy
> 
>        -4     IPv4 only (equivalent to -o dns_order=4)
> 
>        -6     IPv6 only (equivalent to -o dns_order=6)
> 
>        -m     Internet message mode
> 	      Implementation not verified

If it can't be used on email then maybe we should be more specific and
call it "USENET message mode", but it's still too hazy to say.

Again these mostly just need to be sorted.  Uh, which end do numbers
go?

         Internet browser mode options

         (possibly import "-cols N" here?)

         -cookie, -no-cookie
 	      support (or do not support) the use of HTTP cookies
 
         -F
              render HTML frames
 
         -graph, -no-graph
 	      support (or do not support) the use of graphical characters for
              drawing HTML table and frame borders

         (possibly import "-header string" here?)

         -m
              Internet message mode, taking message headers into account to
	      determine content-type.
 	      (Implementation not verified)

         -no-proxy
	      do not use a proxy
 
         (just conceivably import "-post file" here?)

         -4
              IPv4 only (equivalent to -o dns_order=4)
 
         -6
              IPv6 only (equivalent to -o dns_order=6)
 
>    Command line options for a pager-like usage

These are "text-handling" options, but yes, here treating it as "web
browser mode" versus "text pager mode" makes sense.

>        -r     ignore  underline  or  bolding  markup  constructions  that  use
> 	      backspace (e.g. in nroff)

Now I need to look up which way round I eventually decided it was.
 
>        -s     squeeze multiple blank lines into one
> 
>        -t N   set tab width to N columns. No effect on STDOUT

This raises the issue that "pager-like usage" shouldn't have a STDOUT,
but -r and -s are general text-processing options that work on
material sent to STDOUT as well as within the "pager".  Oh well.

         Text pager mode options

         (possibly import "-l N" here?)

         -r
                use caret notation to display special escape characters in text
                (such as ANSI escapes or nroff-style backspaces) instead of
                processing them as colored or otherwise highlighted text.
 
         -s
                squeeze multiple blank lines into one
  
         -t N
                set tab width to N columns. No effect on text sent to STDOUT.


>    Options to control treatment of data input and output
>        -I charset
> 	      user defined character encoding of input data
> 
>        -O charset
> 	      user defined character encoding of output data
> 
>        -T type
> 	      explicit characterization of input data by MIME type

All fair enough.
 
>    Command line options for a special startup
>
>        -v     allow start with no defined input via STDIN, file or URL

That sounds as if it means it's getting undefined input from one of
those sources.

         -v
                with no other target defined, use the built-in W3M
                start-up page

>        -B     start with default bookmark file ~/.w3m/bookmark.html

It doesn't necessarily use that one; "-B -bookmark" effectively
cancels out, so "w3m -B bookmark /etc/fstab" does the same thing as
"w3m /etc/fstab".

         -B
                with no other target defined, use the bookmark page.
 
Those are the two "special targeting" options; the rest are "special
exit" options.


>        -show-option
> 	      show all available config options
> 
>        -help  show a summary of compiled-in and runtime options
> 
>        -version
> 	      show the version of W3M ()

"()"?

Maybe we could put -v and -B in the generic-TUI-options section as

         -B
                special start-up: with no other target defined, use the
                bookmark page.
         [...]
         -v
                special start-up: with no other target defined, use the
                built-in W3M start-up page

and then merge the special-exits into the following section.
 
>    Options for instant data requests

"Non-persistent data output mode options"?  That's terrible, but it's
the best I can think of at the moment.

>        -dump  dump rendered page into STDOUT
> 
>        -dump_source
> 	      dump the page's source code into STDOUT
> 
>        -dump_head
> 	      dump response of a HEAD request for a URL into STDOUT
> 
>        -dump_both
> 	      dump HEAD, and source code for a URL into STDOUT
> 
>        -dump_extra
> 	      dump HEAD, source code, and extra information  for  a  URL  into
> 	      STDOUT ()

"()"?

Merging the two sections would add -help, -show-option and -version
on the end.

>    Miscelleneous command line options
     Miscellaneous

>        +N     go  to  line  N;	only effective for N larger than the number of
> 	      lines in the terminal

Maybe in "generic TUI options"?  (Make sure the highlighting is enough
to let readers distinguist "-N" from "+N" (ie "+<N>".)

>        -cols N
> 	      combined with -dump, HTML input is rendered to lines of N  char-
> 	      acters. Affects only STDOUT

Not necessarily combined with -dump; it also works on any HTML
rendered into a pipe.  Oh, interestingly, even "w3m -v", even though
that was never HTML.  But I think that corner-case is obscure enough
to ignore!

If all the others get rehoused, this could live in "web browsing mode"
(just because it deals with the HTML rendering engine).
 
>        -l N   number  of  lines preserved internally when recieving plain text
> 	      from STDIN (default 10 000)

(It's "-ceiv-".)  Wouldn't this fit nicely in "text pager mode"?

>        -post file
> 	      use POST method with file content
> 	      Implementation not verified

Now fully verified!  But hard to find a use for.  This could just
about go under "web browsing mode".
  
>        -header string
> 	      APPEND string to the HTTP(S)  request.  Expected	to  match  the
> 	      header syntax  Variable: Value

I think this could go in the "web browsing mode" section.

> EXAMPLES
>        Pager-like usage of W3M | W3M as a text pager
> 	      Combine snippets of html code and preview the page
> 	      $ cat header.html footer.html | w3m -T text/html fP

I think that line-final fP indicates an nroff error.

So the "text pager" example renders a webpage?  That would be less
obtrusive if you had a second example.  Maybe something like

              Compare two files using "tabbed pager" mode
              $ w3m -N config.old config

>        Browser-like usage of W3M | W3M as an HTML renderer
> 	      Display web content in monochromous terminals

Just "in a monochrome terminal" (another kind of hardware I'm glad to
have finally thrown away).

> 	      $ w3m -M http://w3m.sourceforge.net
> 	      Display embedded graphics
> 	      w3m -o auto_image=TRUE http://w3m.sourceforge.net
              ^
You're being inconsistent about the leading prompt-style $.
 
>        Filter-like usage of W3M

Different indent.  Another nroff error?
 
>        Convert	an  HTML  file	to  a plain text file with a defined length of
>        lines

Just say "a defined line length".

>        $ w3m -dump -cols 40 foo.html >foo.txt
> 
>        Convert an HTML to plain text and append the contained links
> 	w3m  -dump  -o	display_link_number=1	http://w3m.sourceforge.net   >
>        index.txt

Useless use of -dump!  Nice example, though.

         Convert a web page to text with appended links
         $ w3m -o display_link_number=1 http://w3m.sourceforge.net > index.txt
 
>        Conversion of file format and character encoding
>        $ cat foo.html | w3m -dump -T text/html -I EUC-JP -O UTF-8 >foo.txt

Ditto (and a UUOC, but I think you can get away with one).

(And the file should be called "フ.html"!)
 
>        Special startups
> 	      Start with no input, suitable as predefined command to configure
> 	      in window manager menus w3m -v
> 	      Start with a preferred set of bookmark for special purposes
> 	      w3m -B -bookmark links-1.html

As I said, "-B -bookmark" effectively cancels out.

        Crazy stuff we don't want to have to explain
             Usenet newsreader mode (requires a server)
             $ w3m -m nntp://$NEWSSERVER/debian_curiosa/
             ...

At this point I was idly considering an ERRORS section, and
considering the fact that
	     
	     $ w3m http://example.org/nonesuch >/dev/null

exits happily despite getting a 404.  But wait, apparently the IETF
set up a server on example.org so that page gets a 200 response!  Now
it doesn't work as an example - we would have to use something like
"w3m http://ietf.org/nonesuch >/dev/null".

  ENVIRONMENT

  If it finds a variable WWW_HOME in the environment, W3M will use its contents
  as a last fallback target, so that

             $ WWW_HOME=http://localhost/index.html
             $ w3m

  goes to the local web server instead of exiting with a usage message.

> FILES
>        ~/.w3m/config
> 	      user defined configuration file, overrides $/etc/w3m/config
                                             ;
>        ~/.w3m/keymap
> 	      user defined key bindings; overrides default key bindings
> 
>        ~/.w3m/menu
> 	      user defined menu, overrides default menu
                               ;
>        ~/.w3m/mouse
> 	      user defined mouse settings
> 
>        ~/.w3m/cookie
> 	      cookie jar; written on exit, read on launch
> 
>        ~/.w3m/history
> 	      browser history - visited files and URLs
> 
>        ~/.w3m/passwd
> 	      password and username file
> 
>        ~/.w3m/pre_form
> 	      contains predefined values to fill recurrent HTML forms
> 
>        ~/.w3m/mailcap
> 	      external viewer configuration file
> 
>        ~/.w3m/mime.types
> 	      MIME types file
> 
> NOTES
>        This is the W3M 0.5.3 Release.
> 
> SEE ALSO
>        README and example files are to be found in the doc directory  of  your
>        W3M  installation.  Recent  information	about  W3M may be found on the
>        project's web pages at http://w3m.sourceforge.net
> 
> ACKNOWLEDGMENTS
>        W3M has incorporated code from several sources.	Users have contributed
>        patches and suggestions over time.
> 
> AUTHOR
>        Akinori ITO <aito@fw.ipsj.or.jp>
> 
> 
> 
> 4th Berkeley Distribution	  2014-10-31				W3M(1)

For a while there I didn't think I'd make it to here today.
-- 
JBR	with qualifications in linguistics, experience as a Debian
	sysadmin, and probably no clue about this particular package


Reply to: