[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: [debian-reference manual] bullet point sign and its content misplaced



On Wed, 24 Jun 2020 Finn wrote:
Charles Curley wrote:

Before you do that, are you sure what you see isn't Firefox's reaction
to buggy HTML? Have you run the code through an HTML validator?

Thanks for reminding that. HTML validator shows only one error, with
"width" attribute[1].

[1]:
https://validator.w3.org/check?uri=https%3A%2F%2Fwww.debian.org%2Fdoc%2Fmanuals%2Fdebian-reference%2Fch10.en.html%23_removable_storage_device&charset=%28detect+automatically%29&doctype=Inline&group=0

I see this:

 <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd";>
 <html xmlns="http://www.w3.org/1999/xhtml";>
                                     ^^^^^

and I see a lot of empty elements written in combined start-end form,
like this,

  <tagname attribs />

but without any space preceding the forward slash at the end of the tag.

And by "a lot", I mean one dozen and two hundreds:

  $ grep -o '<[^<]*[^[:blank:]]/>' ch10.en.html | wc -l # NB[1]
  212

  $ grep -o '<[^<]*[^[:blank:]]/>' ch10.en.html |
  >  grep -o "^<[[:alpha:]]*" |
  >  tr -d "<" | sort | uniq -c
     67 a
     15 br
     54 col
      2 hr
     67 img
      5 link
      2 meta

In ye olde book about xhtml[2], it is written:

 Section 16.3.3  Handling Empty Elements

  In XML, and thus XHTML, every tag must have a corresponding end
  tag---even those that aren't allowed to contain other tags or
  content. Accordingly, XHTML expects the line break to appear as
  <br></br> in your document. Ugh.

  Fortunately, there is an acceptable alternative: include a slash
  before the closing bracket of the tag to indicate its ending (eg,
  <br />). If the tag has attributes, the slash comes after, the
  slash comes after all the attributes so that an image could be
  defined as:

      <img src="kumquat.gif" />

  While this notation may seem foreign and annoying to an HTML
  author, it actually serves a useful purpose. Any XHTML element that
  has no content can be written this way. Thus, an empty paragraph
  can be written as <p />, and an empty table cell can be written as
  <td />. This is a handy way to mark empty table cells.

  Clever as it may seem, writing empty tags in this abbreviated way
  may confuse HTML browsers. To avoid compatibility problems, you can
  fool the HTML browsers by placing a space before the forward slash
  in an empty element using the XHTML version of its end tag. For
  example, use <br />, with a space between the "br" and '/', instead
  of the XHTML equivalents <br/> and <br></br>. Table 16-1 contains
  all of the empty HTML tags, expressed in their acceptable XHTML
  (transitional DTD) forms.

 Table 16-1. *HTML empty tags in XHTML format*

  <area />     <base />   <basefont />
  <br />       <col />    <frame />
  <hr />       <img />    <input />
  <isindex />  <link />   <meta />
  <param />


NOTES

1. "Every time you attempt to parse HTML with regular expressions, the
    unholy child weeps the blood of virgins, and Russian hackers pwn
    your webapp."
   stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454

2. Musciano and Kennedy 2007, "HTML & XHTML the Definitive Guide" (6ed)


--
Firstly, you must always implicitly obey orders, without attempting to
form any opinion of your own respecting their propriety. Secondly, you
must consider every man your enemy who speaks ill of your king; and
thirdly, you must hate a Frenchman, as you do the devil. --H. Nelson


Reply to: