[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#467082: marked as done (bibtex2html: Accents lexing/parsing)



Your message dated Tue, 19 Aug 2008 21:47:03 +0000
with message-id <E1KVZ2d-0004qU-Nx@ries.debian.org>
and subject line Bug#467082: fixed in bibtex2html 1.92-1
has caused the Debian Bug report #467082,
regarding bibtex2html: Accents lexing/parsing
to be marked as done.

This means that you claim that the problem has been dealt with.
If this is not the case it is now your responsibility to reopen the
Bug report if necessary, and/or fix the problem forthwith.

(NB: If you are a system administrator and have no idea what this
message is talking about, this may indicate a serious mail system
misconfiguration somewhere. Please contact owner@bugs.debian.org
immediately.)


-- 
467082: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=467082
Debian Bug Tracking System
Contact owner@bugs.debian.org with problems
--- Begin Message ---
Package: bibtex2html
Version: 1.91-1
Severity: minor


Hi,
in my use of bib2bib I discovered that the õ character was not handled. Thus
I added it to latex_accents.mll.
I also made the following changes to it:
- Other latin-1 diacritics (Ç, Ã, etc)
- I removed the "\\I" "letters": to my knowledge only \i exists so as to
  remove the point above the "i". No need of a \I as it already lacks this
  point
- I added "\\i}" because it was not able to handle entries like: 
 author = {Col{\"\i}n},
 for instance. The first "{" is taken by next_char but once "\\"" has been
 lexed quote_char does not know about "\\i}", hence my addition
- I also added the "{I}" char
I hoped I did not misinterpret the inner workings of latex_accents.mll, see
the attached diff.

On that note, I also discovered that fields like:
author = {Tr{\" e}ma and Cl{\' e}s},
were not correctly matched by a regex condition. One of the cause seems to
come from the fact that latex_accents.mll does not take inner spaces into
account. Other experiments seem to also suggest something in condition_lexer
and/or bibtex_lexer, although I'm far from sure.

I got very confused between the OCaml escapings of characters, the escapings
I had to do in my shell and the escapings in the regex, and all the lexers, 
thus I will not attempt to touch it and trust upstream here :-)

-- System Information:
Debian Release: lenny/sid
  APT prefers testing
  APT policy: (500, 'testing')
Architecture: i386 (i686)

Kernel: Linux 2.6.22
Locale: LANG=fr_FR.UTF-8, LC_CTYPE=fr_FR.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/bash

Versions of packages bibtex2html depends on:
ii  ocaml-base-nox [ocaml-base-no 3.10.0-13  Runtime system for ocaml bytecode 
ii  perl                          5.8.8-12   Larry Wall's Practical Extraction 
ii  texlive-base                  2007-13    TeX Live: Essential programs and f

bibtex2html recommends no packages.

-- no debconf information
--- latex_accents.mll.backup	2008-02-22 19:09:59.000000000 +0100
+++ latex_accents.mll	2008-02-22 20:03:46.000000000 +0100
@@ -37,7 +37,13 @@
   | '{'                           { next_char lexbuf }
   | '}'                           { next_char lexbuf }
   | 'ç' { add_string "&ccedil;" ; next_char lexbuf }
+  | 'Ç' { add_string "&Ccedil;" ; next_char lexbuf }
   | 'ñ' { add_string "&ntilde;"; next_char lexbuf }
+  | 'Ñ' { add_string "&Ntilde;"; next_char lexbuf }
+  | 'ã' { add_string "&atilde;"; next_char lexbuf }
+  | 'Ã' { add_string "&Atilde;"; next_char lexbuf }
+  | 'õ' { add_string "&otilde;"; next_char lexbuf }
+  | 'Õ' { add_string "&Otilde;"; next_char lexbuf }
   | 'ä' { add_string "&auml;"; next_char lexbuf }
   | 'ö' { add_string "&ouml;"; next_char lexbuf }
   | 'ü' { add_string "&uuml;"; next_char lexbuf }
@@ -90,25 +96,27 @@
 | '`'                { left_accent lexbuf }
 | '^'                { hat lexbuf }
 | "c{c}"             { add_string "&ccedil;" ; next_char lexbuf }
+| "c{C}"             { add_string "&Ccedil;" ; next_char lexbuf }
 | 'v'                { czech lexbuf }
-| ("~n"|"~{n}")      { add_string "&ntilde;"; next_char lexbuf  }
+| '~'                { tilde lexbuf }
 |  _                 { add_string "\\" ; add lexbuf ; next_char lexbuf  }
 | eof                { add_string "\\" }
 
 (* called when we have seen  "\\\""  *)
 and quote_char = parse
-  ('a'|"{a}")                   { add_string "&auml;" ; next_char lexbuf }
-| ('o'|"{o}")                   { add_string "&ouml;" ; next_char lexbuf }
-| ('u'|"{u}")                   { add_string "&uuml;" ; next_char lexbuf }
-| ('e'|"{e}")                   { add_string "&euml;" ; next_char lexbuf }
-| ('A'|"{A}")                   { add_string "&Auml;" ; next_char lexbuf }
-| ('O'|"{O}")                   { add_string "&Ouml;" ; next_char lexbuf }
-| ('U'|"{U}")                   { add_string "&Uuml;" ; next_char lexbuf }
-| ('E'|"{E}")                   { add_string "&Euml;" ; next_char lexbuf }
-| ("\\i" space+|"{\\i}")        { add_string "&iuml;" ; next_char lexbuf }
-| ('I'|"\\I" space+|"{\\I}")    { add_string "&Iuml;" ; next_char lexbuf }
-| _                             { add_string "\\\"" ; add lexbuf }
-| eof                           { add_string "\\\"" }
+  ('a'|"{a}")   { add_string "&auml;" ; next_char lexbuf }
+| ('o'|"{o}")   { add_string "&ouml;" ; next_char lexbuf }
+| ('u'|"{u}")   { add_string "&uuml;" ; next_char lexbuf }
+| ('e'|"{e}")   { add_string "&euml;" ; next_char lexbuf }
+| ('A'|"{A}")   { add_string "&Auml;" ; next_char lexbuf }
+| ('O'|"{O}")   { add_string "&Ouml;" ; next_char lexbuf }
+| ('U'|"{U}")   { add_string "&Uuml;" ; next_char lexbuf }
+| ('E'|"{E}")   { add_string "&Euml;" ; next_char lexbuf }
+| ('i'|"{i}"|"\\i" space+|"{\\i}"|"\\i}")        
+                { add_string "&iuml;" ; next_char lexbuf }
+| ('I'|"{I}")   { add_string "&Iuml;" ; next_char lexbuf }
+| _             { add_string "\\\"" ; add lexbuf }
+| eof           { add_string "\\\"" }
 
 (* called when we have seen  "\\'"  *)
 and right_accent = parse
@@ -120,9 +128,10 @@
 | ('O'|"{O}")   { add_string "&Oacute;" ; next_char lexbuf }
 | ('U'|"{U}")   { add_string "&Uacute;" ; next_char lexbuf }
 | ('E'|"{E}")   { add_string "&Eacute;" ; next_char lexbuf }
-| ('\'')   { add_string "&rdquo;" ; next_char lexbuf }
-| ('i'|"\\i" space+|"{\\i}") { add_string "&iacute;" ; next_char lexbuf }
-| ('I'|"\\I" space+|"{\\I}") { add_string "&Iacute;" ; next_char lexbuf }
+| ('\'')        { add_string "&rdquo;" ; next_char lexbuf }
+| ('i'|"{i}"|"\\i" space+|"{\\i}"|"\\i}") 
+                { add_string "&iacute;" ; next_char lexbuf }
+| ('I'|"{I}")   { add_string "&Iacute;" ; next_char lexbuf }
 | _             { add_string "\\'" ; add lexbuf ; next_char lexbuf }
 | eof           { add_string "\\'" }
 
@@ -136,12 +145,14 @@
 | ('O'|"{O}")   { add_string "&Ograve;" ; next_char lexbuf }
 | ('U'|"{U}")   { add_string "&Ugrave;" ; next_char lexbuf }
 | ('E'|"{E}")   { add_string "&Egrave;" ; next_char lexbuf }
-| ('`')   { add_string "&ldquo;" ; next_char lexbuf }
-| ('i'|"\\i" space+ |"{\\i}") { add_string "&igrave;" ; next_char lexbuf }
-| ('I'|"\\I" space+ |"{\\I}") { add_string "&Igrave;" ; next_char lexbuf }
+| ('`')         { add_string "&ldquo;" ; next_char lexbuf }
+| ('i'|"{i}"|"\\i" space+ |"{\\i}"|"\\i}") 
+                { add_string "&igrave;" ; next_char lexbuf }
+| ('I'|"{I}")   { add_string "&Igrave;" ; next_char lexbuf }
 | _             { add_string "\\`" ; add lexbuf ; next_char lexbuf }
 | eof           { add_string "\\`" }
 
+(* called when we have seen "\\^"  *)
 and hat = parse
   ('a'|"{a}")   { add_string "&acirc;" ; next_char lexbuf }
 | ('o'|"{o}")   { add_string "&ocirc;" ; next_char lexbuf }
@@ -151,18 +162,32 @@
 | ('O'|"{O}")   { add_string "&Ocirc;" ; next_char lexbuf }
 | ('U'|"{U}")   { add_string "&Ucirc;" ; next_char lexbuf }
 | ('E'|"{E}")   { add_string "&Ecirc;" ; next_char lexbuf }
-| ('i'|"\\i" space+ |"{\\i}") { add_string "&icirc;" ; next_char lexbuf }
-| ('I'|"\\I" space+ |"{\\I}") { add_string "&Icirc;" ; next_char lexbuf }
+| ('i'|"{i}"|"\\i" space+ |"{\\i}"|"\\i}") 
+                { add_string "&icirc;" ; next_char lexbuf }
+| ('I'|"{I}")   { add_string "&Icirc;" ; next_char lexbuf }
 | _             { add_string "\\^" ; add lexbuf ; next_char lexbuf }
 |  eof          { add_string "\\^" }
 
+(* called when we have seen "\\~"  *)
+and tilde = parse
+  ('a'|"{a}")   { add_string "&atilde;" ; next_char lexbuf }
+| ('o'|"{o}")   { add_string "&otilde;" ; next_char lexbuf }
+| ('A'|"{A}")   { add_string "&Atilde;" ; next_char lexbuf }
+| ('O'|"{O}")   { add_string "&Otilde;" ; next_char lexbuf }
+| ('n'|"{n}")   { add_string "&ntilde;" ; next_char lexbuf }
+| ('N'|"{N}")   { add_string "&Ntilde;" ; next_char lexbuf }
+| _             { add_string "\\~" ; add lexbuf ; next_char lexbuf }
+|  eof          { add_string "\\~" }
+
+(* called when we have seen "\\v"  *)
 and czech = parse
   ('r'|"{r}")   { add_string "&#X0159;" ; next_char lexbuf }
 | ('R'|"{R}")   { add_string "&#X0158;" ; next_char lexbuf }
 | ('s'|"{s}")   { add_string "&#X0161;" ; next_char lexbuf }
 | ('S'|"{S}")   { add_string "&#X0160;" ; next_char lexbuf }
-| ('i'|"\\i" space+ |"{\\i}") { add_string "&#X012D;" ; next_char lexbuf }
-| ('I'|"\\I" space+ |"{\\I}") { add_string "&#X012C;" ; next_char lexbuf }
+| ('i'|"{i}"|"\\i" space+ |"{\\i}"|"\\i}") 
+                { add_string "&#X012D;" ; next_char lexbuf }
+| ('I'|"{I}")   { add_string "&#X012C;" ; next_char lexbuf }
 | _             { add_string "\\^" ; add lexbuf ; next_char lexbuf }
 |  eof          { add_string "\\^" }
 

--- End Message ---
--- Begin Message ---
Source: bibtex2html
Source-Version: 1.92-1

We believe that the bug you reported is fixed in the latest version of
bibtex2html, which is due to be installed in the Debian FTP archive:

bibtex2html_1.92-1.diff.gz
  to pool/main/b/bibtex2html/bibtex2html_1.92-1.diff.gz
bibtex2html_1.92-1.dsc
  to pool/main/b/bibtex2html/bibtex2html_1.92-1.dsc
bibtex2html_1.92-1_all.deb
  to pool/main/b/bibtex2html/bibtex2html_1.92-1_all.deb
bibtex2html_1.92.orig.tar.gz
  to pool/main/b/bibtex2html/bibtex2html_1.92.orig.tar.gz



A summary of the changes between this version and the previous one is
attached.

Thank you for reporting the bug, which will now be closed.  If you
have further comments please address them to 467082@bugs.debian.org,
and the maintainer will reopen the bug report if appropriate.

Debian distribution maintenance software
pp.
Ralf Treinen <treinen@debian.org> (supplier of updated bibtex2html package)

(This message was generated automatically at their request; if you
believe that there is a problem with it please contact the archive
administrators by mailing ftpmaster@debian.org)


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Format: 1.8
Date: Tue, 12 Aug 2008 01:43:30 +0200
Source: bibtex2html
Binary: bibtex2html
Architecture: source all
Version: 1.92-1
Distribution: unstable
Urgency: low
Maintainer: Debian OCaml Maintainers <debian-ocaml-maint@lists.debian.org>
Changed-By: Ralf Treinen <treinen@debian.org>
Description: 
 bibtex2html - filters BibTeX files and translates them to HTML
Closes: 467082
Changes: 
 bibtex2html (1.92-1) unstable; urgency=low
 .
   * New upstream version. This release fixes a bug with accent parsing
     and conversion (closes: Bug#467082).
   * Adapted patch 03_charset to new upstream version.
   * Standards-Version 3.8.0  (no change).
Checksums-Sha1: 
 74f3f3f8c5cf159ea2641af084624386458e56b4 1511 bibtex2html_1.92-1.dsc
 37b95ed2d9427f0289939d46af6839453db60794 69800 bibtex2html_1.92.orig.tar.gz
 beaeff49cf9c8c732ed811587618c2924309bd37 11709 bibtex2html_1.92-1.diff.gz
 7c0cf5734293d946807cba5d6c65c0eab35ef7c4 135772 bibtex2html_1.92-1_all.deb
Checksums-Sha256: 
 935bcedb8f6ca00e1f3e79a6824891a45900c694c7bbf1090084fbc8bc76c2aa 1511 bibtex2html_1.92-1.dsc
 3410acb7c01871a48fb4b483a3d93ade49e7fde2ce6d2c19daa3733c734caaea 69800 bibtex2html_1.92.orig.tar.gz
 32ef2f635c3a36ea705cafe2b08258e611fa20f1692888503ef1cfaad0d7d6c5 11709 bibtex2html_1.92-1.diff.gz
 d5709fee96f43eaf97e51b1d46514f2003439787fd69c463501d25f1f612e011 135772 bibtex2html_1.92-1_all.deb
Files: 
 3d25a0a26813dc11f60bd55a6d58f99e 1511 tex optional bibtex2html_1.92-1.dsc
 9d69980f595be02a79a96a851d79bb88 69800 tex optional bibtex2html_1.92.orig.tar.gz
 736bc45e0bb5e60fae66fe80255a0521 11709 tex optional bibtex2html_1.92-1.diff.gz
 7164f919a7f48894c1c1abc5eec4149e 135772 tex optional bibtex2html_1.92-1_all.deb

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)

iD8DBQFIqzzttzWmSeC6BMERApzLAJ9Rx+35YbWJpl4OebrKU7BQ8ELBzQCg+G59
ymVQzKkCw7WjXS2Mpvfp7aA=
=gjTF
-----END PGP SIGNATURE-----



--- End Message ---

Reply to: