[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Output of dpkg-scanpackages as XML



William Ballard wrote on 05/01/2005 22:42:
On Wed, Jan 05, 2005 at 04:32:07PM -0500, William Ballard wrote:

 echo '</Long-Description></entry></packages>'

         ^^^

Should have closed the CDATA tag here.  The short description
tag should probably be wrapped in CDATA too.  If any package
descriptions contain "]]>", it'll break it.

I was able to succesfully turn the sarge/contrib (i386) Packages file into a valid XML file with the following modified version of your script. It is still definately a hack though. Especially the way it escapes Non-ASCII characters.

Since it contains a few long lines, I attached it.  It's under 1k in size.

hth

cu,
sven
#!/bin/bash

PACKAGES=$1
CAT=cat
if [[ ! -f ${PACKAGES} ]]; then
	echo ${PACKAGES not found
	exit 1
fi
if file ${PACKAGES} | grep -q gzip ; then
		CAT=ZCAT
fi

echo '<packages><entry>'
${CAT} ${PACKAGES} \
  | grep-dctrl . \
  | sed -r \
    -e 's/&/\&amp;/g;s/</\&lt;/g;s/>/\&gt;/g;s/ñ/\&#361;/g;s/é/\&#351;/g;s/í/\&#355;/g' \
    -e 's/(Description): (.+)/<\1><Short-Description>\2<\/Short-Description><Long-Description><CDATA>/' \
    -e 's/^([^: ]+): (.+)/<\1><CDATA>\2<\/CDATA><\/\1>/' \
    -e 's/^$/><\/CDATA><\/Long-Description><\/Description><\/entry><entry>/' \
  | head -n-1
echo '</CDATA></Long-Description></Description></entry></packages>'

Reply to: