[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Ideas for a dh-privacy-helper



Quoting Bastien Roucariès (2021-09-02 17:53:18)
> A few year ago I have created the privacy-breach lintian checks in 
> order to detect trackers in our doc
> 
> I think we are losing the battle here.
> 
> I believe that we need better tools than sed in order to fix this kind 
> of problem.
> 
> I have some idea like:
> - read the html tree
> - convert the html tree dom representation to xml serialization (so called 
>   XHTML5 or polyglot)
> - apply to this xhtml5 xslt2 rules for fixing the privacy breach
> 
> The problem are the tools to use...
> 
> I will like to use javascript for this kind of transformation but 
> nodejs does not compile on armel, and for saxon-ce I need gwt that is 
> not in debian...
> 
> I could use saxon2,but it will need java.

Perl is famous for its text juggling features, and sloppy parsing of 
html can be done e.g. with HTML::HTML5::Parser (i.e. Debian package 
libhtml-html5-parser-perl).

Also, debhelper itself is written in perl, so is likely easier to 
integrate plugins written in perl as well.  If perl is an option at all, 
obviously...

I am sure Python/Ruby/PHP/Haskell/Scheme/Rust/etc. folks will argue that 
their pet language is the right for the task as well: I think it will 
help the conversation if you clarify what you are open to and what are 
constraints for you.

E.g. do you mean that it *must* be JavaScript when you mention that?  Or 
are you perhaps asking if someone else wants to take over the challenge 
from you, so it does not matter how it is done?


 - Jonas

-- 
 * Jonas Smedegaard - idealist & Internet-arkitekt
 * Tlf.: +45 40843136  Website: http://dr.jones.dk/

 [x] quote me freely  [ ] ask before reusing  [ ] keep private

Attachment: signature.asc
Description: signature


Reply to: