[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: [GsoC] your latest commit of tasks_udd.py



Hi,

On Wed, Aug 5, 2015 at 7:56 PM, Andreas Tille <andreas@an3as.eu> wrote:
>
> Hi Akshita,
>
> On Wed, Aug 05, 2015 at 06:10:22PM +0530, Akshita Jha wrote:
> > > The new version now has 'ü' instead of '&#252'; and 'á' instead of
> > > '&#225;'.  The result in the browser is the same but it looks better
> > > when reading the code.  I wonder whether it makes sense to strip those
> > > diffs from the test code by simply droping in class 'published' all
> > > non-ASCII signs in the new code and all "&#[0-9];" in the old code.  It
> > > might be a bit dangerous to overlook a real diff but as far as I can see
> > > currently it simply spoils the diff.
> >
> > As you said, I am not sure if it is indeed a good idea to not check for the
> > difference in class 'published'. But if you say so, we can implement it
> > temporarily.
>
> May be you could implement an option "-sloppy" which ignores this kind of
> stuff. 

This is done. The class 'published' is not considered for diff anymore.
Usage: ./test_output.py <optional sloppy flag (-s)> <Blend> <optional lang>

> Also in the end we have some fake-diff by
>
> -<data> : <start_tag> : span <attr> : ('class', 'navpara') <data> : <start_tag> : a <attr> : ('href', 'bio-ngs.ru.html') <attr> : ('title', 'Russian') <attr> : ('hreflang', 'ru') <attr> : ('lang', 'ru')        <attr> : ('rel', 'alternate') <data> : Русский (Russkij) <end_tag> : a
> +<data> : <start_tag> : span <attr> : ('class', 'navpara') <data> : <start_tag> : a <attr> : ('href', 'bio-ngs.ru.html') <attr> : ('title', 'Russian') <attr> : ('hreflang', 'ru') <attr> : ('lang', 'ru')        <attr> : ('rel', 'alternate') <data> : Русск ий (Russkij) <end_tag> : a
>

The diff is because of the space: Русский (Russkij), Русск ий (Russkij). Should this be ignored ? Because if we ignore the space here, then we might need to ignore space for contents in <data> everytime. Hmm. I don't think this will cause any problem though. We can ignore this diff only if '-s' flag is used. What do you feel about it ?

> We could also ignore:
>
> -<data> : <start_tag> : address <data> : Last update: Fri, 31 Jul 2015 12:13:09 -0000 <end_tag> : address
> +<data> : <start_tag> : address <data> : Last update: Wed, 05 Aug 2015 13:07:39 -0000 <end_tag> : address
>

This is also ignored irrespective of '-s' flag. This diff is never computed.

--
Regards,
Akshita Jha

Reply to: