[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: can someone replicate this cut and paste bug?



Darac Marjal wrote:
>songbird wrote:

>>   when cutting and pasting from evince
>> to a terminal the process is translating
>> a "um" into a "mm"  which is a signifcant
>> change for a technical document.
>>=20
>>   the source document used was downloaded from:
>>=20
>> "http://www.nature.com/srep/2013/130425/srep01732/pdf/srep01732.pdf";
>>=20
>>   search for the phrase
>>=20
>> "particle size between 200=E2=80=93500 um"
>>=20
>>   and copy and paste it to any text terminal,=20
>> vi'd document, or even LibreOffice doc.  all
>> do the same thing and translate it from um to
>> mm.
>>=20
>>   so if someone can replicate this it would
>> help.  if not, then i'd be stumped because
>> i have no familiarity with cut and paste=20
>> underpinnings...
>
> I suspect the problem is the source document. What you're reading as
> "200-500 =CE=BCm", is actually (if I can 'translate' to HTML for clarity)
> "200-500 <font family=3D"symbol">m</font>m". That is, the first character
> of the units is actually a lower-case M, but shown in a greek or symbol
> font, such that it is rendered like a Mu.
>
> I suspect that, if you want to fix this, you either need to convert the
> PDF with a sensible converter (try the pdf tools in debian;
> poppler-utils for example) or talk to the authors of the paper and see
> if you can get a copy of the source (maybe there was a Word or TeX
> document that it was written up in).

  i don't have access to the original document
other than the link posted above.

  i hope it doesn't happen in other documents as 
i think scientific people who are copying and 
pasting from PDF docs may be in for a rather rude 
mistranslation of units.

  would the debian-med folks be better at hunting
this down?

  i can verify that in the Times New Roman font i'm
using in Libreoffice that when i paste a µ into it
that it does work correctly, but when i hit return
at the end of the line it translates the initial µ
into a capital M.  this happens on the first line
on the page only.  after that line the µ's work 
correctly.

  this is just odd...  ok, i give up for now, but if
someone can point this to a scientific text document
person (maybe debian-med or debian-edu) who copies and
pastes a lot that would be good as then they could be 
made aware some kind of strange effect is going on.
i need to get some sleep.  :)

  thanks, 

  songbird


Reply to: