Is it save to let translate C format strings?
Hi all,
I found an ugly format flag in hex-a-hop which resulted from passing
a translator specified string (_("> Continue game 1 (1% complete) <"))
to the first argument of printf (I introduced the error myself).
It resulted in "> Continue game 1 (1�omplete) <" as "% c" was interpreted
as format sequence. (I thought such a flag results in a leading space
and not truncated bytes but let's ignore this ...)
A fix was simple, just add "%s" as format string and pass the other
string as second argument.
But now I wonder: Is it save to write the following?
printf(_("Hello World: %s"), a_string);
The translation of "Hello World: %s" could contain multibytes. Isn't
it possible that there exists an encoding in which the
translation contains dangerous bytes such as %s (even if the translator
didn't used the 7bit character %)?
I know that this cannot happen with UTF-8, as multibytes always have the
8th bit set for all bytes. But there exist more encodings ...
I tried to workaround by using "%s" as format specifier where possible,
but in the given example it is not possible (except if I parse all
format flags myself instead of asking printf to do so).
Jens
Reply to: