std::time_put<wchar_t> is broken with UTF-8 locales
>Submitter-Id: net
>Originator: Roger Leigh
>Organization: Debian
>Confidential: no
>Synopsis: std::time_put<wchar_t> is broken with UTF-8 locales
>Severity: non-critical
>Priority: low
>Category: libstdc++
>Class: sw-bug
>Release: 3.4.4 20041113 (prerelease) (Debian 3.4.3-1) (Debian testing/unstable)
>Environment:
System: Linux whinlatter 2.6.9 #7 Mon Oct 25 23:49:41 BST 2004 i686 GNU/Linux
Architecture: i686
host: i486-pc-linux-gnu
build: i486-pc-linux-gnu
target: i486-pc-linux-gnu
configured with: ../src/configure -v --enable-languages=c,c++,java,f77,pascal,objc,ada,treelang --prefix=/usr --libexecdir=/usr/lib --with-gxx-include-dir=/usr/include/c++/3.4 --enable-shared --with-system-zlib --enable-nls --without-included-gettext --program-suffix=-3.4 --enable-__cxa_atexit --enable-libstdcxx-allocator=mt --enable-clocale=gnu --enable-libstdcxx-debug --enable-java-gc=boehm --enable-java-awt=gtk --disable-werror i486-linux
>Description:
Using the example code below, this is the output for an en_GB
UTF-8 locale:
$ ./date3
asctime: Thu Nov 25 17:39:15 2004
strftime: Thu 25 Nov 2004 17:39:15 GMT
std::time_put<char>: Thu 25 Nov 2004 17:39:15 GMT
std::time_put<wchar_t>: Thu 25 Nov 2004 17:39:15 GMT
Some examples of running in an alternate locale are shown below.
Please note: these locales have all been generated to use UTF-8
as their codeset.
$ LANG=fr_FR LC_ALL=fr_FR ./date3
asctime: Thu Nov 25 17:40:05 2004
strftime: jeu 25 nov 2004 17:40:05 GMT
std::time_put<char>: jeu 25 nov 2004 17:40:05 GMT
std::time_put<wchar_t>: jeu 25 nov 2004 17:40:05 GMT
$ LANG=de_DE LC_ALL=de_DE ./date3
asctime: Thu Nov 25 17:40:29 2004
strftime: Do 25 Nov 2004 17:40:29 GMT
std::time_put<char>: Do 25 Nov 2004 17:40:29 GMT
std::time_put<wchar_t>: Do 25 Nov 2004 17:40:29 GMT
$ LANG=ru_RU LC_ALL=ru_RU ./date3
asctime: Fri Nov 26 00:14:07 2004
strftime: Ð?Ñ?н 26 Ð?оÑ? 2004 00:14:07
std::time_put<char>: Ð?Ñ?н 26 Ð?оÑ? 2004 00:14:07
std::time_put<wchar_t>: B= 26 >O 2004 00:14:07
(For GCC 3.3.5)
$ LANG=ru_RU LC_ALL=ru_RU ./date3
asctime: Thu Nov 25 17:50:55 2004
strftime: ЧÑ?в 25 Ð?оÑ? 2004 17:50:55
std::time_put<char>: ЧÑ?в 25 Ð?оÑ? 2004 17:50:55
std::time_put<wchar_t>:
For some reason, the Russian Cyrillic didn't work correctly with
wide streams, but it's an improvement over GCC 3.3.5 (which failed
to output anything). However, there's still a problem outputting
proper UTF-8 codes for non-ASCII characters.
>How-To-Repeat:
The following code should illustrate the problem. Please note I
only tested on GNU/Linux with GNU libc and UTF-8 locales. The
problem might be GNU/Linux-specific and/or cover other codesets.
#include <iostream>
#include <locale>
#include <ctime>
int main()
{
// Set up locale stuff...
std::locale::global(std::locale(""));
std::cout.imbue(std::locale());
std::wcout.imbue(std::locale());
// Get current time
time_t simpletime = time(0);
// Break down time.
std::tm brokentime;
localtime_r(&simpletime, &brokentime);
// Normalise.
mktime(&brokentime);
std::cout << "asctime: " << asctime(&brokentime);
// Print with strftime(3)
char buffer[40];
strftime(&buffer[0], 40, "%c", &brokentime);
std::cout << "strftime: " << &buffer[0] << '\n';
// Try again, but use proper locale facets...
const std::time_put<char>& tp =
std::use_facet<std::time_put<char> >(std::cout.getloc());
std::string pattern("std::time_put<char>: %c\n");
tp.put(std::cout, std::cout, std::cout.fill(),
&brokentime, &*pattern.begin(), &*pattern.end());
// And again, but using wchar_t...
const std::time_put<wchar_t>& wtp =
std::use_facet<std::time_put<wchar_t> >(std::wcout.getloc());
std::wstring wpattern(L"std::time_put<wchar_t>: %c\n");
wtp.put(std::wcout, std::wcout, std::wcout.fill(),
&brokentime, &*wpattern.begin(), &*wpattern.end());
return 0;
}
>Fix:
None at this time, sorry.
Regards,
Roger
Reply to: