[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Endianness in UTF-16 encoding


I have a curious endianness problem. I get different results with the Sun
JVM and gij. Consider this simple program:

import java.io.*;

public class Test
    public static void main(String[] args) throws java.io.IOException
        OutputStreamWriter o = new OutputStreamWriter(System.out, "UTF-16");

According to Sun's API docs (Charset class), the UTF-16 encoding is supposed
to default to big-endian. This is also what I get when running with Sun's

00000000: feff 0048 0065 006c 006c 006f 0021       ...H.e.l.l.o.!

But when I run the same program (still Sun-compiled) with gij, I get
little-endian output:

00000000: fffe 4800 6500 6c00 6c00 6f00 2100       ..H.e.l.l.o.!.

This difference causes the test suite of one my packages to fail. I'm
running on i386.

What confuses me is that I checked the Classpath source for
OutputStreamWriter, and it does the right thing, i.e., big-endian:

Is there a bug somewhere? And where should I look for it?


Reply to: