Re: ftell, fgetpos, etc.
On Thu, 20 Dec 2007 15:48:34 +1100, Alex Samad wrote:
> On Wed, Dec 19, 2007 at 11:39:58PM -0500, Douglas A. Tutty wrote:
>> On Wed, Dec 19, 2007 at 06:09:54PM +0000, Hendrik Boom wrote:
>> > I need to write code that creates, reads, and writes a random-access binary
>> > file, said binary file to be readable and writable on several machines,
>> > which may have different byte sex, but will certainly have different
>> > native word size (32 vs 64 bit). Addresses of positions in the file
>> > *will* have to be written into the file.
>> >
>> > The machines on which this software will have to run presently use
>> > Debian or Debian-derived Linux distributions. (386, AMD64, maemo).
>> >
>> > Now I know how to handle different byte sex (use shifts and masks to
>> > decompose data and recompose it in the chosen file-format -- anyone have a
>> > metter method?).
>> >
>> > What I don't know is how to seek around the file in a machine-independent
>> > manner, and avoid future headaches.
>> >
>> > I can certainly hack up something that works for now, and will have to be
>> > replaced if the files to be handled ever get huge. But I'd like to know
>> > if there's a recommended way of doing it.
>> >
>> > As far as I can tell, the two regimes available are
>> >
>> > (a) use fgetpos and fsetpos
>> > This will presumably do random access to anything the machine's file
>> > system will handle, but the disk address I get from fgetpos are
>> > unliky to be usable on another system.
>> > (b) use ftell and fseek
>> > Now these will solve the problem as long as my files stay small.
>> > They provide byte counts from the start of the file, which are
>> > semantically independent of the platform, but are just long int, which,
>> > last I heard, was 32 bits almost everywhere (and, because of the sign
>> > bit are limited to 31 bits in practise).
>> >
>> > Is there something else available? Is there another way to use the tools
>> > I have already mentioned? Is there a clean way to move to 64-bit
>> > relatively system-independent disk addresses? Is there a standard way?
> not knowing the full requirements for this, but why not create a server app
> that sits on a amd64 machine and create clients that can be on any machine,
> then define the protocol and transfer info via tcp/udp
>
> with multi machines access the same file you are going to have contention
> problems and concurrency problems as well.
>
The problem I'm asking about is a lot more low-level than that. I'm
interested in what standard (or at least common) library functions I can
use that behave transparently in 32- or 64-bit mode depending on the
machine they are compiled and run on. Well, I suppose 64-bit mode is OK
on 32-bit machines, since long long is pretty widely implemented nowadays,
and the length of disk addresses is more important than the length of
efficient integers.
If necessary I'll use #ifdef to choose code depending on the platform, but
that's ugly if the masters of C and Unix have defined a neat way to do it.
-- hendrik
P.S. I am planning to implement protocols on top of this stuff -- sort of
a mix between distributed revision control and data bases. Disconnected
operation with later sync will be essential. But that's not what I'm
asking about here.
>> >
>>
>> To me, a huge file is one which is too big to just load into memory
>> to facilitate the random access. To do random access on a huge file,
>> the speed limit will be the drive access rather than any algorithm you
>> choose, or language for that matter.
>>
>> To be machine independant yet have a pointer always longer than 32 bits,
>> you'll have to write or import a mult-integer data type so that, for
>> example, if you decide that you need a 128-bit integer (for future
>> growth), then you have a function that handles them, then the file seek
>> sections take that to work on, using your imported library to do any
>> math required.
>>
>> However, for current OSs, I think the filesystem is limited to 64 bits
>> for both 32-bit and 64-bit versions (at least in linux for ext2/3).
>>
>> In any event, this is trivial to set up with a language that is more
>> machine independant than standard C. If I were you, I'd prototype it in
>> Python and if it wasn't computationally fast enough, I'd re-implement it
>> in Ada.
>>
>> My answer is vague since your info is a bit vague. What is the purpose
>> of this and what are the parameters.
>>
>> Doug.
>>
>>
>>
>> --
>> To UNSUBSCRIBE, email to debian-user-REQUEST@lists.debian.org
>> with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
>>
>>
Reply to: