Re: ftell, fgetpos, etc.

To: debian-user@lists.debian.org
Subject: Re: ftell, fgetpos, etc.
From: Hendrik Boom <hendrik@topoi.pooq.com>
Date: Sat, 22 Dec 2007 03:13:59 +0000 (UTC)
Message-id: <[🔎] fkhvdn$e74$1@ger.gmane.org>
References: <[🔎] fkbmpi$uaj$1@ger.gmane.org> <[🔎] 20071220043958.GA7112@titan.hooton> <[🔎] 20071220044834.GH12092@samad.com.au>

On Thu, 20 Dec 2007 15:48:34 +1100, Alex Samad wrote:

> On Wed, Dec 19, 2007 at 11:39:58PM -0500, Douglas A. Tutty wrote:
>> On Wed, Dec 19, 2007 at 06:09:54PM +0000, Hendrik Boom wrote:
>> > I need to write code that creates, reads, and writes a random-access binary
>> > file, said binary file to be readable and writable on several machines,
>> > which may have different byte sex, but will certainly have different
>> > native word size (32 vs 64 bit).  Addresses of positions in the file
>> > *will* have to be written into the file.
>> > 
>> > The machines on which this software will have to run presently use
>> > Debian or Debian-derived Linux distributions. (386, AMD64, maemo).
>> > 
>> > Now I know how to handle different byte sex (use shifts and masks to
>> > decompose data and recompose it in the chosen file-format -- anyone have a
>> > metter method?).
>> > 
>> > What I don't know is how to seek around the file in a machine-independent
>> > manner, and avoid future headaches.
>> > 
>> > I can certainly hack up something that works for now, and will have to be
>> > replaced if the files to be handled ever get huge.  But I'd like to know
>> > if there's a recommended way of doing it.
>> > 
>> > As far as I can tell, the two regimes available are
>> > 
>> > (a) use fgetpos and fsetpos
>> >   This will presumably do random access to anything the machine's file
>> >   system will handle, but the disk address I get from fgetpos are
>> >   unliky to be usable on another system.
>> > (b) use ftell and fseek
>> >   Now these will solve the problem as long as my files stay small.
>> >   They provide byte counts from the start of the file, which are
>> >   semantically independent of the platform, but are just long int, which,
>> >   last I heard, was 32 bits almost everywhere (and, because of the sign
>> >   bit are limited to 31 bits in practise).
>> > 
>> > Is there something else available?  Is there another way to use the tools
>> > I have already mentioned?  Is there a clean way to move to 64-bit
>> > relatively system-independent disk addresses?  Is there a standard way?
> not knowing the full requirements for this, but why not create a server app 
> that sits on a amd64 machine and create clients that can be on any machine, 
> then define the protocol and transfer info via tcp/udp
> 
> with multi machines access the same file you are going to have contention 
> problems and concurrency problems as well.
> 

The problem I'm asking about is a lot more low-level than that.  I'm
interested in what standard (or at least common) library functions I can
use that behave transparently in 32- or 64-bit mode depending on the
machine they are compiled and run on.  Well, I suppose 64-bit mode is OK
on 32-bit machines, since long long is pretty widely implemented nowadays,
and the length of disk addresses is more important than the length of
efficient integers.

If necessary I'll use #ifdef to choose code depending on the platform, but
that's ugly if the masters of C and Unix have defined a neat way to do it.

-- hendrik

P.S.  I am planning to implement protocols on top of this stuff -- sort of
a mix between distributed revision control and data bases.  Disconnected
operation with later sync will be essential.  But that's not what I'm
asking about here.

>> >  
>> 
>> To me, a huge file is one which is too big to just load into memory
>> to facilitate the random access.  To do random access on a huge file,
>> the speed limit will be the drive access rather than any algorithm you
>> choose, or language for that matter.  
>> 
>> To be machine independant yet have a pointer always longer than 32 bits,
>> you'll have to write or import a mult-integer data type so that, for
>> example, if you decide that you need a 128-bit integer (for future
>> growth), then you have a function that handles them, then the file seek
>> sections take that to work on, using your imported library to do any
>> math required.
>> 
>> However, for current OSs, I think the filesystem is limited to 64 bits
>> for both 32-bit and 64-bit versions (at least in linux for ext2/3).  
>> 
>> In any event, this is trivial to set up with a language that is more
>> machine independant than standard C.  If I were you, I'd prototype it in
>> Python and if it wasn't computationally fast enough, I'd re-implement it
>> in Ada.
>> 
>> My answer is vague since your info is a bit vague.  What is the purpose
>> of this and what are the parameters.
>> 
>> Doug.
>> 
>> 
>> 
>> -- 
>> To UNSUBSCRIBE, email to debian-user-REQUEST@lists.debian.org 
>> with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
>> 
>>

Reply to:

References:
- ftell, fgetpos, etc.
  - From: Hendrik Boom <hendrik@topoi.pooq.com>
- Re: ftell, fgetpos, etc.
  - From: "Douglas A. Tutty" <dtutty@porchlight.ca>
- Re: ftell, fgetpos, etc.
  - From: Alex Samad <alex@samad.com.au>

Prev by Date: Re: concatenating pdf files
Next by Date: Re: what to take off the root partition
Previous by thread: Re: ftell, fgetpos, etc.
Next by thread: Re: ftell, fgetpos, etc.
Index(es):
- Date
- Thread