[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#1088132: RFP: ratarmount -- Access large archives as a filesystem efficiently



Package: wnpp
Severity: wishlist

Package name : ratarmount
Version : 1.0.0
URL : https://github.com/mxmlnkn/ratarmount/

Ratarmount collects all file positions inside a TAR so that it can
easily jump to and read from any file without extracting it. It, then,
mounts the TAR using fusepy for read access just like archivemount. In
contrast to libarchive, on which archivemount is based, random access
and true seeking is supported. And in contrast to tarindexer, which
also collects file positions for random access, ratarmount offers easy
access via FUSE and support for compressed TARs.

Capabilities:
Random Access: Care was taken to achieve fast random access inside
compressed streams for bzip2, gzip, xz, and zstd and inside TAR
files by building indices containing seek points.
Highly Parallelized: By default, all cores are used for parallelized
algorithms like for the gzip, bzip2, and xz decoders. This can
yield huge speedups on most modern processors but requires more
main memory. It can be controlled or completely turned off using
the -P <cores> option.
Recursive Mounting: Ratarmount will also mount TARs inside TARs inside
TARs, ... recursively into folders of the same name, which is useful
for the 1.31TB ImageNet data set.
Mount Compressed Files: You may also mount files with one of the
supported compression schemes. Even if these files do not contain a
TAR, you can leverage ratarmount's true seeking capabilities when
opening the mounted uncompressed view of such a file.
Read-Only Bind Mounting: Folders may be mounted read-only to other
folders for usecases like merging a backup TAR with newer versions of
those files residing in a normal folder.
Union Mounting: Multiple TARs, compressed files, and bind mounted
folders can be mounted under the same mountpoint.
Write Overlay: A folder can be specified as write overlay. All changes
below the mountpoint will be redirected to this folder and deletions
are tracked so that all changes can be applied back to the archive.
Remote Files and Folders: A remote archive or whole folder structure
can be mounted similar to tools like sshfs thanks to the
filesystem_spec project. These can be specified with URIs as explained
in the section "Remote Files". Supported remote protocols include: FTP,
HTTP, HTTPS, SFTP, SSH, Git, Github, S3, Samba v2 and v3, Dropbox, ...
Many of these are very experimental and may be slow. Please open a
feature request if further backends are desired.


Reply to: