[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#228471: RFP: libvstr1 -- A fast and secure string/buffer library for C



Package: wnpp
Severity: wishlist

* Package name    : libvstr1
  Version         : 1.0.11
  Upstream Author : James Antill <james@and.org>
* URL             : http://www.and.org/vstr/
* License         : LGPL
  Description     : A fast and secure string/buffer library for C

vstr is not used by any major packages as yet, but it looks very
promising.  I am using it in my own projects.



>From the web page:

Vstr is a string library, it's designed so you can work optimally with
readv()/writev() for input/output. This means that, for instance, you
can readv() data to the end of the string and writev() data from the
beginning of the string without having to allocate or move memory. It
also means that the library is completely happy with data that has
multiple zero bytes in it.

This design constraint means that unlike most string libraries Vstr
doesn't have an internal representation of the string where everything
can be accessed from a single (char *) pointer in C, the internal
representation is of multiple "blocks" or nodes each carrying some of
the data for the string. This model of representing the data also
means that as a string gets bigger the Vstr memory usage only goes up
linearly and has no inherent copying (due to other string libraries
increasing space for the string via. realloc() the memory usage can be
triple the required size and require a complete copy of the string).

It also means that adding, substituting or moving data anywhere in the
string can be optimized a lot, to require O(1) copying instead of
O(n). Speaking of O(1), it's worth remembering that if you have a Vstr
string with caching it is O(1) to get all the data to the writev()
system call (the cat example below shows an example of this, the write
call is always constant time.  As well as having features directly
related to doing IO well it contains functions for:

    * a printf like function that is fully ISO 9899:1999 (C99)
compliant, also having %m as standard and POSIX i18n parameter number
modifiers. It also allows gcc warning compatible customer format
specifiers (and includes pre-written custom format specifiers for ipv4
and ipv6 addresses, Vstr strings and more)
    * splitting of strings into parameter/record chunks (a la perl).
    * substituting data in a Vstr string
    * moving data from one Vstr string to another (or within a Vstr
string).
    * comparing strings (without regard for case, or taking into
account version information)
    * searching for data in strings (with or without regard for case).
    * counting spans of data in a string (the equivalent of strspn()
in ISO C).
    * converting data in a Vstr (Ie. delete/substitute unprintable
characters or making a Vstr string lowercase/uppercase).
    * parsing data from a Vstr string (Ie. numbers, or ipv4
addresses).
    * easily parsing and wrapping outgoing data in netstrings, for
fast and simple (and hence less error prone) network communication
    * the ability to cache aspects of data about a Vstr string, to
both simplify and speedup use of the string.
    * the ability to have empty data as part of the string, this is
somewhat useful for representing file transfers as a string as you can
represent the file data as empty data in the string.

It also has a number of functions for exporting data from a Vstr
string so you can easily use data generated with the Vstr outside of
the library.

The other unusual aspect of the Vstr string library is that it
attaches a notion of a locale to the string configuration and not
globally (as POSIX, and pretty much everything else does). This means
that you can do Network I/O in the C locale and user IO in the users
locale.

-- System Information:
Debian Release: testing/unstable
Architecture: i386
Kernel: Linux wistful 2.4.23 #1 Mon Dec 15 21:16:43 EST 2003 i686
Locale: LANG=en_AU, LC_CTYPE=en_AU




Reply to: