[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#1001226: ITP: golang-github-segmentio-ksuid -- K-Sortable Globally Unique IDs



Package: wnpp
Severity: wishlist
Owner: Anthony Fok <foka@debian.org>

* Package name    : golang-github-segmentio-ksuid
  Version         : 1.0.4-1
  Upstream Author : Segment (https://segment.com/)
* URL             : https://github.com/segmentio/ksuid
* License         : Expat
  Programming Lang: Go
  Description     : K-Sortable Globally Unique IDs

 ksuid is an efficient, comprehensive, battle-tested Go library for
 generating and parsing a specific kind of globally unique identifier
 called a *KSUID*. This library serves as its reference implementation.
 .
 What is a KSUID?
 .
 KSUID is for K-Sortable Unique IDentifier. It is a kind of globally
 unique identifier similar to a RFC 4122 UUID
 (https://en.wikipedia.org/wiki/Universally_unique_identifier), built
 from the ground-up to be "naturally" sorted by generation timestamp
 without any special type-aware logic.
 .
 In short, running a set of KSUIDs through the UNIX sort command will
 result in a list ordered by generation time.
 .
 Why use KSUIDs?
 .
 There are numerous methods for generating unique identifiers, so why
 KSUID?
 .
  1. Naturally ordered by generation time
  2. Collision-free, coordination-free, dependency-free
  3. Highly portable representations
 .
 Even if only one of these properties are important to you, KSUID is a
 great choice! :) Many projects chose to use KSUIDs *just* because the
 text representation is copy-and-paste friendly.
 .
 1. Naturally Ordered By Generation Time
 .
 Unlike the more ubiquitous UUIDv4, a KSUID contains a timestamp
 component that allows them to be loosely sorted by generation time. This
 is not a strong guarantee (an invariant) as it depends on wall clocks,
 but is still incredibly useful in practice. Both the binary and text
 representations will sort by creation time without any special sorting
 logic.
 .
 2. Collision-free, Coordination-free, Dependency-free
 .
 While RFC 4122 UUIDv1s *do* include a time component, there aren't
 enough bytes of randomness to provide strong protection against
 collisions (duplicates). With such a low amount of entropy, it is
 feasible for a malicious party to guess generated IDs, creating a
 problem for systems whose security is, implicitly or explicitly,
 sensitive to an adversary guessing identifiers.
 .
 To fit into a 64-bit number space, Snowflake IDs
 (https://blog.twitter.com/2010/announcing-snowflake) and its derivatives
 require coordination to avoid collisions, which significantly increases
 the deployment complexity and operational burden.
 .
 A KSUID includes 128 bits of pseudorandom data ("entropy"). This number
 space is 64 times larger than the 122 bits used by the well-accepted RFC
 4122 UUIDv4 standard. The additional timestamp component can be
 considered "bonus entropy" which further decreases the probability of
 collisions, to the point of physical infeasibility in any practical
 implementation.
 .
 3. Highly Portable Representations
 .
 The text *and* binary representations are lexicographically sortable,
 which allows them to be dropped into systems which do not natively
 support KSUIDs and retain their time-ordered property.
 .
 The text representation is an alphanumeric base62 encoding, so it "fits"
 anywhere alphanumeric strings are accepted. No delimiters are used, so
 stringified KSUIDs won't be inadvertently truncated or tokenized when
 interpreted by software that is designed for human-readable text, a
 common problem for the text representation of RFC 4122 UUIDs.
 .
 How do KSUIDs work?
 .
 Binary KSUIDs are 20-bytes: a 32-bit unsigned integer UTC timestamp and a
 128-bit randomly generated payload. The timestamp uses big-endian
 encoding, to support lexicographic sorting. The timestamp epoch is
 adjusted to May 13th, 2014, providing over 100 years of life. The
 payload is generated by a cryptographically-strong pseudorandom number
 generator.
 .
 The text representation is always 27 characters, encoded in alphanumeric
 base62 that will lexicographically sort by timestamp.
 .
 High Performance
 .
 This library is designed to be used in code paths that are performance
 critical. Its code has been tuned to eliminate all non-essential
 overhead. The KSUID type is derived from a fixed-size array, which
 eliminates the additional reference chasing and allocation involved in a
 variable-width type.
 .
 The API provides an interface for use in code paths which are sensitive
 to allocation. For example, the Append method can be used to parse the
 text representation and replace the contents of a KSUID value without
 additional heap allocation.
 .
 All public package level "pure" functions are concurrency-safe, protected
 by a global mutex. For hot loops that generate a large amount of KSUIDs
 from a single Goroutine, the Sequence type is provided to elide the
 potential contention.
 .
 By default, out of an abundance of caution, the cryptographically-secure
 PRNG is used to generate the random bits of a KSUID. This can be relaxed
 in extremely performance-critical code using the included FastRander
 type. FastRander uses the standard PRNG with a seed generated by the
 cryptographically-secure PRNG.
 .
 *NOTE: While there is no evidence that FastRander will increase
 theprobability of a collision, it shouldn't be used in scenarios
 whereuniqueness is important to security, as there is an increased
 chancethe generated IDs can be predicted by an adversary.*
 .
 Battle Tested
 .
 This code has been used in production at Segment for several years,
 across a diverse array of projects. Trillions upon trillions of KSUIDs
 have been generated in some of Segment's most performance-critical, large-
 scale distributed systems.
 .
 Plays Well With Others
 .
 Designed to be integrated with other libraries, the KSUID type
 implements many standard library interfaces, including:
 .
  * Stringer
  * database/sql.Scanner and database/sql/driver.Valuer
  * encoding.BinaryMarshal and encoding.BinaryUnmarshal
  * encoding.TextMarshal and encoding.TextUnmarshal
    (encoding/json friendly!)
 .
 Command Line Tool
 .
 This package comes with a command-line tool ksuid, useful for generating
 KSUIDs as well as inspecting the internal components of existing KSUIDs.
 Machine-friendly output is provided for scripting use cases.
 .
 Given a Go build environment, it can be installed with the command:
 .
   $ go install github.com/segmentio/ksuid/cmd/ksuid
 .
 CLI Usage Examples
 .
 Generate a KSUID
 .
   $ ksuid
   0ujsswThIGTUYm2K8FjOOfXtY1K
 .
 Generate 4 KSUIDs
 .
   $ ksuid -n 4
   0ujsszwN8NRY24YaXiTIE2VWDTS
   0ujsswThIGTUYm2K8FjOOfXtY1K
   0ujssxh0cECutqzMgbtXSGnjorm
   0ujsszgFvbiEr7CDgE3z8MAUPFt
 .
 Inspect the components of a KSUID
 .
   $ ksuid -f inspect 0ujtsYcgvSTl8PAuAdqWYSMnLOv
 .
   REPRESENTATION:
 .
     String: 0ujtsYcgvSTl8PAuAdqWYSMnLOv
        Raw: 0669F7EFB5A1CD34B5F99D1154FB6853345C9735
 .
   COMPONENTS:
 .
          Time: 2017-10-09 21:00:47 -0700 PDT
     Timestamp: 107608047
       Payload: B5A1CD34B5F99D1154FB6853345C9735
 .
 Generate a KSUID and inspect its components
 .
   $ ksuid -f inspect
 .
   REPRESENTATION:
 .
     String: 0ujzPyRiIAffKhBux4PvQdDqMHY
        Raw: 066A029C73FC1AA3B2446246D6E89FCD909E8FE8
 .
   COMPONENTS:
 .
          Time: 2017-10-09 21:46:20 -0700 PDT
     Timestamp: 107610780
       Payload: 73FC1AA3B2446246D6E89FCD909E8FE8
 .
 Inspect a KSUID with template formatted inspection output
 .
   $ ksuid -f template -t '{{ .Time }}: {{ .Payload }}'
 0ujtsYcgvSTl8PAuAdqWYSMnLOv
   2017-10-09 21:00:47 -0700 PDT: B5A1CD34B5F99D1154FB6853345C9735
 .
 Inspect multiple KSUIDs with template formatted output
 .
   $ ksuid -f template -t '{{ .Time }}: {{ .Payload }}' $(ksuid -n 4)
   2017-10-09 21:05:37 -0700 PDT: 304102BC687E087CC3A811F21D113CCF
   2017-10-09 21:05:37 -0700 PDT: EAF0B240A9BFA55E079D887120D962F0
   2017-10-09 21:05:37 -0700 PDT: DF0761769909ABB0C7BB9D66F79FC041
   2017-10-09 21:05:37 -0700 PDT: 1A8F0E3D0BDEB84A5FAD702876F46543
 .
 Generate KSUIDs and output JSON using template formatting
 .
   $ ksuid -f template -t '{ "timestamp": "{{ .Timestamp }}", "payload": "{{ .Payload }}", "ksuid": "{{.String}}"}' -n 4
   { "timestamp": "107611700", "payload": "9850EEEC191BF4FF26F99315CE43B0C8", "ksuid": "0uk1Hbc9dQ9pxyTqJ93IUrfhdGq"}
   { "timestamp": "107611700", "payload": "CC55072555316F45B8CA2D2979D3ED0A", "ksuid": "0uk1HdCJ6hUZKDgcxhpJwUl5ZEI"}
   { "timestamp": "107611700", "payload": "BA1C205D6177F0992D15EE606AE32238", "ksuid": "0uk1HcdvF0p8C20KtTfdRSB9XIm"}
   { "timestamp": "107611700", "payload": "67517BA309EA62AE7991B27BB6F2FCAC", "ksuid": "0uk1Ha7hGJ1Q9Xbnkt0yZgNwg3g"}
 .
 Implementations for other languages
 .
  * Python: svix-ksuid (https://github.com/svixhq/python-ksuid/)
  * Ruby: ksuid-ruby (https://github.com/michaelherold/ksuid-ruby)
  * Java: ksuid (https://github.com/ksuid/ksuid)
  * Rust: rksuid (https://github.com/nharring/rksuid)
  * dotNet: Ksuid.Net (https://github.com/JoyMoe/Ksuid.Net)
 .
 License
 .
 ksuid source code is available under an MIT License (/LICENSE.md).


Reason for packaging:
 Prerequisite of Glow (https://github.com/charmbracelet/glow)


Reply to: