Bug#636017: ITP: tran[s[lit]] -- transcribe between character scripts (Cyrillic <-> Latin, etc)

To: Debian Bug Tracking System <submit@bugs.debian.org>
Subject: Bug#636017: ITP: tran[s[lit]] -- transcribe between character scripts (Cyrillic <-> Latin, etc)
From: Adam Borowski <kilobyte@angband.pl>
Date: Sat, 30 Jul 2011 12:18:03 +0200
Message-id: <[🔎] 20110730101803.7812.38494.reportbug@orthanc.angband.pl>
Reply-to: Adam Borowski <kilobyte@angband.pl>, 636017@bugs.debian.org

Package: wnpp
Severity: wishlist
Owner: Adam Borowski <kilobyte@angband.pl>

* Package name    : tran? trans? translit?
  Upstream Author : Adam Borowski <kilobyte@angband.pl>
* URL             : https://github.com/kilobyte/tran
* License         : GPL
  Programming Lang: Perl
  Description     : transcribe between character scripts (Cyrillic <-> Latin, etc)

This is a tool for romanization / cyrillization / greekization / etc of text.
It converts character scripts rather than encodings.  For example, it can
turn "Debian" into "Дэбян", "Δεβιαν".

Currently supported scripts:
* latin
* ascii (ie, dropping accents)
* fullwidth (doublewidth ascii for most of us)
* cyrillic
* greek
* devanagari
* katakana
* hiragana
* hangul
and more are coming.  Unicode has for example 13 fancy sets of letters for
mathematical purposes (fraktur, double-strike, etc), this is not supported
yet because a problem in glibc[1], circled/boxed letters, etc.  Not to
mention all the remaining scripts in Unicode and ConScript.

It tries to do transcription rather than mere transliteration, but is still
pretty naive and doesn't go far into realms of phonetic accuracy.

I named this project ~six years ago "tran" which is probably way too
generic.  I guess "translit" might be a bit better.

There is a similar tool in Debian: libtext-unidecode-perl, but it can go
only one way, targets basic ASCII rather than Latin and fails to preserve
non-letter characters like frames.


[1]. towlower(0x1D400) and friends don't work.  This needs either to be
fixed in glibc, or be worked around with hand-crafted case conversions.

Reply to:

Prev by Date: Bug#636016: ITP: goodbye -- next part after 'hello', and a packaging example
Next by Date: Processed: retitle 631374 to ITP: aj-snapshot -- make snapshots of the connections made between JACK and/or ALSA clients
Previous by thread: Bug#636019: RFA: galax -- XQuery implementation with static typing
Next by thread: Processed: retitle 631374 to ITP: aj-snapshot -- make snapshots of the connections made between JACK and/or ALSA clients
Index(es):
- Date
- Thread