[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#1118359: ITP: python-ftfy -- Fixes mojibake and other Unicode text problems



Package: wnpp
Severity: wishlist
Owner: Edward Betts <edward@4angle.com>
X-Debbugs-Cc: debian-devel@lists.debian.org, debian-python@lists.debian.org

* Package name    : python-ftfy
  Version         : 6.3.1
  Upstream Author : Robyn Speer <rspeer@arborelia.net>
* URL             : https://github.com/rspeer/python-ftfy
* License         : Apache-2.0
  Programming Lang: Python
  Description     : Fixes mojibake and other Unicode text problems

  This library automatically repairs text that has been corrupted by misapplied
  character encodings, such as mojibake or other encoding-related issues. It
  analyzes strings to identify and correct cases where characters were
  incorrectly decoded, reconstructing the intended Unicode text. This includes
  fixing multiple layers of encoding errors, handling curly quote characters,
  and decoding HTML entities that are outside of proper HTML contexts, even with
  unusual capitalization. The library is designed to avoid making unnecessary or
  incorrect changes to text that is already correctly encoded. It helps restore
  text readability in content that has been malformed through various data
  handling and transfer processes, such as those involving databases,
  spreadsheets, or outputs from web sources. It does not attempt to detect
  encodings from scratch, but rather focuses on repairing commonly-encountered
  forms of corrupted Unicode text.

I plan to maintain this package as part of the Python team.


Reply to: