Bug#1118359: ITP: python-ftfy -- Fixes mojibake and other Unicode text problems
Package: wnpp
Severity: wishlist
Owner: Edward Betts <edward@4angle.com>
X-Debbugs-Cc: debian-devel@lists.debian.org, debian-python@lists.debian.org
* Package name : python-ftfy
Version : 6.3.1
Upstream Author : Robyn Speer <rspeer@arborelia.net>
* URL : https://github.com/rspeer/python-ftfy
* License : Apache-2.0
Programming Lang: Python
Description : Fixes mojibake and other Unicode text problems
This library automatically repairs text that has been corrupted by misapplied
character encodings, such as mojibake or other encoding-related issues. It
analyzes strings to identify and correct cases where characters were
incorrectly decoded, reconstructing the intended Unicode text. This includes
fixing multiple layers of encoding errors, handling curly quote characters,
and decoding HTML entities that are outside of proper HTML contexts, even with
unusual capitalization. The library is designed to avoid making unnecessary or
incorrect changes to text that is already correctly encoded. It helps restore
text readability in content that has been malformed through various data
handling and transfer processes, such as those involving databases,
spreadsheets, or outputs from web sources. It does not attempt to detect
encodings from scratch, but rather focuses on repairing commonly-encountered
forms of corrupted Unicode text.
I plan to maintain this package as part of the Python team.
Reply to: