[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: pragma supplementation-page



I've now pushed out a new version, with an initial demonstration
of the modification process.  It still has some bugs, for example this...

  {{{
  ## <<SomeMacro()>>
  }}}

... will cause it to leave internal data in its output, because it's confused
by the nested block/comment/macro syntax.

You pointed out that discussion links are often part of translation headers,
so this demo focusses on standardising those headers.  It currently touches
over 4,800 pages - a quick eyeball suggests half of those are just bugs,
but that still means the next bulk change will be >2,000 translation headers.

To run the demo:

1. cd /path/to/mediawiki-recommendations
2. git pull
3. make your wiki dump available to the repo:
   ln -s /path/to/wiki/dump data
   (/path/to/wiki/dump should be the directory that has a "pages" subdirectory)
4. populate the cache directory:
   make cache
   (does a lot of random disk access, may be slow)
5. run the demo:
   make fix-headers
6. compare:
   for FILE in fix-headers/* ; do diff -Naur cache/"$(basename "$FILE")" "$FILE" ; done | less

There should be 4865 pages in fix-headers, and the diff should be just under
60,000 lines looking mostly like this:

--- cache/acpid 2011-07-02 13:40:41.000000000 +0100
+++ fix-headers/acpid   2025-09-12 18:42:16.984544888 +0100
@@ -1,5 +1,8 @@
 #language en
-~-[[DebianWiki/EditorGuide#translation|Translation(s)]]: English - [[it/acpid|Italiano]] -~
+#pragma supplementation-page on
+##TAG:TRANSLATION-HEADER-START
+~-[[DebianWiki/EditorGuide#translation|Translation(s)]]: [[acpid|English]] - [[it/acpid|Italiano]]-~
+##TAG:TRANSLATION-HEADER-END
 ----

--- cache/it(2f)acpid   2011-07-02 13:39:53.000000000 +0100
+++ fix-headers/it(2f)acpid     2025-09-12 18:42:22.542231757 +0100
@@ -1,5 +1,6 @@
 #language it
-~-[[DebianWiki/EditorGuide#translation|Translation(s)]]: [[acpid|English]] - Italiano -~
+#pragma supplementation-page on
+<<Include(acpid, ,from="^##TAG:TRANSLATION-HEADER-START",to="^##TAG:TRANSLATION-HEADER-END")>>
 ----
 
My next step will be to go through and fix the bugs, but I'm busy this weekend
and have already spent more time on this than I meant to, so I may not
get back to you until the second half of next week.

Given the number of changes coming up, I'd be particularly interested to hear
what sort of review tools people would like.  Should I post the whole diff to
some pastebin?  Or just a list of modified files?

...

The above talks about changes that can be automated, but this has also revealed
a bunch of issues that need to be changed by hand.  The rest of this e-mail
contains my notes about manual work for the "translation headers" bulk change.

The following pages are translations whose <<Include>> has now been deleted.
I intend to delete these manually:

 * fr/Debian_GNU/kFreeBSD_why
 * pl/Debian_GNU/kFreeBSD_why
 * pt_BR/Teams/Welcome/Translators

The following pages do not use <<Include>>, but appear to be translations of
now-deleted pages.  I intend to delete these manually:

 * fr/AptOnCd
 * el/DebianInstaller/Loader
 * es/DebianInstaller/Loader
 * fr/DebianInstaller/Loader
 * zh_CN/DebianInstaller/Loader
 * fr/DeviceManagement
 * it/DeviceManagement
 * fr/HowToIdentifyADevice/PC_Card
 * ru/HowToIdentifyADevice/PC_Card
 * es/Orca
 * fr/Orca
 * it/Teams/Welcome/Translators
 * fr/ekiga
 * ru/ekiga

The following pages are translations with URLs that do not match the English
URL.  I intend to rename them and leave a redirect behind:

 * de/USB-DVBTStick (translation of USB-DVBT Stick)
 * es/Bienvenida (translation of Welcome)
 * es/BitTorrent/Transmission (translation of Transmission)
 * fr/Chroot (translation of chroot)
 * fr/EeePC/HowTo/Install (translation of DebianEeePC/HowTo/Install)
 * fr/GrubLegacyConfiguration (translation of GrubConfiguration)
 * fr/Karaoke (translation of MIDI)
 * fr/Lamp (translation of LaMp)
 * fr/SecurityChecklist (translation of SecurityManagement)
 * fr/ShellIntroduction (translation of CommandLineInterface)
 * fr/USB-DVBTStick (translation of USB-DVBT Stick)
 * fr/WhatIsSecurity (translation of SecurityManagement)
 * pl/Wiadomości (translation of News)
 * ru/Fluxbox (translation of FluxBox)
 * ru/Systemd (translation of systemd)
 * sk/Lamp (translation of LaMp)
 * zh_CN/Debian-GNU (translation of Debian_GNU)
 * zh_CN/LAMP (translation of LaMp)
 * zh_CN/ShellIntroduction (translation of CommandLineInterface)
 * zh_CN/Systemd (translation of systemd)

The language name "Norsk" is shared by "nn/FrontPage" and "no/KVM".
Those are the only two pages in those namespaces, so I don't have
enough evidence to suggest an official spelling.  This could cause a problem
in future if e.g. someone creates "no/FrontPage", but for now I intend to
simply call them both "Norsk".

The page "cn/EeePC" is the only page in the "cn" namespace,
the pages "zh_CN/HelpDebian" and "zh_CN/chipo" use "#language cn".
I'm fairly sure they're using "cn" as a synonym for "zh_CN",
so I intend to manually edit the #languages, rename the URL and update links.

Translation headers contain blocks like [[lang_code/SomePage|Language name]].
I intend to standardise these by picking the most common translation for each
language:

 * ar    العربية
 * be    Belarusian
 * bn    বাংলা
 * ca    Català
 * cs    Česky
 * da    Dansk
 * de    Deutsch
 * el    Ελληνικά
 * en    English
 * eo    Esperanto
 * es    Español
 * fa    فارسی (Persian)
 * fi    Suomi
 * fr    Français
 * he    עברית (Hebrew)
 * hu    Magyar
 * id    Indonesia
 * it    Italiano
 * ja    日本語 (Nihongo)
 * ko    한국어
 * ms    Melayu
 * nb    Norsk bokmål
 * nl    Nederlands
 * no    Norsk
 * pl    Polski
 * pt    Português
 * pt_br Português (Brasil)
 * pt_pt Portuguese (Portugal)
 * ro    Română
 * ru    Русский
 * se    Svenska
 * si    සිංහල-(Sinhala)
 * sk    Slovak
 * sr    српски
 * sv    Svenska
 * ta    தமிழ் (Tamil)
 * te    తెలుగు
 * tr    Türkçe
 * uk    Українська
 * vi    tiếng Việt
 * zh_cn 简体中文
 * zh_hk 中文
 * zh_tw 繁體中文

The following pages have /Discussion links that do not point to /Discussion.
Some point to a language-neutral discussion page (usually ../Discussion),
some point to a broader discussion page (e.g. ../ConventionsDiscussion).
I intend to leave these alone when converting /Discussion links
to supplementation-pages:

 * ../ConventionsDiscussion
  * DebianWiki/Browsing
  * DebianWiki/Contact
  * DebianWiki/DealingWithSpam
  * DebianWiki/EditorQuickStart
  * DebianWiki/Engine
  * DebianWiki/LicencingTerms
  * DebianWiki/Privacy
  * fr/DebianWiki/Contact
  * id/DebianWiki/EditorQuickStart
  * pt_BR/DebianWiki/Contact
  * pt_BR/DebianWiki/EditorQuickStart
  * ru/DebianWiki/Contact
  * ru/DebianWiki/EditorQuickStart
  * ru/DebianWiki/Engine
  * zh_CN/DebianWiki/Contact
  * zh_CN/DebianWiki/EditorQuickStart
 * ../Discussion
  * fa/DebianInstall
  * ru/DebianInstall
 * /../Discussion
  * fr/InstallingDebianOn/MSI/GS70/jessie
 * /ConventionsDiscussion
  * el/DebianWiki
  * pt_BR/DebianWiki
 * /DebianWannaBuildInfrastructureDiscussion
  * DebianWannaBuildInfrastructureOnOneServer
  * SetupBuildServiceForWanna-build
 * :/Discussion:Discussion
  * LDAP
 * DebianDesktop/Discussion
  * es/DebianDesktop
  * fr/DebianDesktop
  * sv/DebianDesktop
 * DebianDevelopment/Discussion
  * ar/DebianDevelopment
  * bn/DebianDevelopment
  * es/DebianDevelopment
  * fr/DebianDevelopment
  * id/DebianDevelopment
  * ja/DebianDevelopment
  * ms/DebianDevelopment
  * pt_PT/DebianDevelopment
  * ru/DebianDevelopment
  * sv/DebianDevelopment
 * DebianEeePCGerman/Accessibility/Discussion
  * de/EeePC/Accessibility
 * DebianEvents/Discussion
  * es/DebianEvents
  * fr/DebianEvents
  * uk/DebianEvents
 * DebianInstall/Discussion
  * el/DebianInstaller/Loader
  * id/DebianInstaller
  * zh_CN/DebianInstaller
 * DebianInstaller/Loader/Discussion
  * fr/DebianInstaller/Loader
 * DebianIntroduction/Discussion
  * da/DebianIntroduction
  * eo/DebianIntroduction
 * DebianWiki/Content/Discussion
  * DebianWiki/de/Content
  * es/DebianWiki/Content
  * fr/DebianWiki/Content
  * pt_BR/DebianWiki/Content
  * ru/DebianWiki/Content
  * uk/DebianWiki/Content
 * DebianWiki/ConventionsDiscussion
  * DebianWiki
  * DebianWiki/Administration
  * DebianWiki/TranslationNamespace
  * DebianWikiIsNotGFDL
  * de/DebianWiki/EditorQuickStart
  * es/DebianWiki
  * es/DebianWiki/Contact
  * es/DebianWiki/EditorGuide
  * es/DebianWiki/EditorQuickStart
  * fr/DebianWiki/EditorGuide
  * id/DebianWiki/EditorGuide
  * ja/DebianWiki
  * pt_PT/DebianWiki
  * ru/DebianWiki/EditorGuide
  * sysstat
  * uk/DebianWiki/Contact
  * uk/DebianWiki/EditorGuide
  * uk/DebianWiki/EditorQuickStart
 * DesktopEnvironment/Discussion
  * he/DesktopEnvironment
  * id/DesktopEnvironment
  * ta/DesktopEnvironment
 * Discussion
  * HalfLife2
 * FTP/Discussion
  * uk/FTP
 * Games/CannonSmash/Discussion
  * de/Games/CannonSmash
  * it/Games/CannonSmash
 * Games/Checklist/Discussion
  * fr/Games/Checklist
 * Games/TheBattleForWesnoth/Discussion
  * de/Games/TheBattleForWesnoth
 * GnuPG/Discussion
  * GnuPG/Discussion
  * de/GnuPG
 * Handheld/Discussion
  * uk/Handheld
 * IrcClients/Discussion
  * uk/IrcClients
 * MySql/Discussion
  * de/MySql
  * es/MySql
  * fr/MySql
  * ru/MySql
 * OfficeApplication/Discussion
  * fr/OfficeApplication
 * Openbox/Discussion
  * es/Openbox
  * ru/Openbox
 * Promote/Discussion
  * es/Promote
  * es/Promote/ButtonsAndBanners
 * Promote/PrintAdvertising/Discussion
  * es/Promote/PrintAdvertising
 * Promote/SocialNetworking/Discussion
  * es/Promote/SocialNetworking
 * Promote/Video/Discussion
  * Promote/Videos
 * QuickPackageManagementPolish/Discussion
  * pl/QuickPackageManagement
 * SVN/Discussion
  * Subversion
  * fr/Subversion
 * XMPP/Discussion
  * XMPP/Discussion
  * de/XMPP
 * es/Discussion
  * es/LVM
 * idesk/Discussion
  * fr/idesk

Some people have two homepages - one in English, one in their native language.
See e.g. LunaJernberg and sv/LunaJernberg.  IMHO, this is a strong argument
in favour of putting the language code at the end of the URL in MediaWiki
(i.e. User:LunaJernberg and User:LunaJernberg/sv)

The following pages need to be handled manually.  I intend to edit them myself:

 * JohnPinkerton
 * es/JohnPinkerton

FrontPage and its translations use <<Include(template/AllLangs)>> for their
translations.  I don't expect to touch them in this job, but we'll presumably
need to do something about it before migrating.

If you speak Korean, please adapt the header at the top of DefaultTemplate
into the following pages.  I'm fairly sure some lines at the top of them are
translation headers, but I don't speak Korean and any regexp that catches them
would have an unacceptable number of false positives:

 * DebianYeeloong/HowTo/Install
 * ko/DebianYeeloong/HowTo/Install
 * DebianYeeloong
 * ko/DebianYeeloong

Please adapt the header at the top of DefaultTemplate into any of the following
pages you think is appropriate.  Or if you don't want these pages to be treated
as translations in MediaWiki, please reply to this e-mail with an explanation.

 * pl/About
 * es/Apache
 * nl/Aptitude
 * ArmHardFloatPort
  * id/ArmHardFloatPort
  * zh_CN/ArmHardFloatPort
 * es/BSP/2004/11/LatinAmerica
 * Backup
  * fr/Backup
 * Brasil
 * de/CDDVD
 * fr/CUPS
 * CVS
 * de/Community
 * DDTP
  * zh_CN/DDTP
 * fr/DVD
 * DebianCustomCD
 * DebianDay/2007
  * es/DebianDay/2007
  * pt/DebianDay/2007
 * fr/DebianEdu
 * DebianEdu/BeforeGettingStarted
  * es/DebianEdu/BeforeGettingStarted
 * DebianEdu/HowTo
  * fr/DebianEdu/HowTo
 * ru/DebianEeePC/HowTo/UpgradeBIOS
 * DebianHosting
  * pl/DebianHosting
 * DebianIPv6
  * cs/DebianIPv6
 * DebianInstaller/Arm
 * DebianInstaller/Arm/OtherPlatforms
 * DebianInstaller/ReleaseAnnounce
  * ja/DebianInstaller/ReleaseAnnounce
 * ko/DebianInstaller/Team
 * DebianMed
  * fr/DebianMed
 * DebianPackagingHandbook
 * fr/DebianPolicy
 * DebianReference/Network
  * pt_BR/DebianReference/Network
 * pl/DebianSystem
 * DebianUkrainian
  * uk/DebianUkrainian
 * DebianYeeloong/HowTo/Xorg
  * ko/DebianYeeloong/HowTo/Xorg
 * DebianWomen/History
  * pt_BR/DebianWomen/History
 * DebianWomen/Index
  * fr/DebianWomen/Index
  * it/DebianWomen/Index
  * pt_BR/DebianWomen/Index
 * DebianWomen/Profiles
  * pt_BR/DebianWomen/Profiles
 * DenyHosts
 * es/Devel
  * fr/Devel
 * ru/Dictionary
 * DontBreakDebian/Discussion
  * es/DontBreakDebian/Discussion
 * cn/EeePC
 * de/EeePC/Bugs
 * ru/EeePC/FAQ
 * de/EeePC/HowTo/SplashyWithDmcryptAndStandardGrub
  * fr/EeePC/HowTo/SplashyWithDmcryptAndStandardGrub
 * de/EeePC/HowTo/UpgradeBIOS
 * de/EeePC/Models
 * fr/EeePC/Software/Productivity
 * de/EeePC/Status
 * de/EeePC/Todo
  * fr/EeePC/Todo
 * de/Firewalls
 * FixMe
  * fr/FixMe
 * ar/FortunesDebianHints
 * FreedomBox/Hardware/OrangePiZero
  * es/FreedomBox/Hardware/OrangePiZero
 * FreedomBox/Hardware/VirtualBox
  * es/FreedomBox/Hardware/VirtualBox
 * FreedomBox/HardwareRequirements
  * fr/FreedomBox/HardwareRequirements
 * FreedomBox/Maker
  * es/FreedomBox/Maker
 * FreedomBox/Manual/Coquelicot
  * es/FreedomBox/Manual/Coquelicot
 * FreedomBox/Manual/Developer
  * es/FreedomBox/Manual/Developer
 * FreedomBox/Plinth
  * es/FreedomBox/Plinth
 * FreedomBox/Portal
  * de/FreedomBox/Portal
  * es/FreedomBox/Portal
  * it/FreedomBox/Portal
  * ru/FreedomBox/Portal
  * uk/FreedomBox/Portal
 * FreedomBox/PrivacyAtHome
  * fr/FreedomBox/PrivacyAtHome
 * FreedomBox/UserExperience
  * es/FreedomBox/UserExperience
 * bn/GCC
 * Glossary
  * ja/Glossary
 * fr/Gnome3
  * ru/Gnome3
 * uk/GnuPG
 * fr/HowToGetABacktrace
 * ja/InstallingDebianOn/Apple/MacBookPro/11-1
 * InstallingDebianOn/PageFragments/AdditionalSuggestions
  * it/InstallingDebianOn/PageFragments/AdditionalSuggestions
  * pt_BR/InstallingDebianOn/PageFragments/AdditionalSuggestions
  * uk/InstallingDebianOn/PageFragments/AdditionalSuggestions
 * InstallingDebianOn/PageFragments/Philosophy
  * de/InstallingDebianOn/PageFragments/Philosophy
  * fr/InstallingDebianOn/PageFragments/Philosophy
  * it/InstallingDebianOn/PageFragments/Philosophy
  * pt_BR/InstallingDebianOn/PageFragments/Philosophy
  * ru/InstallingDebianOn/PageFragments/Philosophy
  * uk/InstallingDebianOn/PageFragments/Philosophy
 * fi/IntroDebianPackaging
  * fr/IntroDebianPackaging
  * ro/IntroDebianPackaging
 * KdeDebTasks
  * es/KdeDebTasks
 * Knoppix
  * fr/Knoppix
 * be/L10n
 * L10n/Coordination
  * zh_TW/L10n/Coordination
 * L10n/Dutch
  * nl/L10n/Dutch
 * tr/L10n/Turkish
 * bn/LaMp
 * LennyIllustratedInstall/Discussion
  * es/LennyIllustratedInstall/Discussion
 * LunaJernberg
  * sv/LunaJernberg
 * ManfredLichtenstern
  * de/ManfredLichtenstern
 * MediaWiki
  * es/MediaWiki
 * Mobile
 * pt_BR/Multiarch
 * Multiarch/FAQ/i386-amd64-kernel
  * it/Multiarch/FAQ/i386-amd64-kernel
 * NewInLenny
  * el/NewInLenny
 * el/News
  * ms/News
  * pl/News
 * OCaml
  * fr/OCaml
 * OpenRating/Categories
  * es/OpenRating/Categories
 * ja/Packaging
 * Profanity
  * de/Profanity
 * el/ProjectNews
 * kr/ProjectNews/HowToContribute
 * Prosody
  * de/Prosody
 * ReproducibleBuilds/About
  * ja/ReproducibleBuilds/About
 * ko/Salsa/Doc
 * Samba
 * eo/Screen
 * Teams/Guidelines
 * WiFi/Discussion
  * fr/WiFi/Discussion
 * ru/WikiTag
 * chipo
  * zh_CN/chipo
 * netconf
  * pt_BR/netconf
 * fr/owncloud
  * owncloud
 * qa.debian.org
 * de/screen
  * screen
 * it/sound
  * sound
 * fr/unclutter
  * unclutter
 * ДляНачала
 * ru/ПавлоРудий
  * ПавлоРудий


Reply to: