Package: pandoc
Version: 2.17.1.1-1.1
Severity: important
Tags: security upstream patch
Control: found -1 2.2.1-3
Control: found -1 2.9.2.1-1
X-Debbugs-Cc: guilhem@debian.org
Hi,
The following vulnerability was published for pandoc.
CVE-2023-35936[0]:
| Starting in version 1.13 and prior to version 3.1.4, Pandoc is
| susceptible to an arbitrary file write vulnerability, which can be
| triggered by providing a specially crafted image element in the input
| when generating files using the `--extract-media` option or outputting
| to PDF format. This vulnerability allows an attacker to create or
| overwrite arbitrary files on the system, depending on the privileges of
| the process running pandoc. It only affects systems that pass untrusted
| user input to pandoc and allow pandoc to be used to produce a PDF or
| with the `--extract-media` option. […] Note that the `--sandbox`
| option, which only affects IO done by readers and writers themselves,
| does not block this vulnerability.
I discovered that the upstream fix was incomplete while backporting it
to buster (LTS). Reported the finding upstream who promptly fixed it in
3.1.6 [1]. Another CVE ID was assigned for this, namely CVE-2023-38745 [2].
The Security Team decided not to issue a DSA for these vulnerabilities,
but given they're about to be patched in buster it makes sense to patch
other suites, too. Please consider MR !3 for unstable:
https://salsa.debian.org/haskell-team/pandoc/-/merge_requests/3 .
debdiff attached for convenience.
I've also prepared (and tested) a fix for bullseye [3] which I'm planing
to submit to -pu once sid is patched. Also planing to rebuild the
targeted fix for bookworm and submit it to s-pu. Let me know if you
object :-)
Cheers,
--
Guilhem.
[0] https://security-tracker.debian.org/tracker/CVE-2023-35936
https://github.com/jgm/pandoc/security/advisories/GHSA-xj5q-fv23-575g
[1] https://github.com/jgm/pandoc/commit/eddedbfc14916aa06fc01ff04b38aeb30ae2e625
[2] https://security-tracker.debian.org/tracker/CVE-2023-38745
https://nvd.nist.gov/vuln/detail/CVE-2023-38745
[3] https://salsa.debian.org/lts-team/packages/pandoc/-/compare/debian%2F2.9.2.1-1...debian%2Fbullseye?from_project_id=22949&straight=false
diffstat for pandoc-2.17.1.1 pandoc-2.17.1.1
changelog | 9 +
patches/CVE-2023-35936.patch | 205 +++++++++++++++++++++++++++++++++++++++++++
patches/CVE-2023-38745.patch | 98 ++++++++++++++++++++
patches/series | 2
4 files changed, 314 insertions(+)
diff -Nru pandoc-2.17.1.1/debian/changelog pandoc-2.17.1.1/debian/changelog
--- pandoc-2.17.1.1/debian/changelog 2022-11-19 14:13:51.000000000 +0100
+++ pandoc-2.17.1.1/debian/changelog 2023-07-21 20:22:42.000000000 +0200
@@ -1,3 +1,12 @@
+pandoc (2.17.1.1-1.2) unstable; urgency=high
+
+ * Non-maintainer upload.
+ * Cherry-pick upstream fixes for CVE-2023-35936 from 3.1.4 release. (Closes:
+ #-1)
+ * Cherry-pick upstream fix for CVE-2023-35936 from 3.1.6 release.
+
+ -- Guilhem Moulin <guilhem@debian.org> Fri, 21 Jul 2023 20:22:42 +0200
+
pandoc (2.17.1.1-1.1) unstable; urgency=low
* Non-maintainer upload.
diff -Nru pandoc-2.17.1.1/debian/patches/CVE-2023-35936.patch pandoc-2.17.1.1/debian/patches/CVE-2023-35936.patch
--- pandoc-2.17.1.1/debian/patches/CVE-2023-35936.patch 1970-01-01 01:00:00.000000000 +0100
+++ pandoc-2.17.1.1/debian/patches/CVE-2023-35936.patch 2023-07-21 20:22:42.000000000 +0200
@@ -0,0 +1,205 @@
+From: John MacFarlane <jgm@berkeley.edu>
+Date: Tue, 20 Jun 2023 13:50:13 -0700
+Subject: Fix a security vulnerability in MediaBag and
+ T.P.Class.IO.writeMedia.
+
+This vulnerability, discovered by Entroy C, allows users to write
+arbitrary files to any location by feeding pandoc a specially crafted
+URL in an image element. The vulnerability is serious for anyone
+using pandoc to process untrusted input.
+
+Origin: https://github.com/jgm/pandoc/commit/5e381e3878b5da87ee7542f7e51c3c1a7fd84b89
+Origin: https://github.com/jgm/pandoc/commit/54561e9a6667b36a8452b01d2def9e3642013dd6
+Origin: https://github.com/jgm/pandoc/commit/df4f13b262f7be5863042f8a5a1c365282c81f07
+Origin: https://github.com/jgm/pandoc/commit/fe62da61dfd33e6b4c0c03895c528a47a0405bf7
+Origin: https://github.com/jgm/pandoc/commit/5246f02f0bb9c176a6d2f6e3d0c03407d8a67445
+Bug: https://github.com/jgm/pandoc/security/advisories/GHSA-xj5q-fv23-575g
+Bug-Debian: https://security-tracker.debian.org/tracker/CVE-2023-35936
+---
+ pandoc.cabal | 2 ++
+ src/Text/Pandoc/Class/IO.hs | 12 ++++++------
+ src/Text/Pandoc/MediaBag.hs | 27 ++++++++++++++++-----------
+ test/Tests/MediaBag.hs | 37 +++++++++++++++++++++++++++++++++++++
+ test/test-pandoc.hs | 2 ++
+ 5 files changed, 63 insertions(+), 17 deletions(-)
+ create mode 100644 test/Tests/MediaBag.hs
+
+diff --git a/pandoc.cabal b/pandoc.cabal
+index 52506e3..c5129a8 100644
+--- a/pandoc.cabal
++++ b/pandoc.cabal
+@@ -791,6 +791,7 @@ test-suite test-pandoc
+ tasty-lua >= 1.0 && < 1.1,
+ tasty-quickcheck >= 0.8 && < 0.11,
+ text >= 1.1.1.0 && < 2.1,
++ temporary >= 1.1 && < 1.4,
+ time >= 1.5 && < 1.14,
+ xml >= 1.3.12 && < 1.4,
+ zip-archive >= 0.2.3.4 && < 0.5
+@@ -800,6 +801,7 @@ test-suite test-pandoc
+ Tests.Lua
+ Tests.Lua.Module
+ Tests.Shared
++ Tests.MediaBag
+ Tests.Readers.LaTeX
+ Tests.Readers.HTML
+ Tests.Readers.JATS
+diff --git a/src/Text/Pandoc/Class/IO.hs b/src/Text/Pandoc/Class/IO.hs
+index 5d4dbc7..5043266 100644
+--- a/src/Text/Pandoc/Class/IO.hs
++++ b/src/Text/Pandoc/Class/IO.hs
+@@ -49,7 +49,7 @@ import Network.HTTP.Client.Internal (addProxy)
+ import Network.HTTP.Client.TLS (mkManagerSettings)
+ import Network.HTTP.Types.Header ( hContentType )
+ import Network.Socket (withSocketsDo)
+-import Network.URI (unEscapeString)
++import Network.URI (URI(..), parseURI, unEscapeString)
+ import System.Directory (createDirectoryIfMissing)
+ import System.Environment (getEnv)
+ import System.FilePath ((</>), takeDirectory, normalise)
+@@ -120,11 +120,11 @@ newUniqueHash = hashUnique <$> liftIO Data.Unique.newUnique
+
+ openURL :: (PandocMonad m, MonadIO m) => Text -> m (B.ByteString, Maybe MimeType)
+ openURL u
+- | Just u'' <- T.stripPrefix "data:" u = do
+- let mime = T.takeWhile (/=',') u''
+- let contents = UTF8.fromString $
+- unEscapeString $ T.unpack $ T.drop 1 $ T.dropWhile (/=',') u''
+- return (decodeLenient contents, Just mime)
++ | Just (URI{ uriScheme = "data:",
++ uriPath = upath }) <- parseURI (T.unpack u) = do
++ let (mime, rest) = break (== ',') $ unEscapeString upath
++ let contents = UTF8.fromString $ drop 1 rest
++ return (decodeLenient contents, Just (T.pack mime))
+ | otherwise = do
+ let toReqHeader (n, v) = (CI.mk (UTF8.fromText n), UTF8.fromText v)
+ customHeaders <- map toReqHeader <$> getsCommonState stRequestHeaders
+diff --git a/src/Text/Pandoc/MediaBag.hs b/src/Text/Pandoc/MediaBag.hs
+index df71ff8..45b74b5 100644
+--- a/src/Text/Pandoc/MediaBag.hs
++++ b/src/Text/Pandoc/MediaBag.hs
+@@ -28,12 +28,14 @@ import Data.Data (Data)
+ import qualified Data.Map as M
+ import Data.Maybe (fromMaybe, isNothing)
+ import Data.Typeable (Typeable)
++import Network.URI (unEscapeString)
+ import System.FilePath
+ import Text.Pandoc.MIME (MimeType, getMimeTypeDef, extensionFromMimeType)
+ import Data.Text (Text)
+ import qualified Data.Text as T
+ import Data.Digest.Pure.SHA (sha1, showDigest)
+-import Network.URI (URI (..), parseURI)
++import Network.URI (URI (..), parseURI, isURI)
++import Data.List (isInfixOf)
+
+ data MediaItem =
+ MediaItem
+@@ -52,9 +54,12 @@ newtype MediaBag = MediaBag (M.Map Text MediaItem)
+ instance Show MediaBag where
+ show bag = "MediaBag " ++ show (mediaDirectory bag)
+
+--- | We represent paths with /, in normalized form.
++-- | We represent paths with /, in normalized form. Percent-encoding
++-- is not resolved.
+ canonicalize :: FilePath -> Text
+-canonicalize = T.replace "\\" "/" . T.pack . normalise
++canonicalize fp
++ | isURI fp = T.pack fp
++ | otherwise = T.replace "\\" "/" . T.pack . normalise $ fp
+
+ -- | Delete a media item from a 'MediaBag', or do nothing if no item corresponds
+ -- to the given path.
+@@ -77,22 +82,22 @@ insertMedia fp mbMime contents (MediaBag mediamap) =
+ , mediaContents = contents
+ , mediaMimeType = mt }
+ fp' = canonicalize fp
++ fp'' = unEscapeString $ T.unpack fp'
+ uri = parseURI fp
+- newpath = if isRelative fp
++ newpath = if isRelative fp''
+ && isNothing uri
+- && ".." `notElem` splitDirectories fp
+- then T.unpack fp'
++ && not (".." `isInfixOf` fp'')
++ then fp''
+ else showDigest (sha1 contents) <> "." <> ext
+- fallback = case takeExtension fp of
+- ".gz" -> getMimeTypeDef $ dropExtension fp
+- _ -> getMimeTypeDef fp
++ fallback = case takeExtension fp'' of
++ ".gz" -> getMimeTypeDef $ dropExtension fp''
++ _ -> getMimeTypeDef fp''
+ mt = fromMaybe fallback mbMime
+- path = maybe fp uriPath uri
++ path = maybe fp'' (unEscapeString . uriPath) uri
+ ext = case takeExtension path of
+ '.':e -> e
+ _ -> maybe "" T.unpack $ extensionFromMimeType mt
+
+-
+ -- | Lookup a media item in a 'MediaBag', returning mime type and contents.
+ lookupMedia :: FilePath
+ -> MediaBag
+diff --git a/test/Tests/MediaBag.hs b/test/Tests/MediaBag.hs
+new file mode 100644
+index 0000000..b44232b
+--- /dev/null
++++ b/test/Tests/MediaBag.hs
+@@ -0,0 +1,37 @@
++{-# LANGUAGE OverloadedStrings #-}
++module Tests.MediaBag (tests) where
++
++import Test.Tasty
++import Test.Tasty.HUnit
++-- import Tests.Helpers
++import Text.Pandoc.Class (extractMedia, fillMediaBag, runIOorExplode)
++import System.IO.Temp (withTempDirectory)
++import Text.Pandoc.Shared (inDirectory)
++import System.FilePath
++import Text.Pandoc.Builder as B
++import System.Directory (doesFileExist, copyFile)
++
++tests :: [TestTree]
++tests = [
++ testCase "test fillMediaBag & extractMedia" $
++ withTempDirectory "." "extractMediaTest" $ \tmpdir -> inDirectory tmpdir $ do
++ copyFile "../../test/lalune.jpg" "moon.jpg"
++ let d = B.doc $
++ B.para (B.image "../../test/lalune.jpg" "" mempty) <>
++ B.para (B.image "moon.jpg" "" mempty) <>
++ B.para (B.image "data://image/png;base64,cHJpbnQgImhlbGxvIgo=;.lua+%2f%2e%2e%2f%2e%2e%2fa%2elua" "" mempty) <>
++ B.para (B.image "data:image/gif;base64,R0lGODlhAQABAIAAAAAAAP///yH5BAEAAAAALAAAAAABAAEAAAIBRAA7" "" mempty)
++ runIOorExplode $ do
++ fillMediaBag d
++ extractMedia "foo" d
++ exists1 <- doesFileExist ("foo" </> "moon.jpg")
++ assertBool "file in directory is not extracted with original name" exists1
++ exists2 <- doesFileExist ("foo" </> "f9d88c3dbe18f6a7f5670e994a947d51216cdf0e.jpg")
++ assertBool "file above directory is not extracted with hashed name" exists2
++ exists3 <- doesFileExist ("foo" </> "2a0eaa89f43fada3e6c577beea4f2f8f53ab6a1d.lua")
++ exists4 <- doesFileExist "a.lua"
++ assertBool "data uri with malicious payload gets written outside of destination dir"
++ (exists3 && not exists4)
++ exists5 <- doesFileExist ("foo" </> "d5fceb6532643d0d84ffe09c40c481ecdf59e15a.gif")
++ assertBool "data uri with gif is not properly decoded" exists5
++ ]
+diff --git a/test/test-pandoc.hs b/test/test-pandoc.hs
+index fcb157f..7d622eb 100644
+--- a/test/test-pandoc.hs
++++ b/test/test-pandoc.hs
+@@ -51,6 +51,7 @@ import qualified Tests.Writers.RST
+ import qualified Tests.Writers.AnnotatedTable
+ import qualified Tests.Writers.TEI
+ import qualified Tests.Writers.Markua
++import qualified Tests.MediaBag
+ import Text.Pandoc.Shared (inDirectory)
+
+ tests :: FilePath -> TestTree
+@@ -58,6 +59,7 @@ tests pandocPath = testGroup "pandoc tests"
+ [ Tests.Command.tests
+ , testGroup "Old" (Tests.Old.tests pandocPath)
+ , testGroup "Shared" Tests.Shared.tests
++ , testGroup "MediaBag" Tests.MediaBag.tests
+ , testGroup "Writers"
+ [ testGroup "Native" Tests.Writers.Native.tests
+ , testGroup "ConTeXt" Tests.Writers.ConTeXt.tests
diff -Nru pandoc-2.17.1.1/debian/patches/CVE-2023-38745.patch pandoc-2.17.1.1/debian/patches/CVE-2023-38745.patch
--- pandoc-2.17.1.1/debian/patches/CVE-2023-38745.patch 1970-01-01 01:00:00.000000000 +0100
+++ pandoc-2.17.1.1/debian/patches/CVE-2023-38745.patch 2023-07-21 20:22:42.000000000 +0200
@@ -0,0 +1,98 @@
+From: John MacFarlane <jgm@berkeley.edu>
+Date: Thu, 20 Jul 2023 09:26:38 -0700
+Subject: Fix new variant of the vulnerability in CVE-2023-35936.
+
+Guilhem Moulin noticed that the fix to CVE-2023-35936 was incomplete.
+An attacker could get around it by double-encoding the malicious
+extension to create or override arbitrary files.
+
+ $ echo '' >b.md
+ $ .cabal/bin/pandoc b.md --extract-media=bar
+ <p><img
+ src="bar/2a0eaa89f43fada3e6c577beea4f2f8f53ab6a1d.lua+%2f%2e%2e%2f%2e%2e%2fb%2elua" /></p>
+ $ cat b.lua
+ print "hello"
+ $ find bar
+ bar/
+ bar/2a0eaa89f43fada3e6c577beea4f2f8f53ab6a1d.lua+
+
+This commit adds a test case for this more complex attack and fixes
+the vulnerability. (The fix is quite simple: if the URL-unescaped
+filename or extension contains a '%', we just use the sha1 hash of the
+contents as the canonical name, just as we do if the filename contains
+'..'.)
+
+Origin: https://github.com/jgm/pandoc/commit/eddedbfc14916aa06fc01ff04b38aeb30ae2e625
+Bug-Debian: https://security-tracker.debian.org/tracker/CVE-2023-38745
+---
+ src/Text/Pandoc/Class/IO.hs | 2 ++
+ src/Text/Pandoc/MediaBag.hs | 7 ++++---
+ test/Tests/MediaBag.hs | 12 +++++++++++-
+ 3 files changed, 17 insertions(+), 4 deletions(-)
+
+diff --git a/src/Text/Pandoc/Class/IO.hs b/src/Text/Pandoc/Class/IO.hs
+index 5043266..b3f2a32 100644
+--- a/src/Text/Pandoc/Class/IO.hs
++++ b/src/Text/Pandoc/Class/IO.hs
+@@ -222,6 +222,8 @@ writeMedia :: (PandocMonad m, MonadIO m)
+ -> m ()
+ writeMedia dir (fp, _mt, bs) = do
+ -- we normalize to get proper path separators for the platform
++ -- we unescape URI encoding, but given how insertMedia
++ -- is written, we shouldn't have any % in a canonical media name...
+ let fullpath = normalise $ dir </> unEscapeString fp
+ liftIOError (createDirectoryIfMissing True) (takeDirectory fullpath)
+ logIOError $ BL.writeFile fullpath bs
+diff --git a/src/Text/Pandoc/MediaBag.hs b/src/Text/Pandoc/MediaBag.hs
+index 45b74b5..e02fc1a 100644
+--- a/src/Text/Pandoc/MediaBag.hs
++++ b/src/Text/Pandoc/MediaBag.hs
+@@ -87,16 +87,17 @@ insertMedia fp mbMime contents (MediaBag mediamap) =
+ newpath = if isRelative fp''
+ && isNothing uri
+ && not (".." `isInfixOf` fp'')
++ && '%' `notElem` fp''
+ then fp''
+- else showDigest (sha1 contents) <> "." <> ext
++ else showDigest (sha1 contents) <> ext
+ fallback = case takeExtension fp'' of
+ ".gz" -> getMimeTypeDef $ dropExtension fp''
+ _ -> getMimeTypeDef fp''
+ mt = fromMaybe fallback mbMime
+ path = maybe fp'' (unEscapeString . uriPath) uri
+ ext = case takeExtension path of
+- '.':e -> e
+- _ -> maybe "" T.unpack $ extensionFromMimeType mt
++ '.':e | '%' `notElem` e -> '.':e
++ _ -> maybe "" (\x -> '.':T.unpack x) $ extensionFromMimeType mt
+
+ -- | Lookup a media item in a 'MediaBag', returning mime type and contents.
+ lookupMedia :: FilePath
+diff --git a/test/Tests/MediaBag.hs b/test/Tests/MediaBag.hs
+index b44232b..c27a29b 100644
+--- a/test/Tests/MediaBag.hs
++++ b/test/Tests/MediaBag.hs
+@@ -19,7 +19,7 @@ tests = [
+ let d = B.doc $
+ B.para (B.image "../../test/lalune.jpg" "" mempty) <>
+ B.para (B.image "moon.jpg" "" mempty) <>
+- B.para (B.image "data://image/png;base64,cHJpbnQgImhlbGxvIgo=;.lua+%2f%2e%2e%2f%2e%2e%2fa%2elua" "" mempty) <>
++ B.para (B.image "data:image/png;base64,cHJpbnQgImhlbGxvIgo=;.lua+%2f%2e%2e%2f%2e%2e%2fa%2elua" "" mempty) <>
+ B.para (B.image "data:image/gif;base64,R0lGODlhAQABAIAAAAAAAP///yH5BAEAAAAALAAAAAABAAEAAAIBRAA7" "" mempty)
+ runIOorExplode $ do
+ fillMediaBag d
+@@ -34,4 +34,14 @@ tests = [
+ (exists3 && not exists4)
+ exists5 <- doesFileExist ("foo" </> "d5fceb6532643d0d84ffe09c40c481ecdf59e15a.gif")
+ assertBool "data uri with gif is not properly decoded" exists5
++ -- double-encoded version:
++ let e = B.doc $
++ B.para (B.image "data:image/png;base64,cHJpbnQgInB3bmVkIgo=;.lua+%252f%252e%252e%252f%252e%252e%252fb%252elua" "" mempty)
++ runIOorExplode $ do
++ fillMediaBag e
++ extractMedia "bar" e
++ exists6 <- doesFileExist ("bar" </> "772ceca21a2751863ec46cb23db0e7fc35b9cff8.png")
++ exists7 <- doesFileExist "b.lua"
++ assertBool "data uri with double-encoded malicious payload gets written outside of destination dir"
++ (exists6 && not exists7)
+ ]
diff -Nru pandoc-2.17.1.1/debian/patches/series pandoc-2.17.1.1/debian/patches/series
--- pandoc-2.17.1.1/debian/patches/series 2022-08-13 16:27:42.000000000 +0200
+++ pandoc-2.17.1.1/debian/patches/series 2023-07-21 20:22:42.000000000 +0200
@@ -2,3 +2,5 @@
020220531~9aff861.patch
2001_templates_avoid_privacy_breach.patch
2002_program_package_hint.patch
+CVE-2023-35936.patch
+CVE-2023-38745.patch
Attachment:
signature.asc
Description: PGP signature