[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#1041976: pandoc: CVE-2023-35936



Package: pandoc
Version: 2.17.1.1-1.1
Severity: important
Tags: security upstream patch
Control: found -1 2.2.1-3
Control: found -1 2.9.2.1-1
X-Debbugs-Cc: guilhem@debian.org

Hi,

The following vulnerability was published for pandoc.

CVE-2023-35936[0]:
| Starting in version 1.13 and prior to version 3.1.4, Pandoc is
| susceptible to an arbitrary file write vulnerability, which can be
| triggered by providing a specially crafted image element in the input
| when generating files using the `--extract-media` option or outputting
| to PDF format.  This vulnerability allows an attacker to create or
| overwrite arbitrary files on the system, depending on the privileges of
| the process running pandoc.  It only affects systems that pass untrusted
| user input to pandoc and allow pandoc to be used to produce a PDF or
| with the `--extract-media` option.  […]  Note that the `--sandbox`
| option, which only affects IO done by readers and writers themselves,
| does not block this vulnerability.

I discovered that the upstream fix was incomplete while backporting it
to buster (LTS).  Reported the finding upstream who promptly fixed it in
3.1.6 [1].  Another CVE ID was assigned for this, namely CVE-2023-38745 [2].

The Security Team decided not to issue a DSA for these vulnerabilities,
but given they're about to be patched in buster it makes sense to patch
other suites, too.  Please consider MR !3 for unstable:
https://salsa.debian.org/haskell-team/pandoc/-/merge_requests/3 .
debdiff attached for convenience.

I've also prepared (and tested) a fix for bullseye [3] which I'm planing
to submit to -pu once sid is patched.  Also planing to rebuild the
targeted fix for bookworm and submit it to s-pu.  Let me know if you
object :-)

Cheers,
-- 
Guilhem.

[0] https://security-tracker.debian.org/tracker/CVE-2023-35936
    https://github.com/jgm/pandoc/security/advisories/GHSA-xj5q-fv23-575g
[1] https://github.com/jgm/pandoc/commit/eddedbfc14916aa06fc01ff04b38aeb30ae2e625
[2] https://security-tracker.debian.org/tracker/CVE-2023-38745
    https://nvd.nist.gov/vuln/detail/CVE-2023-38745
[3] https://salsa.debian.org/lts-team/packages/pandoc/-/compare/debian%2F2.9.2.1-1...debian%2Fbullseye?from_project_id=22949&straight=false
diffstat for pandoc-2.17.1.1 pandoc-2.17.1.1

 changelog                    |    9 +
 patches/CVE-2023-35936.patch |  205 +++++++++++++++++++++++++++++++++++++++++++
 patches/CVE-2023-38745.patch |   98 ++++++++++++++++++++
 patches/series               |    2 
 4 files changed, 314 insertions(+)

diff -Nru pandoc-2.17.1.1/debian/changelog pandoc-2.17.1.1/debian/changelog
--- pandoc-2.17.1.1/debian/changelog	2022-11-19 14:13:51.000000000 +0100
+++ pandoc-2.17.1.1/debian/changelog	2023-07-21 20:22:42.000000000 +0200
@@ -1,3 +1,12 @@
+pandoc (2.17.1.1-1.2) unstable; urgency=high
+
+  * Non-maintainer upload.
+  * Cherry-pick upstream fixes for CVE-2023-35936 from 3.1.4 release. (Closes:
+    #-1)
+  * Cherry-pick upstream fix for CVE-2023-35936 from 3.1.6 release.
+
+ -- Guilhem Moulin <guilhem@debian.org>  Fri, 21 Jul 2023 20:22:42 +0200
+
 pandoc (2.17.1.1-1.1) unstable; urgency=low
 
   * Non-maintainer upload.
diff -Nru pandoc-2.17.1.1/debian/patches/CVE-2023-35936.patch pandoc-2.17.1.1/debian/patches/CVE-2023-35936.patch
--- pandoc-2.17.1.1/debian/patches/CVE-2023-35936.patch	1970-01-01 01:00:00.000000000 +0100
+++ pandoc-2.17.1.1/debian/patches/CVE-2023-35936.patch	2023-07-21 20:22:42.000000000 +0200
@@ -0,0 +1,205 @@
+From: John MacFarlane <jgm@berkeley.edu>
+Date: Tue, 20 Jun 2023 13:50:13 -0700
+Subject: Fix a security vulnerability in MediaBag and
+ T.P.Class.IO.writeMedia.
+
+This vulnerability, discovered by Entroy C, allows users to write
+arbitrary files to any location by feeding pandoc a specially crafted
+URL in an image element.  The vulnerability is serious for anyone
+using pandoc to process untrusted input.
+
+Origin: https://github.com/jgm/pandoc/commit/5e381e3878b5da87ee7542f7e51c3c1a7fd84b89
+Origin: https://github.com/jgm/pandoc/commit/54561e9a6667b36a8452b01d2def9e3642013dd6
+Origin: https://github.com/jgm/pandoc/commit/df4f13b262f7be5863042f8a5a1c365282c81f07
+Origin: https://github.com/jgm/pandoc/commit/fe62da61dfd33e6b4c0c03895c528a47a0405bf7
+Origin: https://github.com/jgm/pandoc/commit/5246f02f0bb9c176a6d2f6e3d0c03407d8a67445
+Bug: https://github.com/jgm/pandoc/security/advisories/GHSA-xj5q-fv23-575g
+Bug-Debian: https://security-tracker.debian.org/tracker/CVE-2023-35936
+---
+ pandoc.cabal                |  2 ++
+ src/Text/Pandoc/Class/IO.hs | 12 ++++++------
+ src/Text/Pandoc/MediaBag.hs | 27 ++++++++++++++++-----------
+ test/Tests/MediaBag.hs      | 37 +++++++++++++++++++++++++++++++++++++
+ test/test-pandoc.hs         |  2 ++
+ 5 files changed, 63 insertions(+), 17 deletions(-)
+ create mode 100644 test/Tests/MediaBag.hs
+
+diff --git a/pandoc.cabal b/pandoc.cabal
+index 52506e3..c5129a8 100644
+--- a/pandoc.cabal
++++ b/pandoc.cabal
+@@ -791,6 +791,7 @@ test-suite test-pandoc
+                   tasty-lua         >= 1.0     && < 1.1,
+                   tasty-quickcheck  >= 0.8     && < 0.11,
+                   text              >= 1.1.1.0 && < 2.1,
++                  temporary         >= 1.1     && < 1.4,
+                   time              >= 1.5     && < 1.14,
+                   xml               >= 1.3.12  && < 1.4,
+                   zip-archive       >= 0.2.3.4 && < 0.5
+@@ -800,6 +801,7 @@ test-suite test-pandoc
+                   Tests.Lua
+                   Tests.Lua.Module
+                   Tests.Shared
++                  Tests.MediaBag
+                   Tests.Readers.LaTeX
+                   Tests.Readers.HTML
+                   Tests.Readers.JATS
+diff --git a/src/Text/Pandoc/Class/IO.hs b/src/Text/Pandoc/Class/IO.hs
+index 5d4dbc7..5043266 100644
+--- a/src/Text/Pandoc/Class/IO.hs
++++ b/src/Text/Pandoc/Class/IO.hs
+@@ -49,7 +49,7 @@ import Network.HTTP.Client.Internal (addProxy)
+ import Network.HTTP.Client.TLS (mkManagerSettings)
+ import Network.HTTP.Types.Header ( hContentType )
+ import Network.Socket (withSocketsDo)
+-import Network.URI (unEscapeString)
++import Network.URI (URI(..), parseURI, unEscapeString)
+ import System.Directory (createDirectoryIfMissing)
+ import System.Environment (getEnv)
+ import System.FilePath ((</>), takeDirectory, normalise)
+@@ -120,11 +120,11 @@ newUniqueHash = hashUnique <$> liftIO Data.Unique.newUnique
+ 
+ openURL :: (PandocMonad m, MonadIO m) => Text -> m (B.ByteString, Maybe MimeType)
+ openURL u
+- | Just u'' <- T.stripPrefix "data:" u = do
+-     let mime     = T.takeWhile (/=',') u''
+-     let contents = UTF8.fromString $
+-                     unEscapeString $ T.unpack $ T.drop 1 $ T.dropWhile (/=',') u''
+-     return (decodeLenient contents, Just mime)
++ | Just (URI{ uriScheme = "data:",
++              uriPath = upath }) <- parseURI (T.unpack u) = do
++     let (mime, rest) = break (== ',') $ unEscapeString upath
++     let contents = UTF8.fromString $ drop 1 rest
++     return (decodeLenient contents, Just (T.pack mime))
+  | otherwise = do
+      let toReqHeader (n, v) = (CI.mk (UTF8.fromText n), UTF8.fromText v)
+      customHeaders <- map toReqHeader <$> getsCommonState stRequestHeaders
+diff --git a/src/Text/Pandoc/MediaBag.hs b/src/Text/Pandoc/MediaBag.hs
+index df71ff8..45b74b5 100644
+--- a/src/Text/Pandoc/MediaBag.hs
++++ b/src/Text/Pandoc/MediaBag.hs
+@@ -28,12 +28,14 @@ import Data.Data (Data)
+ import qualified Data.Map as M
+ import Data.Maybe (fromMaybe, isNothing)
+ import Data.Typeable (Typeable)
++import Network.URI (unEscapeString)
+ import System.FilePath
+ import Text.Pandoc.MIME (MimeType, getMimeTypeDef, extensionFromMimeType)
+ import Data.Text (Text)
+ import qualified Data.Text as T
+ import Data.Digest.Pure.SHA (sha1, showDigest)
+-import Network.URI (URI (..), parseURI)
++import Network.URI (URI (..), parseURI, isURI)
++import Data.List (isInfixOf)
+ 
+ data MediaItem =
+   MediaItem
+@@ -52,9 +54,12 @@ newtype MediaBag = MediaBag (M.Map Text MediaItem)
+ instance Show MediaBag where
+   show bag = "MediaBag " ++ show (mediaDirectory bag)
+ 
+--- | We represent paths with /, in normalized form.
++-- | We represent paths with /, in normalized form.  Percent-encoding
++-- is not resolved.
+ canonicalize :: FilePath -> Text
+-canonicalize = T.replace "\\" "/" . T.pack . normalise
++canonicalize fp
++  | isURI fp = T.pack fp
++  | otherwise = T.replace "\\" "/" . T.pack . normalise $ fp
+ 
+ -- | Delete a media item from a 'MediaBag', or do nothing if no item corresponds
+ -- to the given path.
+@@ -77,22 +82,22 @@ insertMedia fp mbMime contents (MediaBag mediamap) =
+                              , mediaContents = contents
+                              , mediaMimeType = mt }
+         fp' = canonicalize fp
++        fp'' = unEscapeString $ T.unpack fp'
+         uri = parseURI fp
+-        newpath = if isRelative fp
++        newpath = if isRelative fp''
+                        && isNothing uri
+-                       && ".." `notElem` splitDirectories fp
+-                     then T.unpack fp'
++                       && not (".." `isInfixOf` fp'')
++                     then fp''
+                      else showDigest (sha1 contents) <> "." <> ext
+-        fallback = case takeExtension fp of
+-                        ".gz" -> getMimeTypeDef $ dropExtension fp
+-                        _     -> getMimeTypeDef fp
++        fallback = case takeExtension fp'' of
++                        ".gz" -> getMimeTypeDef $ dropExtension fp''
++                        _     -> getMimeTypeDef fp''
+         mt = fromMaybe fallback mbMime
+-        path = maybe fp uriPath uri
++        path = maybe fp'' (unEscapeString . uriPath) uri
+         ext = case takeExtension path of
+                 '.':e -> e
+                 _ -> maybe "" T.unpack $ extensionFromMimeType mt
+ 
+-
+ -- | Lookup a media item in a 'MediaBag', returning mime type and contents.
+ lookupMedia :: FilePath
+             -> MediaBag
+diff --git a/test/Tests/MediaBag.hs b/test/Tests/MediaBag.hs
+new file mode 100644
+index 0000000..b44232b
+--- /dev/null
++++ b/test/Tests/MediaBag.hs
+@@ -0,0 +1,37 @@
++{-# LANGUAGE OverloadedStrings #-}
++module Tests.MediaBag (tests) where
++
++import Test.Tasty
++import Test.Tasty.HUnit
++-- import Tests.Helpers
++import Text.Pandoc.Class (extractMedia, fillMediaBag, runIOorExplode)
++import System.IO.Temp (withTempDirectory)
++import Text.Pandoc.Shared (inDirectory)
++import System.FilePath
++import Text.Pandoc.Builder as B
++import System.Directory (doesFileExist, copyFile)
++
++tests :: [TestTree]
++tests = [
++  testCase "test fillMediaBag & extractMedia" $
++      withTempDirectory "." "extractMediaTest" $ \tmpdir -> inDirectory tmpdir $ do
++        copyFile "../../test/lalune.jpg" "moon.jpg"
++        let d = B.doc $
++                  B.para (B.image "../../test/lalune.jpg" "" mempty) <>
++                  B.para (B.image "moon.jpg" "" mempty) <>
++                  B.para (B.image "data://image/png;base64,cHJpbnQgImhlbGxvIgo=;.lua+%2f%2e%2e%2f%2e%2e%2fa%2elua" "" mempty) <>
++                  B.para (B.image "data:image/gif;base64,R0lGODlhAQABAIAAAAAAAP///yH5BAEAAAAALAAAAAABAAEAAAIBRAA7" "" mempty)
++        runIOorExplode $ do
++          fillMediaBag d
++          extractMedia "foo" d
++        exists1 <- doesFileExist ("foo" </> "moon.jpg")
++        assertBool "file in directory is not extracted with original name" exists1
++        exists2 <- doesFileExist ("foo" </> "f9d88c3dbe18f6a7f5670e994a947d51216cdf0e.jpg")
++        assertBool "file above directory is not extracted with hashed name" exists2
++        exists3 <- doesFileExist ("foo" </> "2a0eaa89f43fada3e6c577beea4f2f8f53ab6a1d.lua")
++        exists4 <- doesFileExist "a.lua"
++        assertBool "data uri with malicious payload gets written outside of destination dir"
++          (exists3 && not exists4)
++        exists5 <- doesFileExist ("foo" </> "d5fceb6532643d0d84ffe09c40c481ecdf59e15a.gif")
++        assertBool "data uri with gif is not properly decoded" exists5
++  ]
+diff --git a/test/test-pandoc.hs b/test/test-pandoc.hs
+index fcb157f..7d622eb 100644
+--- a/test/test-pandoc.hs
++++ b/test/test-pandoc.hs
+@@ -51,6 +51,7 @@ import qualified Tests.Writers.RST
+ import qualified Tests.Writers.AnnotatedTable
+ import qualified Tests.Writers.TEI
+ import qualified Tests.Writers.Markua
++import qualified Tests.MediaBag
+ import Text.Pandoc.Shared (inDirectory)
+ 
+ tests :: FilePath -> TestTree
+@@ -58,6 +59,7 @@ tests pandocPath = testGroup "pandoc tests"
+         [ Tests.Command.tests
+         , testGroup "Old" (Tests.Old.tests pandocPath)
+         , testGroup "Shared" Tests.Shared.tests
++        , testGroup "MediaBag" Tests.MediaBag.tests
+         , testGroup "Writers"
+           [ testGroup "Native" Tests.Writers.Native.tests
+           , testGroup "ConTeXt" Tests.Writers.ConTeXt.tests
diff -Nru pandoc-2.17.1.1/debian/patches/CVE-2023-38745.patch pandoc-2.17.1.1/debian/patches/CVE-2023-38745.patch
--- pandoc-2.17.1.1/debian/patches/CVE-2023-38745.patch	1970-01-01 01:00:00.000000000 +0100
+++ pandoc-2.17.1.1/debian/patches/CVE-2023-38745.patch	2023-07-21 20:22:42.000000000 +0200
@@ -0,0 +1,98 @@
+From: John MacFarlane <jgm@berkeley.edu>
+Date: Thu, 20 Jul 2023 09:26:38 -0700
+Subject: Fix new variant of the vulnerability in CVE-2023-35936.
+
+Guilhem Moulin noticed that the fix to CVE-2023-35936 was incomplete.
+An attacker could get around it by double-encoding the malicious
+extension to create or override arbitrary files.
+
+    $ echo '![](data://image/png;base64,cHJpbnQgImhlbGxvIgo=;.lua+%252f%252e%252e%252f%252e%252e%252fb%252elua)' >b.md
+    $ .cabal/bin/pandoc b.md --extract-media=bar
+    <p><img
+    src="bar/2a0eaa89f43fada3e6c577beea4f2f8f53ab6a1d.lua+%2f%2e%2e%2f%2e%2e%2fb%2elua" /></p>
+    $ cat b.lua
+    print "hello"
+    $ find bar
+    bar/
+    bar/2a0eaa89f43fada3e6c577beea4f2f8f53ab6a1d.lua+
+
+This commit adds a test case for this more complex attack and fixes
+the vulnerability.  (The fix is quite simple: if the URL-unescaped
+filename or extension contains a '%', we just use the sha1 hash of the
+contents as the canonical name, just as we do if the filename contains
+'..'.)
+
+Origin: https://github.com/jgm/pandoc/commit/eddedbfc14916aa06fc01ff04b38aeb30ae2e625
+Bug-Debian: https://security-tracker.debian.org/tracker/CVE-2023-38745
+---
+ src/Text/Pandoc/Class/IO.hs |  2 ++
+ src/Text/Pandoc/MediaBag.hs |  7 ++++---
+ test/Tests/MediaBag.hs      | 12 +++++++++++-
+ 3 files changed, 17 insertions(+), 4 deletions(-)
+
+diff --git a/src/Text/Pandoc/Class/IO.hs b/src/Text/Pandoc/Class/IO.hs
+index 5043266..b3f2a32 100644
+--- a/src/Text/Pandoc/Class/IO.hs
++++ b/src/Text/Pandoc/Class/IO.hs
+@@ -222,6 +222,8 @@ writeMedia :: (PandocMonad m, MonadIO m)
+            -> m ()
+ writeMedia dir (fp, _mt, bs) = do
+   -- we normalize to get proper path separators for the platform
++  -- we unescape URI encoding, but given how insertMedia
++  -- is written, we shouldn't have any % in a canonical media name...
+   let fullpath = normalise $ dir </> unEscapeString fp
+   liftIOError (createDirectoryIfMissing True) (takeDirectory fullpath)
+   logIOError $ BL.writeFile fullpath bs
+diff --git a/src/Text/Pandoc/MediaBag.hs b/src/Text/Pandoc/MediaBag.hs
+index 45b74b5..e02fc1a 100644
+--- a/src/Text/Pandoc/MediaBag.hs
++++ b/src/Text/Pandoc/MediaBag.hs
+@@ -87,16 +87,17 @@ insertMedia fp mbMime contents (MediaBag mediamap) =
+         newpath = if isRelative fp''
+                        && isNothing uri
+                        && not (".." `isInfixOf` fp'')
++                       && '%' `notElem` fp''
+                      then fp''
+-                     else showDigest (sha1 contents) <> "." <> ext
++                     else showDigest (sha1 contents) <> ext
+         fallback = case takeExtension fp'' of
+                         ".gz" -> getMimeTypeDef $ dropExtension fp''
+                         _     -> getMimeTypeDef fp''
+         mt = fromMaybe fallback mbMime
+         path = maybe fp'' (unEscapeString . uriPath) uri
+         ext = case takeExtension path of
+-                '.':e -> e
+-                _ -> maybe "" T.unpack $ extensionFromMimeType mt
++                '.':e | '%' `notElem` e -> '.':e
++                _ -> maybe "" (\x -> '.':T.unpack x) $ extensionFromMimeType mt
+ 
+ -- | Lookup a media item in a 'MediaBag', returning mime type and contents.
+ lookupMedia :: FilePath
+diff --git a/test/Tests/MediaBag.hs b/test/Tests/MediaBag.hs
+index b44232b..c27a29b 100644
+--- a/test/Tests/MediaBag.hs
++++ b/test/Tests/MediaBag.hs
+@@ -19,7 +19,7 @@ tests = [
+         let d = B.doc $
+                   B.para (B.image "../../test/lalune.jpg" "" mempty) <>
+                   B.para (B.image "moon.jpg" "" mempty) <>
+-                  B.para (B.image "data://image/png;base64,cHJpbnQgImhlbGxvIgo=;.lua+%2f%2e%2e%2f%2e%2e%2fa%2elua" "" mempty) <>
++                  B.para (B.image "data:image/png;base64,cHJpbnQgImhlbGxvIgo=;.lua+%2f%2e%2e%2f%2e%2e%2fa%2elua" "" mempty) <>
+                   B.para (B.image "data:image/gif;base64,R0lGODlhAQABAIAAAAAAAP///yH5BAEAAAAALAAAAAABAAEAAAIBRAA7" "" mempty)
+         runIOorExplode $ do
+           fillMediaBag d
+@@ -34,4 +34,14 @@ tests = [
+           (exists3 && not exists4)
+         exists5 <- doesFileExist ("foo" </> "d5fceb6532643d0d84ffe09c40c481ecdf59e15a.gif")
+         assertBool "data uri with gif is not properly decoded" exists5
++        -- double-encoded version:
++        let e = B.doc $
++                  B.para (B.image "data:image/png;base64,cHJpbnQgInB3bmVkIgo=;.lua+%252f%252e%252e%252f%252e%252e%252fb%252elua" "" mempty)
++        runIOorExplode $ do
++          fillMediaBag e
++          extractMedia "bar" e
++        exists6 <- doesFileExist ("bar" </> "772ceca21a2751863ec46cb23db0e7fc35b9cff8.png")
++        exists7 <- doesFileExist "b.lua"
++        assertBool "data uri with double-encoded malicious payload gets written outside of destination dir"
++          (exists6 && not exists7)
+   ]
diff -Nru pandoc-2.17.1.1/debian/patches/series pandoc-2.17.1.1/debian/patches/series
--- pandoc-2.17.1.1/debian/patches/series	2022-08-13 16:27:42.000000000 +0200
+++ pandoc-2.17.1.1/debian/patches/series	2023-07-21 20:22:42.000000000 +0200
@@ -2,3 +2,5 @@
 020220531~9aff861.patch
 2001_templates_avoid_privacy_breach.patch
 2002_program_package_hint.patch
+CVE-2023-35936.patch
+CVE-2023-38745.patch

Attachment: signature.asc
Description: PGP signature


Reply to: