[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: git-annex security update ready for testing and review



Hi,

[reducing CC list]

Thank you very much for the hint on the checksum verification commit
(2fb3722ce), it was really the bit missing. I've added that to the patch
series and rerolled the rest of the patches to add the Verify type (which is
now called VerifyConfig, but we don't have the Verification types yet).

I'm now more confident the patchset is complete. There are one tiny bit
I'm still slightly unsure of. In Command.Reinject.perform, there was a
`boolSystem "mv"` call lying around that was turned into a `moveFile`
some time between the jessie version and 2fb3722ce. I figured this was
the last instance of such an "mv" call and that moveFile does what it's
supposed to do in the jessie version. So to avoid any compiler mishaps,
I figured I would just use moveFile there but I'm not certain of the
implications.

I'm also wondering if there are reproducers for those vulnerabilities so
that I can test the new packages to see if they actually fix the
problems.

So I've uploaded the test packages to my repository again:

https://people.debian.org/~anarcat/debian/jessie-lts/

This time, testing would be greatly appreciated. And of course, a review
of the patchset would be great as well.

Thanks!

A.

-- 
For every complex problem, there is an answer that is clear, simple -
and wrong.
                        - H.L. Mencken
>From 2fb3722ce993f01adb3ea9d5f8855c3e6a99c249 Mon Sep 17 00:00:00 2001
From: Joey Hess <joeyh@joeyh.name>
Date: Thu, 1 Oct 2015 15:54:37 -0400
Subject: [PATCH] Do verification of checksums of annex objects downloaded from
 remotes.

* When annex objects are received into git repositories, their checksums are
  verified then too.
* To get the old, faster, behavior of not verifying checksums, set
  annex.verify=false, or remote.<name>.annex-verify=false.
* setkey, rekey: These commands also now verify that the provided file
  matches the key, unless annex.verify=false.
* reinject: Already verified content; this can now be disabled by
  setting annex.verify=false.

recvkey and reinject already did verification, so removed now duplicate
code from them. fsck still does its own verification, which is ok since it
does not use getViaTmp, so verification doesn't happen twice when using fsck
--from.
---
 Annex/Content.hs                              | 63 ++++++++++++++++---
 Command/Get.hs                                | 20 +++---
 Command/Move.hs                               |  3 +-
 Command/ReKey.hs                              |  2 +-
 Command/RecvKey.hs                            | 52 +++------------
 Command/Reinject.hs                           | 28 +++------
 Command/SetKey.hs                             |  2 +-
 Command/Sync.hs                               |  4 +-
 Command/TestRemote.hs                         |  6 +-
 Command/TransferKey.hs                        |  3 +-
 Command/TransferKeys.hs                       |  3 +-
 Remote/Git.hs                                 |  2 +-
 Types/GitConfig.hs                            |  4 ++
 debian/changelog                              | 14 +++++
 doc/git-annex-reinject.mdwn                   |  4 +-
 doc/git-annex-setkey.mdwn                     |  3 +-
 doc/git-annex.mdwn                            |  6 ++
 ..._2fa9445619032a378264de8b59958c60._comment | 17 +++++
 18 files changed, 137 insertions(+), 99 deletions(-)
 create mode 100644 doc/todo/checksum_verification_on_transfer/comment_3_2fa9445619032a378264de8b59958c60._comment

(This patch was backported with a small modification to avoir pulling
the changeset introducing Utility/FileSize.hs, and to use the old name of verifyKeyContent (fsckContent).)

Index: b/Annex/Content.hs
===================================================================
--- a/Annex/Content.hs	2018-08-30 16:20:32.951273846 -0400
+++ b/Annex/Content.hs	2018-08-30 16:20:32.943273729 -0400
@@ -16,6 +16,7 @@ module Annex.Content (
 	getViaTmpChecked,
 	getViaTmpUnchecked,
 	prepGetViaTmpChecked,
+	Verify(..),
 	prepTmp,
 	withTmp,
 	checkDiskSpace,
@@ -57,6 +58,9 @@ import Annex.Link
 import Annex.Content.Direct
 import Annex.ReplaceFile
 import Utility.LockFile
+import qualified Types.Remote
+import qualified Types.Backend
+import qualified Backend
 
 {- Checks if a given key's content is currently present. -}
 inAnnex :: Key -> Annex Bool
@@ -207,19 +211,19 @@ lockContent key a = do
 {- Runs an action, passing it a temporary filename to get,
  - and if the action succeeds, moves the temp file into 
  - the annex as a key's content. -}
-getViaTmp :: Key -> (FilePath -> Annex Bool) -> Annex Bool
+getViaTmp :: Verify -> Key -> (FilePath -> Annex Bool) -> Annex Bool
 getViaTmp = getViaTmpChecked (return True)
 
 {- Like getViaTmp, but does not check that there is enough disk space
  - for the incoming key. For use when the key content is already on disk
  - and not being copied into place. -}
-getViaTmpUnchecked :: Key -> (FilePath -> Annex Bool) -> Annex Bool
+getViaTmpUnchecked :: Verify -> Key -> (FilePath -> Annex Bool) -> Annex Bool
 getViaTmpUnchecked = finishGetViaTmp (return True)
 
-getViaTmpChecked :: Annex Bool -> Key -> (FilePath -> Annex Bool) -> Annex Bool
-getViaTmpChecked check key action = 
+getViaTmpChecked :: Verify -> Annex Bool -> Key -> (FilePath -> Annex Bool) -> Annex Bool
+getViaTmpChecked v check key action = 
 	prepGetViaTmpChecked key False $
-		finishGetViaTmp check key action
+		finishGetViaTmp v check key action
 
 {- Prepares to download a key via a tmp file, and checks that there is
  - enough free disk space.
@@ -245,16 +249,22 @@ prepGetViaTmpChecked key unabletoget get
 		, return unabletoget
 		)
 
-finishGetViaTmp :: Annex Bool -> Key -> (FilePath -> Annex Bool) -> Annex Bool
-finishGetViaTmp check key action = do
+finishGetViaTmp :: Verify -> Annex Bool -> Key -> (FilePath -> Annex Bool) -> Annex Bool
+finishGetViaTmp v check key action = do
 	tmpfile <- prepTmp key
 	ifM (action tmpfile <&&> check)
-		( do
-			moveAnnex key tmpfile
-			logStatus key InfoPresent
-			return True
-		-- the tmp file is left behind, in case caller wants
-		-- to resume its transfer
+		( ifM (verifyKeyContent v key tmpfile)
+			( do
+				moveAnnex key tmpfile
+				logStatus key InfoPresent
+				return True
+			, do
+				warning "verification of content failed"
+				liftIO $ nukeFile tmpfile
+				return False
+			)
+		-- On transfer failure, the tmp file is left behind, in case
+		-- caller wants to resume its transfer
 		, return False
 		)
 
@@ -275,6 +285,39 @@ withTmp key action = do
 	liftIO $ nukeFile tmp
 	return res
 
+{- Verifies that a file is the expected content of a key.
+ -
+ - Most keys have a known size, and if so, the file size is checked.
+ - This is not expensive, so is always done.
+ -
+ - When the key's backend allows verifying the content (eg via checksum),
+ - it is checked. This is an expensive check, so configuration can prevent
+ - it, for either a particular remote or always.
+ -}
+verifyKeyContent :: Verify -> Key -> FilePath -> Annex Bool
+verifyKeyContent v k f = verifysize <&&> verifycontent
+  where
+	verifysize = case Types.Key.keySize k of
+		Nothing -> return True
+		Just size -> do
+			size' <- liftIO $ catchDefaultIO 0 $
+				fromIntegral . fileSize <$> getFileStatus f
+			return (size' == size)
+	verifycontent = ifM (shouldVerify v)
+		( case Types.Backend.fsckKey =<< Backend.maybeLookupBackendName (Types.Key.keyBackendName k) of
+			Nothing -> return True
+			Just verifier -> verifier k f
+		, return True
+		)
+
+data Verify = AlwaysVerify | RemoteVerify Remote | DefaultVerify
+
+shouldVerify :: Verify -> Annex Bool
+shouldVerify AlwaysVerify = return True
+shouldVerify DefaultVerify = annexVerify <$> Annex.getGitConfig
+shouldVerify (RemoteVerify r) = shouldVerify DefaultVerify
+	<&&> pure (remoteAnnexVerify (Types.Remote.gitconfig r))
+
 {- Checks that there is disk space available to store a given key,
  - in a destination (or the annex) printing a warning if not. -}
 checkDiskSpace :: Maybe FilePath -> Key -> Integer -> Annex Bool
Index: b/Command/Get.hs
===================================================================
--- a/Command/Get.hs	2018-08-30 16:20:32.951273846 -0400
+++ b/Command/Get.hs	2018-08-30 16:20:32.943273729 -0400
@@ -53,17 +53,16 @@ start' expensivecheck from key afile = s
 		next a
 
 perform :: Key -> AssociatedFile -> CommandPerform
-perform key afile = stopUnless (getViaTmp key $ getKeyFile key afile) $
+perform key afile = stopUnless (getKey key afile) $
 	next $ return True -- no cleanup needed
 
 {- Try to find a copy of the file in one of the remotes,
  - and copy it to here. -}
-getKeyFile :: Key -> AssociatedFile -> FilePath -> Annex Bool
-getKeyFile key afile dest = getKeyFile' key afile dest
-	=<< Remote.keyPossibilities key
+getKey :: Key -> AssociatedFile -> Annex Bool
+getKey key afile = getKey' key afile =<< Remote.keyPossibilities key
 
-getKeyFile' :: Key -> AssociatedFile -> FilePath -> [Remote] -> Annex Bool
-getKeyFile' key afile dest = dispatch
+getKey' :: Key -> AssociatedFile -> [Remote] -> Annex Bool
+getKey' key afile = dispatch
   where
 	dispatch [] = do
 		showNote "not available"
@@ -87,6 +86,9 @@ getKeyFile' key afile dest = dispatch
 		| Remote.hasKeyCheap r =
 			either (const False) id <$> Remote.hasKey r key
 		| otherwise = return True
-	docopy r = download (Remote.uuid r) key afile noRetry $ \p -> do
-		showAction $ "from " ++ Remote.name r
-		Remote.retrieveKeyFile r key afile dest p
+	docopy r witness = getViaTmp (RemoteVerify r) key $ \dest ->
+		download (Remote.uuid r) key afile noRetry
+			(\p -> do
+				showAction $ "from " ++ Remote.name r
+				Remote.retrieveKeyFile r key afile dest p
+			) witness
Index: b/Command/Move.hs
===================================================================
--- a/Command/Move.hs	2018-08-30 16:20:32.951273846 -0400
+++ b/Command/Move.hs	2018-08-30 16:20:32.947273787 -0400
@@ -158,7 +158,8 @@ fromPerform src move key afile = ifM (in
 	go = notifyTransfer Download afile $ 
 		download (Remote.uuid src) key afile noRetry $ \p -> do
 			showAction $ "from " ++ Remote.name src
-			getViaTmp key $ \t -> Remote.retrieveKeyFile src key afile t p
+			getViaTmp (RemoteVerify src) key $ \t ->
+				Remote.retrieveKeyFile src key afile t p
 	dispatch _ False = stop -- failed
 	dispatch False True = next $ return True -- copy complete
 	dispatch True True = do -- finish moving
Index: b/Command/ReKey.hs
===================================================================
--- a/Command/ReKey.hs	2018-08-30 16:20:32.951273846 -0400
+++ b/Command/ReKey.hs	2018-08-30 16:20:32.947273787 -0400
@@ -49,7 +49,7 @@ perform file oldkey newkey = do
 {- Make a hard link to the old key content (when supported),
  - to avoid wasting disk space. -}
 linkKey :: Key -> Key -> Annex Bool
-linkKey oldkey newkey = getViaTmpUnchecked newkey $ \tmp -> do
+linkKey oldkey newkey = getViaTmpUnchecked DefaultVerify newkey $ \tmp -> do
 	src <- calcRepo $ gitAnnexLocation oldkey
 	liftIO $ ifM (doesFileExist tmp)
 		( return True
Index: b/Command/RecvKey.hs
===================================================================
--- a/Command/RecvKey.hs	2018-08-30 16:20:32.951273846 -0400
+++ b/Command/RecvKey.hs	2018-08-30 16:20:32.947273787 -0400
@@ -16,9 +16,6 @@ import Utility.Rsync
 import Logs.Transfer
 import Command.SendKey (fieldTransfer)
 import qualified CmdLine.GitAnnexShell.Fields as Fields
-import qualified Types.Key
-import qualified Types.Backend
-import qualified Backend
 
 cmd :: [Command]
 cmd = [noCommit $ command "recvkey" paramKey seek
@@ -28,8 +25,12 @@ seek :: CommandSeek
 seek = withKeys start
 
 start :: Key -> CommandStart
-start key = fieldTransfer Download key $ \_p ->
-	ifM (getViaTmp key go)
+start key = fieldTransfer Download key $ \_p -> do
+	-- Always verify content when a direct mode repo is sending a file,
+	-- as the file could change while being transferred.
+	fromdirect <- isJust <$> Fields.getField Fields.direct
+	let verify = if fromdirect then AlwaysVerify else DefaultVerify
+	ifM (getViaTmp verify key go)
 		( do
 			-- forcibly quit after receiving one key,
 			-- and shutdown cleanly
@@ -41,44 +42,4 @@ start key = fieldTransfer Download key $
 	go tmp = do
 		opts <- filterRsyncSafeOptions . maybe [] words
 			<$> getField "RsyncOptions"
-		ok <- liftIO $ rsyncServerReceive (map Param opts) tmp
-
-		-- The file could have been received with permissions that
-		-- do not allow reading it, so this is done before the
-		-- directcheck.
-		freezeContent tmp
-
-		if ok
-			then ifM (isJust <$> Fields.getField Fields.direct)
-				( directcheck tmp
-				, return True
-				)
-			else return False
-	{- If the sending repository uses direct mode, the file
-	 - it sends could be modified as it's sending it. So check
-	 - that the right size file was received, and that the key/value
-	 - Backend is happy with it. -}
-	directcheck tmp = do
-		oksize <- case Types.Key.keySize key of
-		        Nothing -> return True
-		        Just size -> do
-				size' <- fromIntegral . fileSize
-					<$> liftIO (getFileStatus tmp)
-				return $ size == size'
-		if oksize
-			then case Backend.maybeLookupBackendName (Types.Key.keyBackendName key) of
-				Nothing -> do
-					warning "recvkey: received key from direct mode repository using unknown backend; cannot check; discarding"
-					return False
-				Just backend -> maybe (return True) runfsck
-					(Types.Backend.fsckKey backend)
-			else do
-				warning "recvkey: received key with wrong size; discarding"
-				return False
-	  where
-		runfsck check = ifM (check key tmp)
-			( return True
-			, do
-				warning "recvkey: received key from direct mode repository seems to have changed as it was transferred; discarding"
-				return False
-			)
+		liftIO $ rsyncServerReceive (map Param opts) tmp
Index: b/Command/Reinject.hs
===================================================================
--- a/Command/Reinject.hs	2018-08-30 16:20:32.951273846 -0400
+++ b/Command/Reinject.hs	2018-08-30 16:20:32.947273787 -0400
@@ -11,8 +11,6 @@ import Common.Annex
 import Command
 import Logs.Location
 import Annex.Content
-import qualified Command.Fsck
-import qualified Backend
 
 cmd :: [Command]
 cmd = [command "reinject" (paramPair "SRC" "DEST") seek
@@ -35,27 +33,19 @@ start (src:dest:[])
 start _ = error "specify a src file and a dest file"
 
 perform :: FilePath -> FilePath -> Key -> CommandPerform
-perform src dest key = do
-	{- Check the content before accepting it. -}
-	v <- Backend.getBackend dest key
-	case v of
-		Nothing -> stop
-		Just backend ->
-			ifM (Command.Fsck.checkKeySizeOr reject key src
-				<&&> Command.Fsck.checkBackendOr reject backend key src)
-				( do
-					unlessM move $ error "mv failed!"
-					next $ cleanup key
-				, error "not reinjecting"
-				)
+perform src _dest key = ifM move
+	( next $ cleanup key
+	, error "failed"
+	)
   where
-	-- the file might be on a different filesystem,
+	-- The file might be on a different filesystem,
 	-- so mv is used rather than simply calling
-	-- moveToObjectDir; disk space is also
-	-- checked this way.
-	move = getViaTmp key $ \tmp ->
-		liftIO $ boolSystem "mv" [File src, File tmp]
-	reject = const $ return "wrong file?"
+	-- moveToObjectDir; disk space is also checked this way,
+	-- and the file's content is verified to match the key.
+	move = getViaTmp DefaultVerify key $ \tmp ->
+		liftIO $  catchBoolIO $ do
+ 			moveFile src tmp
+ 			return True
 
 cleanup :: Key -> CommandCleanup
 cleanup key = do
Index: b/Command/Sync.hs
===================================================================
--- a/Command/Sync.hs	2018-08-30 16:20:32.951273846 -0400
+++ b/Command/Sync.hs	2018-08-30 16:20:53.639578339 -0400
@@ -25,7 +25,7 @@ import qualified Remote.Git
 import Config
 import Annex.Wanted
 import Annex.Content
-import Command.Get (getKeyFile')
+import Command.Get (getKey')
 import qualified Command.Move
 import Logs.Location
 import Annex.Drop
@@ -367,7 +367,7 @@ syncFile rs f k = do
 		)
 	get have = commandAction $ do
 		showStart "get" f
-		next $ next $ getViaTmp k $ \dest -> getKeyFile' k (Just f) dest have
+		next $ next $ getKey' k (Just f) have
 
 	wantput r
 		| Remote.readonly r || remoteAnnexReadOnly (Remote.gitconfig r) = return False
Index: b/Command/TestRemote.hs
===================================================================
--- a/Command/TestRemote.hs	2018-08-30 16:20:32.951273846 -0400
+++ b/Command/TestRemote.hs	2018-08-30 16:20:32.947273787 -0400
@@ -153,7 +153,7 @@ test st r k =
 		Just b -> case fsckKey b of
 			Nothing -> return True
 			Just fscker -> fscker k (key2file k)
-	get = getViaTmp k $ \dest ->
+	get = getViaTmp (RemoteVerify r) k $ \dest ->
 		Remote.retrieveKeyFile r k Nothing dest nullMeterUpdate
 	store = Remote.storeKey r k Nothing nullMeterUpdate
 	remove = Remote.removeKey r k
@@ -167,10 +167,10 @@ testUnavailable st r k =
 	, check (`notElem` [Right True, Right False]) "checkPresent" $
 		Remote.checkPresent r k
 	, check (== Right False) "retrieveKeyFile" $
-		getViaTmp k $ \dest ->
+		getViaTmp (RemoteVerify r) k $ \dest ->
 			Remote.retrieveKeyFile r k Nothing dest nullMeterUpdate
 	, check (== Right False) "retrieveKeyFileCheap" $
-		getViaTmp k $ \dest ->
+		getViaTmp (RemoteVerify r) k $ \dest ->
 			Remote.retrieveKeyFileCheap r k dest
 	]
   where
Index: b/Command/TransferKey.hs
===================================================================
--- a/Command/TransferKey.hs	2018-08-30 16:20:32.951273846 -0400
+++ b/Command/TransferKey.hs	2018-08-30 16:20:32.947273787 -0400
@@ -51,7 +51,8 @@ toPerform remote key file = go Upload fi
 fromPerform :: Remote -> Key -> AssociatedFile -> CommandPerform
 fromPerform remote key file = go Upload file $
 	download (uuid remote) key file forwardRetry $ \p ->
-		getViaTmp key $ \t -> Remote.retrieveKeyFile remote key file t p
+		getViaTmp (RemoteVerify remote) key $
+			\t -> Remote.retrieveKeyFile remote key file t p
 
 go :: Direction -> AssociatedFile -> (NotifyWitness -> Annex Bool) -> CommandPerform
 go direction file a = notifyTransfer direction file a >>= liftIO . exitBool
Index: b/Command/TransferKeys.hs
===================================================================
--- a/Command/TransferKeys.hs	2018-08-30 16:20:32.951273846 -0400
+++ b/Command/TransferKeys.hs	2018-08-30 16:20:32.947273787 -0400
@@ -43,7 +43,8 @@ start = do
 				return ok
 		| otherwise = notifyTransfer direction file $
 			download (Remote.uuid remote) key file forwardRetry $ \p ->
-				getViaTmp key $ \t -> Remote.retrieveKeyFile remote key file t p
+				getViaTmp (RemoteVerify remote) key $ \t -> 
+					Remote.retrieveKeyFile remote key file t p
 
 runRequests
 	:: Handle
Index: b/Remote/Git.hs
===================================================================
--- a/Remote/Git.hs	2018-08-30 16:20:32.951273846 -0400
+++ b/Remote/Git.hs	2018-08-30 16:20:32.947273787 -0400
@@ -481,7 +481,7 @@ copyToRemote r key file p
 				ensureInitialized
 				runTransfer (Transfer Download u key) file noRetry $ const $
 					Annex.Content.saveState True `after`
-						Annex.Content.getViaTmpChecked (liftIO checksuccessio) key
+						Annex.Content.getViaTmpChecked (Annex.Content.RemoteVerify r) (liftIO checksuccessio) key
 							(\d -> rsyncOrCopyFile params object d p)
 			)
 
Index: b/Types/GitConfig.hs
===================================================================
--- a/Types/GitConfig.hs	2018-08-30 16:20:32.951273846 -0400
+++ b/Types/GitConfig.hs	2018-08-30 16:20:32.947273787 -0400
@@ -53,6 +53,7 @@ data GitConfig = GitConfig
 	, annexListen :: Maybe String
 	, annexStartupScan :: Bool
 	, annexHardLink :: Bool
+	, annexVerify :: Bool
 	, coreSymlinks :: Bool
 	, gcryptId :: Maybe String
 	}
@@ -89,6 +90,7 @@ extractGitConfig r = GitConfig
 	, annexListen = getmaybe (annex "listen")
 	, annexStartupScan = getbool (annex "startupscan") True
 	, annexHardLink = getbool (annex "hardlink") False
+	, annexVerify = getbool (annex "verify") True
 	, coreSymlinks = getbool "core.symlinks" True
 	, gcryptId = getmaybe "core.gcrypt-id"
 	}
@@ -113,6 +115,7 @@ data RemoteGitConfig = RemoteGitConfig
 	, remoteAnnexIgnore :: Bool
 	, remoteAnnexSync :: Bool
 	, remoteAnnexReadOnly :: Bool
+	, remoteAnnexVerify :: Bool
 	, remoteAnnexTrustLevel :: Maybe String
 	, remoteAnnexStartCommand :: Maybe String
 	, remoteAnnexStopCommand :: Maybe String
@@ -148,6 +151,7 @@ extractRemoteGitConfig r remotename = Re
 	, remoteAnnexIgnore = getbool "ignore" False
 	, remoteAnnexSync = getbool "sync" True
 	, remoteAnnexReadOnly = getbool "readonly" False
+	, remoteAnnexVerify = getbool "verify" True
 	, remoteAnnexTrustLevel = notempty $ getmaybe "trustlevel"
 	, remoteAnnexStartCommand = notempty $ getmaybe "start-command"
 	, remoteAnnexStopCommand = notempty $ getmaybe "stop-command"
Index: b/doc/git-annex.mdwn
===================================================================
--- a/doc/git-annex.mdwn	2018-08-30 16:20:32.951273846 -0400
+++ b/doc/git-annex.mdwn	2018-08-30 16:20:32.947273787 -0400
@@ -1587,6 +1587,12 @@ Here are all the supported configuration
   This both prevents git-annex sync from pushing changes, and prevents
   storing or removing files from read-only remote.
 
+* `remote.<name>.annex-verify`, `annex.verify`
+
+  By default, git-annex will verify the checksums of objects downloaded
+  from remotes. If you trust a remote and don't want the overhead
+  of these checksums, you can set this to `false`.
+
 * `remote.<name>.annexUrl`
 
   Can be used to specify a different url than the regular `remote.<name>.url`
Index: b/doc/todo/checksum_verification_on_transfer/comment_3_2fa9445619032a378264de8b59958c60._comment
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ b/doc/todo/checksum_verification_on_transfer/comment_3_2fa9445619032a378264de8b59958c60._comment	2018-08-30 16:20:32.947273787 -0400
@@ -0,0 +1,17 @@
+[[!comment format=mdwn
+ username="joey"
+ subject="""status update"""
+ date="2015-10-01T19:17:38Z"
+ content="""
+Checksum verification is now done for all downloads, unless disabled via
+annex.verify=false.
+
+When an object is uploaded to a regular git remote, checksum verification
+also also done. (For a local directory, git-annex runs a download from the
+perspective of the remote, so we get it for free, and when git-annex-shell
+recvkey is used, it checksums the data it receives and compares it with the
+key.)
+
+For uploads to special remotes, no checksum verification is done yet.
+Leaving this todo item open because of that gap in the coverage.
+"""]]
From: Joey Hess <id@joeyh.name>
Date: Mon, 18 Jun 2018 15:38:25 -0400
Subject: limit url downloads to whitelisted schemes

backported from 28720c795ff57a55b48e56d15f9b6bcb977f48d9

Security fix! Allowing any schemes, particularly file: and
possibly others like scp: allowed file exfiltration by anyone who had
write access to the git repository, since they could add an annexed file
using such an url, or using an url that redirected to such an url,
and wait for the victim to get it into their repository and send them a copy.

* Added annex.security.allowed-url-schemes setting, which defaults
  to only allowing http and https URLs. Note especially that file:/
  is no longer enabled by default.

* Removed annex.web-download-command, since its interface does not allow
  supporting annex.security.allowed-url-schemes across redirects.
  If you used this setting, you may want to instead use annex.web-options
  to pass options to curl.

With annex.web-download-command removed, nearly all url accesses in
git-annex are made via Utility.Url via http-client or curl. http-client
only supports http and https, so no problem there.
(Disabling one and not the other is not implemented.)

Used curl --proto to limit the allowed url schemes.
wget only supports http https ftp, so does not need any limiting.

Note that this will cause git annex fsck --from web to mark files using
a disallowed url scheme as not being present in the web. That seems
acceptable; fsck --from web also does that when a web server is not available.

quvi was not changed; it only supports a hardcoded set of urls, which
are http, not file urls.

This does not address any external special remotes that might download
an url themselves. Current thinking is all external special remotes will
need to be audited for this problem, although many of them will use
http libraries that only support http and not curl's menagarie.

The related problem of accessing private localhost and LAN urls is not
addressed by this commit.

This commit was sponsored by Brett Eisenberg on Patreon.

(patch backported from Debian stretch 6.20170101-1+deb9u2)

---

Index: b/Annex/Content.hs
===================================================================
--- a/Annex/Content.hs	2018-08-30 15:53:46.419628462 -0400
+++ b/Annex/Content.hs	2018-08-30 15:53:46.415628403 -0400
@@ -596,18 +596,10 @@ saveState nocommit = doSideAction $ do
 
 {- Downloads content from any of a list of urls. -}
 downloadUrl :: [Url.URLString] -> FilePath -> Annex Bool
-downloadUrl urls file = go =<< annexWebDownloadCommand <$> Annex.getGitConfig
+downloadUrl urls file = go
   where
-	go Nothing = Url.withUrlOptions $ \uo ->
+	go = Url.withUrlOptions $ \uo ->
 		anyM (\u -> Url.download u file uo) urls
-	go (Just basecmd) = liftIO $ anyM (downloadcmd basecmd) urls
-	downloadcmd basecmd url =
-		boolSystem "sh" [Param "-c", Param $ gencmd url basecmd]
-			<&&> doesFileExist file
-	gencmd url = massReplace
-		[ ("%file", shellEscape file)
-		, ("%url", shellEscape url)
-		]
 
 {- Copies a key's content, when present, to a temp file.
  - This is used to speed up some rsyncs. -}
Index: b/Annex/Url.hs
===================================================================
--- a/Annex/Url.hs	2018-08-30 15:53:46.419628462 -0400
+++ b/Annex/Url.hs	2018-08-30 15:53:46.415628403 -0400
@@ -30,6 +30,7 @@ getUrlOptions = mkUrlOptions
 	<$> getUserAgent
 	<*> headers
 	<*> options
+	<*> (annexAllowedUrlSchemes <$> Annex.getGitConfig)
   where
 	headers = do
 		v <- annexHttpHeadersCommand <$> Annex.getGitConfig
Index: b/Types/GitConfig.hs
===================================================================
--- a/Types/GitConfig.hs	2018-08-30 15:53:46.419628462 -0400
+++ b/Types/GitConfig.hs	2018-08-30 15:53:46.415628403 -0400
@@ -21,6 +21,9 @@ import Types.Distribution
 import Types.Availability
 import Types.NumCopies
 import Utility.HumanTime
+import Utility.Url (Scheme, mkScheme)
+
+import qualified Data.Set as S
 
 {- Main git-annex settings. Each setting corresponds to a git-config key
  - such as annex.foo -}
@@ -42,7 +45,6 @@ data GitConfig = GitConfig
 	, annexDebug :: Bool
 	, annexWebOptions :: [String]
 	, annexQuviOptions :: [String]
-	, annexWebDownloadCommand :: Maybe String
 	, annexCrippledFileSystem :: Bool
 	, annexLargeFiles :: Maybe String
 	, annexFsckNudge :: Bool
@@ -53,6 +55,8 @@ data GitConfig = GitConfig
 	, annexListen :: Maybe String
 	, annexStartupScan :: Bool
 	, annexHardLink :: Bool
+	, annexAllowedUrlSchemes :: S.Set Scheme
+	, annexAllowedHttpAddresses :: String
 	, annexVerify :: Bool
 	, coreSymlinks :: Bool
 	, gcryptId :: Maybe String
@@ -78,7 +82,6 @@ extractGitConfig r = GitConfig
 	, annexDebug = getbool (annex "debug") False
 	, annexWebOptions = getwords (annex "web-options")
 	, annexQuviOptions = getwords (annex "quvi-options")
-	, annexWebDownloadCommand = getmaybe (annex "web-download-command")
 	, annexCrippledFileSystem = getbool (annex "crippledfilesystem") False
 	, annexLargeFiles = getmaybe (annex "largefiles")
 	, annexFsckNudge = getbool (annex "fscknudge") True
@@ -90,6 +93,11 @@ extractGitConfig r = GitConfig
 	, annexListen = getmaybe (annex "listen")
 	, annexStartupScan = getbool (annex "startupscan") True
 	, annexHardLink = getbool (annex "hardlink") False
+	, annexAllowedUrlSchemes = S.fromList $ map mkScheme $
+		maybe ["http", "https", "ftp"] words $
+			getmaybe (annex "security.allowed-url-schemes")
+        , annexAllowedHttpAddresses = fromMaybe "" $
+		getmaybe (annex "security.allowed-http-addresses")
 	, annexVerify = getbool (annex "verify") True
 	, coreSymlinks = getbool "core.symlinks" True
 	, gcryptId = getmaybe "core.gcrypt-id"
Index: b/Utility/Url.hs
===================================================================
--- a/Utility/Url.hs	2018-08-30 15:53:46.419628462 -0400
+++ b/Utility/Url.hs	2018-08-30 15:53:46.415628403 -0400
@@ -12,6 +12,9 @@
 module Utility.Url (
 	URLString,
 	UserAgent,
+	Scheme,
+	mkScheme,
+	allowedScheme,
 	UrlOptions,
 	mkUrlOptions,
 	check,
@@ -30,6 +33,7 @@ import Data.Default
 import qualified Data.CaseInsensitive as CI
 import qualified Data.ByteString as B
 import qualified Data.ByteString.UTF8 as B8
+import qualified Data.Set as S
 
 import qualified Build.SysConfig
 
@@ -39,6 +43,15 @@ type Headers = [String]
 
 type UserAgent = String
 
+newtype Scheme = Scheme (CI.CI String)
+	deriving (Eq, Ord)
+
+mkScheme :: String -> Scheme
+mkScheme = Scheme . CI.mk
+
+fromScheme :: Scheme -> String
+fromScheme (Scheme s) = CI.original s
+
 data UrlOptions = UrlOptions
 	{ userAgent :: Maybe UserAgent
 	, reqHeaders :: Headers
@@ -48,19 +61,21 @@ data UrlOptions = UrlOptions
 #else
 	, applyRequest :: forall m. Request m -> Request m
 #endif
+	, allowedSchemes :: S.Set Scheme
 	}
 
 instance Default UrlOptions
   where
 	def = UrlOptions Nothing [] [] id
+		(S.fromList $ map mkScheme ["http", "https", "ftp"])
 
-mkUrlOptions :: Maybe UserAgent -> Headers -> [CommandParam] -> UrlOptions
-mkUrlOptions useragent reqheaders reqparams =
-	UrlOptions useragent reqheaders reqparams applyrequest
+mkUrlOptions :: Maybe UserAgent -> Headers -> [CommandParam] -> S.Set Scheme -> UrlOptions
+mkUrlOptions defuseragent reqheaders reqparams allowedschemes =
+	UrlOptions defuseragent reqheaders reqparams applyrequest allowedschemes
   where
 	applyrequest = \r -> r { requestHeaders = requestHeaders r ++ addedheaders }
 	addedheaders = uaheader ++ otherheaders
-	uaheader = case useragent of
+	uaheader = case defuseragent of
 		Nothing -> []
 		Just ua -> [(hUserAgent, B8.fromString ua)]
 	otherheaders = map toheader reqheaders
@@ -77,6 +92,28 @@ addUserAgent uo ps = case userAgent uo o
 	-- --user-agent works for both wget and curl commands
 	Just ua -> ps ++ [Param "--user-agent", Param ua] 
 
+checkPolicy :: UrlOptions -> URI -> a -> IO a -> IO a
+checkPolicy uo u onerr a
+	| allowedScheme uo u = a
+	| otherwise = do
+		hPutStrLn stderr $
+			"Configuration does not allow accessing " ++ show u
+		hFlush stderr
+		return onerr
+
+curlSchemeParams :: UrlOptions -> [CommandParam]
+curlSchemeParams uo = 
+	[ Param "--proto"
+	, Param $ intercalate "," ("-all" : schemelist)
+	]
+  where
+	schemelist = map fromScheme $ S.toList $ allowedSchemes uo
+
+allowedScheme :: UrlOptions -> URI -> Bool
+allowedScheme uo u = uscheme `S.member` allowedSchemes uo
+  where
+	uscheme = mkScheme $ takeWhile (/=':') (uriScheme u)
+
 {- Checks that an url exists and could be successfully downloaded,
  - also checking that its size, if available, matches a specified size. -}
 checkBoth :: URLString -> Maybe Integer -> UrlOptions -> IO Bool
@@ -96,7 +133,7 @@ check url expected_size = go <$$> exists
  - also returning its size if available. -}
 exists :: URLString -> UrlOptions -> IO (Bool, Maybe Integer)
 exists url uo = case parseURIRelaxed url of
-	Just u -> case parseUrl (show u) of
+	Just u -> checkPolicy uo u dne' $ case parseUrl (show u) of
 		Just req -> existsconduit req `catchNonAsync` const dne
 		-- http-conduit does not support file:, ftp:, etc urls,
 		-- so fall back to reading files and using curl.
@@ -115,14 +152,15 @@ exists url uo = case parseURIRelaxed url
 			| otherwise -> dne
 	Nothing -> dne
   where
-	dne = return (False, Nothing)
+	dne = return dne'
+	dne' = (False, Nothing)
 
 	curlparams = addUserAgent uo $
 		[ Param "-s"
 		, Param "--head"
 		, Param "-L", Param url
 		, Param "-w", Param "%{http_code}"
-		] ++ concatMap (\h -> [Param "-H", Param h]) (reqHeaders uo) ++ (reqParams uo)
+		] ++ concatMap (\h -> [Param "-H", Param h]) (reqHeaders uo) ++ (reqParams uo) ++ curlSchemeParams uo
 
 	extractlencurl s = case lastMaybe $ filter ("Content-Length:" `isPrefixOf`) (lines s) of
 		Just l -> case lastMaybe $ words l of
@@ -178,20 +216,25 @@ downloadQuiet = download' True
 download' :: Bool -> URLString -> FilePath -> UrlOptions -> IO Bool
 download' quiet url file uo = 
 	case parseURIRelaxed url of
-		Just u
-			| uriScheme u == "file:" -> do
+		Just u -> checkPolicy uo u False $
+			if uriScheme u == "file:" then do
 				-- curl does not create destination file
 				-- for an empty file:// url, so pre-create
 				writeFile file ""
 				curl
-			| otherwise -> ifM (inPath "wget") (wget , curl)
+			else ifM (inPath "wget") (wget , curl)
 		_ -> return False
   where
 	headerparams = map (\h -> Param $ "--header=" ++ h) (reqHeaders uo)
 	wget = go "wget" $ headerparams ++ quietopt "-q" ++ wgetparams
 	{- Regular wget needs --clobber to continue downloading an existing
 	 - file. On Android, busybox wget is used, which does not
-	 - support, or need that option. -}
+	 - support, or need that option.
+	 -
+	 - wget only supports https, http, and ftp, not file, which are
+	 - all always allowed, so its url schemes do not need to be
+	 - limited.
+         -}
 #ifndef __ANDROID__
 	wgetparams = [Params "--clobber -c -O"]
 #else
@@ -202,7 +245,7 @@ download' quiet url file uo =
 	 - the remainder to download as the whole file,
 	 - and not indicating how much percent was
 	 - downloaded before the resume. -}
-	curl = go "curl" $ headerparams ++ quietopt "-s" ++
+	curl = go "curl" $ headerparams ++ quietopt "-s" ++ curlSchemeParams uo ++
 		[Params "-f -L -C - -# -o"]
 	go cmd opts = boolSystem cmd $
 		addUserAgent uo $ reqParams uo++opts++[File file, File url]
Index: b/doc/git-annex.mdwn
===================================================================
--- a/doc/git-annex.mdwn	2018-08-30 15:53:46.419628462 -0400
+++ b/doc/git-annex.mdwn	2018-08-30 15:53:46.415628403 -0400
@@ -1694,13 +1694,20 @@ Here are all the supported configuration
   If set, the command is run and each line of its output is used as a HTTP
   header. This overrides annex.http-headers.
 
-* `annex.web-download-command`
+* `annex.security.allowed-url-schemes`
 
-  Use to specify a command to run to download a file from the web.
-  (The default is to use wget or curl.)
+  List of URL schemes that git-annex is allowed to download content from.
+  The default is "http https ftp".
 
-  In the command line, %url is replaced with the url to download,
-  and %file is replaced with the file that it should be saved to.
+  Think very carefully before changing this; there are security
+  implications. For example, if it's changed to allow "file" URLs, then
+  anyone who can get a commit into your git-annex repository could
+  `git-annex addurl` a pointer to a private file located outside that
+  repository, possibly causing it to be copied into your repository
+  and transferred on to other remotes, exposing its content.
+
+  Some special remotes support their own domain-specific URL
+  schemes; those are not affected by this configuration setting.
 
 * `annex.secure-erase-command`
 
From: Joey Hess <id@joeyh.name>
Date: Mon, 18 Jun 2018 17:40:50 -0400
Subject: block url downloads by default

git-annex will refuse to download content from the web, to prevent
accidental exposure of data on private webservers on localhost and the
LAN. This can be overridden with the
annex.security.allowed-http-addresses setting.

This is the simplest possible fix for the security hole. A better fix
has been developed for newer versions of git-annex but would be a lot of
work to backport, and perhaps too big a diff.

There are several sets of git-annex users who will be impacted
in different ways by this:

* Users who have a git-annex repository but don't use the web special
  remote. Unaffected.

* Users who have a git-annex repository that is for private use only.
  They will have to read enough docs to find the setting to allow
  git annex addurl to work again.

* Users who have a git-annex repositry that is shared with people they
  don't fully trust.
  They will not be able to use the web special remote with this version
  of git-annex. They'll have to upgrade.

The S3, glacier, and webdav special remotes are still allowed to
download from the web. There are other potential attacks involving the
web server they connect to redirecting to a local private web server,
and tricking them from downloading content from it which then leaks back
to the attacker. Those attacks are not addressed here, but they also
seem fairly unlikely. Further analysis is needed; preliminary analysis
of glacier-cli, for example, suggests it does not follow redirects and
so is not vulnerable to such attacks.

(patch backported from Debian stretch 6.20170101-1+deb9u2)

---

Index: b/Annex/Quvi.hs
===================================================================
--- a/Annex/Quvi.hs	2018-08-30 14:45:54.339694327 -0400
+++ b/Annex/Quvi.hs	2018-08-30 14:45:54.327694150 -0400
@@ -11,8 +11,8 @@ module Annex.Quvi where
 
 import Common.Annex
 import qualified Annex
+import Annex.Url
 import Utility.Quvi
-import Utility.Url
 
 withQuviOptions :: forall a. Query a -> [QuviParam] -> URLString -> Annex a
 withQuviOptions a ps url = do
@@ -21,7 +21,14 @@ withQuviOptions a ps url = do
 	liftIO $ a v (map (\mkp -> mkp v) ps++opts) url
 
 quviSupported :: URLString -> Annex Bool
-quviSupported u = liftIO . flip supported u =<< quviVersion
+quviSupported u = ifM httpAddressesUnlimited
+	( liftIO . flip supported u =<< quviVersion
+	-- Don't allow any url schemes to be used when
+	-- there's a limit on the allowed addresses, because
+	-- there is no way to prevent quvi from
+	-- redirecting to any address.
+	, return False
+	)
 
 quviVersion :: Annex QuviVersion
 quviVersion = go =<< Annex.getState Annex.quviversion
Index: b/Annex/Url.hs
===================================================================
--- a/Annex/Url.hs	2018-08-30 14:45:54.339694327 -0400
+++ b/Annex/Url.hs	2018-08-30 14:45:54.335694268 -0400
@@ -11,6 +11,7 @@ module Annex.Url (
 	withUrlOptions,
 	getUrlOptions,
 	getUserAgent,
+	httpAddressesUnlimited,
 ) where
 
 import Common.Annex
@@ -18,6 +19,8 @@ import qualified Annex
 import Utility.Url as U
 import qualified Build.SysConfig as SysConfig
 
+import qualified Data.Set as S
+
 defaultUserAgent :: U.UserAgent
 defaultUserAgent = "git-annex/" ++ SysConfig.packageversion
 
@@ -30,7 +33,7 @@ getUrlOptions = mkUrlOptions
 	<$> getUserAgent
 	<*> headers
 	<*> options
-	<*> (annexAllowedUrlSchemes <$> Annex.getGitConfig)
+	<*> urlschemes
   where
 	headers = do
 		v <- annexHttpHeadersCommand <$> Annex.getGitConfig
@@ -38,6 +41,18 @@ getUrlOptions = mkUrlOptions
 			Just cmd -> lines <$> liftIO (readProcess "sh" ["-c", cmd])
 			Nothing -> annexHttpHeaders <$> Annex.getGitConfig
 	options = map Param . annexWebOptions <$> Annex.getGitConfig
+	urlschemes = ifM httpAddressesUnlimited
+		( annexAllowedUrlSchemes <$> Annex.getGitConfig
+		-- Don't allow any url schemes to be used when
+		-- there's a limit on the allowed addresses, because
+		-- there is no way to prevent curl or wget from
+		-- redirecting to any address.
+		, pure S.empty
+		)
+
+httpAddressesUnlimited :: Annex Bool
+httpAddressesUnlimited =
+	("all" == ) . annexAllowedHttpAddresses <$> Annex.getGitConfig
 
 withUrlOptions :: (U.UrlOptions -> IO a) -> Annex a
 withUrlOptions a = liftIO . a =<< getUrlOptions
Index: b/doc/git-annex.mdwn
===================================================================
--- a/doc/git-annex.mdwn	2018-08-30 14:45:54.339694327 -0400
+++ b/doc/git-annex.mdwn	2018-08-30 14:45:54.335694268 -0400
@@ -1709,6 +1709,21 @@ Here are all the supported configuration
   Some special remotes support their own domain-specific URL
   schemes; those are not affected by this configuration setting.
 
+* `annex.security.allowed-http-addresses`
+
+  By default, this version of git-annex refuses to download the content of
+  annexed files from the web. Newer versions of git-annex allow downloading
+  from the web, but only when the web server is not on a private IP address.
+
+  To relax this security check and allow getting annexed files from
+  anywhere on the web, set this to "all".
+  
+  Think very carefully before changing this; there are security
+  implications. Anyone who can get a commit into your git-annex repository
+  could `git annex addurl` an url on a private http server, possibly
+  causing it to be downloaded into your repository and transferred to
+  other remotes, exposing its content.
+
 * `annex.secure-erase-command`
 
   This can be set to a command that should be run whenever git-annex
From: Joey Hess <joeyh@joeyh.name>
Date: Thu, 21 Jun 2018 11:35:27 -0400
Subject: add retrievalSecurityPolicy

This will be used to protect against CVE-2018-10859, where an encrypted
special remote is fed the wrong encrypted data, and so tricked into
decrypting something that the user encrypted with their gpg key and did
not store in git-annex.

It also protects against CVE-2018-10857, where a remote follows a http
redirect to a file:// url or to a local private web server. While that's
already been prevented in git-annex's own use of http, external special
remotes, hooks, etc use other http implementations and could still be
vulnerable.

The policy is not yet enforced, this commit only adds the appropriate
metadata to remotes.

This commit was sponsored by Boyd Stephen Smith Jr. on Patreon.

(patch backported from Debian stretch 6.20170101-1+deb9u2)

---

Index: b/Remote/Bup.hs
===================================================================
--- a/Remote/Bup.hs	2018-08-28 17:34:22.000000000 -0400
+++ b/Remote/Bup.hs	2018-08-28 17:34:28.000000000 -0400
@@ -57,6 +57,9 @@ gen r u c gc = do
 		, storeKey = storeKeyDummy
 		, retrieveKeyFile = retreiveKeyFileDummy
 		, retrieveKeyFileCheap = retrieveCheap buprepo
+		-- Bup uses git, which cryptographically verifies content
+		-- (with SHA1, but sufficiently for this).
+		, retrievalSecurityPolicy = RetrievalAllKeysSecure
 		, removeKey = removeKeyDummy
 		, checkPresent = checkPresentDummy
 		, checkPresentCheap = bupLocal buprepo
Index: b/Remote/Ddar.hs
===================================================================
--- a/Remote/Ddar.hs	2018-08-28 17:34:28.000000000 -0400
+++ b/Remote/Ddar.hs	2018-08-28 17:39:54.000000000 -0400
@@ -55,6 +55,8 @@ gen r u c gc = do
 		, storeKey = storeKeyDummy
 		, retrieveKeyFile = retreiveKeyFileDummy
 		, retrieveKeyFileCheap = retrieveCheap
+		-- Unsure about this, safe default until Robie answers.
+		, retrievalSecurityPolicy = RetrievalVerifiableKeysSecure
 		, removeKey = removeKeyDummy
 		, checkPresent = checkPresentDummy
 		, checkPresentCheap = ddarLocal ddarrepo
Index: b/Remote/Directory.hs
===================================================================
--- a/Remote/Directory.hs	2018-08-28 17:34:22.000000000 -0400
+++ b/Remote/Directory.hs	2018-08-28 17:34:28.000000000 -0400
@@ -53,6 +53,7 @@ gen r u c gc = do
 			storeKey = storeKeyDummy,
 			retrieveKeyFile = retreiveKeyFileDummy,
 			retrieveKeyFileCheap = retrieveCheap dir chunkconfig,
+			retrievalSecurityPolicy = RetrievalAllKeysSecure,
 			removeKey = removeKeyDummy,
 			checkPresent = checkPresentDummy,
 			checkPresentCheap = True,
Index: b/Remote/External.hs
===================================================================
--- a/Remote/External.hs	2018-08-28 17:34:22.000000000 -0400
+++ b/Remote/External.hs	2018-08-28 17:34:28.000000000 -0400
@@ -53,6 +53,11 @@ gen r u c gc = do
 			storeKey = storeKeyDummy,
 			retrieveKeyFile = retreiveKeyFileDummy,
 			retrieveKeyFileCheap = \_ _ -> return False,
+			-- External special remotes use many http libraries
+			-- and have no protection against redirects to
+			-- local private web servers, or in some cases
+			-- to file:// urls.
+			retrievalSecurityPolicy = RetrievalVerifiableKeysSecure,
 			removeKey = removeKeyDummy,
 			checkPresent = checkPresentDummy,
 			checkPresentCheap = False,
Index: b/Remote/GCrypt.hs
===================================================================
--- a/Remote/GCrypt.hs	2018-08-28 17:34:28.000000000 -0400
+++ b/Remote/GCrypt.hs	2018-08-28 17:34:28.000000000 -0400
@@ -108,6 +108,7 @@ gen' r u c gc = do
 		, storeKey = storeKeyDummy
 		, retrieveKeyFile = retreiveKeyFileDummy
 		, retrieveKeyFileCheap = \_ _ -> return False
+		, retrievalSecurityPolicy = RetrievalAllKeysSecure
 		, removeKey = removeKeyDummy
 		, checkPresent = checkPresentDummy
 		, checkPresentCheap = repoCheap r
Index: b/Remote/Git.hs
===================================================================
--- a/Remote/Git.hs	2018-08-28 17:34:22.000000000 -0400
+++ b/Remote/Git.hs	2018-08-28 17:42:25.000000000 -0400
@@ -140,6 +140,7 @@ gen r u c gc
 			, storeKey = copyToRemote new
 			, retrieveKeyFile = copyFromRemote new
 			, retrieveKeyFileCheap = copyFromRemoteCheap new
+			, retrievalSecurityPolicy = RetrievalAllKeysSecure
 			, removeKey = dropKey new
 			, checkPresent = inAnnex new
 			, checkPresentCheap = repoCheap r
Index: b/Remote/Glacier.hs
===================================================================
--- a/Remote/Glacier.hs	2018-08-28 17:34:22.000000000 -0400
+++ b/Remote/Glacier.hs	2018-08-28 17:39:55.000000000 -0400
@@ -53,6 +53,9 @@ gen r u c gc = new <$> remoteCost gc ver
 			storeKey = storeKeyDummy,
 			retrieveKeyFile = retreiveKeyFileDummy,
 			retrieveKeyFileCheap = retrieveCheap this,
+			-- glacier-cli does not follow redirects and does
+			-- not support file://, so this is secure.
+			retrievalSecurityPolicy = RetrievalAllKeysSecure,
 			removeKey = removeKeyDummy,
 			checkPresent = checkPresentDummy,
 			checkPresentCheap = False,
Index: b/Remote/Helper/Special.hs
===================================================================
--- a/Remote/Helper/Special.hs	2018-08-28 17:34:22.000000000 -0400
+++ b/Remote/Helper/Special.hs	2018-08-28 17:34:28.000000000 -0400
@@ -162,6 +162,14 @@ specialRemote' cfg c preparestorer prepa
 			(retrieveKeyFileCheap baser k d)
 			-- retrieval of encrypted keys is never cheap
 			(\_ -> return False)
+		-- When encryption is used, the remote could provide
+		-- some other content encrypted by the user, and trick
+		-- git-annex into decrypting it, leaking the decryption
+		-- into the git-annex repository. Verifiable keys
+		-- are the main protection against this attack.
+		, retrievalSecurityPolicy = if isencrypted
+			then RetrievalVerifiableKeysSecure
+			else retrievalSecurityPolicy baser
 		, removeKey = \k -> cip >>= removeKeyGen k
 		, checkPresent = \k -> cip >>= checkPresentGen k
 		, cost = maybe
@@ -176,6 +184,7 @@ specialRemote' cfg c preparestorer prepa
 				]
 		}
 	cip = cipherKey c
+	isencrypted = isJust (extractCipher c)
 	gpgopts = getGpgEncParams encr
 
 	safely a = catchNonAsync a (\e -> warning (show e) >> return False)
Index: b/Remote/Hook.hs
===================================================================
--- a/Remote/Hook.hs	2018-08-28 17:34:22.000000000 -0400
+++ b/Remote/Hook.hs	2018-08-28 17:34:28.000000000 -0400
@@ -46,6 +46,9 @@ gen r u c gc = do
 			storeKey = storeKeyDummy,
 			retrieveKeyFile = retreiveKeyFileDummy,
 			retrieveKeyFileCheap = retrieveCheap hooktype,
+			-- A hook could use http and be vulnerable to
+			-- redirect to file:// attacks, etc.
+			retrievalSecurityPolicy = RetrievalVerifiableKeysSecure,
 			removeKey = removeKeyDummy,
 			checkPresent = checkPresentDummy,
 			checkPresentCheap = False,
Index: b/Remote/Rsync.hs
===================================================================
--- a/Remote/Rsync.hs	2018-08-28 17:34:28.000000000 -0400
+++ b/Remote/Rsync.hs	2018-08-28 17:34:28.000000000 -0400
@@ -68,6 +68,7 @@ gen r u c gc = do
 			, storeKey = storeKeyDummy
 			, retrieveKeyFile = retreiveKeyFileDummy
 			, retrieveKeyFileCheap = retrieveCheap o
+			, retrievalSecurityPolicy = RetrievalAllKeysSecure
 			, removeKey = removeKeyDummy
 			, checkPresent = checkPresentDummy
 			, checkPresentCheap = False
Index: b/Remote/S3.hs
===================================================================
--- a/Remote/S3.hs	2018-08-28 17:34:22.000000000 -0400
+++ b/Remote/S3.hs	2018-08-28 17:34:28.000000000 -0400
@@ -58,6 +58,9 @@ gen r u c gc = new <$> remoteCost gc exp
 			storeKey = storeKeyDummy,
 			retrieveKeyFile = retreiveKeyFileDummy,
 			retrieveKeyFileCheap = retrieveCheap,
+			-- HttpManagerRestricted is used here, so this is
+			-- secure.
+			retrievalSecurityPolicy = RetrievalAllKeysSecure,
 			removeKey = removeKeyDummy,
 			checkPresent = checkPresentDummy,
 			checkPresentCheap = False,
Index: b/Remote/Tahoe.hs
===================================================================
--- a/Remote/Tahoe.hs	2018-08-28 17:34:22.000000000 -0400
+++ b/Remote/Tahoe.hs	2018-08-28 17:34:28.000000000 -0400
@@ -71,6 +71,8 @@ gen r u c gc = do
 		storeKey = store u hdl,
 		retrieveKeyFile = retrieve u hdl,
 		retrieveKeyFileCheap = \_ _ -> return False,
+		-- Tahoe cryptographically verifies content.
+		retrievalSecurityPolicy = RetrievalAllKeysSecure,
 		removeKey = remove,
 		checkPresent = checkKey u hdl,
 		checkPresentCheap = False,
Index: b/Remote/Web.hs
===================================================================
--- a/Remote/Web.hs	2018-08-28 17:34:22.000000000 -0400
+++ b/Remote/Web.hs	2018-08-28 17:34:28.000000000 -0400
@@ -49,6 +49,9 @@ gen r _ c gc =
 		storeKey = uploadKey,
 		retrieveKeyFile = downloadKey,
 		retrieveKeyFileCheap = downloadKeyCheap,
+		-- HttpManagerRestricted is used here, so this is
+		-- secure.
+		retrievalSecurityPolicy = RetrievalAllKeysSecure,
 		removeKey = dropKey,
 		checkPresent = checkKey,
 		checkPresentCheap = False,
Index: b/Remote/WebDAV.hs
===================================================================
--- a/Remote/WebDAV.hs	2018-08-28 17:34:22.000000000 -0400
+++ b/Remote/WebDAV.hs	2018-08-28 17:34:28.000000000 -0400
@@ -58,6 +58,9 @@ gen r u c gc = new <$> remoteCost gc exp
 			storeKey = storeKeyDummy,
 			retrieveKeyFile = retreiveKeyFileDummy,
 			retrieveKeyFileCheap = retrieveCheap,
+			-- HttpManagerRestricted is used here, so this is
+			-- secure.
+			retrievalSecurityPolicy = RetrievalAllKeysSecure,
 			removeKey = removeKeyDummy,
 			checkPresent = checkPresentDummy,
 			checkPresentCheap = False,
Index: b/Types/Key.hs
===================================================================
--- a/Types/Key.hs	2018-08-28 17:34:22.000000000 -0400
+++ b/Types/Key.hs	2018-08-28 17:42:25.000000000 -0400
@@ -138,3 +138,13 @@ prop_idempotent_key_decode f
 	normalfieldorder = fields `isPrefixOf` "smSC"
 	fields = map (f !!) $ filter (< length f) $ map succ $
 		elemIndices fieldSep f
+
+{- Is the Key variety backed by a hash, which allows verifying content?
+ - It does not have to be cryptographically secure against eg birthday
+ - attacks.
+ -}
+isVerifiable :: Key -> Bool
+isVerifiable k = case keyBackendName k of
+	"WORM" -> False
+	"URL" -> False
+	_ -> True
Index: b/Types/Remote.hs
===================================================================
--- a/Types/Remote.hs	2018-08-28 17:34:22.000000000 -0400
+++ b/Types/Remote.hs	2018-08-28 17:34:28.000000000 -0400
@@ -13,6 +13,7 @@ module Types.Remote
 	, RemoteTypeA(..)
 	, RemoteA(..)
 	, Availability(..)
+	, RetrievalSecurityPolicy(..)
 	)
 	where
 
@@ -66,6 +67,8 @@ data RemoteA a = Remote {
 	retrieveKeyFile :: Key -> AssociatedFile -> FilePath -> MeterUpdate -> a Bool,
 	-- retrieves a key's contents to a tmp file, if it can be done cheaply
 	retrieveKeyFileCheap :: Key -> FilePath -> a Bool,
+	-- Security policy for reteiving keys from this remote.
+	retrievalSecurityPolicy :: RetrievalSecurityPolicy,
 	-- removes a key's contents (succeeds if the contents are not present)
 	removeKey :: Key -> a Bool,
 	-- Checks if a key is present in the remote.
@@ -112,3 +115,29 @@ instance Eq (RemoteA a) where
 
 instance Ord (RemoteA a) where
 	compare = comparing uuid
+
+-- Security policy indicating what keys can be safely retrieved from a
+-- remote.
+data RetrievalSecurityPolicy
+	= RetrievalVerifiableKeysSecure
+	-- ^ Transfer of keys whose content can be verified
+	-- with a hash check is secure; transfer of unverifiable keys is
+	-- not secure and should not be allowed.
+	--
+	-- This is used eg, when HTTP to a remote could be redirected to a
+	-- local private web server or even a file:// url, causing private
+	-- data from it that is not the intended content of a key to make
+	-- its way into the git-annex repository.
+	--
+	-- It's also used when content is stored encrypted on a remote,
+	-- which could replace it with a different encrypted file, and
+	-- trick git-annex into decrypting it and leaking the decryption
+	-- into the git-annex repository.
+	--
+	-- It's not (currently) used when the remote could alter the
+	-- content stored on it, because git-annex does not provide
+	-- strong guarantees about the content of keys that cannot be 
+	-- verified with a hash check.
+	-- (But annex.securehashesonly does provide such guarantees.)
+	| RetrievalAllKeysSecure
+	-- ^ Any key can be securely retrieved.
From: Joey Hess <joeyh@joeyh.name>
Date: Thu, 21 Jun 2018 13:34:11 -0400
Subject: enforce retrievalSecurityPolicy

Leveraged the existing verification code by making it also check the
retrievalSecurityPolicy.

Also, prevented getViaTmp from running the download action at all when the
retrievalSecurityPolicy is going to prevent verifying and so storing it.

Added annex.security.allow-unverified-downloads. A per-remote version
would be nice to have too, but would need more plumbing, so KISS.
(Bill the Cat reference not too over the top I hope. The point is to
make this something the user reads the documentation for before using.)

A few calls to verifyKeyContent and getViaTmp, that don't
involve downloads from remotes, have RetrievalAllKeysSecure hard-coded.
It was also hard-coded for P2P.Annex and Command.RecvKey,
to match the values of the corresponding remotes.

A few things use retrieveKeyFile/retrieveKeyFileCheap without going
through getViaTmp.
* Command.Fsck when downloading content from a remote to verify it.
  That content does not get into the annex, so this is ok.
* Command.AddUrl when using a remote to download an url; this is new
  content being added, so this is ok.

This commit was sponsored by Fernando Jimenez on Patreon.

(patch backported from Debian stretch 6.20170101-1+deb9u2)

---

Index: b/Annex/Content.hs
===================================================================
--- a/Annex/Content.hs	2018-08-30 16:21:25.000000000 -0400
+++ b/Annex/Content.hs	2018-08-30 16:21:25.000000000 -0400
@@ -12,6 +12,7 @@ module Annex.Content (
 	inAnnexSafe,
 	inAnnexCheck,
 	lockContent,
+	RetrievalSecurityPolicy(..),
 	getViaTmp,
 	getViaTmpChecked,
 	getViaTmpUnchecked,
@@ -49,6 +50,7 @@ import Utility.DiskFree
 import Utility.FileMode
 import qualified Annex.Url as Url
 import Types.Key
+import Types.Remote (RetrievalSecurityPolicy(..))
 import Utility.DataUnits
 import Utility.CopyFile
 import Config
@@ -211,19 +213,19 @@ lockContent key a = do
 {- Runs an action, passing it a temporary filename to get,
  - and if the action succeeds, moves the temp file into 
  - the annex as a key's content. -}
-getViaTmp :: Verify -> Key -> (FilePath -> Annex Bool) -> Annex Bool
-getViaTmp = getViaTmpChecked (return True)
+getViaTmp :: RetrievalSecurityPolicy -> Verify -> Key -> (FilePath -> Annex Bool) -> Annex Bool
+getViaTmp rsp v key action = getViaTmpChecked rsp v (return True) key action
 
 {- Like getViaTmp, but does not check that there is enough disk space
  - for the incoming key. For use when the key content is already on disk
  - and not being copied into place. -}
-getViaTmpUnchecked :: Verify -> Key -> (FilePath -> Annex Bool) -> Annex Bool
-getViaTmpUnchecked = finishGetViaTmp (return True)
+getViaTmpUnchecked :: RetrievalSecurityPolicy -> Verify -> Key -> (FilePath -> Annex Bool) -> Annex Bool
+getViaTmpUnchecked rsp v key action = finishGetViaTmp rsp v (return True) key action
 
-getViaTmpChecked :: Verify -> Annex Bool -> Key -> (FilePath -> Annex Bool) -> Annex Bool
-getViaTmpChecked v check key action = 
+getViaTmpChecked :: RetrievalSecurityPolicy -> Verify -> Annex Bool -> Key -> (FilePath -> Annex Bool) -> Annex Bool
+getViaTmpChecked rsp v check key action = 
 	prepGetViaTmpChecked key False $
-		finishGetViaTmp v check key action
+		finishGetViaTmp rsp v check key action
 
 {- Prepares to download a key via a tmp file, and checks that there is
  - enough free disk space.
@@ -249,11 +251,11 @@ prepGetViaTmpChecked key unabletoget get
 		, return unabletoget
 		)
 
-finishGetViaTmp :: Verify -> Annex Bool -> Key -> (FilePath -> Annex Bool) -> Annex Bool
-finishGetViaTmp v check key action = do
+finishGetViaTmp :: RetrievalSecurityPolicy -> Verify -> Annex Bool -> Key -> (FilePath -> Annex Bool) -> Annex Bool
+finishGetViaTmp rsp v check key action = checkallowed $ do
 	tmpfile <- prepTmp key
 	ifM (action tmpfile <&&> check)
-		( ifM (verifyKeyContent v key tmpfile)
+		( ifM (verifyKeyContent rsp v key tmpfile)
 			( do
 				moveAnnex key tmpfile
 				logStatus key InfoPresent
@@ -267,6 +269,17 @@ finishGetViaTmp v check key action = do
 		-- caller wants to resume its transfer
 		, return False
 		)
+  where
+	-- Avoid running the action to get the content when the
+	-- RetrievalSecurityPolicy would cause verification to always fail.
+	checkallowed a = case rsp of
+		RetrievalAllKeysSecure -> a
+		RetrievalVerifiableKeysSecure
+			| isVerifiable key -> a
+			| otherwise -> ifM (annexAllowUnverifiedDownloads <$> Annex.getGitConfig)
+				( a
+				, warnUnverifiableInsecure key >> return False
+				)
 
 prepTmp :: Key -> Annex FilePath
 prepTmp key = do
@@ -294,8 +307,18 @@ withTmp key action = do
  - it is checked. This is an expensive check, so configuration can prevent
  - it, for either a particular remote or always.
  -}
-verifyKeyContent :: Verify -> Key -> FilePath -> Annex Bool
-verifyKeyContent v k f = verifysize <&&> verifycontent
+verifyKeyContent :: RetrievalSecurityPolicy -> Verify -> Key -> FilePath -> Annex Bool
+verifyKeyContent rsp v k f = case rsp of
+	RetrievalVerifiableKeysSecure
+		| isVerifiable k -> verifysize <&&> verifycontent
+		| otherwise -> ifM (annexAllowUnverifiedDownloads <$> Annex.getGitConfig)
+			( verifysize <&&> verifycontent
+			, warnUnverifiableInsecure k >> return False
+			)
+	_ -> ifM (shouldVerify v)
+		( verifysize <&&> verifycontent
+		, return True
+		)
   where
 	verifysize = case Types.Key.keySize k of
 		Nothing -> return True
@@ -318,6 +341,17 @@ shouldVerify DefaultVerify = annexVerify
 shouldVerify (RemoteVerify r) = shouldVerify DefaultVerify
 	<&&> pure (remoteAnnexVerify (Types.Remote.gitconfig r))
 
+warnUnverifiableInsecure :: Key -> Annex ()
+warnUnverifiableInsecure k = warning $ unwords
+	[ "Getting " ++ kv ++ " keys with this remote is not secure;"
+	, "the content cannot be verified to be correct."
+	, "(Use annex.security.allow-unverified-downloads to bypass"
+	, "this safety check.)"
+	]
+  where
+	kv = keyBackendName k
+
+
 {- Checks that there is disk space available to store a given key,
  - in a destination (or the annex) printing a warning if not. -}
 checkDiskSpace :: Maybe FilePath -> Key -> Integer -> Annex Bool
Index: b/Command/RecvKey.hs
===================================================================
--- a/Command/RecvKey.hs	2018-08-30 16:20:32.947273787 -0400
+++ b/Command/RecvKey.hs	2018-08-30 16:21:25.000000000 -0400
@@ -15,6 +15,7 @@ import Annex
 import Utility.Rsync
 import Logs.Transfer
 import Command.SendKey (fieldTransfer)
+import Types.Remote (RetrievalSecurityPolicy(..))
 import qualified CmdLine.GitAnnexShell.Fields as Fields
 
 cmd :: [Command]
@@ -30,7 +31,9 @@ start key = fieldTransfer Download key $
 	-- as the file could change while being transferred.
 	fromdirect <- isJust <$> Fields.getField Fields.direct
 	let verify = if fromdirect then AlwaysVerify else DefaultVerify
-	ifM (getViaTmp verify key go)
+	-- This matches the retrievalSecurityPolicy of Remote.Git
+	let rsp = RetrievalAllKeysSecure
+	ifM (getViaTmp rsp verify key go)
 		( do
 			-- forcibly quit after receiving one key,
 			-- and shutdown cleanly
Index: b/Command/Reinject.hs
===================================================================
--- a/Command/Reinject.hs	2018-08-30 16:20:32.947273787 -0400
+++ b/Command/Reinject.hs	2018-08-30 16:21:25.000000000 -0400
@@ -42,7 +42,7 @@ perform src _dest key = ifM move
 	-- so mv is used rather than simply calling
 	-- moveToObjectDir; disk space is also checked this way,
 	-- and the file's content is verified to match the key.
-	move = getViaTmp DefaultVerify key $ \tmp ->
+	move = getViaTmp RetrievalAllKeysSecure DefaultVerify key $ \tmp ->
 		liftIO $  catchBoolIO $ do
  			moveFile src tmp
  			return True
Index: b/Command/TestRemote.hs
===================================================================
--- a/Command/TestRemote.hs	2018-08-30 16:20:32.947273787 -0400
+++ b/Command/TestRemote.hs	2018-08-30 16:21:25.000000000 -0400
@@ -153,7 +153,7 @@ test st r k =
 		Just b -> case fsckKey b of
 			Nothing -> return True
 			Just fscker -> fscker k (key2file k)
-	get = getViaTmp (RemoteVerify r) k $ \dest ->
+	get = getViaTmp (Remote.retrievalSecurityPolicy r) (RemoteVerify r) k $ \dest ->
 		Remote.retrieveKeyFile r k Nothing dest nullMeterUpdate
 	store = Remote.storeKey r k Nothing nullMeterUpdate
 	remove = Remote.removeKey r k
@@ -167,10 +167,10 @@ testUnavailable st r k =
 	, check (`notElem` [Right True, Right False]) "checkPresent" $
 		Remote.checkPresent r k
 	, check (== Right False) "retrieveKeyFile" $
-		getViaTmp (RemoteVerify r) k $ \dest ->
+		getViaTmp (Remote.retrievalSecurityPolicy r) (RemoteVerify r) k $ \dest ->
 			Remote.retrieveKeyFile r k Nothing dest nullMeterUpdate
 	, check (== Right False) "retrieveKeyFileCheap" $
-		getViaTmp (RemoteVerify r) k $ \dest ->
+		getViaTmp (Remote.retrievalSecurityPolicy r) (RemoteVerify r) k $ \dest ->
 			Remote.retrieveKeyFileCheap r k dest
 	]
   where
Index: b/Command/Get.hs
===================================================================
--- a/Command/Get.hs	2018-08-30 16:20:32.943273729 -0400
+++ b/Command/Get.hs	2018-08-30 16:21:25.000000000 -0400
@@ -86,7 +86,7 @@ getKey' key afile = dispatch
 		| Remote.hasKeyCheap r =
 			either (const False) id <$> Remote.hasKey r key
 		| otherwise = return True
-	docopy r witness = getViaTmp (RemoteVerify r) key $ \dest ->
+	docopy r witness = getViaTmp (Remote.retrievalSecurityPolicy r) (RemoteVerify r) key $ \dest ->
 		download (Remote.uuid r) key afile noRetry
 			(\p -> do
 				showAction $ "from " ++ Remote.name r
Index: b/Command/Move.hs
===================================================================
--- a/Command/Move.hs	2018-08-30 16:20:32.947273787 -0400
+++ b/Command/Move.hs	2018-08-30 16:21:25.000000000 -0400
@@ -158,7 +158,7 @@ fromPerform src move key afile = ifM (in
 	go = notifyTransfer Download afile $ 
 		download (Remote.uuid src) key afile noRetry $ \p -> do
 			showAction $ "from " ++ Remote.name src
-			getViaTmp (RemoteVerify src) key $ \t ->
+			getViaTmp (Remote.retrievalSecurityPolicy src) (RemoteVerify src) key $ \t ->
 				Remote.retrieveKeyFile src key afile t p
 	dispatch _ False = stop -- failed
 	dispatch False True = next $ return True -- copy complete
Index: b/Command/ReKey.hs
===================================================================
--- a/Command/ReKey.hs	2018-08-30 16:20:32.947273787 -0400
+++ b/Command/ReKey.hs	2018-08-30 16:21:25.000000000 -0400
@@ -49,7 +49,7 @@ perform file oldkey newkey = do
 {- Make a hard link to the old key content (when supported),
  - to avoid wasting disk space. -}
 linkKey :: Key -> Key -> Annex Bool
-linkKey oldkey newkey = getViaTmpUnchecked DefaultVerify newkey $ \tmp -> do
+linkKey oldkey newkey = getViaTmpUnchecked RetrievalAllKeysSecure DefaultVerify newkey $ \tmp -> do
 	src <- calcRepo $ gitAnnexLocation oldkey
 	liftIO $ ifM (doesFileExist tmp)
 		( return True
Index: b/Command/TransferKey.hs
===================================================================
--- a/Command/TransferKey.hs	2018-08-30 16:20:32.947273787 -0400
+++ b/Command/TransferKey.hs	2018-08-30 16:21:25.000000000 -0400
@@ -51,7 +51,7 @@ toPerform remote key file = go Upload fi
 fromPerform :: Remote -> Key -> AssociatedFile -> CommandPerform
 fromPerform remote key file = go Upload file $
 	download (uuid remote) key file forwardRetry $ \p ->
-		getViaTmp (RemoteVerify remote) key $
+		getViaTmp (retrievalSecurityPolicy remote) (RemoteVerify remote) key $
 			\t -> Remote.retrieveKeyFile remote key file t p
 
 go :: Direction -> AssociatedFile -> (NotifyWitness -> Annex Bool) -> CommandPerform
Index: b/Command/TransferKeys.hs
===================================================================
--- a/Command/TransferKeys.hs	2018-08-30 16:20:32.947273787 -0400
+++ b/Command/TransferKeys.hs	2018-08-30 16:32:39.145962211 -0400
@@ -43,7 +43,7 @@ start = do
 				return ok
 		| otherwise = notifyTransfer direction file $
 			download (Remote.uuid remote) key file forwardRetry $ \p ->
-				getViaTmp (RemoteVerify remote) key $ \t -> 
+				getViaTmp (Remote.retrievalSecurityPolicy remote) (RemoteVerify remote) key $ \t -> 
 					Remote.retrieveKeyFile remote key file t p
 
 runRequests
Index: b/Remote.hs
===================================================================
--- a/Remote.hs	2018-08-30 16:20:30.000000000 -0400
+++ b/Remote.hs	2018-08-30 16:21:25.000000000 -0400
@@ -12,6 +12,7 @@ module Remote (
 	storeKey,
 	retrieveKeyFile,
 	retrieveKeyFileCheap,
+	retrievalSecurityPolicy,
 	removeKey,
 	hasKey,
 	hasKeyCheap,
Index: b/Remote/Git.hs
===================================================================
--- a/Remote/Git.hs	2018-08-30 16:21:25.000000000 -0400
+++ b/Remote/Git.hs	2018-08-30 16:21:25.000000000 -0400
@@ -480,9 +480,10 @@ copyToRemote r key file p
 			( return True
 			, do
 				ensureInitialized
+				let rsp = RetrievalAllKeysSecure
 				runTransfer (Transfer Download u key) file noRetry $ const $
 					Annex.Content.saveState True `after`
-						Annex.Content.getViaTmpChecked (Annex.Content.RemoteVerify r) (liftIO checksuccessio) key
+						Annex.Content.getViaTmpChecked rsp (Annex.Content.RemoteVerify r) (liftIO checksuccessio) key
 							(\d -> rsyncOrCopyFile params object d p)
 			)
 
Index: b/Types/GitConfig.hs
===================================================================
--- a/Types/GitConfig.hs	2018-08-30 16:21:25.000000000 -0400
+++ b/Types/GitConfig.hs	2018-08-30 16:21:25.000000000 -0400
@@ -58,6 +58,7 @@ data GitConfig = GitConfig
 	, annexAllowedUrlSchemes :: S.Set Scheme
 	, annexAllowedHttpAddresses :: String
 	, annexVerify :: Bool
+	, annexAllowUnverifiedDownloads :: Bool
 	, coreSymlinks :: Bool
 	, gcryptId :: Maybe String
 	}
@@ -98,6 +99,8 @@ extractGitConfig r = GitConfig
 			getmaybe (annex "security.allowed-url-schemes")
         , annexAllowedHttpAddresses = fromMaybe "" $
 		getmaybe (annex "security.allowed-http-addresses")
+	, annexAllowUnverifiedDownloads = (== Just "ACKTHPPT") $
+		getmaybe (annex "security.allow-unverified-downloads")
 	, annexVerify = getbool (annex "verify") True
 	, coreSymlinks = getbool "core.symlinks" True
 	, gcryptId = getmaybe "core.gcrypt-id"
Index: b/Types/Key.hs
===================================================================
--- a/Types/Key.hs	2018-08-30 16:21:25.000000000 -0400
+++ b/Types/Key.hs	2018-08-30 16:21:25.000000000 -0400
@@ -16,6 +16,7 @@ module Types.Key (
 	nonChunkKey,
 	chunkKeyOffset,
 	isChunkKey,
+	isVerifiable,
 
 	prop_idempotent_key_encode,
 	prop_idempotent_key_decode
Index: b/doc/git-annex.mdwn
===================================================================
--- a/doc/git-annex.mdwn	2018-08-30 16:21:25.000000000 -0400
+++ b/doc/git-annex.mdwn	2018-08-30 16:21:25.000000000 -0400
@@ -1593,6 +1593,10 @@ Here are all the supported configuration
   from remotes. If you trust a remote and don't want the overhead
   of these checksums, you can set this to `false`.
 
+  Note that even when this is set to `false`, git-annex does verification
+  in some edge cases, where it's likely the case than an
+  object was downloaded incorrectly, or when needed for security.
+
 * `remote.<name>.annexUrl`
 
   Can be used to specify a different url than the regular `remote.<name>.url`
@@ -1724,6 +1728,43 @@ Here are all the supported configuration
   causing it to be downloaded into your repository and transferred to
   other remotes, exposing its content.
 
+* `annex.security.allow-unverified-downloads`, 
+
+  For security reasons, git-annex refuses to download content from
+  most special remotes when it cannot check a hash to verify 
+  that the correct content was downloaded. This particularly impacts
+  downloading the content of URL or WORM keys, which lack hashes.
+
+  The best way to avoid problems due to this is to migrate files
+  away from such keys, before their content reaches a special remote.
+  See [[git-annex-migrate]](1).
+
+  When the content is only available from a special remote, you can
+  use this configuration to force git-annex to download it.
+  But you do so at your own risk, and it's very important you read and
+  understand the information below first!
+
+  Downloading unverified content from encrypted special remotes is
+  prevented, because the special remote could send some other encrypted
+  content than what you expect, causing git-annex to decrypt data that you
+  never checked into git-annex, and risking exposing the decrypted
+  data to any non-encrypted remotes you send content to.
+
+  Downloading unverified content from (non-encrypted)
+  external special remotes is prevented, because they could follow
+  http redirects to web servers on localhost or on a private network,
+  or in some cases to a file:/// url.
+
+  If you decide to bypass this security check, the best thing to do is
+  to only set it temporarily while running the command that gets the file.
+  The value to set the config to is "ACKTHPPT".
+  For example:
+
+	git -c annex.security.allow-unverified-downloads=ACKTHPPT annex get myfile
+
+  It would be a good idea to check that it downloaded the file you expected,
+  too.
+
 * `annex.secure-erase-command`
 
   This can be set to a command that should be run whenever git-annex
From: Joey Hess <joeyh@joeyh.name>
Date: Thu, 21 Jun 2018 14:14:56 -0400
Subject: don't assume boto will remain secure

On second thought, best to default to being secure even if boto changes
http libraries to one that happens to follow redirects.

(patch backported from Debian stretch 6.20170101-1+deb9u2)

---

Index: b/Remote/Glacier.hs
===================================================================
--- a/Remote/Glacier.hs	2018-08-28 15:19:08.903444542 -0400
+++ b/Remote/Glacier.hs	2018-08-28 15:19:50.660057855 -0400
@@ -54,8 +54,10 @@ gen r u c gc = new <$> remoteCost gc ver
 			retrieveKeyFile = retreiveKeyFileDummy,
 			retrieveKeyFileCheap = retrieveCheap this,
 			-- glacier-cli does not follow redirects and does
-			-- not support file://, so this is secure.
-			retrievalSecurityPolicy = RetrievalAllKeysSecure,
+			-- not support file://, as far as we know, but
+			-- there's no guarantee that will continue to be
+			-- the case, so require verifiable keys.
+			retrievalSecurityPolicy = RetrievalVerifiableKeysSecure,
 			removeKey = removeKeyDummy,
 			checkPresent = checkPresentDummy,
 			checkPresentCheap = False,
From: Joey Hess <joeyh@joeyh.name>
Date: Thu, 21 Jun 2018 16:38:47 -0400
Subject: set ddar to RetrievalAllKeysSecure

Based on information from Robie Basak.

(patch backported from Debian stretch 6.20170101-1+deb9u2)

---

Index: b/Remote/Ddar.hs
===================================================================
--- a/Remote/Ddar.hs	2018-08-28 18:13:50.000000000 -0400
+++ b/Remote/Ddar.hs	2018-08-28 18:15:58.000000000 -0400
@@ -55,8 +55,9 @@ gen r u c gc = do
 		, storeKey = storeKeyDummy
 		, retrieveKeyFile = retreiveKeyFileDummy
 		, retrieveKeyFileCheap = retrieveCheap
-		-- Unsure about this, safe default until Robie answers.
-		, retrievalSecurityPolicy = RetrievalVerifiableKeysSecure
+		-- ddar communicates over ssh, not subject to http redirect
+		-- type attacks
+		, retrievalSecurityPolicy = RetrievalAllKeysSecure
 		, removeKey = removeKeyDummy
 		, checkPresent = checkPresentDummy
 		, checkPresentCheap = ddarLocal ddarrepo

Reply to: