[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

git-annex security update ready for testing and review



Hi!

TL;DR: test packages ready for git-annex. fix probably incomplete,
patches attached for review.

I've been working for the past day or two on backporting the pending
security fixes for git-annex to Debian jessie as part of the LTS
project. The two security issues are of course CVE-2018-10857 and
CVE-2018-10859:

https://git-annex.branchable.com/security/CVE-2018-10857_and_CVE-2018-10859/

The version in Debian stretch (6.20170101-1+deb9u1) has a backport of
the fix released upstream (as 6.20180626) which Joey Hess (in CC) has
been gracious enough to share with Debian. No such backport exists for
older versions, as far as we know, so I embarked upon such work
yesterday, based on the patches from Debian stretch.

It was challenging work, especially for the two main patches.

The first big patch is "limit url downloads to whitelisted schemes":

http://source.git-annex.branchable.com/?p=source.git;a=commitdiff;h=28720c795ff57a55b48e56d15f9b6bcb977f48d9

The main challenge with this patch is that `getUrlInfo` is quite different
in 5.x: it's called `exists` and has a somewhat different structure. I
think I've been able to port it correctly, but reviewers will want to
pay special attention to that function in Utility/Url.hs. Similarly, I
found that all of the P2P module and `setkey` command are absent from
5.x which removes a good chunk of patches.

More concerning are chunks affecting code which does a
`verifyKeyContent`. There is no equivalent in 5.x and I am worried this
was used to also check for bad URLs as a side-effect. More critically,
I'm concerned that the lack of such infrastructure makes it impossible
to fix CVE-2018-10859 in the 5.x branch without backporting a
significant part of that work. It seems to me the patchset I have now
does not actually have provisions to restrict restriction to certain
content.

The second big patch is "enforce retrievalSecurityPolicy":

http://source.git-annex.branchable.com/?p=source.git;a=commitdiff;h=b657242f5d946efae4cc77e8aef95dd2a306cd6b

Again, some implementations differ here especially around getViaTmp*
commands which have been renamed around. I tried to reinject the
RetrievalSecurityPolicy controls in there, but I am not certain I have
all getViaTmp incantations covered. Considering Haskell's type checking,
however, I am confident the compiler would have caught those
mistakes. I am still concerned that other similar requests could be made
*without* going through the getVia* handlers which would make
CVE-2018-10857 fix incomplete.

As the first patch, some "verify" code is missing in 5.x here as
well. RemoteVerify, verifyKeyContent, and VerifyConfig types are all
missing from 5.x and I am worried they would also make CVE-2018-10859
unfixed.

I am worried about Get.hs, where I had to hardcode a
RetrievalAllKeysSecure policy in case no remote is specified, since I
couldn't figure out how to find the remote at that level. I similarly
shot from the hip, in the dark, with Sync.hs, unfortunately.

Finally, Test.hs does not have a addurl test case so this chunk was
dropped. I have performed some smoke tests on a test repository
(https://github.com/RichiH/conference_proceedings) through which I have
confirmed that `addurl` and `get` still work correctly.

I have uploaded the resulting package to the usual location for public
testing and review:

https://people.debian.org/~anarcat/debian/jessie-lts/

The patches are also attached for your reviewing pleasure. I will upload
the result next week unless advised otherwise.

Thanks!

A.

PS: Notice the odd version number in the release. It turns out I did a
critical mistake in the previous upload as I kept the "native" packaging
format for the first security upload to Debian jessie. The new version
number is an attempt at restoring sanity to the maintenance process. It
reverts to a non-native packages and brings back the patch from the
previous upload into a file in debian/patches.

-- 
Never believe that a few caring people can't change the world. For,
indeed, that's all who ever have.
                        - Margaret Mead
From: Joey Hess <id@joeyh.name>
Date: Mon, 18 Jun 2018 15:38:25 -0400
Subject: limit url downloads to whitelisted schemes

backported from 28720c795ff57a55b48e56d15f9b6bcb977f48d9

Security fix! Allowing any schemes, particularly file: and
possibly others like scp: allowed file exfiltration by anyone who had
write access to the git repository, since they could add an annexed file
using such an url, or using an url that redirected to such an url,
and wait for the victim to get it into their repository and send them a copy.

* Added annex.security.allowed-url-schemes setting, which defaults
  to only allowing http and https URLs. Note especially that file:/
  is no longer enabled by default.

* Removed annex.web-download-command, since its interface does not allow
  supporting annex.security.allowed-url-schemes across redirects.
  If you used this setting, you may want to instead use annex.web-options
  to pass options to curl.

With annex.web-download-command removed, nearly all url accesses in
git-annex are made via Utility.Url via http-client or curl. http-client
only supports http and https, so no problem there.
(Disabling one and not the other is not implemented.)

Used curl --proto to limit the allowed url schemes.
wget only supports http https ftp, so does not need any limiting.

Note that this will cause git annex fsck --from web to mark files using
a disallowed url scheme as not being present in the web. That seems
acceptable; fsck --from web also does that when a web server is not available.

quvi was not changed; it only supports a hardcoded set of urls, which
are http, not file urls.

This does not address any external special remotes that might download
an url themselves. Current thinking is all external special remotes will
need to be audited for this problem, although many of them will use
http libraries that only support http and not curl's menagarie.

The related problem of accessing private localhost and LAN urls is not
addressed by this commit.

This commit was sponsored by Brett Eisenberg on Patreon.

(patch backported from Debian stretch 6.20170101-1+deb9u2)

---

Index: b/Annex/Content.hs
===================================================================
--- a/Annex/Content.hs	2018-08-28 15:35:49.306369141 -0400
+++ b/Annex/Content.hs	2018-08-28 15:35:49.302369082 -0400
@@ -553,18 +553,10 @@ saveState nocommit = doSideAction $ do
 
 {- Downloads content from any of a list of urls. -}
 downloadUrl :: [Url.URLString] -> FilePath -> Annex Bool
-downloadUrl urls file = go =<< annexWebDownloadCommand <$> Annex.getGitConfig
+downloadUrl urls file = go
   where
-	go Nothing = Url.withUrlOptions $ \uo ->
+	go = Url.withUrlOptions $ \uo ->
 		anyM (\u -> Url.download u file uo) urls
-	go (Just basecmd) = liftIO $ anyM (downloadcmd basecmd) urls
-	downloadcmd basecmd url =
-		boolSystem "sh" [Param "-c", Param $ gencmd url basecmd]
-			<&&> doesFileExist file
-	gencmd url = massReplace
-		[ ("%file", shellEscape file)
-		, ("%url", shellEscape url)
-		]
 
 {- Copies a key's content, when present, to a temp file.
  - This is used to speed up some rsyncs. -}
Index: b/Annex/Url.hs
===================================================================
--- a/Annex/Url.hs	2018-08-28 15:35:49.306369141 -0400
+++ b/Annex/Url.hs	2018-08-28 15:35:49.302369082 -0400
@@ -30,6 +30,7 @@ getUrlOptions = mkUrlOptions
 	<$> getUserAgent
 	<*> headers
 	<*> options
+	<*> (annexAllowedUrlSchemes <$> Annex.getGitConfig)
   where
 	headers = do
 		v <- annexHttpHeadersCommand <$> Annex.getGitConfig
Index: b/Types/GitConfig.hs
===================================================================
--- a/Types/GitConfig.hs	2018-08-28 15:35:49.306369141 -0400
+++ b/Types/GitConfig.hs	2018-08-28 15:45:29.514895956 -0400
@@ -21,6 +21,9 @@ import Types.Distribution
 import Types.Availability
 import Types.NumCopies
 import Utility.HumanTime
+import Utility.Url (Scheme, mkScheme)
+
+import qualified Data.Set as S
 
 {- Main git-annex settings. Each setting corresponds to a git-config key
  - such as annex.foo -}
@@ -42,7 +45,6 @@ data GitConfig = GitConfig
 	, annexDebug :: Bool
 	, annexWebOptions :: [String]
 	, annexQuviOptions :: [String]
-	, annexWebDownloadCommand :: Maybe String
 	, annexCrippledFileSystem :: Bool
 	, annexLargeFiles :: Maybe String
 	, annexFsckNudge :: Bool
@@ -53,6 +55,8 @@ data GitConfig = GitConfig
 	, annexListen :: Maybe String
 	, annexStartupScan :: Bool
 	, annexHardLink :: Bool
+	, annexAllowedUrlSchemes :: S.Set Scheme
+	, annexAllowedHttpAddresses :: String
 	, coreSymlinks :: Bool
 	, gcryptId :: Maybe String
 	}
@@ -77,7 +81,6 @@ extractGitConfig r = GitConfig
 	, annexDebug = getbool (annex "debug") False
 	, annexWebOptions = getwords (annex "web-options")
 	, annexQuviOptions = getwords (annex "quvi-options")
-	, annexWebDownloadCommand = getmaybe (annex "web-download-command")
 	, annexCrippledFileSystem = getbool (annex "crippledfilesystem") False
 	, annexLargeFiles = getmaybe (annex "largefiles")
 	, annexFsckNudge = getbool (annex "fscknudge") True
@@ -89,6 +92,11 @@ extractGitConfig r = GitConfig
 	, annexListen = getmaybe (annex "listen")
 	, annexStartupScan = getbool (annex "startupscan") True
 	, annexHardLink = getbool (annex "hardlink") False
+	, annexAllowedUrlSchemes = S.fromList $ map mkScheme $
+		maybe ["http", "https", "ftp"] words $
+			getmaybe (annex "security.allowed-url-schemes")
+        , annexAllowedHttpAddresses = fromMaybe "" $
+		getmaybe (annex "security.allowed-http-addresses")
 	, coreSymlinks = getbool "core.symlinks" True
 	, gcryptId = getmaybe "core.gcrypt-id"
 	}
Index: b/Utility/Url.hs
===================================================================
--- a/Utility/Url.hs	2018-08-28 15:35:49.306369141 -0400
+++ b/Utility/Url.hs	2018-08-28 15:42:30.576266197 -0400
@@ -12,6 +12,9 @@
 module Utility.Url (
 	URLString,
 	UserAgent,
+	Scheme,
+	mkScheme,
+	allowedScheme,
 	UrlOptions,
 	mkUrlOptions,
 	check,
@@ -30,6 +33,7 @@ import Data.Default
 import qualified Data.CaseInsensitive as CI
 import qualified Data.ByteString as B
 import qualified Data.ByteString.UTF8 as B8
+import qualified Data.Set as S
 
 import qualified Build.SysConfig
 
@@ -39,6 +43,15 @@ type Headers = [String]
 
 type UserAgent = String
 
+newtype Scheme = Scheme (CI.CI String)
+	deriving (Eq, Ord)
+
+mkScheme :: String -> Scheme
+mkScheme = Scheme . CI.mk
+
+fromScheme :: Scheme -> String
+fromScheme (Scheme s) = CI.original s
+
 data UrlOptions = UrlOptions
 	{ userAgent :: Maybe UserAgent
 	, reqHeaders :: Headers
@@ -48,19 +61,21 @@ data UrlOptions = UrlOptions
 #else
 	, applyRequest :: forall m. Request m -> Request m
 #endif
+	, allowedSchemes :: S.Set Scheme
 	}
 
 instance Default UrlOptions
   where
 	def = UrlOptions Nothing [] [] id
+		(S.fromList $ map mkScheme ["http", "https", "ftp"])
 
-mkUrlOptions :: Maybe UserAgent -> Headers -> [CommandParam] -> UrlOptions
-mkUrlOptions useragent reqheaders reqparams =
-	UrlOptions useragent reqheaders reqparams applyrequest
+mkUrlOptions :: Maybe UserAgent -> Headers -> [CommandParam] -> S.Set Scheme -> UrlOptions
+mkUrlOptions defuseragent reqheaders reqparams allowedschemes =
+	UrlOptions defuseragent reqheaders reqparams applyrequest allowedschemes
   where
 	applyrequest = \r -> r { requestHeaders = requestHeaders r ++ addedheaders }
 	addedheaders = uaheader ++ otherheaders
-	uaheader = case useragent of
+	uaheader = case defuseragent of
 		Nothing -> []
 		Just ua -> [(hUserAgent, B8.fromString ua)]
 	otherheaders = map toheader reqheaders
@@ -77,6 +92,28 @@ addUserAgent uo ps = case userAgent uo o
 	-- --user-agent works for both wget and curl commands
 	Just ua -> ps ++ [Param "--user-agent", Param ua] 
 
+checkPolicy :: UrlOptions -> URI -> a -> IO a -> IO a
+checkPolicy uo u onerr a
+	| allowedScheme uo u = a
+	| otherwise = do
+		hPutStrLn stderr $
+			"Configuration does not allow accessing " ++ show u
+		hFlush stderr
+		return onerr
+
+curlSchemeParams :: UrlOptions -> [CommandParam]
+curlSchemeParams uo = 
+	[ Param "--proto"
+	, Param $ intercalate "," ("-all" : schemelist)
+	]
+  where
+	schemelist = map fromScheme $ S.toList $ allowedSchemes uo
+
+allowedScheme :: UrlOptions -> URI -> Bool
+allowedScheme uo u = uscheme `S.member` allowedSchemes uo
+  where
+	uscheme = mkScheme $ takeWhile (/=':') (uriScheme u)
+
 {- Checks that an url exists and could be successfully downloaded,
  - also checking that its size, if available, matches a specified size. -}
 checkBoth :: URLString -> Maybe Integer -> UrlOptions -> IO Bool
@@ -96,7 +133,7 @@ check url expected_size = go <$$> exists
  - also returning its size if available. -}
 exists :: URLString -> UrlOptions -> IO (Bool, Maybe Integer)
 exists url uo = case parseURIRelaxed url of
-	Just u -> case parseUrl (show u) of
+	Just u -> checkPolicy uo u dne' $ case parseUrl (show u) of
 		Just req -> existsconduit req `catchNonAsync` const dne
 		-- http-conduit does not support file:, ftp:, etc urls,
 		-- so fall back to reading files and using curl.
@@ -115,14 +152,15 @@ exists url uo = case parseURIRelaxed url
 			| otherwise -> dne
 	Nothing -> dne
   where
-	dne = return (False, Nothing)
+	dne = return dne'
+	dne' = (False, Nothing)
 
 	curlparams = addUserAgent uo $
 		[ Param "-s"
 		, Param "--head"
 		, Param "-L", Param url
 		, Param "-w", Param "%{http_code}"
-		] ++ concatMap (\h -> [Param "-H", Param h]) (reqHeaders uo) ++ (reqParams uo)
+		] ++ concatMap (\h -> [Param "-H", Param h]) (reqHeaders uo) ++ (reqParams uo) ++ curlSchemeParams uo
 
 	extractlencurl s = case lastMaybe $ filter ("Content-Length:" `isPrefixOf`) (lines s) of
 		Just l -> case lastMaybe $ words l of
@@ -178,20 +216,25 @@ downloadQuiet = download' True
 download' :: Bool -> URLString -> FilePath -> UrlOptions -> IO Bool
 download' quiet url file uo = 
 	case parseURIRelaxed url of
-		Just u
-			| uriScheme u == "file:" -> do
+		Just u -> checkPolicy uo u False $
+			if uriScheme u == "file:" then do
 				-- curl does not create destination file
 				-- for an empty file:// url, so pre-create
 				writeFile file ""
 				curl
-			| otherwise -> ifM (inPath "wget") (wget , curl)
+			else ifM (inPath "wget") (wget , curl)
 		_ -> return False
   where
 	headerparams = map (\h -> Param $ "--header=" ++ h) (reqHeaders uo)
 	wget = go "wget" $ headerparams ++ quietopt "-q" ++ wgetparams
 	{- Regular wget needs --clobber to continue downloading an existing
 	 - file. On Android, busybox wget is used, which does not
-	 - support, or need that option. -}
+	 - support, or need that option.
+	 -
+	 - wget only supports https, http, and ftp, not file, which are
+	 - all always allowed, so its url schemes do not need to be
+	 - limited.
+         -}
 #ifndef __ANDROID__
 	wgetparams = [Params "--clobber -c -O"]
 #else
@@ -202,7 +245,7 @@ download' quiet url file uo =
 	 - the remainder to download as the whole file,
 	 - and not indicating how much percent was
 	 - downloaded before the resume. -}
-	curl = go "curl" $ headerparams ++ quietopt "-s" ++
+	curl = go "curl" $ headerparams ++ quietopt "-s" ++ curlSchemeParams uo ++
 		[Params "-f -L -C - -# -o"]
 	go cmd opts = boolSystem cmd $
 		addUserAgent uo $ reqParams uo++opts++[File file, File url]
Index: b/doc/git-annex.mdwn
===================================================================
--- a/doc/git-annex.mdwn	2018-08-28 15:35:49.306369141 -0400
+++ b/doc/git-annex.mdwn	2018-08-28 15:35:49.302369082 -0400
@@ -1688,13 +1688,20 @@ Here are all the supported configuration
   If set, the command is run and each line of its output is used as a HTTP
   header. This overrides annex.http-headers.
 
-* `annex.web-download-command`
+* `annex.security.allowed-url-schemes`
 
-  Use to specify a command to run to download a file from the web.
-  (The default is to use wget or curl.)
+  List of URL schemes that git-annex is allowed to download content from.
+  The default is "http https ftp".
 
-  In the command line, %url is replaced with the url to download,
-  and %file is replaced with the file that it should be saved to.
+  Think very carefully before changing this; there are security
+  implications. For example, if it's changed to allow "file" URLs, then
+  anyone who can get a commit into your git-annex repository could
+  `git-annex addurl` a pointer to a private file located outside that
+  repository, possibly causing it to be copied into your repository
+  and transferred on to other remotes, exposing its content.
+
+  Some special remotes support their own domain-specific URL
+  schemes; those are not affected by this configuration setting.
 
 * `annex.secure-erase-command`
 
From: Joey Hess <id@joeyh.name>
Date: Mon, 18 Jun 2018 17:40:50 -0400
Subject: block url downloads by default

git-annex will refuse to download content from the web, to prevent
accidental exposure of data on private webservers on localhost and the
LAN. This can be overridden with the
annex.security.allowed-http-addresses setting.

This is the simplest possible fix for the security hole. A better fix
has been developed for newer versions of git-annex but would be a lot of
work to backport, and perhaps too big a diff.

There are several sets of git-annex users who will be impacted
in different ways by this:

* Users who have a git-annex repository but don't use the web special
  remote. Unaffected.

* Users who have a git-annex repository that is for private use only.
  They will have to read enough docs to find the setting to allow
  git annex addurl to work again.

* Users who have a git-annex repositry that is shared with people they
  don't fully trust.
  They will not be able to use the web special remote with this version
  of git-annex. They'll have to upgrade.

The S3, glacier, and webdav special remotes are still allowed to
download from the web. There are other potential attacks involving the
web server they connect to redirecting to a local private web server,
and tricking them from downloading content from it which then leaks back
to the attacker. Those attacks are not addressed here, but they also
seem fairly unlikely. Further analysis is needed; preliminary analysis
of glacier-cli, for example, suggests it does not follow redirects and
so is not vulnerable to such attacks.

(patch backported from Debian stretch 6.20170101-1+deb9u2)

---

Index: b/Annex/Quvi.hs
===================================================================
--- a/Annex/Quvi.hs	2018-08-27 21:36:58.049200494 -0400
+++ b/Annex/Quvi.hs	2018-08-27 21:36:58.045200767 -0400
@@ -11,8 +11,8 @@ module Annex.Quvi where
 
 import Common.Annex
 import qualified Annex
+import Annex.Url
 import Utility.Quvi
-import Utility.Url
 
 withQuviOptions :: forall a. Query a -> [QuviParam] -> URLString -> Annex a
 withQuviOptions a ps url = do
@@ -21,7 +21,14 @@ withQuviOptions a ps url = do
 	liftIO $ a v (map (\mkp -> mkp v) ps++opts) url
 
 quviSupported :: URLString -> Annex Bool
-quviSupported u = liftIO . flip supported u =<< quviVersion
+quviSupported u = ifM httpAddressesUnlimited
+	( liftIO . flip supported u =<< quviVersion
+	-- Don't allow any url schemes to be used when
+	-- there's a limit on the allowed addresses, because
+	-- there is no way to prevent quvi from
+	-- redirecting to any address.
+	, return False
+	)
 
 quviVersion :: Annex QuviVersion
 quviVersion = go =<< Annex.getState Annex.quviversion
Index: b/Annex/Url.hs
===================================================================
--- a/Annex/Url.hs	2018-08-27 21:36:58.049200494 -0400
+++ b/Annex/Url.hs	2018-08-27 21:36:58.045200767 -0400
@@ -11,6 +11,7 @@ module Annex.Url (
 	withUrlOptions,
 	getUrlOptions,
 	getUserAgent,
+	httpAddressesUnlimited,
 ) where
 
 import Common.Annex
@@ -18,6 +19,8 @@ import qualified Annex
 import Utility.Url as U
 import qualified Build.SysConfig as SysConfig
 
+import qualified Data.Set as S
+
 defaultUserAgent :: U.UserAgent
 defaultUserAgent = "git-annex/" ++ SysConfig.packageversion
 
@@ -30,7 +33,7 @@ getUrlOptions = mkUrlOptions
 	<$> getUserAgent
 	<*> headers
 	<*> options
-	<*> (annexAllowedUrlSchemes <$> Annex.getGitConfig)
+	<*> urlschemes
   where
 	headers = do
 		v <- annexHttpHeadersCommand <$> Annex.getGitConfig
@@ -38,6 +41,18 @@ getUrlOptions = mkUrlOptions
 			Just cmd -> lines <$> liftIO (readProcess "sh" ["-c", cmd])
 			Nothing -> annexHttpHeaders <$> Annex.getGitConfig
 	options = map Param . annexWebOptions <$> Annex.getGitConfig
+	urlschemes = ifM httpAddressesUnlimited
+		( annexAllowedUrlSchemes <$> Annex.getGitConfig
+		-- Don't allow any url schemes to be used when
+		-- there's a limit on the allowed addresses, because
+		-- there is no way to prevent curl or wget from
+		-- redirecting to any address.
+		, pure S.empty
+		)
+
+httpAddressesUnlimited :: Annex Bool
+httpAddressesUnlimited =
+	("all" == ) . annexAllowedHttpAddresses <$> Annex.getGitConfig
 
 withUrlOptions :: (U.UrlOptions -> IO a) -> Annex a
 withUrlOptions a = liftIO . a =<< getUrlOptions
Index: b/doc/git-annex.mdwn
===================================================================
--- a/doc/git-annex.mdwn	2018-08-27 21:36:58.049200494 -0400
+++ b/doc/git-annex.mdwn	2018-08-27 21:36:58.049200494 -0400
@@ -1703,6 +1703,21 @@ Here are all the supported configuration
   Some special remotes support their own domain-specific URL
   schemes; those are not affected by this configuration setting.
 
+* `annex.security.allowed-http-addresses`
+
+  By default, this version of git-annex refuses to download the content of
+  annexed files from the web. Newer versions of git-annex allow downloading
+  from the web, but only when the web server is not on a private IP address.
+
+  To relax this security check and allow getting annexed files from
+  anywhere on the web, set this to "all".
+  
+  Think very carefully before changing this; there are security
+  implications. Anyone who can get a commit into your git-annex repository
+  could `git annex addurl` an url on a private http server, possibly
+  causing it to be downloaded into your repository and transferred to
+  other remotes, exposing its content.
+
 * `annex.secure-erase-command`
 
   This can be set to a command that should be run whenever git-annex
From: Joey Hess <joeyh@joeyh.name>
Date: Thu, 21 Jun 2018 11:35:27 -0400
Subject: add retrievalSecurityPolicy

This will be used to protect against CVE-2018-10859, where an encrypted
special remote is fed the wrong encrypted data, and so tricked into
decrypting something that the user encrypted with their gpg key and did
not store in git-annex.

It also protects against CVE-2018-10857, where a remote follows a http
redirect to a file:// url or to a local private web server. While that's
already been prevented in git-annex's own use of http, external special
remotes, hooks, etc use other http implementations and could still be
vulnerable.

The policy is not yet enforced, this commit only adds the appropriate
metadata to remotes.

This commit was sponsored by Boyd Stephen Smith Jr. on Patreon.

(patch backported from Debian stretch 6.20170101-1+deb9u2)

---

Index: b/Remote/Bup.hs
===================================================================
--- a/Remote/Bup.hs	2018-08-28 17:34:22.000000000 -0400
+++ b/Remote/Bup.hs	2018-08-28 17:34:28.000000000 -0400
@@ -57,6 +57,9 @@ gen r u c gc = do
 		, storeKey = storeKeyDummy
 		, retrieveKeyFile = retreiveKeyFileDummy
 		, retrieveKeyFileCheap = retrieveCheap buprepo
+		-- Bup uses git, which cryptographically verifies content
+		-- (with SHA1, but sufficiently for this).
+		, retrievalSecurityPolicy = RetrievalAllKeysSecure
 		, removeKey = removeKeyDummy
 		, checkPresent = checkPresentDummy
 		, checkPresentCheap = bupLocal buprepo
Index: b/Remote/Ddar.hs
===================================================================
--- a/Remote/Ddar.hs	2018-08-28 17:34:28.000000000 -0400
+++ b/Remote/Ddar.hs	2018-08-28 17:39:54.000000000 -0400
@@ -55,6 +55,8 @@ gen r u c gc = do
 		, storeKey = storeKeyDummy
 		, retrieveKeyFile = retreiveKeyFileDummy
 		, retrieveKeyFileCheap = retrieveCheap
+		-- Unsure about this, safe default until Robie answers.
+		, retrievalSecurityPolicy = RetrievalVerifiableKeysSecure
 		, removeKey = removeKeyDummy
 		, checkPresent = checkPresentDummy
 		, checkPresentCheap = ddarLocal ddarrepo
Index: b/Remote/Directory.hs
===================================================================
--- a/Remote/Directory.hs	2018-08-28 17:34:22.000000000 -0400
+++ b/Remote/Directory.hs	2018-08-28 17:34:28.000000000 -0400
@@ -53,6 +53,7 @@ gen r u c gc = do
 			storeKey = storeKeyDummy,
 			retrieveKeyFile = retreiveKeyFileDummy,
 			retrieveKeyFileCheap = retrieveCheap dir chunkconfig,
+			retrievalSecurityPolicy = RetrievalAllKeysSecure,
 			removeKey = removeKeyDummy,
 			checkPresent = checkPresentDummy,
 			checkPresentCheap = True,
Index: b/Remote/External.hs
===================================================================
--- a/Remote/External.hs	2018-08-28 17:34:22.000000000 -0400
+++ b/Remote/External.hs	2018-08-28 17:34:28.000000000 -0400
@@ -53,6 +53,11 @@ gen r u c gc = do
 			storeKey = storeKeyDummy,
 			retrieveKeyFile = retreiveKeyFileDummy,
 			retrieveKeyFileCheap = \_ _ -> return False,
+			-- External special remotes use many http libraries
+			-- and have no protection against redirects to
+			-- local private web servers, or in some cases
+			-- to file:// urls.
+			retrievalSecurityPolicy = RetrievalVerifiableKeysSecure,
 			removeKey = removeKeyDummy,
 			checkPresent = checkPresentDummy,
 			checkPresentCheap = False,
Index: b/Remote/GCrypt.hs
===================================================================
--- a/Remote/GCrypt.hs	2018-08-28 17:34:28.000000000 -0400
+++ b/Remote/GCrypt.hs	2018-08-28 17:34:28.000000000 -0400
@@ -108,6 +108,7 @@ gen' r u c gc = do
 		, storeKey = storeKeyDummy
 		, retrieveKeyFile = retreiveKeyFileDummy
 		, retrieveKeyFileCheap = \_ _ -> return False
+		, retrievalSecurityPolicy = RetrievalAllKeysSecure
 		, removeKey = removeKeyDummy
 		, checkPresent = checkPresentDummy
 		, checkPresentCheap = repoCheap r
Index: b/Remote/Git.hs
===================================================================
--- a/Remote/Git.hs	2018-08-28 17:34:22.000000000 -0400
+++ b/Remote/Git.hs	2018-08-28 17:42:25.000000000 -0400
@@ -140,6 +140,7 @@ gen r u c gc
 			, storeKey = copyToRemote new
 			, retrieveKeyFile = copyFromRemote new
 			, retrieveKeyFileCheap = copyFromRemoteCheap new
+			, retrievalSecurityPolicy = RetrievalAllKeysSecure
 			, removeKey = dropKey new
 			, checkPresent = inAnnex new
 			, checkPresentCheap = repoCheap r
Index: b/Remote/Glacier.hs
===================================================================
--- a/Remote/Glacier.hs	2018-08-28 17:34:22.000000000 -0400
+++ b/Remote/Glacier.hs	2018-08-28 17:39:55.000000000 -0400
@@ -53,6 +53,9 @@ gen r u c gc = new <$> remoteCost gc ver
 			storeKey = storeKeyDummy,
 			retrieveKeyFile = retreiveKeyFileDummy,
 			retrieveKeyFileCheap = retrieveCheap this,
+			-- glacier-cli does not follow redirects and does
+			-- not support file://, so this is secure.
+			retrievalSecurityPolicy = RetrievalAllKeysSecure,
 			removeKey = removeKeyDummy,
 			checkPresent = checkPresentDummy,
 			checkPresentCheap = False,
Index: b/Remote/Helper/Special.hs
===================================================================
--- a/Remote/Helper/Special.hs	2018-08-28 17:34:22.000000000 -0400
+++ b/Remote/Helper/Special.hs	2018-08-28 17:34:28.000000000 -0400
@@ -162,6 +162,14 @@ specialRemote' cfg c preparestorer prepa
 			(retrieveKeyFileCheap baser k d)
 			-- retrieval of encrypted keys is never cheap
 			(\_ -> return False)
+		-- When encryption is used, the remote could provide
+		-- some other content encrypted by the user, and trick
+		-- git-annex into decrypting it, leaking the decryption
+		-- into the git-annex repository. Verifiable keys
+		-- are the main protection against this attack.
+		, retrievalSecurityPolicy = if isencrypted
+			then RetrievalVerifiableKeysSecure
+			else retrievalSecurityPolicy baser
 		, removeKey = \k -> cip >>= removeKeyGen k
 		, checkPresent = \k -> cip >>= checkPresentGen k
 		, cost = maybe
@@ -176,6 +184,7 @@ specialRemote' cfg c preparestorer prepa
 				]
 		}
 	cip = cipherKey c
+	isencrypted = isJust (extractCipher c)
 	gpgopts = getGpgEncParams encr
 
 	safely a = catchNonAsync a (\e -> warning (show e) >> return False)
Index: b/Remote/Hook.hs
===================================================================
--- a/Remote/Hook.hs	2018-08-28 17:34:22.000000000 -0400
+++ b/Remote/Hook.hs	2018-08-28 17:34:28.000000000 -0400
@@ -46,6 +46,9 @@ gen r u c gc = do
 			storeKey = storeKeyDummy,
 			retrieveKeyFile = retreiveKeyFileDummy,
 			retrieveKeyFileCheap = retrieveCheap hooktype,
+			-- A hook could use http and be vulnerable to
+			-- redirect to file:// attacks, etc.
+			retrievalSecurityPolicy = RetrievalVerifiableKeysSecure,
 			removeKey = removeKeyDummy,
 			checkPresent = checkPresentDummy,
 			checkPresentCheap = False,
Index: b/Remote/Rsync.hs
===================================================================
--- a/Remote/Rsync.hs	2018-08-28 17:34:28.000000000 -0400
+++ b/Remote/Rsync.hs	2018-08-28 17:34:28.000000000 -0400
@@ -68,6 +68,7 @@ gen r u c gc = do
 			, storeKey = storeKeyDummy
 			, retrieveKeyFile = retreiveKeyFileDummy
 			, retrieveKeyFileCheap = retrieveCheap o
+			, retrievalSecurityPolicy = RetrievalAllKeysSecure
 			, removeKey = removeKeyDummy
 			, checkPresent = checkPresentDummy
 			, checkPresentCheap = False
Index: b/Remote/S3.hs
===================================================================
--- a/Remote/S3.hs	2018-08-28 17:34:22.000000000 -0400
+++ b/Remote/S3.hs	2018-08-28 17:34:28.000000000 -0400
@@ -58,6 +58,9 @@ gen r u c gc = new <$> remoteCost gc exp
 			storeKey = storeKeyDummy,
 			retrieveKeyFile = retreiveKeyFileDummy,
 			retrieveKeyFileCheap = retrieveCheap,
+			-- HttpManagerRestricted is used here, so this is
+			-- secure.
+			retrievalSecurityPolicy = RetrievalAllKeysSecure,
 			removeKey = removeKeyDummy,
 			checkPresent = checkPresentDummy,
 			checkPresentCheap = False,
Index: b/Remote/Tahoe.hs
===================================================================
--- a/Remote/Tahoe.hs	2018-08-28 17:34:22.000000000 -0400
+++ b/Remote/Tahoe.hs	2018-08-28 17:34:28.000000000 -0400
@@ -71,6 +71,8 @@ gen r u c gc = do
 		storeKey = store u hdl,
 		retrieveKeyFile = retrieve u hdl,
 		retrieveKeyFileCheap = \_ _ -> return False,
+		-- Tahoe cryptographically verifies content.
+		retrievalSecurityPolicy = RetrievalAllKeysSecure,
 		removeKey = remove,
 		checkPresent = checkKey u hdl,
 		checkPresentCheap = False,
Index: b/Remote/Web.hs
===================================================================
--- a/Remote/Web.hs	2018-08-28 17:34:22.000000000 -0400
+++ b/Remote/Web.hs	2018-08-28 17:34:28.000000000 -0400
@@ -49,6 +49,9 @@ gen r _ c gc =
 		storeKey = uploadKey,
 		retrieveKeyFile = downloadKey,
 		retrieveKeyFileCheap = downloadKeyCheap,
+		-- HttpManagerRestricted is used here, so this is
+		-- secure.
+		retrievalSecurityPolicy = RetrievalAllKeysSecure,
 		removeKey = dropKey,
 		checkPresent = checkKey,
 		checkPresentCheap = False,
Index: b/Remote/WebDAV.hs
===================================================================
--- a/Remote/WebDAV.hs	2018-08-28 17:34:22.000000000 -0400
+++ b/Remote/WebDAV.hs	2018-08-28 17:34:28.000000000 -0400
@@ -58,6 +58,9 @@ gen r u c gc = new <$> remoteCost gc exp
 			storeKey = storeKeyDummy,
 			retrieveKeyFile = retreiveKeyFileDummy,
 			retrieveKeyFileCheap = retrieveCheap,
+			-- HttpManagerRestricted is used here, so this is
+			-- secure.
+			retrievalSecurityPolicy = RetrievalAllKeysSecure,
 			removeKey = removeKeyDummy,
 			checkPresent = checkPresentDummy,
 			checkPresentCheap = False,
Index: b/Types/Key.hs
===================================================================
--- a/Types/Key.hs	2018-08-28 17:34:22.000000000 -0400
+++ b/Types/Key.hs	2018-08-28 17:42:25.000000000 -0400
@@ -138,3 +138,13 @@ prop_idempotent_key_decode f
 	normalfieldorder = fields `isPrefixOf` "smSC"
 	fields = map (f !!) $ filter (< length f) $ map succ $
 		elemIndices fieldSep f
+
+{- Is the Key variety backed by a hash, which allows verifying content?
+ - It does not have to be cryptographically secure against eg birthday
+ - attacks.
+ -}
+isVerifiable :: Key -> Bool
+isVerifiable k = case keyBackendName k of
+	"WORM" -> False
+	"URL" -> False
+	_ -> True
Index: b/Types/Remote.hs
===================================================================
--- a/Types/Remote.hs	2018-08-28 17:34:22.000000000 -0400
+++ b/Types/Remote.hs	2018-08-28 17:34:28.000000000 -0400
@@ -13,6 +13,7 @@ module Types.Remote
 	, RemoteTypeA(..)
 	, RemoteA(..)
 	, Availability(..)
+	, RetrievalSecurityPolicy(..)
 	)
 	where
 
@@ -66,6 +67,8 @@ data RemoteA a = Remote {
 	retrieveKeyFile :: Key -> AssociatedFile -> FilePath -> MeterUpdate -> a Bool,
 	-- retrieves a key's contents to a tmp file, if it can be done cheaply
 	retrieveKeyFileCheap :: Key -> FilePath -> a Bool,
+	-- Security policy for reteiving keys from this remote.
+	retrievalSecurityPolicy :: RetrievalSecurityPolicy,
 	-- removes a key's contents (succeeds if the contents are not present)
 	removeKey :: Key -> a Bool,
 	-- Checks if a key is present in the remote.
@@ -112,3 +115,29 @@ instance Eq (RemoteA a) where
 
 instance Ord (RemoteA a) where
 	compare = comparing uuid
+
+-- Security policy indicating what keys can be safely retrieved from a
+-- remote.
+data RetrievalSecurityPolicy
+	= RetrievalVerifiableKeysSecure
+	-- ^ Transfer of keys whose content can be verified
+	-- with a hash check is secure; transfer of unverifiable keys is
+	-- not secure and should not be allowed.
+	--
+	-- This is used eg, when HTTP to a remote could be redirected to a
+	-- local private web server or even a file:// url, causing private
+	-- data from it that is not the intended content of a key to make
+	-- its way into the git-annex repository.
+	--
+	-- It's also used when content is stored encrypted on a remote,
+	-- which could replace it with a different encrypted file, and
+	-- trick git-annex into decrypting it and leaking the decryption
+	-- into the git-annex repository.
+	--
+	-- It's not (currently) used when the remote could alter the
+	-- content stored on it, because git-annex does not provide
+	-- strong guarantees about the content of keys that cannot be 
+	-- verified with a hash check.
+	-- (But annex.securehashesonly does provide such guarantees.)
+	| RetrievalAllKeysSecure
+	-- ^ Any key can be securely retrieved.
From: Joey Hess <joeyh@joeyh.name>
Date: Thu, 21 Jun 2018 13:34:11 -0400
Subject: enforce retrievalSecurityPolicy

Leveraged the existing verification code by making it also check the
retrievalSecurityPolicy.

Also, prevented getViaTmp from running the download action at all when the
retrievalSecurityPolicy is going to prevent verifying and so storing it.

Added annex.security.allow-unverified-downloads. A per-remote version
would be nice to have too, but would need more plumbing, so KISS.
(Bill the Cat reference not too over the top I hope. The point is to
make this something the user reads the documentation for before using.)

A few calls to verifyKeyContent and getViaTmp, that don't
involve downloads from remotes, have RetrievalAllKeysSecure hard-coded.
It was also hard-coded for P2P.Annex and Command.RecvKey,
to match the values of the corresponding remotes.

A few things use retrieveKeyFile/retrieveKeyFileCheap without going
through getViaTmp.
* Command.Fsck when downloading content from a remote to verify it.
  That content does not get into the annex, so this is ok.
* Command.AddUrl when using a remote to download an url; this is new
  content being added, so this is ok.

This commit was sponsored by Fernando Jimenez on Patreon.

(patch backported from Debian stretch 6.20170101-1+deb9u2)

---

Index: b/Annex/Content.hs
===================================================================
--- a/Annex/Content.hs	2018-08-28 18:50:15.311731831 -0400
+++ b/Annex/Content.hs	2018-08-28 18:50:15.303731708 -0400
@@ -12,6 +12,7 @@ module Annex.Content (
 	inAnnexSafe,
 	inAnnexCheck,
 	lockContent,
+	RetrievalSecurityPolicy(..),
 	getViaTmp,
 	getViaTmpChecked,
 	getViaTmpUnchecked,
@@ -48,6 +49,7 @@ import Utility.DiskFree
 import Utility.FileMode
 import qualified Annex.Url as Url
 import Types.Key
+import Types.Remote (RetrievalSecurityPolicy(..))
 import Utility.DataUnits
 import Utility.CopyFile
 import Config
@@ -207,19 +209,19 @@ lockContent key a = do
 {- Runs an action, passing it a temporary filename to get,
  - and if the action succeeds, moves the temp file into 
  - the annex as a key's content. -}
-getViaTmp :: Key -> (FilePath -> Annex Bool) -> Annex Bool
-getViaTmp = getViaTmpChecked (return True)
+getViaTmp :: RetrievalSecurityPolicy -> Key -> (FilePath -> Annex Bool) -> Annex Bool
+getViaTmp rsp key action = getViaTmpChecked rsp (return True) key action
 
 {- Like getViaTmp, but does not check that there is enough disk space
  - for the incoming key. For use when the key content is already on disk
  - and not being copied into place. -}
-getViaTmpUnchecked :: Key -> (FilePath -> Annex Bool) -> Annex Bool
-getViaTmpUnchecked = finishGetViaTmp (return True)
+getViaTmpUnchecked :: RetrievalSecurityPolicy -> Key -> (FilePath -> Annex Bool) -> Annex Bool
+getViaTmpUnchecked rsp key action = finishGetViaTmp rsp (return True) key action 
 
-getViaTmpChecked :: Annex Bool -> Key -> (FilePath -> Annex Bool) -> Annex Bool
-getViaTmpChecked check key action = 
+getViaTmpChecked :: RetrievalSecurityPolicy -> Annex Bool -> Key -> (FilePath -> Annex Bool) -> Annex Bool
+getViaTmpChecked rsp check key action = 
 	prepGetViaTmpChecked key False $
-		finishGetViaTmp check key action
+		finishGetViaTmp rsp check key action
 
 {- Prepares to download a key via a tmp file, and checks that there is
  - enough free disk space.
@@ -245,8 +247,8 @@ prepGetViaTmpChecked key unabletoget get
 		, return unabletoget
 		)
 
-finishGetViaTmp :: Annex Bool -> Key -> (FilePath -> Annex Bool) -> Annex Bool
-finishGetViaTmp check key action = do
+finishGetViaTmp :: RetrievalSecurityPolicy -> Annex Bool -> Key -> (FilePath -> Annex Bool) -> Annex Bool
+finishGetViaTmp rsp check key action = checkallowed $ do
 	tmpfile <- prepTmp key
 	ifM (action tmpfile <&&> check)
 		( do
@@ -257,6 +259,17 @@ finishGetViaTmp check key action = do
 		-- to resume its transfer
 		, return False
 		)
+  where
+	-- Avoid running the action to get the content when the
+	-- RetrievalSecurityPolicy would cause verification to always fail.
+	checkallowed a = case rsp of
+		RetrievalAllKeysSecure -> a
+		RetrievalVerifiableKeysSecure
+			| isVerifiable key -> a
+			| otherwise -> ifM (annexAllowUnverifiedDownloads <$> Annex.getGitConfig)
+				( a
+				, warnUnverifiableInsecure key >> return False
+				)
 
 prepTmp :: Key -> Annex FilePath
 prepTmp key = do
@@ -275,6 +288,17 @@ withTmp key action = do
 	liftIO $ nukeFile tmp
 	return res
 
+warnUnverifiableInsecure :: Key -> Annex ()
+warnUnverifiableInsecure k = warning $ unwords
+	[ "Getting " ++ kv ++ " keys with this remote is not secure;"
+	, "the content cannot be verified to be correct."
+	, "(Use annex.security.allow-unverified-downloads to bypass"
+	, "this safety check.)"
+	]
+  where
+	kv = keyBackendName k
+
+
 {- Checks that there is disk space available to store a given key,
  - in a destination (or the annex) printing a warning if not. -}
 checkDiskSpace :: Maybe FilePath -> Key -> Integer -> Annex Bool
Index: b/Command/RecvKey.hs
===================================================================
--- a/Command/RecvKey.hs	2018-08-28 18:50:15.311731831 -0400
+++ b/Command/RecvKey.hs	2018-08-28 18:54:59.443756576 -0400
@@ -15,6 +15,7 @@ import Annex
 import Utility.Rsync
 import Logs.Transfer
 import Command.SendKey (fieldTransfer)
+import Types.Remote (RetrievalSecurityPolicy(..))
 import qualified CmdLine.GitAnnexShell.Fields as Fields
 import qualified Types.Key
 import qualified Types.Backend
@@ -29,7 +30,8 @@ seek = withKeys start
 
 start :: Key -> CommandStart
 start key = fieldTransfer Download key $ \_p ->
-	ifM (getViaTmp key go)
+	-- This matches the retrievalSecurityPolicy of Remote.Git
+	ifM (getViaTmp RetrievalAllKeysSecure key go)
 		( do
 			-- forcibly quit after receiving one key,
 			-- and shutdown cleanly
Index: b/Command/Reinject.hs
===================================================================
--- a/Command/Reinject.hs	2018-08-28 18:50:15.311731831 -0400
+++ b/Command/Reinject.hs	2018-08-28 18:50:15.307731770 -0400
@@ -53,7 +53,7 @@ perform src dest key = do
 	-- so mv is used rather than simply calling
 	-- moveToObjectDir; disk space is also
 	-- checked this way.
-	move = getViaTmp key $ \tmp ->
+	move = getViaTmp RetrievalAllKeysSecure key $ \tmp ->
 		liftIO $ boolSystem "mv" [File src, File tmp]
 	reject = const $ return "wrong file?"
 
Index: b/Command/TestRemote.hs
===================================================================
--- a/Command/TestRemote.hs	2018-08-28 18:50:15.311731831 -0400
+++ b/Command/TestRemote.hs	2018-08-28 18:50:15.307731770 -0400
@@ -153,7 +153,7 @@ test st r k =
 		Just b -> case fsckKey b of
 			Nothing -> return True
 			Just fscker -> fscker k (key2file k)
-	get = getViaTmp k $ \dest ->
+	get = getViaTmp (Remote.retrievalSecurityPolicy r) k $ \dest ->
 		Remote.retrieveKeyFile r k Nothing dest nullMeterUpdate
 	store = Remote.storeKey r k Nothing nullMeterUpdate
 	remove = Remote.removeKey r k
@@ -167,10 +167,10 @@ testUnavailable st r k =
 	, check (`notElem` [Right True, Right False]) "checkPresent" $
 		Remote.checkPresent r k
 	, check (== Right False) "retrieveKeyFile" $
-		getViaTmp k $ \dest ->
+		getViaTmp (Remote.retrievalSecurityPolicy r) k $ \dest ->
 			Remote.retrieveKeyFile r k Nothing dest nullMeterUpdate
 	, check (== Right False) "retrieveKeyFileCheap" $
-		getViaTmp k $ \dest ->
+		getViaTmp (Remote.retrievalSecurityPolicy r) k $ \dest ->
 			Remote.retrieveKeyFileCheap r k dest
 	]
   where
Index: b/Command/Get.hs
===================================================================
--- a/Command/Get.hs	2018-08-28 18:50:15.311731831 -0400
+++ b/Command/Get.hs	2018-08-28 18:50:15.307731770 -0400
@@ -53,7 +53,7 @@ start' expensivecheck from key afile = s
 		next a
 
 perform :: Key -> AssociatedFile -> CommandPerform
-perform key afile = stopUnless (getViaTmp key $ getKeyFile key afile) $
+perform key afile = stopUnless (getViaTmp RetrievalAllKeysSecure key $ getKeyFile key afile) $
 	next $ return True -- no cleanup needed
 
 {- Try to find a copy of the file in one of the remotes,
Index: b/Command/Move.hs
===================================================================
--- a/Command/Move.hs	2018-08-28 18:50:15.311731831 -0400
+++ b/Command/Move.hs	2018-08-28 18:50:15.307731770 -0400
@@ -158,7 +158,7 @@ fromPerform src move key afile = ifM (in
 	go = notifyTransfer Download afile $ 
 		download (Remote.uuid src) key afile noRetry $ \p -> do
 			showAction $ "from " ++ Remote.name src
-			getViaTmp key $ \t -> Remote.retrieveKeyFile src key afile t p
+			getViaTmp (Remote.retrievalSecurityPolicy src) key $ \t -> Remote.retrieveKeyFile src key afile t p
 	dispatch _ False = stop -- failed
 	dispatch False True = next $ return True -- copy complete
 	dispatch True True = do -- finish moving
Index: b/Command/ReKey.hs
===================================================================
--- a/Command/ReKey.hs	2018-08-28 18:50:15.311731831 -0400
+++ b/Command/ReKey.hs	2018-08-28 18:50:15.307731770 -0400
@@ -49,7 +49,7 @@ perform file oldkey newkey = do
 {- Make a hard link to the old key content (when supported),
  - to avoid wasting disk space. -}
 linkKey :: Key -> Key -> Annex Bool
-linkKey oldkey newkey = getViaTmpUnchecked newkey $ \tmp -> do
+linkKey oldkey newkey = getViaTmpUnchecked RetrievalAllKeysSecure newkey $ \tmp -> do
 	src <- calcRepo $ gitAnnexLocation oldkey
 	liftIO $ ifM (doesFileExist tmp)
 		( return True
Index: b/Command/TransferKey.hs
===================================================================
--- a/Command/TransferKey.hs	2018-08-28 18:50:15.311731831 -0400
+++ b/Command/TransferKey.hs	2018-08-28 18:50:15.307731770 -0400
@@ -51,7 +51,7 @@ toPerform remote key file = go Upload fi
 fromPerform :: Remote -> Key -> AssociatedFile -> CommandPerform
 fromPerform remote key file = go Upload file $
 	download (uuid remote) key file forwardRetry $ \p ->
-		getViaTmp key $ \t -> Remote.retrieveKeyFile remote key file t p
+		getViaTmp (Remote.retrievalSecurityPolicy remote) key $ \t -> Remote.retrieveKeyFile remote key file t p
 
 go :: Direction -> AssociatedFile -> (NotifyWitness -> Annex Bool) -> CommandPerform
 go direction file a = notifyTransfer direction file a >>= liftIO . exitBool
Index: b/Command/TransferKeys.hs
===================================================================
--- a/Command/TransferKeys.hs	2018-08-28 18:50:15.311731831 -0400
+++ b/Command/TransferKeys.hs	2018-08-28 18:50:15.307731770 -0400
@@ -43,7 +43,7 @@ start = do
 				return ok
 		| otherwise = notifyTransfer direction file $
 			download (Remote.uuid remote) key file forwardRetry $ \p ->
-				getViaTmp key $ \t -> Remote.retrieveKeyFile remote key file t p
+				getViaTmp (Remote.retrievalSecurityPolicy remote) key $ \t -> Remote.retrieveKeyFile remote key file t p
 
 runRequests
 	:: Handle
Index: b/Command/Sync.hs
===================================================================
--- a/Command/Sync.hs	2018-08-28 18:50:15.311731831 -0400
+++ b/Command/Sync.hs	2018-08-28 18:50:15.307731770 -0400
@@ -367,7 +367,7 @@ syncFile rs f k = do
 		)
 	get have = commandAction $ do
 		showStart "get" f
-		next $ next $ getViaTmp k $ \dest -> getKeyFile' k (Just f) dest have
+		next $ next $ getViaTmp RetrievalAllKeysSecure k $ \dest -> getKeyFile' k (Just f) dest have
 
 	wantput r
 		| Remote.readonly r || remoteAnnexReadOnly (Remote.gitconfig r) = return False
Index: b/Remote.hs
===================================================================
--- a/Remote.hs	2018-08-28 18:50:15.311731831 -0400
+++ b/Remote.hs	2018-08-28 18:50:15.307731770 -0400
@@ -12,6 +12,7 @@ module Remote (
 	storeKey,
 	retrieveKeyFile,
 	retrieveKeyFileCheap,
+	retrievalSecurityPolicy,
 	removeKey,
 	hasKey,
 	hasKeyCheap,
Index: b/Remote/Git.hs
===================================================================
--- a/Remote/Git.hs	2018-08-28 18:50:15.311731831 -0400
+++ b/Remote/Git.hs	2018-08-28 18:50:15.307731770 -0400
@@ -480,9 +480,10 @@ copyToRemote r key file p
 			( return True
 			, do
 				ensureInitialized
+				let rsp = RetrievalAllKeysSecure
 				runTransfer (Transfer Download u key) file noRetry $ const $
 					Annex.Content.saveState True `after`
-						Annex.Content.getViaTmpChecked (liftIO checksuccessio) key
+						Annex.Content.getViaTmpChecked rsp (liftIO checksuccessio) key
 							(\d -> rsyncOrCopyFile params object d p)
 			)
 
Index: b/Types/GitConfig.hs
===================================================================
--- a/Types/GitConfig.hs	2018-08-28 18:50:15.311731831 -0400
+++ b/Types/GitConfig.hs	2018-08-28 18:50:15.307731770 -0400
@@ -57,6 +57,7 @@ data GitConfig = GitConfig
 	, annexHardLink :: Bool
 	, annexAllowedUrlSchemes :: S.Set Scheme
 	, annexAllowedHttpAddresses :: String
+	, annexAllowUnverifiedDownloads :: Bool
 	, coreSymlinks :: Bool
 	, gcryptId :: Maybe String
 	}
@@ -97,6 +98,8 @@ extractGitConfig r = GitConfig
 			getmaybe (annex "security.allowed-url-schemes")
         , annexAllowedHttpAddresses = fromMaybe "" $
 		getmaybe (annex "security.allowed-http-addresses")
+	, annexAllowUnverifiedDownloads = (== Just "ACKTHPPT") $
+		getmaybe (annex "security.allow-unverified-downloads")
 	, coreSymlinks = getbool "core.symlinks" True
 	, gcryptId = getmaybe "core.gcrypt-id"
 	}
Index: b/Types/Key.hs
===================================================================
--- a/Types/Key.hs	2018-08-28 18:50:15.311731831 -0400
+++ b/Types/Key.hs	2018-08-28 18:50:15.307731770 -0400
@@ -16,6 +16,7 @@ module Types.Key (
 	nonChunkKey,
 	chunkKeyOffset,
 	isChunkKey,
+	isVerifiable,
 
 	prop_idempotent_key_encode,
 	prop_idempotent_key_decode
Index: b/doc/git-annex.mdwn
===================================================================
--- a/doc/git-annex.mdwn	2018-08-28 18:50:15.311731831 -0400
+++ b/doc/git-annex.mdwn	2018-08-28 18:50:15.307731770 -0400
@@ -1587,6 +1587,10 @@ Here are all the supported configuration
   This both prevents git-annex sync from pushing changes, and prevents
   storing or removing files from read-only remote.
 
+  Note that even when this is set to `false`, git-annex does verification
+  in some edge cases, where it's likely the case than an
+  object was downloaded incorrectly, or when needed for security.
+
 * `remote.<name>.annexUrl`
 
   Can be used to specify a different url than the regular `remote.<name>.url`
@@ -1718,6 +1722,43 @@ Here are all the supported configuration
   causing it to be downloaded into your repository and transferred to
   other remotes, exposing its content.
 
+* `annex.security.allow-unverified-downloads`, 
+
+  For security reasons, git-annex refuses to download content from
+  most special remotes when it cannot check a hash to verify 
+  that the correct content was downloaded. This particularly impacts
+  downloading the content of URL or WORM keys, which lack hashes.
+
+  The best way to avoid problems due to this is to migrate files
+  away from such keys, before their content reaches a special remote.
+  See [[git-annex-migrate]](1).
+
+  When the content is only available from a special remote, you can
+  use this configuration to force git-annex to download it.
+  But you do so at your own risk, and it's very important you read and
+  understand the information below first!
+
+  Downloading unverified content from encrypted special remotes is
+  prevented, because the special remote could send some other encrypted
+  content than what you expect, causing git-annex to decrypt data that you
+  never checked into git-annex, and risking exposing the decrypted
+  data to any non-encrypted remotes you send content to.
+
+  Downloading unverified content from (non-encrypted)
+  external special remotes is prevented, because they could follow
+  http redirects to web servers on localhost or on a private network,
+  or in some cases to a file:/// url.
+
+  If you decide to bypass this security check, the best thing to do is
+  to only set it temporarily while running the command that gets the file.
+  The value to set the config to is "ACKTHPPT".
+  For example:
+
+	git -c annex.security.allow-unverified-downloads=ACKTHPPT annex get myfile
+
+  It would be a good idea to check that it downloaded the file you expected,
+  too.
+
 * `annex.secure-erase-command`
 
   This can be set to a command that should be run whenever git-annex
From: Joey Hess <joeyh@joeyh.name>
Date: Thu, 21 Jun 2018 14:14:56 -0400
Subject: don't assume boto will remain secure

On second thought, best to default to being secure even if boto changes
http libraries to one that happens to follow redirects.

(patch backported from Debian stretch 6.20170101-1+deb9u2)

---

Index: b/Remote/Glacier.hs
===================================================================
--- a/Remote/Glacier.hs	2018-08-28 15:19:08.903444542 -0400
+++ b/Remote/Glacier.hs	2018-08-28 15:19:50.660057855 -0400
@@ -54,8 +54,10 @@ gen r u c gc = new <$> remoteCost gc ver
 			retrieveKeyFile = retreiveKeyFileDummy,
 			retrieveKeyFileCheap = retrieveCheap this,
 			-- glacier-cli does not follow redirects and does
-			-- not support file://, so this is secure.
-			retrievalSecurityPolicy = RetrievalAllKeysSecure,
+			-- not support file://, as far as we know, but
+			-- there's no guarantee that will continue to be
+			-- the case, so require verifiable keys.
+			retrievalSecurityPolicy = RetrievalVerifiableKeysSecure,
 			removeKey = removeKeyDummy,
 			checkPresent = checkPresentDummy,
 			checkPresentCheap = False,
From: Joey Hess <joeyh@joeyh.name>
Date: Thu, 21 Jun 2018 16:38:47 -0400
Subject: set ddar to RetrievalAllKeysSecure

Based on information from Robie Basak.

(patch backported from Debian stretch 6.20170101-1+deb9u2)

---

Index: b/Remote/Ddar.hs
===================================================================
--- a/Remote/Ddar.hs	2018-08-28 18:13:50.000000000 -0400
+++ b/Remote/Ddar.hs	2018-08-28 18:15:58.000000000 -0400
@@ -55,8 +55,9 @@ gen r u c gc = do
 		, storeKey = storeKeyDummy
 		, retrieveKeyFile = retreiveKeyFileDummy
 		, retrieveKeyFileCheap = retrieveCheap
-		-- Unsure about this, safe default until Robie answers.
-		, retrievalSecurityPolicy = RetrievalVerifiableKeysSecure
+		-- ddar communicates over ssh, not subject to http redirect
+		-- type attacks
+		, retrievalSecurityPolicy = RetrievalAllKeysSecure
 		, removeKey = removeKeyDummy
 		, checkPresent = checkPresentDummy
 		, checkPresentCheap = ddarLocal ddarrepo

Reply to: