[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#669427: apt segfaults on s390x



On Wed, May 02, 2012 at 05:28:16PM +0200, Philipp Kern wrote:
> On Wed, May 02, 2012 at 05:23:30PM +0200, David Kalnischkies wrote:
> > > 2129.4.17  kalnisc |    /* Record the Description (it is not translated) */
> > >                   |    MD5SumValue CurMd5 = List.Description_md5();
> > >                   |    if (CurMd5.Value().empty() == true)
> > >                   |       return true;
> > >                   |    std::string CurLang = List.DescriptionLanguage();
> > >                   |
> > >                   |    /* Before we add a new description we first search in the group for
> > >                   |       a version with a description of the same MD5 - if so we reuse this
> > >                   |       description group instead of creating our own for this version */
> > >                   |    for (pkgCache::PkgIterator P = Grp.PackageList();
> > >                   |    P.end() == false; P = Grp.NextPkg(P))
> > >                   |    {
> > >                   |       for (pkgCache::VerIterator V = P.VersionList();
> > >                   |       V.end() == false; ++V)
> > >                   |       {
> > >                   |     if (IsDuplicateDescription(V.DescriptionList(), CurMd5, "") == false)
> > >                   |        continue;
> > >                   |     Ver->DescriptionList = V->DescriptionList;
> > >                   |     return true;
> > >                   |       }
> > >                   |    }
> > >
> > > When IsDuplicateDescription is called, calling md5() on the V.DescriptionList()
> > > points to unallocated memory.  Any idea why that could be?
> > 
> > So you mean the Description struct is invalid (V.DescriptionList()) or
> > the V.DescriptionList().md5() char* ?
> 
> The latter does not point to any initialized memory.
> 
> > We are missing a bit of error checking here (callers of NewDescription() do
> > not check if return is != 0 and IsDuplicateDescription doesn't check if the
> > given Description is valid), but both shouldn't be a problem as NewDescription
> > can only really fail if new memory can't be allocated and as each version has
> > at least one description you shouldn't hit a problem in the dup check either.
> > Both wouldn't be limited to s390x either way:
> > We seem to have a similar bugreport from ppc64 (#669243),
> > if i understand right it's also bigendian 64bit, but no other report.
> 
> Yes.  So would be sparc64.  s390x is the only in-archive arch that's 64bit be.
> 
> > The code in pkgcachegen.cc was reworked for multi-arch and especially the dup
> > check is new and the code as such works wih pointer left and right, but non of
> > it should be architecture dependent… Somehow i fear that it's more related to
> > our checksum changes. We had way to many problems with sha1 and sha2 to assume
> > md5 would be okay (the code for md5 itself is not new, but the code warping
> > around it).
> 
> At least the problem was not in constructing the checksum object, but in
> retrieving the content for it (i.e. md5() in IsDuplicateDescription).
> 
> > You mentioned bisecting? Any insigns which revisions are (not) effected?
> > (bzr has no bisect included by default, and last time i check the plugin was…
> >  lets say suboptimal for us as we tend to have "big" merges)
> 
> Either the plugin was broken or the tree it acted upon was useless to bisect.
> I don't know.
> 

I have been able to bisect the issue, by first converting the bzr
repository to git, and then with the usual method. The commit that has
broken apt on s390x is the following:

commit 22f07fc5e77bcedbc774a4b60d305da847fab287
Author: David Kalnischkies <kalnischkies@gmail.com>
Date:   Wed Oct 12 15:47:56 2011 +0200

    a version can have only a single md5 for descriptions, so we can optimize
    the merging with this knowledge a bit and by correctly sharing the lists
    we only need to have a single description list for possibly many different
    versions. This also means that description translations are shared between
    different sources

diff --git a/apt-pkg/pkgcachegen.cc b/apt-pkg/pkgcachegen.cc
index 3545517..3b2c08e 100644
--- a/apt-pkg/pkgcachegen.cc
+++ b/apt-pkg/pkgcachegen.cc
@@ -279,33 +279,36 @@ bool pkgCacheGenerator::MergeListPackage(ListParser &List, pkgCache::PkgIterator
    for (Ver = Pkg.VersionList(); Ver.end() == false; ++Ver)
    {
       pkgCache::DescIterator Desc = Ver.DescriptionList();
-      Dynamic<pkgCache::DescIterator> DynDesc(Desc);
-      map_ptrloc *LastDesc = &Ver->DescriptionList;
+
+      // a version can only have one md5 describing it
+      if (MD5SumValue(Desc.md5()) != CurMd5)
+	 continue;
 
       // don't add a new description if we have one for the given
       // md5 && language
       if (IsDuplicateDescription(Desc, CurMd5, CurLang) == true)
 	 continue;
 
-      for (Desc = Ver.DescriptionList();
-	   Desc.end() == false;
-	    LastDesc = &Desc->NextDesc, ++Desc)
-      {
-	 if (MD5SumValue(Desc.md5()) != CurMd5)
-	    continue;
-
-	 // Add new description
-	 void const * const oldMap = Map.Data();
-	 map_ptrloc const descindex = NewDescription(Desc, CurLang, CurMd5, *LastDesc);
-	 if (oldMap != Map.Data())
-	    LastDesc += (map_ptrloc*) Map.Data() - (map_ptrloc*) oldMap;
-	 *LastDesc = descindex;
-	 Desc->ParentPkg = Pkg.Index();
-
-	 if ((*LastDesc == 0 && _error->PendingError()) || NewFileDesc(Desc,List) == false)
-	    return _error->Error(_("Error occurred while processing %s (NewFileDesc1)"), Pkg.Name());
-	 break;
-       }
+      Dynamic<pkgCache::DescIterator> DynDesc(Desc);
+      // we add at the end, so that the start is constant as we need
+      // that to be able to efficiently share these lists
+      map_ptrloc *LastDesc = &Ver->DescriptionList;
+      for (;Desc.end() == false && Desc->NextDesc != 0; ++Desc);
+      if (Desc.end() == false)
+	 LastDesc = &Desc->NextDesc;
+
+      void const * const oldMap = Map.Data();
+      map_ptrloc const descindex = NewDescription(Desc, CurLang, CurMd5, *LastDesc);
+      if (oldMap != Map.Data())
+	 LastDesc += (map_ptrloc*) Map.Data() - (map_ptrloc*) oldMap;
+      *LastDesc = descindex;
+      Desc->ParentPkg = Pkg.Index();
+
+      if ((*LastDesc == 0 && _error->PendingError()) || NewFileDesc(Desc,List) == false)
+	 return _error->Error(_("Error occurred while processing %s (NewFileDesc1)"), Pkg.Name());
+
+      // we can stop here as all "same" versions will share the description
+      break;
    }
 
    return true;
@@ -421,7 +424,7 @@ bool pkgCacheGenerator::MergeListVersion(ListParser &List, pkgCache::PkgIterator
    map_ptrloc *LastDesc = &Ver->DescriptionList;
 
    oldMap = Map.Data();
-   map_ptrloc const descindex = NewDescription(Desc, List.DescriptionLanguage(), List.Description_md5(), *LastDesc);
+   map_ptrloc const descindex = NewDescription(Desc, CurLang, CurMd5, *LastDesc);
    if (oldMap != Map.Data())
        LastDesc += (map_ptrloc*) Map.Data() - (map_ptrloc*) oldMap;
    *LastDesc = descindex;
@@ -1426,8 +1429,11 @@ bool pkgCacheGenerator::MakeOnlyStatusCache(OpProgress *Progress,DynamicMMap **O
 bool IsDuplicateDescription(pkgCache::DescIterator Desc,
 			    MD5SumValue const &CurMd5, std::string const &CurLang)
 {
-   for ( ; Desc.end() == false; ++Desc)
-      if (MD5SumValue(Desc.md5()) == CurMd5 && Desc.LanguageCode() == CurLang)
+   // Descriptions in the same link-list have all the same md5
+   if (MD5SumValue(Desc.md5()) != CurMd5)
+      return false;
+   for (; Desc.end() == false; ++Desc)
+      if (Desc.LanguageCode() == CurLang)
 	 return true;
    return false;
 }
diff --git a/debian/changelog b/debian/changelog
index b7f1996..62f32d8 100644
--- a/debian/changelog
+++ b/debian/changelog
@@ -25,6 +25,7 @@ apt (0.8.16~exp7) UNRELEASEDexperimental; urgency=low
   * apt-pkg/pkgcachegen.cc:
     - refactor MergeList by creating -Group, -Package and -Version specialist
     - share description list between "same" versions (LP: #868977)
+      This also means that descriptions are shared across archives now.
 
   [ Michael Vogt ]
   * apt-pkg/contrib/configuration.cc:
@@ -39,7 +40,7 @@ apt (0.8.16~exp7) UNRELEASEDexperimental; urgency=low
   * ftparchive/cachedb.cc:
     - fix buffersize in bytes2hex
 
- -- David Kalnischkies <kalnischkies@gmail.com>  Tue, 11 Oct 2011 21:07:38 +0200
+ -- David Kalnischkies <kalnischkies@gmail.com>  Wed, 12 Oct 2011 15:47:43 +0200
 
 apt (0.8.16~exp6) experimental; urgency=low
 


-- 
Aurelien Jarno	                        GPG: 1024D/F1BCDB73
aurelien@aurel32.net                 http://www.aurel32.net



Reply to: