[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: [Insight-developers] FEM debian test failures



That is great! It's a much more sane patch than my alignment hack. I have verified that is solved the problem on my system too.

There are still a lot more failing tests on the 32-bit version dashboard build. It certainly appears to be an optimization bug with this version of GCC. I am not sure how worth wild it will be to pursue the remaining test failures that are occurring here:

http://open.cdash.org/viewTest.php?onlyfailed&buildid=2355571

If we have to just hack the structure of the loops, or move around how the data is laid out in memory.

The version of this compiler is:

$ gcc-4.7 --version
gcc-4.7 (Debian 4.7.0-12) 4.7.0
Copyright (C) 2012 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

So it's still very new. Is it likely that Debian is going to release with this version of the compiler? or are the bug is gcc going to get patched?

Thanks,
Brad

On Jun 12, 2012, at 2:44 PM, Bill Lorensen wrote:

Paul,

Can you send me the updated  file (not a debian patch) so I can submit
it as a gerrit patch?

Bill

On Tue, Jun 12, 2012 at 2:28 PM, Paul Novotny <paul@paulnovo.us> wrote:
I spent some time on this too, and again was about to submit a patch
(man, I am just too slow). The code in ComputeMatrix looked a bit odd to
me, ie using where the for counter stopped as a condition in the next
if. I am thinking gcc does some for-loop branching optimization that
does't like this. Anyway, if you compile with the attached patch, the
Spatial Object segfaults go away.

-Paul

On Tue, 2012-06-12 at 13:58 -0400, Bill Lorensen wrote:
First, I can't imagine how you figured this out. You are starting to
become a legend in your time...

Why not #ifdef your solution. Something like:

#if (__GNUC__ == 4.7)
  InputVectorType m_MatrixScale __attribute__ ((aligned) );
#else
  InputVectorType m_MatrixScale;
#endif


On Tue, Jun 12, 2012 at 12:16 PM, Bradley Lowekamp
<blowekamp@mail.nih.gov> wrote:
Hello Steve,

I spent some time looking into this.

I downloaded the latest Debian 64-bit and then updated to the sid unstable
with gcc 4.7.  I think this is the same version you are using?

These are the failing tests I had:

The following tests FAILED:
508 - itkDiffusionTensor3DReconstructionImageFilterTest (SEGFAULT)
885 - itkReadWriteSpatialObjectTest (SEGFAULT)
886 - itkReadWriteSpatialObjectTest1 (SEGFAULT)
887 - itkReadWriteSpatialObjectTest2 (SEGFAULT)
1504 - itkMeshSpatialObjectIOTest (SEGFAULT)
1505 - itkMeshSpatialObjectIOTest1 (SEGFAULT)
1506 - itkMeshSpatialObjectIOTest2 (SEGFAULT)
1924 - itkImageSpatialObjectTest (SEGFAULT)
1925 - itkSpatialObjectToImageStatisticsCalculatorTest (SEGFAULT)
1926 - itkImageMaskSpatialObjectTest (SEGFAULT)
1927 - itkImageMaskSpatialObjectTest2 (SEGFAULT)
1928 - itkImageMaskSpatialObjectTest3 (SEGFAULT)
1931 - itkSceneSpatialObjectTest (SEGFAULT)
1933 - itkMetaArrowConverterTest (SEGFAULT)
1946 - itkArrowSpatialObjectTest (SEGFAULT)
2124 - itkTransformsSetParametersTest (SEGFAULT)
Errors while running CTest


I was able to repeat many of your test failures. I then hacked and debugged
the code for a little bit. I don't see anything wrong with it. Based on you
experiment of mixing optimization levels and what I observed, I would tend
to agree that there is likely some kind of gcc bug, perhaps in intra-unit
optimization or how the data get laid out in certain circumstances.

The most interesting things I was able to do it the following change:

diff --git a/Modules/Core/Transform/include/itkScalableAffineTransform.h
b/Modules/Core/Transform/i
index 6032127..8dbcdc4 100644
--- a/Modules/Core/Transform/include/itkScalableAffineTransform.h
+++ b/Modules/Core/Transform/include/itkScalableAffineTransform.h
@@ -166,7 +166,7 @@ private:
   const Self & operator=(const Self &);

   double          m_Scale[NDimensions];
-  InputVectorType m_MatrixScale;
+  InputVectorType m_MatrixScale __attribute__ ((aligned) );
 }; //class ScalableAffineTransform
 }  // namespace itk


With this change all tests pass. Which is _really_ good to know! This mean
that all these failing tests are related to this same bug. So I think I have
taken this as far as it needs to... I guess I hope that a patch in gcc 4.7
will address this issue.


On another note. Have you looking at the SimpleITK project? Have you
considered also packaging that for Debian distributions?

Thanks,
Brad



On Jun 10, 2012, at 1:41 AM, Steve M. Robbins wrote:

On Sun, Jun 10, 2012 at 12:03:39AM -0500, Steve M. Robbins wrote:

Well, I spent some time looking at itkReadWriteSpatialObjectTest and

I've determined a few things.

 [...]

2. It works when built using gcc 4.7 in RelWithDebInfo mode (flags -O2 -g).

 [...]

4. It fails when built using gcc 4.7 with flags -O3 -g -DNDEBUG.


One further item.  Of the three files that make up
ITKIOSpatialObjectsTestDriver, as long as this one
is built with -O2:

itkPolygonGroupSpatialObjectXMLFileTest.cxx.o

I can build the other two with -O3:

ITKIOSpatialObjectsTestDriver.cxx.o
itkReadWriteSpatialObjectTest.cxx.o

and the test completes successfully.  This makes no sense to me
because itkPolygonGroupSpatialObjectXMLFileTest code is not even
executed when running the test in question ("ctest -R
itkReadWriteSpatialObjectTest").


-Steve
<signature.asc>_______________________________________________
Powered by www.kitware.com

Visit other Kitware open-source projects at
http://www.kitware.com/opensource/opensource.html

Kitware offers ITK Training Courses, for more information visit:
http://kitware.com/products/protraining.php

Please keep messages on-topic and check the ITK FAQ at:
http://www.itk.org/Wiki/ITK_FAQ

Follow this link to subscribe/unsubscribe:
http://www.itk.org/mailman/listinfo/insight-developers


========================================================

Bradley Lowekamp

Medical Science and Computing for

Office of High Performance Computing and Communications

National Library of Medicine

blowekamp@mail.nih.gov











--
Unpaid intern in BillsBasement at noware dot com

========================================================

Bradley Lowekamp  

Medical Science and Computing for

Office of High Performance Computing and Communications

National Library of Medicine 

blowekamp@mail.nih.gov





Reply to: