[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#591061: Leak still present.



Proposed patch to remove leak attached.

Dave.

On Tue, Sep 28, 2010 at 6:42 PM, Eduardo I. <perseguidor@gmail.com> wrote:
> Hi Cyril, thanks for the heads up.
>
> Unfortunately I still seem to be able to leak memory in the exact same
> way. I'm reporting this back at RH's bugzilla.
>
> I'm attaching ttm/drm related entries in vmallocinfo just in case.
>
> --
> Eduardo.
>
> _______________________________________________
> xorg-driver-ati mailing list
> xorg-driver-ati@lists.x.org
> http://lists.x.org/mailman/listinfo/xorg-driver-ati
>
>
From 086970d5b8371bc38d25bfc9b3a59328bce7a6f9 Mon Sep 17 00:00:00 2001
From: Dave Airlie <airlied@redhat.com>
Date: Wed, 29 Sep 2010 13:31:07 +1000
Subject: [PATCH] drm/ttm: remove race and optimise evicting destroyed buffers.

A race condition was identifed that led to a leak of the BOs, when
 a bo was on the delayed delete list and was selected for eviction,
the code would enter

thread 1 (evict)                 -- thread2 (dd)
bo added to delayed destroy list
take lru lock
ttm_mem_evict_first called
   - reserve locked
   - remove from lru
   - drop lru lock
				take lru lock
				try del from lru - get put_count == 0
				try to reserve locked
					- drop lru lock to wait unreserved
call bo_evict
unreserve buffer
   - take lru lock
   - add back to lru
   - unreserver
   - drop lru lock
				take lru lock due to unreserved
				unbind ttm
				drop put_count references (which is 0)
				despite being on the lru/swap lru lists.
				leak due to never losing reference

The obvious quick fix is to try the bo from the ddestroy list if we
are going to evict it, however if we are going to evict a buffer that
is going to be destroyed we should just destroy it instead, its more
likely to be a lighter-weight operation than evict + delayed destroy.

This patch check is a bo that we are about to evict is actually on the
destroy list and if it is, it unreserves it (without adding to lru),
and cleans up its references. It should then get destroyed via
normal ref counting means.

proposed fix for:
https://bugzilla.redhat.com/show_bug.cgi?id=615505
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=591061

Signed-off-by: Dave Airlie <airlied@redhat.com>
---
 drivers/gpu/drm/ttm/ttm_bo.c |   18 +++++++++++++++---
 1 files changed, 15 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
index cb4cf7e..60689b1 100644
--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++ b/drivers/gpu/drm/ttm/ttm_bo.c
@@ -689,7 +689,7 @@ static int ttm_mem_evict_first(struct ttm_bo_device *bdev,
 	struct ttm_mem_type_manager *man = &bdev->man[mem_type];
 	struct ttm_buffer_object *bo;
 	int ret, put_count = 0;
-
+	bool destroy = false;
 retry:
 	spin_lock(&glob->lru_lock);
 	if (list_empty(&man->lru)) {
@@ -719,6 +719,13 @@ retry:
 	}
 
 	put_count = ttm_bo_del_from_lru(bo);
+
+	/* is the buffer currently on the delayed destroy list? */
+	if (!list_empty(&bo->ddestroy)) {
+		list_del_init(&bo->ddestroy);
+		destroy = true;
+		put_count++;
+	}
 	spin_unlock(&glob->lru_lock);
 
 	BUG_ON(ret != 0);
@@ -726,8 +733,13 @@ retry:
 	while (put_count--)
 		kref_put(&bo->list_kref, ttm_bo_ref_bug);
 
-	ret = ttm_bo_evict(bo, interruptible, no_wait_reserve, no_wait_gpu);
-	ttm_bo_unreserve(bo);
+	if (destroy) {
+		atomic_set(&bo->reserved, 0);
+		ret = ttm_bo_cleanup_refs(bo, !no_wait_gpu);
+	} else {
+		ret = ttm_bo_evict(bo, interruptible, no_wait_reserve, no_wait_gpu);
+		ttm_bo_unreserve(bo);
+	}
 
 	kref_put(&bo->list_kref, ttm_bo_release_list);
 	return ret;
-- 
1.7.2.3


Reply to: