[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#610530: ocfs2-tools: BUG at fs/ocfs2/dlm/dlmmaster.c:2226! invalid opcode



reassign 610530 linux-2.6 linux-2.6/2.6.26-26lenny1
quit

Hi Szabolcs,

Szabolcs JANOSI wrote:

> Justification: causes high load, reboot on both nodes (2-node cluster)
>
> Log details on node1:
>
> (6894,14):dlm_drop_lockres_ref:2224 ERROR: while dropping ref on A35DE40B6A044A4A873B96E2F2DE42B2:M000000000000000112401200000000 (master=0) got -22.
> lockres: M00000000000000011240120000000, owner=0, state=64
>   last used: 5332038594, refcnt: 3, on purge list: yes
>   on dirty list: no, on reco list: no, migrating pending: no
>   inflight locks: 0, asts reserved: 0
>   refmap nodes: [ ], inflight=0
>   granted queue:
>   converting queue:
>   blocked queue:
> ------------[ cut here ]------------
> kernel BUG at fs/ocfs2/dlm/dlmmaster.c:2226!
[...]
> Code: 8b 14 25 24 00 00 00 48 c7 c1 e0 89 39 a0 89 d2 4c 89 74 24 08 89 44 24 10 31 c0 89 2c 24 e8 2c 90 ea df 4c 89 e7 e8 32 43 ff ff <0f> 0b eb fe 48 83 c4 70 89 d8 5b 5d 41 5c 41 5d 41 5e c3 41 54
> RIP  [<ffffffffa038c381>] :ocfs2_dlm:dlm_drop_lockres_ref+0x1dd/0x1f0
[...]
> Technical investigations resulted that it was not caused by network problem.

I guess this was reproducible.  Was it a regression?  (I.e., do you
know of any previous kernel that worked ok?)

| $ git show debian/lenny:fs/ocfs2/dlm/dlmmaster.c | sed -n 2220,2226' 'p
|	else if (r < 0) {
|		/* BAD.  other node says I did not have a ref. */
|		mlog(ML_ERROR,"while dropping ref on %s:%.*s "
|		    "(master=%u) got %d.\n", dlm->name, namelen,
|		    lockname, res->owner, r);
|		dlm_print_one_lock_resource(res);
|		BUG();

What kernel do you use these days?  Can you still reproduce this?

If you can reproduce this with a current squeeze or sid kernel, the next
step will be to get in touch from upstream.  Sorry we missed this before.

Sincerely,
Jonathan



Reply to: