[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#822084: 4.4.6-1~bpo8+1 deadlock with systemd+cgroup



Package: linux
Version: 4.4.6-1~bpo8+1

We plan to upgrade servers to 4.4.6 for networking improvement of 4.4
series. We are using debian wheezy & jessie.

At test phase, we get serious deadlocks for some servers. These
servers are using jessie (thus systemd). The symptom
is that boot process hangs before login prompt appears, so systemd hangs.

The servers work well on 3.16.7-ckt20 and 4.2.6 bpo kernel.

We enable lockdep and lock_stat to trace as below:

[ 1680.821488] INFO: task systemd:1 blocked for more than 120 seconds.
[ 1680.821533]       Tainted: G           O    4.4.6 #2
[ 1680.821574] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 1680.821634] systemd         D ffff8818116f3d30     0     1      0 0x00000000
[ 1680.821690]  ffff8818116f3d30 0000000000000007 0000000000000006
ffff8830065d6a58
[ 1680.821771]  ffff881811202140 ffff8818116d2040 ffff8818116f4000
0000000000000246
[ 1680.821853]  ffffffff81c69f08 ffff8818116d2040 00000000ffffffff
ffff8818116f3d48
[ 1680.821941] Call Trace:
[ 1680.821982]  [<ffffffff8165f42c>] schedule+0x3c/0x90
[ 1680.822021]  [<ffffffff8165f775>] schedule_preempt_disabled+0x15/0x20
[ 1680.822064]  [<ffffffff8166117c>] mutex_lock_nested+0x18c/0x3e0
[ 1680.822105]  [<ffffffff8113d8be>] ? proc_cgroup_show+0x4e/0x300
[ 1680.822146]  [<ffffffff8113d8be>] ? proc_cgroup_show+0x4e/0x300
[ 1680.822184]  [<ffffffff8113d8be>] proc_cgroup_show+0x4e/0x300
[ 1680.822228]  [<ffffffff812a7e40>] proc_single_show+0x50/0x90
[ 1680.822266]  [<ffffffff81258e99>] seq_read+0xe9/0x3c0
[ 1680.822306]  [<ffffffff8122f658>] __vfs_read+0x18/0x40
[ 1680.822342]  [<ffffffff8122fc69>] vfs_read+0x89/0x130
[ 1680.822382]  [<ffffffff81230a69>] SyS_read+0x49/0xb0
[ 1680.822418]  [<ffffffff81665db6>] system_call_fast_compare_end+0xc/0x70
[ 1680.822461] 2 locks held by systemd/1:
[ 1680.822494]  #0:  (&p->lock){+.+.+.}, at: [<ffffffff81258ded>]
seq_read+0x3d/0x3c0
[ 1680.822593]  #1:  (cgroup_mutex){+.+.+.}, at: [<ffffffff8113d8be>]
proc_cgroup_show+0x4e/0x300
[ 1680.822689] INFO: task kthreadd:2 blocked for more than 120 seconds.
[ 1680.822726]       Tainted: G           O    4.4.6 #2
[ 1680.822764] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 1680.822824] kthreadd        D ffff8818116fbc78     0     2      0 0x00000000
[ 1680.822878]  ffff8818116fbc78 ffffffff8166533c ffff8818116f4080
ffff8830061d6a58
[ 1680.822959]  ffff8818111f60c0 ffff8818116f4080 ffff8818116fc000
ffffffff82e0ed08
[ 1680.823040]  ffffffff82e0ed20 0000000000000000 0000000000000000
ffff8818116fbc90
[ 1680.823123] Call Trace:
[ 1680.823156]  [<ffffffff8166533c>] ? _raw_spin_unlock_irq+0x2c/0x40
[ 1680.823197]  [<ffffffff8165f42c>] schedule+0x3c/0x90
[ 1680.823236]  [<ffffffff8166394e>] rwsem_down_read_failed+0xee/0x140
[ 1680.823275]  [<ffffffff8137aa34>] call_rwsem_down_read_failed+0x14/0x30
[ 1680.823321]  [<ffffffff810dc0c9>] ? percpu_down_read+0x79/0xa0
[ 1680.823359]  [<ffffffff81081b57>] ? copy_process+0x5b7/0x1e40
[ 1680.823400]  [<ffffffff81081b57>] copy_process+0x5b7/0x1e40
[ 1680.823437]  [<ffffffff810e0bdd>] ? __lock_acquire+0x5cd/0x1e90
[ 1680.823480]  [<ffffffff810aa670>] ? kthread_create_on_node+0x250/0x250
[ 1680.823519]  [<ffffffff8108354e>] _do_fork+0x7e/0x760
[ 1680.823560]  [<ffffffff810ab560>] ? kthreadd+0x1b0/0x280
[ 1680.823596]  [<ffffffff810ab560>] ? kthreadd+0x1b0/0x280
[ 1680.823637]  [<ffffffff810ab5b3>] ? kthreadd+0x203/0x280
[ 1680.823677]  [<ffffffff81083c59>] kernel_thread+0x29/0x30
[ 1680.823718]  [<ffffffff810ab5d4>] kthreadd+0x224/0x280
[ 1680.823754]  [<ffffffff8166617f>] ? ret_from_fork+0x3f/0x70
[ 1680.823794]  [<ffffffff810ab3b0>] ? kthread_create_on_cpu+0x70/0x70
[ 1680.823832]  [<ffffffff8166617f>] ret_from_fork+0x3f/0x70
[ 1680.823875]  [<ffffffff810ab3b0>] ? kthread_create_on_cpu+0x70/0x70
[ 1680.823913] 1 lock held by kthreadd/2:
[ 1680.823949]  #0:  (&cgroup_threadgroup_rwsem){++++++}, at:
[<ffffffff81081b57>] copy_process+0x5b7/0x1e40
[ 1680.824071] INFO: task kworker/0:3:294 blocked for more than 120 seconds.
[ 1680.824109]       Tainted: G           O    4.4.6 #2
[ 1680.824147] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 1680.824206] kworker/0:3     D ffff881809db7c50     0   294      2 0x00000000
[ 1680.824260] Workqueue: events cgroup_release_agent
[ 1680.824300]  ffff881809db7c50 0000000000000007 0000000000000006
ffff88181e3d6a58
[ 1680.824381]  ffffffff81c125c0 ffff881809db2200 ffff881809db8000
0000000000000246
[ 1680.824463]  ffffffff81c69f08 ffff881809db2200 00000000ffffffff
ffff881809db7c68
[ 1680.824544] Call Trace:
[ 1680.824578]  [<ffffffff8165f42c>] schedule+0x3c/0x90
[ 1680.824614]  [<ffffffff8165f775>] schedule_preempt_disabled+0x15/0x20
[ 1680.824657]  [<ffffffff8166117c>] mutex_lock_nested+0x18c/0x3e0
[ 1680.824694]  [<ffffffff81135053>] ? cgroup_release_agent+0x23/0xf0
[ 1680.824735]  [<ffffffff81135053>] ? cgroup_release_agent+0x23/0xf0
[ 1680.824773]  [<ffffffff81135053>] cgroup_release_agent+0x23/0xf0
[ 1680.824815]  [<ffffffff810a35e5>] process_one_work+0x1f5/0x790
[ 1680.824853]  [<ffffffff810a3550>] ? process_one_work+0x160/0x790
[ 1680.824894]  [<ffffffff810a3be9>] worker_thread+0x69/0x480
[ 1680.824931]  [<ffffffff810a3b80>] ? process_one_work+0x790/0x790
[ 1680.824972]  [<ffffffff810a3b80>] ? process_one_work+0x790/0x790
[ 1680.825010]  [<ffffffff810aa78c>] kthread+0x11c/0x140
[ 1680.825050]  [<ffffffff810aa670>] ? kthread_create_on_node+0x250/0x250
[ 1680.825090]  [<ffffffff8166617f>] ret_from_fork+0x3f/0x70
[ 1680.825130]  [<ffffffff810aa670>] ? kthread_create_on_node+0x250/0x250
[ 1680.825169] 3 locks held by kworker/0:3/294:
[ 1680.825205]  #0:  ("events"){.+.+.+}, at: [<ffffffff810a3550>]
process_one_work+0x160/0x790
[ 1680.825297]  #1:  ((&cgrp->release_agent_work)){+.+.+.}, at:
[<ffffffff810a3550>] process_one_work+0x160/0x790
[ 1680.825393]  #2:  (cgroup_mutex){+.+.+.}, at: [<ffffffff81135053>]
cgroup_release_agent+0x23/0xf0
[ 1680.825507] INFO: task systemd-journal:530 blocked for more than 120 seconds.
[ 1680.825547]       Tainted: G           O    4.4.6 #2
[ 1680.825585] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 1680.825644] systemd-journal D ffff88180aaa7d30     0   530      1 0x00000000
[ 1680.825698]  ffff88180aaa7d30 0000000000000007 0000000000000006
ffff88181efd6a58
[ 1680.825778]  ffff8818111d0440 ffff88180a82a0c0 ffff88180aaa8000
0000000000000246
[ 1680.825839]  ffffffff81c69f08 ffff88180a82a0c0 00000000ffffffff
ffff88180aaa7d48
[ 1680.825841] Call Trace:
[ 1680.825845]  [<ffffffff8165f42c>] schedule+0x3c/0x90
[ 1680.825847]  [<ffffffff8165f775>] schedule_preempt_disabled+0x15/0x20
[ 1680.825848]  [<ffffffff8166117c>] mutex_lock_nested+0x18c/0x3e0
[ 1680.825850]  [<ffffffff8113d8be>] ? proc_cgroup_show+0x4e/0x300
[ 1680.825851]  [<ffffffff8113d8be>] ? proc_cgroup_show+0x4e/0x300
[ 1680.825853]  [<ffffffff8113d8be>] proc_cgroup_show+0x4e/0x300
[ 1680.825855]  [<ffffffff812a7e40>] proc_single_show+0x50/0x90
[ 1680.825856]  [<ffffffff81258e99>] seq_read+0xe9/0x3c0
[ 1680.825858]  [<ffffffff8122f658>] __vfs_read+0x18/0x40
[ 1680.825859]  [<ffffffff8122fc69>] vfs_read+0x89/0x130
[ 1680.825860]  [<ffffffff81230a69>] SyS_read+0x49/0xb0
[ 1680.825861]  [<ffffffff81665db6>] system_call_fast_compare_end+0xc/0x70
[ 1680.825862] 2 locks held by systemd-journal/530:
[ 1680.825865]  #0:  (&p->lock){+.+.+.}, at: [<ffffffff81258ded>]
seq_read+0x3d/0x3c0
[ 1680.825867]  #1:  (cgroup_mutex){+.+.+.}, at: [<ffffffff8113d8be>]
proc_cgroup_show+0x4e/0x300
[ 1680.825868] INFO: task kworker/12:2:544 blocked for more than 120 seconds.
[ 1680.825869]       Tainted: G           O    4.4.6 #2
[ 1680.825870] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 1680.825872] kworker/12:2    D ffff882fef583c50     0   544      2 0x00000000
[ 1680.825873] Workqueue: events cgroup_release_agent
[ 1680.825875]  ffff882fef583c50 0000000000000007 0000000000000006
ffff883005fd6a58
[ 1680.825876]  ffff8818111f4080 ffff882ff0336040 ffff882fef584000
0000000000000246
[ 1680.825878]  ffffffff81c69f08 ffff882ff0336040 00000000ffffffff
ffff882fef583c68
[ 1680.825878] Call Trace:
[ 1680.825880]  [<ffffffff8165f42c>] schedule+0x3c/0x90
[ 1680.825881]  [<ffffffff8165f775>] schedule_preempt_disabled+0x15/0x20
[ 1680.825883]  [<ffffffff8166117c>] mutex_lock_nested+0x18c/0x3e0
[ 1680.825884]  [<ffffffff81135053>] ? cgroup_release_agent+0x23/0xf0
[ 1680.825885]  [<ffffffff81135053>] ? cgroup_release_agent+0x23/0xf0
[ 1680.825886]  [<ffffffff81135053>] cgroup_release_agent+0x23/0xf0
[ 1680.825887]  [<ffffffff810a35e5>] process_one_work+0x1f5/0x790
[ 1680.825889]  [<ffffffff810a3550>] ? process_one_work+0x160/0x790
[ 1680.825890]  [<ffffffff810a3be9>] worker_thread+0x69/0x480
[ 1680.825891]  [<ffffffff810a3b80>] ? process_one_work+0x790/0x790
[ 1680.825892]  [<ffffffff810a3b80>] ? process_one_work+0x790/0x790
[ 1680.825894]  [<ffffffff810aa78c>] kthread+0x11c/0x140
[ 1680.825896]  [<ffffffff810aa670>] ? kthread_create_on_node+0x250/0x250
[ 1680.825897]  [<ffffffff8166617f>] ret_from_fork+0x3f/0x70
[ 1680.825898]  [<ffffffff810aa670>] ? kthread_create_on_node+0x250/0x250
[ 1680.825899] 3 locks held by kworker/12:2/544:
[ 1680.825902]  #0:  ("events"){.+.+.+}, at: [<ffffffff810a3550>]
process_one_work+0x160/0x790
[ 1680.825904]  #1:  ((&cgrp->release_agent_work)){+.+.+.}, at:
[<ffffffff810a3550>] process_one_work+0x160/0x790
[ 1680.825906]  #2:  (cgroup_mutex){+.+.+.}, at: [<ffffffff81135053>]
cgroup_release_agent+0x23/0xf0
[ 1680.825912] INFO: task sshd:1341 blocked for more than 120 seconds.
[ 1680.825913]       Tainted: G           O    4.4.6 #2
[ 1680.825913] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 1680.825915] sshd            D ffff88180c9efcc8     0  1341      1 0x00000000
[ 1680.825916]  ffff88180c9efcc8 ffffffff8166533c ffff88180cd740c0
ffff883005dd6a58
[ 1680.825920]  ffff8818111ea040 ffff88180cd740c0 ffff88180c9f0000
ffffffff82e0ed08
[ 1680.825921]  ffffffff82e0ed20 0000000000000000 00007fee2b563ad0
ffff88180c9efce0
[ 1680.825921] Call Trace:
[ 1680.825922]  [<ffffffff8166533c>] ? _raw_spin_unlock_irq+0x2c/0x40
[ 1680.825924]  [<ffffffff8165f42c>] schedule+0x3c/0x90
[ 1680.825926]  [<ffffffff8166394e>] rwsem_down_read_failed+0xee/0x140
[ 1680.825927]  [<ffffffff8137aa34>] call_rwsem_down_read_failed+0x14/0x30
[ 1680.825929]  [<ffffffff810dc0c9>] ? percpu_down_read+0x79/0xa0
[ 1680.825930]  [<ffffffff81081b57>] ? copy_process+0x5b7/0x1e40
[ 1680.825931]  [<ffffffff81081b57>] copy_process+0x5b7/0x1e40
[ 1680.825933]  [<ffffffff8108354e>] _do_fork+0x7e/0x760
[ 1680.825935]  [<ffffffff81251e45>] ? __fd_install+0x5/0x2e0
[ 1680.825939]  [<ffffffff811dc09e>] ? __might_fault+0x4e/0xb0
[ 1680.825942]  [<ffffffff81003044>] ? lockdep_sys_exit_thunk+0x12/0x14
[ 1680.825943]  [<ffffffff81083cd9>] SyS_clone+0x19/0x20
[ 1680.825944]  [<ffffffff81665db6>] system_call_fast_compare_end+0xc/0x70
[ 1680.825945] 1 lock held by sshd/1341:
[ 1680.825947]  #0:  (&cgroup_threadgroup_rwsem){++++++}, at:
[<ffffffff81081b57>] copy_process+0x5b7/0x1e40
[ 1680.825950] INFO: task gmond:1569 blocked for more than 120 seconds.
[ 1680.825951]       Tainted: G           O    4.4.6 #2
[ 1680.825951] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 1680.825953] gmond           D ffff8817ad8ebcc8     0  1569      1 0x00000000
[ 1680.825954]  ffff8817ad8ebcc8 ffffffff8166533c ffff8817ad8e2480
ffff88181edd6a58
[ 1680.825956]  ffff8818111c6400 ffff8817ad8e2480 ffff8817ad8ec000
ffffffff82e0ed08
[ 1680.825957]  ffffffff82e0ed20 0000000000000000 00007f4dd8af4a50
ffff8817ad8ebce0
[ 1680.825958] Call Trace:
[ 1680.825959]  [<ffffffff8166533c>] ? _raw_spin_unlock_irq+0x2c/0x40
[ 1680.825961]  [<ffffffff8165f42c>] schedule+0x3c/0x90
[ 1680.825962]  [<ffffffff8166394e>] rwsem_down_read_failed+0xee/0x140
[ 1680.825964]  [<ffffffff8137aa34>] call_rwsem_down_read_failed+0x14/0x30
[ 1680.825965]  [<ffffffff810dc0c9>] ? percpu_down_read+0x79/0xa0
[ 1680.825966]  [<ffffffff81081b57>] ? copy_process+0x5b7/0x1e40
[ 1680.825967]  [<ffffffff81081b57>] copy_process+0x5b7/0x1e40
[ 1680.825968]  [<ffffffff8108354e>] _do_fork+0x7e/0x760
[ 1680.825970]  [<ffffffff81666904>] ? retint_user+0x18/0x23
[ 1680.825971]  [<ffffffff81003017>] ? trace_hardirqs_on_thunk+0x17/0x19
[ 1680.825972]  [<ffffffff81083cd9>] SyS_clone+0x19/0x20
[ 1680.825973]  [<ffffffff81665db6>] system_call_fast_compare_end+0xc/0x70
[ 1680.825974] 1 lock held by gmond/1569:
[ 1680.825976]  #0:  (&cgroup_threadgroup_rwsem){++++++}, at:
[<ffffffff81081b57>] copy_process+0x5b7/0x1e40
[ 1680.825978] INFO: task python:1702 blocked for more than 120 seconds.
[ 1680.825978]       Tainted: G           O    4.4.6 #2
[ 1680.825978] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 1680.825980] python          D ffff8817b05d3cc8     0  1702   1543 0x00000000
[ 1680.825982]  ffff8817b05d3cc8 ffffffff8166533c ffff88180c780240
ffff88181e7d6a58
[ 1680.825983]  ffff8818111b8340 ffff88180c780240 ffff8817b05d4000
ffffffff82e0ed08
[ 1680.825984]  ffffffff82e0ed20 0000000000000000 00007f5256eda9d0
ffff8817b05d3ce0
[ 1680.825984] Call Trace:
[ 1680.825985]  [<ffffffff8166533c>] ? _raw_spin_unlock_irq+0x2c/0x40
[ 1680.825987]  [<ffffffff8165f42c>] schedule+0x3c/0x90
[ 1680.825988]  [<ffffffff8166394e>] rwsem_down_read_failed+0xee/0x140
[ 1680.825990]  [<ffffffff8137aa34>] call_rwsem_down_read_failed+0x14/0x30
[ 1680.825991]  [<ffffffff810dc0c9>] ? percpu_down_read+0x79/0xa0
[ 1680.825992]  [<ffffffff81081b57>] ? copy_process+0x5b7/0x1e40
[ 1680.825993]  [<ffffffff81081b57>] copy_process+0x5b7/0x1e40
[ 1680.825995]  [<ffffffff810e0bdd>] ? __lock_acquire+0x5cd/0x1e90
[ 1680.825996]  [<ffffffff8108354e>] _do_fork+0x7e/0x760
[ 1680.825998]  [<ffffffff81251f3a>] ? __fd_install+0xfa/0x2e0
[ 1680.825999]  [<ffffffff81251e45>] ? __fd_install+0x5/0x2e0
[ 1680.826000]  [<ffffffff811dc09e>] ? __might_fault+0x4e/0xb0
[ 1680.826002]  [<ffffffff81003044>] ? lockdep_sys_exit_thunk+0x12/0x14
[ 1680.826003]  [<ffffffff81083cd9>] SyS_clone+0x19/0x20
[ 1680.826003]  [<ffffffff81665db6>] system_call_fast_compare_end+0xc/0x70
[ 1680.826004] 1 lock held by python/1702:
[ 1680.826007]  #0:  (&cgroup_threadgroup_rwsem){++++++}, at:
[<ffffffff81081b57>] copy_process+0x5b7/0x1e40
[ 1680.826008] INFO: task local_resource_:1775 blocked for more than
120 seconds.
[ 1680.826009]       Tainted: G           O    4.4.6 #2
[ 1680.826009] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 1680.826011] local_resource_ D ffff8817ace0bcc8     0  1775   1531 0x00000000
[ 1680.826012]  ffff8817ace0bcc8 ffffffff8166533c ffff8817acc861c0
ffff883005fd6a58
[ 1680.826013]  ffff8818111f4080 ffff8817acc861c0 ffff8817ace0c000
ffffffff82e0ed08
[ 1680.826015]  ffffffff82e0ed20 0000000000000000 0000000000000008
ffff8817ace0bce0
[ 1680.826015] Call Trace:
[ 1680.826016]  [<ffffffff8166533c>] ? _raw_spin_unlock_irq+0x2c/0x40
[ 1680.826017]  [<ffffffff8165f42c>] schedule+0x3c/0x90
[ 1680.826019]  [<ffffffff8166394e>] rwsem_down_read_failed+0xee/0x140
[ 1680.826020]  [<ffffffff8137aa34>] call_rwsem_down_read_failed+0x14/0x30
[ 1680.826022]  [<ffffffff810dc0c9>] ? percpu_down_read+0x79/0xa0
[ 1680.826023]  [<ffffffff81081b57>] ? copy_process+0x5b7/0x1e40
[ 1680.826023]  [<ffffffff81081b57>] copy_process+0x5b7/0x1e40
[ 1680.826025]  [<ffffffff8108354e>] _do_fork+0x7e/0x760
[ 1680.826026]  [<ffffffff811dc09e>] ? __might_fault+0x4e/0xb0
[ 1680.826028]  [<ffffffff81003044>] ? lockdep_sys_exit_thunk+0x12/0x14
[ 1680.826029]  [<ffffffff81083cd9>] SyS_clone+0x19/0x20
[ 1680.826030]  [<ffffffff81665db6>] system_call_fast_compare_end+0xc/0x70
[ 1680.826031] 1 lock held by local_resource_/1775:
[ 1680.826033]  #0:  (&cgroup_threadgroup_rwsem){++++++}, at:
[<ffffffff81081b57>] copy_process+0x5b7/0x1e40
[ 1680.826072] INFO: task run:5120 blocked for more than 120 seconds.
[ 1680.826073]       Tainted: G           O    4.4.6 #2
[ 1680.826073] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 1680.826075] run             D ffff882ff4367ce0     0  5120   5117 0x00000000
[ 1680.826077]  ffff882ff4367ce0 0000000000000007 0000000000000006
ffff88181edd6a58
[ 1680.826078]  ffff8818111c6400 ffff882ff452a180 ffff882ff4368000
0000000000000246
[ 1680.826080]  ffffffff81c69f08 ffff882ff452a180 00000000ffffffff
ffff882ff4367cf8
[ 1680.826080] Call Trace:
[ 1680.826082]  [<ffffffff8165f42c>] schedule+0x3c/0x90
[ 1680.826083]  [<ffffffff8165f775>] schedule_preempt_disabled+0x15/0x20
[ 1680.826085]  [<ffffffff8166117c>] mutex_lock_nested+0x18c/0x3e0
[ 1680.826086]  [<ffffffff811371f0>] ? cgroup_kn_lock_live+0x50/0x1d0
[ 1680.826087]  [<ffffffff811371f0>] ? cgroup_kn_lock_live+0x50/0x1d0
[ 1680.826089]  [<ffffffff811371f0>] cgroup_kn_lock_live+0x50/0x1d0
[ 1680.826090]  [<ffffffff81137208>] ? cgroup_kn_lock_live+0x68/0x1d0
[ 1680.826091]  [<ffffffff8113ac12>] __cgroup_procs_write+0x52/0x460
[ 1680.826093]  [<ffffffff8113b031>] cgroup_tasks_write+0x11/0x20
[ 1680.826094]  [<ffffffff81136a5e>] cgroup_file_write+0x3e/0x1c0
[ 1680.826096]  [<ffffffff812ba611>] kernfs_fop_write+0x141/0x190
[ 1680.826097]  [<ffffffff8122f748>] __vfs_write+0x18/0x40
[ 1680.826098]  [<ffffffff8122fdbc>] vfs_write+0xac/0x1a0
[ 1680.826100]  [<ffffffff81251606>] ? __fget_light+0x66/0x90
[ 1680.826101]  [<ffffffff81230b19>] SyS_write+0x49/0xb0
[ 1680.826102]  [<ffffffff81665db6>] system_call_fast_compare_end+0xc/0x70
[ 1680.826103] 3 locks held by run/5120:
[ 1680.826107]  #0:  (sb_writers#8){.+.+.+}, at: [<ffffffff81232b01>]
__sb_start_write+0xd1/0xf0
[ 1680.826109]  #1:  (&of->mutex){+.+.+.}, at: [<ffffffff812ba536>]
kernfs_fop_write+0x66/0x190
[ 1680.826112]  #2:  (cgroup_mutex){+.+.+.}, at: [<ffffffff811371f0>]
cgroup_kn_lock_live+0x50/0x1d0

Tejun<tj@kernel.org> helped us looking into this issue. There's a
patch for 4.6, which works for 4.4 too:
http://lkml.kernel.org/g/20160415191719.GK12583@htj.duckdns.org

The deadlock is easy to trigger if using cgroup directly in services
brought up by
systemd. It's a major showstopper.


Reply to: