genirq: Remove racy waitqueue_active check
commit2b6e97e6748795e47ec139f4da96b2f2d16bb8ee
authorChuansheng Liu <chuansheng.liu@intel.com>
Mon, 24 Feb 2014 03:29:50 +0000 (24 11:29 +0800)
committerBen Hutchings <ben@decadent.org.uk>
Tue, 1 Apr 2014 23:58:56 +0000 (2 00:58 +0100)
tree2fdf8f36b8615ff0bfba1611d65c242f22fd97f3
parent7d1cce6a938cad067edbb4522b587df60afc4730
genirq: Remove racy waitqueue_active check

commit c685689fd24d310343ac33942e9a54a974ae9c43 upstream.

We hit one rare case below:

T1 calling disable_irq(), but hanging at synchronize_irq()
always;
The corresponding irq thread is in sleeping state;
And all CPUs are in idle state;

After analysis, we found there is one possible scenerio which
causes T1 is waiting there forever:
CPU0                                       CPU1
 synchronize_irq()
  wait_event()
    spin_lock()
                                           atomic_dec_and_test(&threads_active)
      insert the __wait into queue
    spin_unlock()
                                           if(waitqueue_active)
    atomic_read(&threads_active)
                                             wake_up()

Here after inserted the __wait into queue on CPU0, and before
test if queue is empty on CPU1, there is no barrier, it maybe
cause it is not visible for CPU1 immediately, although CPU0 has
updated the queue list.
It is similar for CPU0 atomic_read() threads_active also.

So we'd need one smp_mb() before waitqueue_active.that, but removing
the waitqueue_active() check solves it as wel l and it makes
things simple and clear.

Signed-off-by: Chuansheng Liu <chuansheng.liu@intel.com>
Cc: Xiaoming Wang <xiaoming.wang@intel.com>
Link: http://lkml.kernel.org/r/1393212590-32543-1-git-send-email-chuansheng.liu@intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
[bwh: Backported to 3.2: The corresponding check is in irq_thread()]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
kernel/irq/manage.c