Skip to content
  • Roland Dreier's avatar
    [SCSI] sg: Fix user memory corruption when SG_IO is interrupted by a signal · 35dc2483
    Roland Dreier authored
    
    
    There is a nasty bug in the SCSI SG_IO ioctl that in some circumstances
    leads to one process writing data into the address space of some other
    random unrelated process if the ioctl is interrupted by a signal.
    What happens is the following:
    
     - A process issues an SG_IO ioctl with direction DXFER_FROM_DEV (ie the
       underlying SCSI command will transfer data from the SCSI device to
       the buffer provided in the ioctl)
    
     - Before the command finishes, a signal is sent to the process waiting
       in the ioctl.  This will end up waking up the sg_ioctl() code:
    
    		result = wait_event_interruptible(sfp->read_wait,
    			(srp_done(sfp, srp) || sdp->detached));
    
       but neither srp_done() nor sdp->detached is true, so we end up just
       setting srp->orphan and returning to userspace:
    
    		srp->orphan = 1;
    		write_unlock_irq(&sfp->rq_list_lock);
    		return result;	/* -ERESTARTSYS because signal hit process */
    
       At this point the original process is done with the ioctl and
       blithely goes ahead handling the signal, reissuing the ioctl, etc.
    
     - Eventually, the SCSI command issued by the first ioctl finishes and
       ends up in sg_rq_end_io().  At the end of that function, we run through:
    
    	write_lock_irqsave(&sfp->rq_list_lock, iflags);
    	if (unlikely(srp->orphan)) {
    		if (sfp->keep_orphan)
    			srp->sg_io_owned = 0;
    		else
    			done = 0;
    	}
    	srp->done = done;
    	write_unlock_irqrestore(&sfp->rq_list_lock, iflags);
    
    	if (likely(done)) {
    		/* Now wake up any sg_read() that is waiting for this
    		 * packet.
    		 */
    		wake_up_interruptible(&sfp->read_wait);
    		kill_fasync(&sfp->async_qp, SIGPOLL, POLL_IN);
    		kref_put(&sfp->f_ref, sg_remove_sfp);
    	} else {
    		INIT_WORK(&srp->ew.work, sg_rq_end_io_usercontext);
    		schedule_work(&srp->ew.work);
    	}
    
       Since srp->orphan *is* set, we set done to 0 (assuming the
       userspace app has not set keep_orphan via an SG_SET_KEEP_ORPHAN
       ioctl), and therefore we end up scheduling sg_rq_end_io_usercontext()
       to run in a workqueue.
    
     - In workqueue context we go through sg_rq_end_io_usercontext() ->
       sg_finish_rem_req() -> blk_rq_unmap_user() -> ... ->
       bio_uncopy_user() -> __bio_copy_iov() -> copy_to_user().
    
       The key point here is that we are doing copy_to_user() on a
       workqueue -- that is, we're on a kernel thread with current->mm
       equal to whatever random previous user process was scheduled before
       this kernel thread.  So we end up copying whatever data the SCSI
       command returned to the virtual address of the buffer passed into
       the original ioctl, but it's quite likely we do this copying into a
       different address space!
    
    As suggested by James Bottomley <James.Bottomley@hansenpartnership.com>,
    add a check for current->mm (which is NULL if we're on a kernel thread
    without a real userspace address space) in bio_uncopy_user(), and skip
    the copy if we're on a kernel thread.
    
    There's no reason that I can think of for any caller of bio_uncopy_user()
    to want to do copying on a kernel thread with a random active userspace
    address space.
    
    Huge thanks to Costa Sapuntzakis <costa@purestorage.com> for the
    original pointer to this bug in the sg code.
    
    Signed-off-by: default avatarRoland Dreier <roland@purestorage.com>
    Tested-by: default avatarDavid Milburn <dmilburn@redhat.com>
    Cc: Jens Axboe <axboe@kernel.dk>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: default avatarJames Bottomley <JBottomley@Parallels.com>
    35dc2483