Index | Thread | Search

From:
Vitaliy Makkoveev <mvs@openbsd.org>
Subject:
Re: Don't release NET_LOCK in vio(4) ioctl
To:
Stefan Fritsch <sf@sfritsch.de>
Cc:
Alexander Bluhm <alexander.bluhm@gmx.net>, tech@openbsd.org
Date:
Wed, 17 Apr 2024 20:57:02 +0300

Download raw body.

Thread
> On 17 Apr 2024, at 00:49, Stefan Fritsch <sf@sfritsch.de> wrote:
> 
> On Tue, 16 Apr 2024, Alexander Bluhm wrote:
> 
>> On Fri, Apr 12, 2024 at 03:38:44PM +0300, Vitaliy Makkoveev wrote:
>>> On Fri, Apr 12, 2024 at 02:05:05PM +0200, Stefan Fritsch wrote:
>>>> Hi,
>>>> 
>>>> commits f0b002d01d5 "Release the netlock when sleeping for control
>>>> messages in in vioioctl()" and 126b881f71 "Insert a workaround for per-ifp
>>>> ioctl being called w/o NET_LOCK()." in vio(4) fixed a deadlock but may
>>>> cause a crash with a protection fault trap at rt_ifa_del if addresses are
>>>> added/removed concurrently.
>>>> 
>>>> Reproducible by running
>>>> 
>>>> while true; do ifconfig vio0 -inet; done &
>>>> while true; do ifconfig vio0 10.1.6.11/24; done &
>>>> while true; do ifconfig vio0 down; done &
>>>> while true; do ifconfig vio0 up; done &
>>>> while true; do ifconfig vio0 -inet6 ; done &
>>>> while true; do ifconfig vio0 inet6 2001:a61:3446:ddff:94be:52d1:d57c:1/64 ; done &
>>>> 
>>>> in parallel.
>>>> 
>>>> The diff below reverts the two above commits and changes the vio_ctrl_*
>>>> functions to prevent any concurrent control queue requests while the
>>>> device is being reset. I have seen no deadlocks or crashes with that
>>>> patch.
>>>> 
>>>> Comments? OKs?
>>>> 
>>> 
>>> Sleep with netlock held stops packet processing and operations in inet
>>> sockets. This is much worse when use another lock with netlock release.
>>> pflowioctl() could be good example.
>> 
>> Holding net lock for a long time is not good.  But kernel crashes
>> are worse.  I prefer short stop of packet processing during ifconfig
>> over a crash in netinet code.  Here is the original bug report
>> 
>> https://marc.info/?l=openbsd-bugs&m=166124228103568&w=2
>> 
>> I would try get this in and work on fine grained vio(4) locking
>> afterwards.
> 
> I agree. I have tried with a separate lock that serializes the whole 
> vio_ioctl function but I still get crashes with that:
> 
> db_command_loop() at db_command_loop+0xea
> db_trap(4,0) at db_trap+0x129
> db_ktrap(4,0,ffff8000239d0f60) at db_ktrap+0x111
> kerntrap(ffff8000239d0f60) at kerntrap+0xa7
> alltraps_kern_meltdown() at alltraps_kern_meltdown+0x7b
> rt_ifa_addlocal(ffff8000008d7b00) at rt_ifa_addlocal+0x2a
> in6_update_ifa(ffff8000000af2a8,ffff8000239d1230,0) at 
> in6_update_ifa+0x581
> in6_ifattach_linklocal(ffff8000000af2a8,0) at in6_ifattach_linklocal+0x204
> in6_ifattach(ffff8000000af2a8) at in6_ifattach+0x103
> ifioctl(fffffd8051ac7dd8,801169ab,ffff8000239d13f0,ffff8000239834a8) at 
> ifioctl+0xc99
> sys_ioctl(ffff8000239834a8,ffff8000239d15a0,ffff8000239d1510) at 
> sys_ioctl+0x2af
> syscall(ffff8000239d15a0) at syscall+0x588
> Xsyscall() at Xsyscall+0x128
> end of kernel
> end trace frame: 0x773f17f14c30, count: -13
> 

Just for curiosity, can I see your diff?

> I suspect that the global ifioctl() function needs some changes in order 
> to allow NIC drivers' ioctl functions to drop the NET_LOCK.

Periodically, I think about that. At least for some pseudo drivers
netlock could be avoided.