From: Andrew Hewus Fresh Subject: Remove package database lock for fw_update(8) To: tech@openbsd.org Date: Mon, 14 Oct 2024 19:30:46 -0700 After I sent the previous patch to watch for the lock co-process to exit, Theo asked whether firmware needed the lock at all. I thought about failure modes for firmware add, upgrade, and delete and I think with a little extra work we can do this relatively safely without a lock. Since each firmware has no dependencies and there is only ever a single correct version of a firmware for any system we have many fewer problems to solve for. We don't have to worry about using a separate database from the normal package tools because the package tools ignore any packages marked as firmware and fw_update only uses those. We could move to a separate database if we desire and tools like sysclean(8) will have to adapt. The failure mode in almost every case is that we have firmware installed without being registered and that means either extra disk space used (which could be fixed with sysclean(8)) but in most cases another run of fw_update(8) will reinstall it properly. The failures I imagined amounted to being interrupted during add or delete. However if you can think of other problem interactions, I would like to hear them. I'm not sure about the interaction when doing an upgrade and a simultaneous delete. I will continue to consider that, especially with the "rename to uninstall" patch below. The easiest to imagine is a SIGINT stopping fw_update in the middle of adding a firmware. Since we don't register the firmware as being installed until the very last step, the failure mode here is that the firmware (or part of it) may be in /etc/firmware, but we don't know. That means things might work, but another run of fw_update will overwrite any of the firmware that may have been installed with the exact same file and should properly register on the next run. However, without a lock, two fw_update could be installing the same firmware at the same time. This is useless, and mostly harmless. The harm would be because `mv(1)` doesn't allow for failing when second argument is a directory that exists, but didn't expect it to. For that there is a new "renamedir" function that will attempt to detect that failure mode and fix it, removing the destination before trying, moving the failed directory back and trying again. This might be overkill and open to more bugs, so I will be thinking about it. Index: fw_update.sh =================================================================== RCS file: /cvs/src/usr.sbin/fw_update/fw_update.sh,v diff -u -p -r1.57 fw_update.sh --- fw_update.sh 12 Oct 2024 23:56:23 -0000 1.57 +++ fw_update.sh 15 Oct 2024 01:20:33 -0000 @@ -92,6 +92,19 @@ tmpdir() { echo "$_dir" } +renamedir() { + local _src=$1 _dst=$2 + local _bad_dst="$_dst/$( basename "$_src" )" + + [ -e "$_dst" ] && rm -rf "$_dst" + + while [ -e "$_src" ] || [ -e "$_bad_dst" ]; do + [ -e "$_bad_dst" ] && mv "$_bad_dst" "$_src" + [ -e "$_dst" ] && rm -rf "$_dst" + mv "$_src" "$_dst" + done +} + spin() { if ! "$ENABLE_SPINNER"; then sleep 1 @@ -414,7 +427,7 @@ w EOL chmod 755 "$FWPKGTMP" - mv "$FWPKGTMP" "$_pkgdir/$_pkg" + renamedir "$FWPKGTMP" "$_pkgdir/$_pkg" unset FWPKGTMP } The next failure point is while deleting a firmware. To account for an interruption there, we rename the package directory to have a "-uninstall" suffix. This should continue to get detected as an installed firmware but with a version that doesn't exist. That means a follow-up delete will continue removing any remains and an upgrade will not think it is the version we need and will therefore upgrade it. We could rely on something like sysclean(8) to do any cleanup, but since we do clean up the registration last, we would need to reverse that because otherwise we may have removed the firmware, but fw_update would think it was installed. This leverages the renamedir helper from the previous patch. Index: fw_update.sh =================================================================== RCS file: /cvs/src/usr.sbin/fw_update/fw_update.sh,v diff -u -p -r1.57 fw_update.sh --- fw_update.sh 12 Oct 2024 23:56:23 -0000 1.57 +++ fw_update.sh 15 Oct 2024 00:16:53 -0000 @@ -435,11 +435,11 @@ remove_files() { } delete_firmware() { - local _cwd _pkg="$1" _pkgdir="${DESTDIR}/var/db/pkg" + local _remove _pkg="$1" _pkgdir="${DESTDIR}/var/db/pkg" + local _cwd="$_pkgdir/$_pkg" # TODO: Check hash for files before deleting ((VERBOSE > 2)) && echo -n "Uninstall $_pkg ..." - _cwd="${_pkgdir}/$_pkg" if [ ! -e "$_cwd/+CONTENTS" ] || ! grep -Fxq '@option firmware' "$_cwd/+CONTENTS"; then @@ -447,6 +447,12 @@ delete_firmware() { return 2 fi + # If interrupted, deleting again will still work + # and if we were doing an upgrade, restarting + # still remove first. + renamedir "$_cwd" "$_cwd-uninstall" + _cwd="$_cwd-uninstall" + set -A _remove -- "${_cwd}/+CONTENTS" "${_cwd}" while read -r _c _g; do @@ -458,7 +464,7 @@ delete_firmware() { *) set -A _remove -- "$_cwd/$_c" "${_remove[@]}" ;; esac - done < "${_pkgdir}/${_pkg}/+CONTENTS" + done < "$_cwd/+CONTENTS" remove_files "${_remove[@]}" With that, we should be able to safely remove the database lock, unless there is something I haven't thought of.. Index: fw_update.sh =================================================================== RCS file: /cvs/src/usr.sbin/fw_update/fw_update.sh,v diff -u -p -r1.57 fw_update.sh --- fw_update.sh 12 Oct 2024 23:56:23 -0000 1.57 +++ fw_update.sh 14 Oct 2024 23:54:30 -0000 @@ -48,7 +48,6 @@ integer WARN_FD=2 FD_DIR= unset FTPPID -unset LOCKPID unset FWPKGTMP REMOVE_LOCALSRC=false @@ -70,7 +69,6 @@ cleanup() { fi [ "${FTPPID:-}" ] && kill -TERM -"$FTPPID" 2>/dev/null - [ "${LOCKPID:-}" ] && kill -TERM -"$LOCKPID" 2>/dev/null [ "${FWPKGTMP:-}" ] && rm -rf "$FWPKGTMP" "$REMOVE_LOCALSRC" && rm -rf "$LOCALSRC" [ -e "$CFILE" ] && [ ! -s "$CFILE" ] && rm -f "$CFILE" @@ -281,56 +279,6 @@ firmware_devicename() { echo "$_d" } -lock_db() { - local _waited - [ "${LOCKPID:-}" ] && return 0 - - # The installer doesn't have perl, so we can't lock there - [ -e /usr/bin/perl ] || return 0 - - set -o monitor - perl <<-'EOL' |& - no lib ('/usr/local/libdata/perl5/site_perl'); - use v5.36; - use OpenBSD::PackageInfo qw< lock_db >; - - $|=1; - - $0 = "fw_update: lock_db"; - my $waited = 0; - package OpenBSD::FwUpdateState { - use parent 'OpenBSD::BaseState'; - sub errprint ($self, @p) { - if ($p[0] && $p[0] =~ /already locked/) { - $waited++; - $p[0] = " " . $p[0] - if !$ENV{VERBOSE}; - } - $self->SUPER::errprint(@p); - } - - } - lock_db(0, 'OpenBSD::FwUpdateState'); - - say "$$ $waited"; - - # Wait for STDOUT to be readable, which won't happen - # but if our parent exits unexpectedly it will close. - my $rin = ''; - vec($rin, fileno(STDOUT), 1) = 1; - select $rin, '', '', undef; -EOL - set +o monitor - - read -rp LOCKPID _waited - - if ((_waited)); then - ! ((VERBOSE)) && status "${0##*/}:" - fi - - return 0 -} - available_firmware() { check_cfile || return $? sed -n 's/.*(\(.*\)-firmware.*/\1/p' "$CFILE" @@ -571,7 +519,6 @@ status "${0##*/}:" if "$DELETE"; then [ "$OPT_F" ] && warn "Cannot use -F and -d" && usage - lock_db # Show the "Uninstall" message when just deleting not upgrading ((VERBOSE)) && VERBOSE=3 @@ -647,7 +594,6 @@ kept='' unregister='' if [ "${devices[*]:-}" ]; then - lock_db for f in "${devices[@]}"; do d="$( firmware_devicename "$f" )"