From: Marc Espie Subject: the little bug that wasn't To: tech@openbsd.org Date: Wed, 4 Dec 2024 11:10:59 +0100 This is a somewhat longer write-up about the fun I had two days ago. The ports tree had a problem. For quite a few months now, something didn't seem right, you started pkg_add -u, you let it proceed through a few hundreds packages, and you tried to start an application, and you got an error message, like two days ago $ gnuplot nausicaa$ gnuplot ld.so: gnuplot: can't load library 'libharfbuzz.so.18.10' Killed $ Invariably it had to do with the minor version number not being exactly right, but no-one took the time to look any further. Until two days ago. I thought I had it figured out: this had to be a bug in ld.so, and it had probably to do with the handling of minor version numbers. So I looked. ld.so being a bit "low level", it's not necessarily easy to debug directly, but it's got a DL_DEB() macro that allows you to do printf debugging (I know, don't laugh, but it's often easier than to try to figure out something smarter). So I looked, and yeah, gnuplot did set up the right hint, and yeah, the test for the right library seemed okay, so why didn't it work ? Then it hit me: the cache. ld.so was totally fine, but there's this cache updated by ldconfig(8) and it was the part that was gettin out of whack. So why didn't we notice this before. It used to be that we ran outside executables all the time, and the pkg tools have got a "just in time" mechanism to handle that (it used to be that the oldest version of the tools did @exec ldconfig -R manually): some part that says we changed libraries, and some part that runs ldconfig if libraries have changed prior to running an external program. But we got @tags: a mechanism to run commands like update-desktop-database just once, right before cleaning up shared data at the end of the pkg_add run. So... turns out that pkg_add behavior was perfect for itself, but defying users's expectations (including mine) that commands would be usable right after being updated. The "fix" (quality of life improvement really) was one line: $state->ldconfig->ensure; right after the end of each individual update. (and so it was a long standing issue: @tags happened somewhere around 2019, and some astute users probably starting noticing the issue around 2020). It's now the 2nd time ldconfig has gotten me to look in the wrong location in 15 years. Sneaky bastard. -- Marc