20:00:30 <carnil> #startmeeting
20:00:30 <MeetBot> Meeting started Wed Jan  8 20:00:30 2025 UTC.  The chair is carnil. Information about MeetBot at http://wiki.debian.org/MeetBot.
20:00:30 <MeetBot> Useful Commands: #action #agreed #help #info #idea #link #topic.
20:00:39 <carnil> #chair bwh waldi ukleinek
20:00:39 <MeetBot> Current chairs: bwh carnil ukleinek waldi
20:00:40 <waldi> hi
20:00:53 <carnil> Hello to the first kernel-team meeting of this year
20:00:53 * ukleinek waves
20:01:30 <carnil> building the agenda I did not go back to the date from last meeting, the list would have been to long, I hink we can have a look for a start at the most current ones
20:01:46 * ukleinek nods
20:01:48 <carnil> but first one one item, where we might exchange ideas on how to move forward
20:02:01 <carnil> #topic Rebase of debian/latest do 6.13-rc6 and open issues
20:02:26 <carnil> A while back I have imported 6.13-rc6 changes in a merge request, and it looks amost good modulo cleanup of the commits
20:02:55 <carnil> but there is an issue which we were so far not able to resolve, external moduels build do no work anymore with our packaing layout in the headers package
20:03:07 <ukleinek> I didn't look (should I?), but +1 for getting this into d/latest
20:03:09 <carnil> I tried with a idea of waldi to do https://salsa.debian.org/carnil/linux/-/commits/6.13-autopkgtest-fix
20:03:27 <carnil> but there is one issue remaining:
20:03:46 <carnil> if you look at the autopkgtest log:
20:04:08 <carnil> the error where we stop now is
20:04:09 <carnil> x86_64-linux-gnu-ld: read in flex scanner failed
20:04:15 <waldi> yes
20:04:18 <carnil> someone has time to further look into it?
20:04:20 <waldi> there is another bug
20:04:23 <waldi> ignore it
20:05:05 <carnil> waldi: I was unsure about your comment: are you saying we should merge the changes as they are now up to including my 6.13-autopkgtest-fix branch and then handle the rest before the experimental upload?
20:05:07 <waldi> (this is a wrong location of the linker script)
20:05:08 <ukleinek> carnil: nitpick: s/it's/its/
20:05:23 <waldi> carnil: no, we upload as is
20:05:49 <carnil> waldi: don't we break this way all e.g. dkms modules builds?
20:05:55 <waldi> and?
20:06:11 <waldi> we don't break non-external stuff, so 99% is fine
20:06:31 * ukleinek would want to be more ambitious here
20:07:07 <ukleinek> unbreaking external module builds doesn't sound so complicated to justify not doing it?
20:07:31 <waldi> but this does not mean we need to let it hang in limbo until then
20:08:02 <waldi> and thats the state of this change, limbo. because, well, yes, it breaks some things
20:08:04 <ukleinek> are we talking about a 15min effort? Then I'd want to squeeze it in before the upload.
20:08:32 <waldi> no, we talk about hours. because even with the fixed linker script, the linker dies
20:08:50 <waldi> so we are at least two hard problems deep
20:09:18 * ukleinek didn't understand the issue( yet), so no very strong opinion here.
20:09:46 <carnil> okay I have a proposal: I clean up the commits a bit and merge this into debian/latest
20:10:00 <carnil> then this at least would unblock several MRs they can rebase it
20:10:10 <ukleinek> is this an upstream problem? If not, what is the relevant deviation from upstream?
20:10:27 <ukleinek> +1 for getting the update into d/latest
20:10:27 <waldi> it is an upstream problem
20:10:41 <waldi> or at least how we use some features
20:10:43 <ukleinek> is upstream aware?
20:11:06 <waldi> unlikely, because we have his problem stuck in limbo
20:11:46 <ukleinek> okay, so additionally to carnil pushing this to d/latest, I intend to understand the problem and depending on that might take this to upstram.
20:12:05 <carnil> that I hopefully undestand corectly: with all he changes from 6.13-autopkgtest-fix  the problem is *not* from the split we have in tthe common and arch-specific headers package
20:12:16 <bwh> Hi. Sorry I'm late - had to drive through a surprise snowstorm
20:12:24 <carnil> welcome bwh
20:12:40 <carnil> bwh:  we are discussing right now the external module build problem from the 6.13 merge request
20:13:34 <bwh> I'm afraid I haven't looked at it yet
20:13:38 <carnil> bwh: waldi is proposing to ignore it for now and merge and upload to experimenal with that problem. The remaining problem is in the linker script and more
20:14:33 <bwh> I'm OK with that. It should be noted in the changelog so we hopefully reduce the number of bug reports to tell us what we know
20:14:54 * ukleinek nods
20:15:16 <carnil> ok lets formulate an agreement then
20:16:07 <carnil> #agreeed Merge the changes for the 6.13-rcX rebase into debian/latest (carnil), then upload to experimental noting in the changelog the open problem with external module builds to reduce potential bugreports
20:16:18 <carnil> #agreed Merge the changes for the 6.13-rcX rebase into debian/latest (carnil), then upload to experimental noting in the changelog the open problem with external module builds to reduce potential bugreports
20:16:32 <carnil> thanks, waldi hope this move is okay with you
20:16:37 <ukleinek> ..ooOO(Doppelt hält besser)
20:16:55 <carnil> #topic Autoremovals of firmware-nonfree and firmware-free due to #1091260
20:17:28 <carnil> this is just something to monitor, an RC bug in as31 will cause autoremoval of firmware-nonfree, firmware-free, but bdale did already upload a fix to unstable, so should resolve
20:17:38 <bwh> Oh good
20:18:07 <carnil> at least unless an issue appears for as31 preventing migration
20:18:12 <ukleinek> that's not a big problem then if the firmware packages are missing from testing a few days I'd say.
20:18:43 <carnil> suggest to go now trhough some bugs
20:18:48 <ukleinek> +1
20:18:59 <carnil> #topic #1087807: (C, ) linux-image-6.1.0-27-amd64: Unable to boot: i40e swiotlb buffer is ful
20:19:14 <carnil> TTOMK still for bwh
20:19:16 <ukleinek> still nobody added that missing l
20:19:25 <carnil> right we should fix that
20:19:45 <bwh> Yes, I'm afraid I didn't spend as much time on Debian stuff as I planned
20:19:55 <carnil> ukleinek: retitled
20:20:05 <bwh> I still have this on my radar
20:20:07 * ukleinek doesn't press Enter then
20:20:36 <carnil> bwh: no problem we are volunteers :) but then we leave it assigned to you.
20:20:51 <carnil> #action bwh takes are of looking into #1087807
20:21:27 <carnil> #topic #1076372, #1090717: Corruption of Lexar NM790 NVMe drive with Linux 6.5+
20:22:16 <carnil> there is finally progress on upstream bugzilla, the breaking commit is identified and there is ongoing discussion, so far nothing to do for us. The other half though is open and not clear how to handle
20:22:30 <ukleinek> Does the upstream note apply to both bugs?
20:23:04 <ukleinek> ah, that's the "other half" part I guess
20:23:19 <carnil> ukleinek: unfortunately Stefan did not fill two bugs upstream separately but both are in https://bugzilla.kernel.org/show_bug.cgi?id=219609
20:24:00 * ukleinek is relaxed here, maybe this is even sensible.
20:24:29 <ukleinek> If upstream only manages to fix the first bug and the second remains, the discussion will go on.
20:24:52 <carnil> I guess so, so we move on and keep both bugs open for now as they are in our BTS as well
20:25:03 <ukleinek> In contrast discussing both bugs in parallel might mis synergies.
20:25:34 <carnil> ukleinek: well I did explicitly ask Stefan to separate the bugs IIRC, but no idea why he reported them in one
20:25:46 <carnil> I guess its on upstream to handle them now separately :-/
20:25:52 * ukleinek shrugs
20:26:01 <carnil> do we want to move to the next bugs?
20:26:07 * ukleinek nods
20:26:21 * carnil btw apolgoies for typing issue, have issues with one finger
20:26:30 <carnil> #topic #1091858: (i, +u) zstd: -9 SIGILLs on mips64el (under QEMU -M malta => invalid baselining?)
20:26:55 <ukleinek> carnil: found more typing errors in my talk than in yours, so no worries :-)
20:27:17 <carnil> issue pinpointed (and surpising it was not reported for so long), but patch is now on upstream and marked for stable, so we will have the fixes in ~4w down to 6.1.y series as the patch submitter explicitly asked to wait for 3 weeks before going to stable
20:27:35 <ukleinek> !next
20:27:38 <carnil> so it's handled and will trickle automaticaly
20:27:47 <carnil> #topic
20:27:47 <carnil> #1092189: (i, ) nouveau: Texts on the Blender screen are not visible. Error messages for Nouveau appear in the dmesg output.
20:28:13 <carnil> this is yet not tackled at all on our side
20:29:11 <ukleinek> no instant idea on my side
20:29:20 <carnil> anyone has the right questions to ask to the reporter?
20:30:12 <ukleinek> Ask for testing a newer kernel and then take it upstream?
20:30:22 <bwh> I don't know but the log messages mentioning INVALID_VALUE and INVALID_OPCODE sound like user-space doing something wrong
20:30:53 <bdale> carnil: fwiw, I don't personally use as31 any more.  maintaining the package has been very low overhead for me, but if it's considered important enough that someone else wants to take it over sometime, I won't object
20:31:52 <bwh> Having said that, the opcode is printed as all-1s which seems more likely to be a kernel bug
20:32:27 <ukleinek> The value that triggers the DATA_ERROR message is read from hardware if I'm not mistaken
20:34:33 <bwh> ukleinek: I think I'm in agreement with you: test a recent kernel, then send upstream or try to find the fix
20:34:54 <carnil> ukleinek: would you feel ok to take care of this?
20:35:13 <ukleinek> #action ukleinek cares for #1092189
20:35:36 <carnil> #topic #1092187: (i, +u) [amdgpu] Adaptative backlight is broken post suspend-resume on Radeon Vega
20:35:59 <carnil> fixing commit pending for 6.12.9 and will be included in nex unstable upload, I can take care of that import
20:36:12 <carnil> #topic
20:36:12 <carnil> #1091893: (i, M) linux-image-6.1.0-28-amd64: Watchdog detected hard LOCKUP on CPU 8, then CPU 0
20:36:45 <ukleinek> #topic #1091893: (i, M) linux-image-6.1.0-28-amd64: Watchdog detected hard LOCKUP on CPU 8, then CPU 0
20:37:04 <ukleinek> The message unfortunately is only a symptom, not the reason for a problem
20:37:30 <bwh> "flaky hardware usually shouldn't cause a system
20:37:34 <bwh> lockup" is bullshit
20:37:40 * ukleinek nods
20:38:02 <bwh> Ideally the kernel copes with it, but there are limits
20:38:31 <carnil> I'm not sure that we should do something here more
20:39:10 <bwh> syslog contains an Oops
20:39:16 <waldi> we have a broken spinlock in xhci_setup_device+0x16b/0x600 [xhci_hcd]
20:39:32 <ukleinek> There are messages about reiserfs, btrfs and xfs. Is thre reporter really using all of these?
20:40:19 <bwh> waldi: Huh?
20:41:03 <bwh> What I see is I/O timeouts, then an xhci timeout, and the handler for that crashes (null pointer deref)
20:42:23 <carnil> as an additional datapoint the reporter suspects that the problem started after the update from bullseye to bookworm (but is not certain)
20:42:53 <waldi> this hardware is flacky. but yes, the timeout handler should not die
20:43:29 <waldi> bwh: i missed the first trace, the one wirh the timeout handler
20:43:52 <carnil> https://lore.kernel.org/all/bug-219532-208809@https.bugzilla.kernel.org%2F/ could that be related?
20:45:01 <bwh> could be
20:45:23 <waldi> carnil: the backtrace is pretty darn similar
20:45:38 <ukleinek> That link is a null pointer dereference, so I have doubts this is the same issue
20:45:58 <waldi> both are null pointer
20:46:18 <ukleinek> that this time I missed something in our bug report
20:46:23 <waldi> "BUG: kernel NULL pointer dereference, address: 0000000000000030"
20:46:24 <ukleinek> s/that/than/
20:46:32 <waldi> even the address is identical
20:47:26 * ukleinek nods
20:47:46 <bwh> OK, so we have an upstream bug to link to... but not a fix
20:48:36 <waldi> yes
20:49:17 <carnil> maybe we can have the reporter cime in on that discussion?
20:49:53 <carnil> though the comment in https://bugzilla.kernel.org/show_bug.cgi?id=219532#c7 goes in smae direction and thee seems to be buggy hardare as well involved
20:50:11 <carnil> there is a patch to apply and tyr to get more information
20:50:21 <carnil> so the reporter might try that and report back in the upstream bug
20:50:37 <ukleinek> Yeah, let's point our reporter to that bugzilla entry
20:50:59 * ukleinek volunteers
20:51:05 <carnil> thanks ukleinek
20:51:24 <ukleinek> #action ukleinek asks reporter of #1091893 to look at https://lore.kernel.org/all/bug-219532-208809@https.bugzilla.kernel.org%2F/
20:51:46 <carnil> okay we are at 21:50
20:52:02 <carnil> let's shortly discuss the question from ukleinek on his merge requests
20:52:19 <carnil> #topic #1015871: (w, +) Please enable CONFIG_PCI_P2PDMA + #1090072: (w, +) linux-image-amd64: Enable P2P and HMM feature for AMDGPU
20:52:57 <carnil> ukleinek: prepared a MR but ask if someone from team has more insights in the topic to understand if the changes are sensible to have enabled
20:53:20 <waldi> is that feature arch restricted?
20:53:43 <ukleinek> I enabled it for arm64 and amd64, does that answer your questino?
20:54:18 <bwh> Which is the MR?
20:54:25 <ukleinek> there is only a branch, no MR
20:54:48 <bwh> Oh OK
20:54:49 <ukleinek> https://salsa.debian.org/kernel-team/linux/-/tree/ukleinek/p2pdma?ref_type=heads
20:56:03 <ukleinek> IIRC in the earlier bugreport powerpcspe was also considered.
20:56:55 <ukleinek> s/spe//
20:58:05 <bwh> Or we could enable at the top level and rely on the arch dependencies
20:58:20 <bwh> Is there a reason we would want to be more restrictive than upstream?
20:59:03 <ukleinek> I think it's still "If unsure select 'N'" which I'd interpret as a hint to not enable it.
20:59:26 <waldi> ukleinek: no, this is not a guide for arch dependency
21:00:02 <ukleinek> config PCI_P2PDMA depends on ZONE_DEVICE
21:00:14 <waldi> and ZONE_DEVICE depends on memory hotplug and so stuff
21:00:16 <bwh> "If unsure select 'N'" can mean anythign between "you probably don't have hardware that needs this" and "this is dangerously experimental code"
21:00:46 <waldi> so s390x will pretty sure have all the required bits as well
21:00:46 <bwh> If it's the former then we probably do want to enable it for generic kerel flavours
21:01:14 <waldi> and ZONE_DEVICE is a requirement for DAX
21:01:17 <ukleinek> back then powerpc didn't have CONFIG_ZONE_DEVICE enabled
21:02:13 * ukleinek tends to enable it
21:02:19 <waldi> i'll look into it
21:02:24 <ukleinek> supi
21:02:37 <carnil> so I suggest ukleinek can make a draft mr and assign it to waldi
21:02:38 <carnil> ok?
21:02:54 <waldi> #action ukleinek to create a MR and assign it to waldi
21:02:59 <carnil> perfect
21:03:02 <carnil> last topic for today
21:03:11 <carnil> #topic AOB?
21:03:18 <ukleinek> Today we have CONFIG_ZONE_DEVICE=y in debian/config/config
21:03:26 <ukleinek> Does someone attend FOSDEM?
21:03:33 <bwh> Very likely
21:03:34 <carnil> unfortunately no
21:03:44 * ukleinek does, too
21:03:46 <waldi> i intend to, but really need to plan accomodation
21:04:07 <carnil> ah nice maybe I should make up again my mind aobut it if the whole kernel team is there =)
21:04:44 <carnil> who takes care of preparing and leading through the next meeting?
21:04:49 <carnil> (on jitsi)
21:05:06 <bwh> Is it my turn?
21:05:37 <waldi> jup
21:05:45 <bwh> OK
21:05:53 <carnil> ok perfect
21:06:10 <carnil> #action bwh does chair next kernel-team meeting
21:06:20 <carnil> I guess we are done, thanks all for attending!
21:06:29 <ukleinek> waldi: if you fail to find accomodation, ping me. I might have a contact who booked too many rooms
21:06:30 <bwh> Thank you for chairing
21:06:43 <carnil> #endmeeting