20:00:30 #startmeeting 20:00:30 Meeting started Wed Jan 8 20:00:30 2025 UTC. The chair is carnil. Information about MeetBot at http://wiki.debian.org/MeetBot. 20:00:30 Useful Commands: #action #agreed #help #info #idea #link #topic. 20:00:39 #chair bwh waldi ukleinek 20:00:39 Current chairs: bwh carnil ukleinek waldi 20:00:40 hi 20:00:53 Hello to the first kernel-team meeting of this year 20:00:53 * ukleinek waves 20:01:30 building the agenda I did not go back to the date from last meeting, the list would have been to long, I hink we can have a look for a start at the most current ones 20:01:46 * ukleinek nods 20:01:48 but first one one item, where we might exchange ideas on how to move forward 20:02:01 #topic Rebase of debian/latest do 6.13-rc6 and open issues 20:02:26 A while back I have imported 6.13-rc6 changes in a merge request, and it looks amost good modulo cleanup of the commits 20:02:55 but there is an issue which we were so far not able to resolve, external moduels build do no work anymore with our packaing layout in the headers package 20:03:07 I didn't look (should I?), but +1 for getting this into d/latest 20:03:09 I tried with a idea of waldi to do https://salsa.debian.org/carnil/linux/-/commits/6.13-autopkgtest-fix 20:03:27 but there is one issue remaining: 20:03:46 if you look at the autopkgtest log: 20:04:08 the error where we stop now is 20:04:09 x86_64-linux-gnu-ld: read in flex scanner failed 20:04:15 yes 20:04:18 someone has time to further look into it? 20:04:20 there is another bug 20:04:23 ignore it 20:05:05 waldi: I was unsure about your comment: are you saying we should merge the changes as they are now up to including my 6.13-autopkgtest-fix branch and then handle the rest before the experimental upload? 20:05:07 (this is a wrong location of the linker script) 20:05:08 carnil: nitpick: s/it's/its/ 20:05:23 carnil: no, we upload as is 20:05:49 waldi: don't we break this way all e.g. dkms modules builds? 20:05:55 and? 20:06:11 we don't break non-external stuff, so 99% is fine 20:06:31 * ukleinek would want to be more ambitious here 20:07:07 unbreaking external module builds doesn't sound so complicated to justify not doing it? 20:07:31 but this does not mean we need to let it hang in limbo until then 20:08:02 and thats the state of this change, limbo. because, well, yes, it breaks some things 20:08:04 are we talking about a 15min effort? Then I'd want to squeeze it in before the upload. 20:08:32 no, we talk about hours. because even with the fixed linker script, the linker dies 20:08:50 so we are at least two hard problems deep 20:09:18 * ukleinek didn't understand the issue( yet), so no very strong opinion here. 20:09:46 okay I have a proposal: I clean up the commits a bit and merge this into debian/latest 20:10:00 then this at least would unblock several MRs they can rebase it 20:10:10 is this an upstream problem? If not, what is the relevant deviation from upstream? 20:10:27 +1 for getting the update into d/latest 20:10:27 it is an upstream problem 20:10:41 or at least how we use some features 20:10:43 is upstream aware? 20:11:06 unlikely, because we have his problem stuck in limbo 20:11:46 okay, so additionally to carnil pushing this to d/latest, I intend to understand the problem and depending on that might take this to upstram. 20:12:05 that I hopefully undestand corectly: with all he changes from 6.13-autopkgtest-fix the problem is *not* from the split we have in tthe common and arch-specific headers package 20:12:16 Hi. Sorry I'm late - had to drive through a surprise snowstorm 20:12:24 welcome bwh 20:12:40 bwh: we are discussing right now the external module build problem from the 6.13 merge request 20:13:34 I'm afraid I haven't looked at it yet 20:13:38 bwh: waldi is proposing to ignore it for now and merge and upload to experimenal with that problem. The remaining problem is in the linker script and more 20:14:33 I'm OK with that. It should be noted in the changelog so we hopefully reduce the number of bug reports to tell us what we know 20:14:54 * ukleinek nods 20:15:16 ok lets formulate an agreement then 20:16:07 #agreeed Merge the changes for the 6.13-rcX rebase into debian/latest (carnil), then upload to experimental noting in the changelog the open problem with external module builds to reduce potential bugreports 20:16:18 #agreed Merge the changes for the 6.13-rcX rebase into debian/latest (carnil), then upload to experimental noting in the changelog the open problem with external module builds to reduce potential bugreports 20:16:32 thanks, waldi hope this move is okay with you 20:16:37 ..ooOO(Doppelt hält besser) 20:16:55 #topic Autoremovals of firmware-nonfree and firmware-free due to #1091260 20:17:28 this is just something to monitor, an RC bug in as31 will cause autoremoval of firmware-nonfree, firmware-free, but bdale did already upload a fix to unstable, so should resolve 20:17:38 Oh good 20:18:07 at least unless an issue appears for as31 preventing migration 20:18:12 that's not a big problem then if the firmware packages are missing from testing a few days I'd say. 20:18:43 suggest to go now trhough some bugs 20:18:48 +1 20:18:59 #topic #1087807: (C, ) linux-image-6.1.0-27-amd64: Unable to boot: i40e swiotlb buffer is ful 20:19:14 TTOMK still for bwh 20:19:16 still nobody added that missing l 20:19:25 right we should fix that 20:19:45 Yes, I'm afraid I didn't spend as much time on Debian stuff as I planned 20:19:55 ukleinek: retitled 20:20:05 I still have this on my radar 20:20:07 * ukleinek doesn't press Enter then 20:20:36 bwh: no problem we are volunteers :) but then we leave it assigned to you. 20:20:51 #action bwh takes are of looking into #1087807 20:21:27 #topic #1076372, #1090717: Corruption of Lexar NM790 NVMe drive with Linux 6.5+ 20:22:16 there is finally progress on upstream bugzilla, the breaking commit is identified and there is ongoing discussion, so far nothing to do for us. The other half though is open and not clear how to handle 20:22:30 Does the upstream note apply to both bugs? 20:23:04 ah, that's the "other half" part I guess 20:23:19 ukleinek: unfortunately Stefan did not fill two bugs upstream separately but both are in https://bugzilla.kernel.org/show_bug.cgi?id=219609 20:24:00 * ukleinek is relaxed here, maybe this is even sensible. 20:24:29 If upstream only manages to fix the first bug and the second remains, the discussion will go on. 20:24:52 I guess so, so we move on and keep both bugs open for now as they are in our BTS as well 20:25:03 In contrast discussing both bugs in parallel might mis synergies. 20:25:34 ukleinek: well I did explicitly ask Stefan to separate the bugs IIRC, but no idea why he reported them in one 20:25:46 I guess its on upstream to handle them now separately :-/ 20:25:52 * ukleinek shrugs 20:26:01 do we want to move to the next bugs? 20:26:07 * ukleinek nods 20:26:21 * carnil btw apolgoies for typing issue, have issues with one finger 20:26:30 #topic #1091858: (i, +u) zstd: -9 SIGILLs on mips64el (under QEMU -M malta => invalid baselining?) 20:26:55 carnil: found more typing errors in my talk than in yours, so no worries :-) 20:27:17 issue pinpointed (and surpising it was not reported for so long), but patch is now on upstream and marked for stable, so we will have the fixes in ~4w down to 6.1.y series as the patch submitter explicitly asked to wait for 3 weeks before going to stable 20:27:35 !next 20:27:38 so it's handled and will trickle automaticaly 20:27:47 #topic 20:27:47 #1092189: (i, ) nouveau: Texts on the Blender screen are not visible. Error messages for Nouveau appear in the dmesg output. 20:28:13 this is yet not tackled at all on our side 20:29:11 no instant idea on my side 20:29:20 anyone has the right questions to ask to the reporter? 20:30:12 Ask for testing a newer kernel and then take it upstream? 20:30:22 I don't know but the log messages mentioning INVALID_VALUE and INVALID_OPCODE sound like user-space doing something wrong 20:30:53 carnil: fwiw, I don't personally use as31 any more. maintaining the package has been very low overhead for me, but if it's considered important enough that someone else wants to take it over sometime, I won't object 20:31:52 Having said that, the opcode is printed as all-1s which seems more likely to be a kernel bug 20:32:27 The value that triggers the DATA_ERROR message is read from hardware if I'm not mistaken 20:34:33 ukleinek: I think I'm in agreement with you: test a recent kernel, then send upstream or try to find the fix 20:34:54 ukleinek: would you feel ok to take care of this? 20:35:13 #action ukleinek cares for #1092189 20:35:36 #topic #1092187: (i, +u) [amdgpu] Adaptative backlight is broken post suspend-resume on Radeon Vega 20:35:59 fixing commit pending for 6.12.9 and will be included in nex unstable upload, I can take care of that import 20:36:12 #topic 20:36:12 #1091893: (i, M) linux-image-6.1.0-28-amd64: Watchdog detected hard LOCKUP on CPU 8, then CPU 0 20:36:45 #topic #1091893: (i, M) linux-image-6.1.0-28-amd64: Watchdog detected hard LOCKUP on CPU 8, then CPU 0 20:37:04 The message unfortunately is only a symptom, not the reason for a problem 20:37:30 "flaky hardware usually shouldn't cause a system 20:37:34 lockup" is bullshit 20:37:40 * ukleinek nods 20:38:02 Ideally the kernel copes with it, but there are limits 20:38:31 I'm not sure that we should do something here more 20:39:10 syslog contains an Oops 20:39:16 we have a broken spinlock in xhci_setup_device+0x16b/0x600 [xhci_hcd] 20:39:32 There are messages about reiserfs, btrfs and xfs. Is thre reporter really using all of these? 20:40:19 waldi: Huh? 20:41:03 What I see is I/O timeouts, then an xhci timeout, and the handler for that crashes (null pointer deref) 20:42:23 as an additional datapoint the reporter suspects that the problem started after the update from bullseye to bookworm (but is not certain) 20:42:53 this hardware is flacky. but yes, the timeout handler should not die 20:43:29 bwh: i missed the first trace, the one wirh the timeout handler 20:43:52 https://lore.kernel.org/all/bug-219532-208809@https.bugzilla.kernel.org%2F/ could that be related? 20:45:01 could be 20:45:23 carnil: the backtrace is pretty darn similar 20:45:38 That link is a null pointer dereference, so I have doubts this is the same issue 20:45:58 both are null pointer 20:46:18 that this time I missed something in our bug report 20:46:23 "BUG: kernel NULL pointer dereference, address: 0000000000000030" 20:46:24 s/that/than/ 20:46:32 even the address is identical 20:47:26 * ukleinek nods 20:47:46 OK, so we have an upstream bug to link to... but not a fix 20:48:36 yes 20:49:17 maybe we can have the reporter cime in on that discussion? 20:49:53 though the comment in https://bugzilla.kernel.org/show_bug.cgi?id=219532#c7 goes in smae direction and thee seems to be buggy hardare as well involved 20:50:11 there is a patch to apply and tyr to get more information 20:50:21 so the reporter might try that and report back in the upstream bug 20:50:37 Yeah, let's point our reporter to that bugzilla entry 20:50:59 * ukleinek volunteers 20:51:05 thanks ukleinek 20:51:24 #action ukleinek asks reporter of #1091893 to look at https://lore.kernel.org/all/bug-219532-208809@https.bugzilla.kernel.org%2F/ 20:51:46 okay we are at 21:50 20:52:02 let's shortly discuss the question from ukleinek on his merge requests 20:52:19 #topic #1015871: (w, +) Please enable CONFIG_PCI_P2PDMA + #1090072: (w, +) linux-image-amd64: Enable P2P and HMM feature for AMDGPU 20:52:57 ukleinek: prepared a MR but ask if someone from team has more insights in the topic to understand if the changes are sensible to have enabled 20:53:20 is that feature arch restricted? 20:53:43 I enabled it for arm64 and amd64, does that answer your questino? 20:54:18 Which is the MR? 20:54:25 there is only a branch, no MR 20:54:48 Oh OK 20:54:49 https://salsa.debian.org/kernel-team/linux/-/tree/ukleinek/p2pdma?ref_type=heads 20:56:03 IIRC in the earlier bugreport powerpcspe was also considered. 20:56:55 s/spe// 20:58:05 Or we could enable at the top level and rely on the arch dependencies 20:58:20 Is there a reason we would want to be more restrictive than upstream? 20:59:03 I think it's still "If unsure select 'N'" which I'd interpret as a hint to not enable it. 20:59:26 ukleinek: no, this is not a guide for arch dependency 21:00:02 config PCI_P2PDMA depends on ZONE_DEVICE 21:00:14 and ZONE_DEVICE depends on memory hotplug and so stuff 21:00:16 "If unsure select 'N'" can mean anythign between "you probably don't have hardware that needs this" and "this is dangerously experimental code" 21:00:46 so s390x will pretty sure have all the required bits as well 21:00:46 If it's the former then we probably do want to enable it for generic kerel flavours 21:01:14 and ZONE_DEVICE is a requirement for DAX 21:01:17 back then powerpc didn't have CONFIG_ZONE_DEVICE enabled 21:02:13 * ukleinek tends to enable it 21:02:19 i'll look into it 21:02:24 supi 21:02:37 so I suggest ukleinek can make a draft mr and assign it to waldi 21:02:38 ok? 21:02:54 #action ukleinek to create a MR and assign it to waldi 21:02:59 perfect 21:03:02 last topic for today 21:03:11 #topic AOB? 21:03:18 Today we have CONFIG_ZONE_DEVICE=y in debian/config/config 21:03:26 Does someone attend FOSDEM? 21:03:33 Very likely 21:03:34 unfortunately no 21:03:44 * ukleinek does, too 21:03:46 i intend to, but really need to plan accomodation 21:04:07 ah nice maybe I should make up again my mind aobut it if the whole kernel team is there =) 21:04:44 who takes care of preparing and leading through the next meeting? 21:04:49 (on jitsi) 21:05:06 Is it my turn? 21:05:37 jup 21:05:45 OK 21:05:53 ok perfect 21:06:10 #action bwh does chair next kernel-team meeting 21:06:20 I guess we are done, thanks all for attending! 21:06:29 waldi: if you fail to find accomodation, ping me. I might have a contact who booked too many rooms 21:06:30 Thank you for chairing 21:06:43 #endmeeting