19:00:16 <carnil> #startmeeting 19:00:16 <MeetBot> Meeting started Wed May 29 19:00:16 2024 UTC. The chair is carnil. Information about MeetBot at http://wiki.debian.org/MeetBot. 19:00:16 <MeetBot> Useful Commands: #action #agreed #help #info #idea #link #topic. 19:00:31 <ema> hi! 19:00:34 <bwh> hi 19:00:54 <carnil> Hello everybody. As discussed on the mailing list we would like to try to bootstrap some regular meetings for the kernel-team to see where we can get more traction on issues, merge requests and topics to discuss 19:01:04 <waldi> hi 19:01:20 <carnil> first of all: I'm quite unexperienced in running online meetings, so if someone feels like wanting to (co)chair it feel free to speak up 19:01:36 <diederik> hi 19:01:43 <carnil> for today I have looked through some of the items and put an agenda as follows roughly: 19:02:47 <carnil> agree on the team meetings, looking trough most important open bugs filled recently or updted recently, looking at merge requests, an item on handling check s on kernel-team projects and if time remains at lest mention the trixie kernel maintenance issue 19:03:16 <carnil> #topic can we agree on trying to schedule regular team meetings every week 19:04:11 <bwh> I will usually be available on Mon-Thu evenings (CET) 19:04:13 <carnil> #info My proposal here would be that we hold those meetings on every wednesday, 21:00 CEST/19:00 UTC trying to keep them focused on the most important bits and do not make them overlong 19:04:35 <carnil> would a fixed weekday work for the interested people? 19:04:46 <bwh> For me, yes 19:05:04 <waldi> for now, yes 19:05:12 <ema> +1 19:05:43 <diederik> football season is over, so I can now too 19:05:51 <carnil> it should be noted that we can cancel the event if the experiment fail, it's fair to say 19:06:24 <carnil> so let's try that and make it every Wednesday (until there is need to change something) 19:06:51 <bwh> OK, updated my calendar 19:06:51 <carnil> #agreed hold kernel-team meetings every week on wednesday 21:00 CEST/19:00 UTC 19:07:25 <ema> excellent, the first topic was easy :) 19:07:27 <carnil> next topic would be to go over at least the grave, serious and important bugs where we can say something 19:07:39 <carnil> #topic recent open bugs / recently updated bugs 19:07:55 <diederik> Sure about important? That's a LOT of bugs 19:08:00 <bwh> For #1063754, the reporter is trying to investigate in his own way and not doign the test I asked for 19:08:26 <bwh> I intend to reply and mark it moreinfo unreproducible 19:09:09 <carnil> that sounds good to me, if we do not have an actionable hint we cannot do much, if users cannot perform the tests we ask for then we might be out of luck 19:09:56 <carnil> #action bwh will reply to #1071378 on the reporter performing the needed tests and eventually mark it moreinfo and unreproducible for now 19:10:30 <diederik> ... that's a different bug 19:10:33 <carnil> #1039883 goes in very similar direction, there seems to be the reporter affected but nobody was really able to track this down. 19:11:00 <carnil> Theodore Ts'o was looped in a while back (july 2023) but there was not followup from upstream 19:11:21 <carnil> should we do the same here as well and mark it further unreproducible? 19:11:55 <bwh> The reporter gave a script to reproduce it; did anyone try that? 19:12:23 <carnil> #info reproducer from the reporter is in message #40, did anyone tried to reproduce it with it? 19:13:16 <carnil> I guess the answer is no 19:13:18 <waldi> carnil: #info and #action are standalone, you have to provide all context 19:14:30 <diederik> I haven't tried it 19:14:33 <carnil> waldi: thanks, will try to improve on it. 19:14:56 <bwh> Could someone try it, then? 19:15:15 <ema> I can 19:15:46 <carnil> #info With respect to #1039883 the reporter provided a reproducer script in message https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1039883#40 where it is unclear is someone tried to reproduce it on their own as well 19:16:08 <ema> #action ema will try to reproduce #1039883 19:17:05 <carnil> thanks ema 19:17:08 <bwh> thanks 19:17:17 <ema> np! 19:17:52 <bwh> For #1071378, carnil asked the reporter to bisect and report upstream. Should it again be tagged moreinfo then? 19:18:53 <carnil> #action carnil will tag #1071378 moreinfo as we asked the reporter to bisect the issue and report upstream 19:19:25 <diederik> Waiting for the reporter to respond seems more useful. If then new questions arise, then it could be added (imo) 19:19:36 <carnil> #info #1057282 is affecting ci.debian.net infrastructure, once updating the kernel on arm64 hosts. Ben asked Paul if more recent kernels fix the issue 19:20:10 <carnil> #info Paul Gevers has though a question back to us before trying that on ci.debian.net in https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1057282#42 19:20:37 <bwh> So what do we say? 19:20:42 <carnil> #info summarizing question is if ith is worth knowing and put workload on the ci.debian.net maintainers 19:20:59 <carnil> his reply: 19:21:01 <carnil> If you think it worth enough knowing if either is the case, I can 19:21:02 <carnil> install the backports kernel again on the arm64 hosts, but obviously 19:21:02 <carnil> that will be annoying for us. Please let me know if I should pursue this 19:22:04 <bwh> I think they're going to have to upgrade at some point so we might as well find out whether the issue is fixed rather than waiting for them to upgrade to trixie and potentially hit it then 19:22:06 <carnil> the problem here is if I understand correctly, is that the stable kernel has other issues they were facing, so switching to the bpo kernel for arm64 hosts was a possibiltiy 19:23:12 <diederik> that bug doesn't mention that they're currently having problems with the Stable kernel 19:23:44 <carnil> diederik: it is in the very first message: Thursday 30 November I upgraded the ci.debian.net workers. We're running 19:23:47 <carnil> the backports kernel there due to issues we discussed earlier, but after 19:23:49 <carnil> upgrading, we lost access to our arm64 hosts one after the other. 19:24:40 <diederik> Yes, and that's now half a year old. The most recent message said they switched back to the Stable kernel 19:24:42 <carnil> but it does not explicitly mentions anymore which were the issue, and I at least have lost the overview which one. there is at least the apparmor issue to be looked right after this which has an impact on them (but with upstream approaching a solution) 19:24:56 <diederik> (but are willing to upgrade again if it helps fixing the issue) 19:25:18 <carnil> I suggest we ask Paul to please test the updated version 19:25:52 <bwh> #action bwh to follow up to #1057282 19:26:07 <carnil> thanks, bwh you were faster to write that 19:26:21 <carnil> Last bug to quickly look at until we start to run out of time 19:27:34 <carnil> #info #1072004 is actually affecting task for QA for the release, breaking autopkgtest for qemu jobs . We lowered the severity for now, but bluca is asking to make it actually RC to not let the version of linux migrate to testing 19:28:05 <carnil> it would be ideal to know if this is fixed in 6.9.y, has someone tried that from the kernel-team? 19:28:33 <diederik> technically it isn't RC. But I do think it's very important as it affects important Debian infra 19:28:35 <waldi> they can run qemu jobs with the kernel from stable, so no problem at this time? 19:30:12 <waldi> anyway, is it fixed somewhere? 19:30:14 <bwh> This is for isolaiton-machine, where the package gets run in a QEMU VM 19:30:43 <bwh> so I think it makes sense to install the kernel from testing in the VM 19:32:16 <bluca> yes it's a guest kernel issue, not a host kernel issue 19:32:44 <bwh> waldi: I don't see anything in next referring to the breaking commit, so probably no 19:33:00 <waldi> bwh: i don't even see a log. just a SIGTERM 19:33:11 <carnil> Paul in any case seems to indicate to defer the decision to us (in his last message) 19:33:56 <waldi> maybe we should just refer people to migrate to virtiofs for any new usecase, which this would be 19:34:00 <carnil> so if I understand you all correctly: we wuold not increase the severity to RC, but seek for someone who can verify if the issue if addressed in 6.9.y upstream (maybe pursuing again the stalled thread upstream) 19:34:09 <diederik> I (then) want to hightlight another part: "I would be expecting a bit quicker turn around on this bug if you say yes now ;)" 19:35:52 <diederik> increasing the severity (to RC) or not is a decision that the maintainer(s) need to make 19:36:03 <ema> apparently canonical did verify that reverting one commit is sufficient to get a working 6.8: https://bugs.launchpad.net/ubuntu/+source/autopkgtest/+bug/2056461/comments/13 19:36:05 <diederik> That's also what Paul explicitly said 19:36:41 <bwh> ema: Saw that. I worry a bit whether that will still work when we upgrade to 6.9 19:37:45 <waldi> anyway, i have to run, not enough time today 19:37:56 <carnil> and any patch we do diverge from upstream which is not the solution applied by upstream will hit us later in some way. 19:38:18 <carnil> ok waldi, thanks for partecipating, and this is hilighting a good point that time is running fast 19:38:32 <diederik> The problem seems consistent and (thus) reproducible. So if someone has time, try to reproduce it in a VM (or sth like that)? 19:38:50 <diederik> If reproduced, try it with a 6.9 kernel to see if the issue is still there 19:39:18 <bwh> Yes I think the difficulty is to make a simple reproducer rather than the whole of autopkgtest 19:41:09 <diederik> That would be better as it's likely quicker. But this doesn't seem like an issue which *only* the reporter can trigger/hit (as it's not dependent on certain HW) 19:41:10 <carnil> so how about approaching again the people in the upstream stalled thread, to get more information and unerstanding if 6.9.y is still affected? The point here is likely to find someone with enough free time to do otherwise do experimenting on our own. 19:42:08 <carnil> diederik: would you be in the position to schedule enough time for trying to reproduce the issue and verifying it against 6.9.y as well? 19:42:15 <diederik> no 19:42:39 <ema> AFAIU reproducing is a matter of installing a 6.9 kernel in a VM and use such VM in a autopkgtest with qemu backend? If so I can hopefully give it a go in the next couple of days 19:43:01 <ema> s/reproducing/checking if 6.9 is affected/ 19:43:17 <diederik> Yes, first make sure you can reproduce it with a 6.8 kernel 19:43:43 <carnil> #info if our understanding is correct the reproducing #1072004 is a matter of installing the kernel in a VM and use such a VM in autopkgtest with qemu backend with first verifying it is reproducible with a 6.8.y kernel 19:44:10 <carnil> #action ema might have time to give it a try to reproduce the issue in the next few days 19:44:16 <bwh> ema: If you haven't used autopkgtest before, don't undereztimate the difficulty of setting it up 19:44:24 <ema> bwh: I have :-) 19:44:31 <bwh> Ah, good 19:45:08 <carnil> perfect, so I had actually on the agenda at least to talk how we move forward with the firmware-nonfree rebases and rebased for 6.9.y and 6.10-rcX in experimental 19:45:31 <ema> #action ema to try reproduce #1072004 with 6.8 and 6.9 as guest kernel in autopkgtest-virt-qemu 19:45:56 <carnil> do we have capacity to still at least look at the firmware-nonfree situation? 19:46:23 <bwh> I hope to work on it "soon", but can't promise anything 19:46:39 <carnil> #topic firmware-nonfree lacking behind upstream versions 19:47:12 <carnil> #info situation is rather unforunate for firmware-nofree . We lack behind several upstream version containing both security fixes and updates which have real impact for users with recent HW 19:47:19 <diederik> I want to make a remark about the previous topic if that's ok 19:47:32 <carnil> #info diederik did a lot of work rebasing the versions but the main problem is reviewing those MR and lacking of automatism 19:47:46 <diederik> "approaching again the people in the upstream stalled thread" the last message was YESTERDAY, not 6+ months ago 19:47:48 <carnil> #info bwh hopes to work on it "soon" but there cannot be a promise for it 19:48:24 <diederik> I don't think the problem is automatism, but someone making the time to review them 19:49:04 <bwh> I do mean to review the MRs 19:49:37 <diederik> If you look at the procedure I described in https://lists.debian.org/debian-kernel/2024/05/msg00049.html and try that out on f.e. 202309XY, that should (hopefully) reveal it's not that hard ... 19:49:51 <diederik> ... after the huge one (and the next) are cleared 19:49:55 <daissi> I'm willing to help with firmware-nonfree 19:50:10 <diederik> bwh: ah ok, thanks :) 19:50:37 <diederik> I also saw a different approach which does look like automation and I thought that was referred to 19:51:05 <bwh> So I want to automate things more for future updates 19:51:51 <bwh> Something I worked on Berlin was to support wildcard file lists, so there's no need to add individual files to a package any more 19:52:00 <bwh> (usually) 19:52:58 <bwh> Anyway, I think the rest of this can be discussed in the relevant MRs 19:52:59 <diederik> The main problem I encountered was coming up with a 'random string of characters' to use as Description 19:53:14 <diederik> IOW: I think that field is mostly useless 19:53:41 <bwh> Right, it is pretty pointless 19:53:50 <carnil> thanks bwh and diederik for summarizing the current situation. 19:54:27 <bwh> since we now have automation for finding firmware packages relevant to your hardware 19:54:32 <carnil> (if you have seen aboive daissi offered as well help for firmware-nonfree) 19:55:08 <bwh> daissi: Thank you, but at the moment the limitng factor is review by maintainers (like me) 19:55:14 <bluca> just a note: reproducing the qemu hang with autopkgtest is really easy, two steps: build an image with 'autopkgtest-build-qemu unstable /path/to/some/img' and then 'autopkgtest -B dpdk -- autopkgtest-virt-qemu /path/to/some/img' 19:55:38 <ema> thanks bluca 19:55:40 <diederik> I think there's another important issue wrt firmware and that's on the kernel side 19:55:45 <daissi> Okay so don't hesitate to ping me if I can do something 19:56:05 <carnil> #info limiting factor for firmware-nonfree is right now rather reviewing the MRs by kernel-team maintainers 19:56:34 <diederik> We 'upgrade' every firmware message to error, which in turn causes reporters to focus on that (and report more as it's an error, which itself is understandable) 19:56:39 <carnil> #info some ifnormation we put in the package is quite useless (descriptions for firmware) as we do have automation for finding firmware packages relevant to present hardware 19:57:22 <diederik> also, people look for file names/paths, not some random string I came up with 19:58:14 <bwh> diederik: Yes the firmware logging patches do need attention (possibly deletion, depending on whether d-i still wants that reporting) 19:58:57 <diederik> yeah, they still need the messages, but I hope they don't require them to be errors 19:59:45 <diederik> imo, those (2 IIRC) patches should be deleted OR split up (as described in the patch description) 20:00:35 <diederik> but right now I think it's causing more harm then providing value 20:00:50 <bwh> #action bwh to overhaul firmware logging patches 20:01:09 <carnil> I think we have bugs for those, but I do not have them at hand right now 20:01:32 <bwh> There's one relating to iwlwifi-yoyo.bin which is a debug thing not present in linux-firmware.git 20:02:07 <bwh> and I've been meaning to deal with the patches since I saw that, quite some time ago now :-/ 20:03:01 <carnil> okay we are running out of time and I propose to close it here. We can discuss the relevant items further in the MRs for firmware-nonfree and how to rework the firmware logging patches again in a next meeting or off-meeting 20:03:09 <bwh> I agree 20:03:56 <carnil> so thanks to all for partecipating to this experiment. Again if someone feels better suited to chair the meeting please do approach. 20:04:15 <ema> sounds good, I also wanted to briefly discuss other items but nothing that needs sync communication, I'll send emails :-) 20:04:43 <carnil> #info next meeting will be On Wed. 5th June 21:00 CEST 20:04:54 <bwh> Thanks carnil 20:05:06 <carnil> #endmeeting