- Queries
- All Stories
- Search
- Advanced Search
- Transactions
- Transaction Logs
Advanced Search
Yesterday
Fri, Jun 20
Out of curiosity, which applications depended on this? I didn't see any apparent problems in the regression test suite.
In D50947#1162925, @zlei wrote:I intended to MFC this to stable/14, but stable/14 still has gnu99 as the default.
Thu, Jun 19
- Rebase.
- Fix handling of the queue limit, i.e., coalesce overflow events when possible.
- Make the user watch limit a function of the maximum number of vnodes. Each watch represents a vnode ref, so we should ensure that excessive watches do not prevent vnode reclamation. At the same time, some reading suggests that the watch limit is often too low on Linux, so this approach is better than a hard-coded limit.
- Impose a global limit on the number of watches. This is higher than the per-user limit.
- Add a sysctl to export the total number of watches in use.
Wed, Jun 18
Should we note this behaviour in ptrace.2?
In D47935#1162094, @kib wrote:In D47935#1161832, @markj wrote:I still don't really like it. I'd prefer to understand exactly why pmap_growkernel() failed--is it because it tried to allocate more than v_interrupt_free_min (2) pages? Is it because there was some concurrent VM_ALLOC_INTERRUPT page allocation? Why don't we see such panics from stress2?
Perhaps stress2 is not stressful enough. Jokes aside, I do know that hardware that Peter uses is relatively old and underpowered by the modern standards. I do not think that he has anything larger that 64 threads.
Another issue might be that stress2 mostly tests top level of the kernel, since this is what we can load from the syscall layer. There are no tests for things that would utilize VM_ALLOC_INTERRUPT enough. For instance, high-speed (200G/400G) ethernet card with lot of receive queues on large multiprocessor with the parallel swap load is something that is not tested by stress2.
And then, I read your arguments as the attempt to make vm_page_alloc(VM_ALLOC_INTERRUPT) non-failing. I do not think this is feasible or intended. All callers of vm_page_alloc() must be prepared for the failure, and the reserve would not work there.
Regarding the pmap, wouldn't the interrupt reserve depleted if all CPUs call pmap_growkernel() if back-to-back manner, not giving the pagedaemon threads a chance to become runnable? I think this is possible with differrent zones needing to grow KVA, or something similar. And no, I do not think that any bump of the reserve for interrupt allocation would solve this theoretically correct.
Tue, Jun 17
This seems fine in principle, but it'll probably break some targets which rely on ignoring errors.
In D47935#1120644, @glebius wrote:syzkaller hit similar issue with fork(2):
https://syzkaller.appspot.com/bug?extid=6cd13c008e8640eceb4cIMHO, we could propagate the pmap_growkernel() error all the way up and fail the syscall.
Mon, Jun 16
Make the queue size update atomic.
@alc did you have any thoughts on this patch?
Thanks, the vnet function is indeed too simplistic. I believe, though I'm not totally certain, that I'll need to modify link_elf_obj.c a bit to preserve the original base address for the VNET variable section. Otherwise the debugger doesn't have a good way to figure out which VNET section a given variable belongs to. I have some WIP to address this but I need a bit of time.
Sat, Jun 14
In D50825#1160523, @kp wrote:While testing I found that this has issues with vnet variables in kernel modules.
For a test I added a panic() call in pf_test(), and tried to print V_pf_status.
The kernel and module were loaded at:
0xffffffff80200000 217bc18 kernel 0xffffffff836dd000 52b48 pf.koThe kernel shows these addresses:
curvnet 0xfffff804a6a76bc0 curvnet->vnet_data_base 0xfffffe02c7d4aa20 &VNET_NAME(pf_status) 0xffffffff81b8fac8 &V_pf_status = 0xfffffe02498da4e8Debugging the vnet.py script I found that it had the correct vnet: 0xfffff804a6a76bc0, and the correct vnet_data_base 0xfffffe02c7d4aa20, but it thought that the address of vnet_entry_pf_status (== &VNET_NAME(pf_status)) was 0xffffffff8372c390.
It looks like the vnet,py code thinks the pf.ko vnet variables live in the module's address range, while the kernel actually puts those in the kernel's vnet address range. I'm not sure how we can teach vnet.py to get the correct address for kernel module vnet variables.