mirror of
git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
synced 2025-09-04 20:19:47 +08:00
a430d95c5e
1021 Commits
Author | SHA1 | Message | Date | |
---|---|---|---|---|
![]() |
a721f7b8c3 |
lsm: constify 'bprm' parameter in security_bprm_committed_creds()
Three LSMs register the implementations for the 'bprm_committed_creds()' hook: AppArmor, SELinux and tomoyo. Looking at the function implementations we may observe that the 'bprm' parameter is not changing. Mark the 'bprm' parameter of LSM hook security_bprm_committed_creds() as 'const' since it will not be changing in the LSM hook. Signed-off-by: Khadija Kamran <kamrankhadijadj@gmail.com> [PM: minor merge fuzzing due to other constification patches] Signed-off-by: Paul Moore <paul@paul-moore.com> |
||
![]() |
64fc952614 |
lsm: constify 'bprm' parameter in security_bprm_committing_creds()
The 'bprm_committing_creds' hook has implementations registered in SELinux and Apparmor. Looking at the function implementations we observe that the 'bprm' parameter is not changing. Mark the 'bprm' parameter of LSM hook security_bprm_committing_creds() as 'const' since it will not be changing in the LSM hook. Signed-off-by: Khadija Kamran <kamrankhadijadj@gmail.com> Signed-off-by: Paul Moore <paul@paul-moore.com> |
||
![]() |
25cc71d152 |
lsm: constify 'sb' parameter in security_quotactl()
SELinux registers the implementation for the "quotactl" hook. Looking at the function implementation we observe that the parameter "sb" is not changing. Mark the "sb" parameter of LSM hook security_quotactl() as "const" since it will not be changing in the LSM hook. Signed-off-by: Khadija Kamran <kamrankhadijadj@gmail.com> Signed-off-by: Paul Moore <paul@paul-moore.com> |
||
![]() |
ccf1dab96b |
selinux: fix handling of empty opts in selinux_fs_context_submount()
selinux_set_mnt_opts() relies on the fact that the mount options pointer
is always NULL when all options are unset (specifically in its
!selinux_initialized() branch. However, the new
selinux_fs_context_submount() hook breaks this rule by allocating a new
structure even if no options are set. That causes any submount created
before a SELinux policy is loaded to be rejected in
selinux_set_mnt_opts().
Fix this by making selinux_fs_context_submount() leave fc->security
set to NULL when there are no options to be copied from the reference
superblock.
Cc: <stable@vger.kernel.org>
Reported-by: Adam Williamson <awilliam@redhat.com>
Link: https://bugzilla.redhat.com/show_bug.cgi?id=2236345
Fixes:
|
||
![]() |
1086eeac9c |
lsm/stable-6.6 PR 20230829
-----BEGIN PGP SIGNATURE----- iQJIBAABCAAyFiEES0KozwfymdVUl37v6iDy2pc3iXMFAmTuKLcUHHBhdWxAcGF1 bC1tb29yZS5jb20ACgkQ6iDy2pc3iXM/Eg//cwaOu/ASS08Cz/tfXeKpzg9UpzbW uHqGtgdE9ZEvS71z+3dorOJVPEwPr+/yviq3FXYjYHFqvVhLZCvYM9rw+eNo/k4T I95UTchGUsMWwkw61YBDLythfXm2UL5nabjckO81i9UPtxUYOwF6xQMQXYyMcLL8 6fm1vnCvK5FBEXi2HSUWy3Eb3wdviGdHrL6h19Aeew+q8u33asWSxn9vmBSSFEzZ 492//Pgy0t3FA6paWXQRvoR+GvLgBXNOvHB68cAx9vS8Lq6mAwJJSCRrQtKGh2Gd YInr49f+TXOosD5Tm6ueWO4sr8RzQZ7nPyM+BLue4Yn2ZzdYgjwfHdkHWS1KeH5X qVqa9s6/QONvkSCzqHs/ne2qio1Q0/0uGgwOkx6N7oVWQWjE7iTYlADwM0CDJnd2 UD7AHTOgpc88x1T1eW599MZttSCznBTSFXv4waaS5/5NT9n8Db1TpTtCTedOc1x2 n+c+F5BHLy69vhSGCanvum/8i2gNoKVyYaHyaMsQxr5LRcLnvN6oOjWIv7jMKxe7 GavUAxU7M5rxPUH44vrrrI+XztKJOdpCz4S0xp+7pSSSGAK5KkmVVLXjzrlGO1WS 55ixxQWYTGK0KlWHp4Ofi6brE9a4ATKcd1XscPN+AtBYX2ufNHLskCZulu/lyrMx lAy9RRDe1hHWTvg= =dnm4 -----END PGP SIGNATURE----- Merge tag 'lsm-pr-20230829' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm Pull LSM updates from Paul Moore: - Add proper multi-LSM support for xattrs in the security_inode_init_security() hook Historically the LSM layer has only allowed a single LSM to add an xattr to an inode, with IMA/EVM measuring that and adding its own as well. As we work towards promoting IMA/EVM to a "proper LSM" instead of the special case that it is now, we need to better support the case of multiple LSMs each adding xattrs to an inode and after several attempts we now appear to have something that is working well. It is worth noting that in the process of making this change we uncovered a problem with Smack's SMACK64TRANSMUTE xattr which is also fixed in this pull request. - Additional LSM hook constification Two patches to constify parameters to security_capget() and security_binder_transfer_file(). While I generally don't make a special note of who submitted these patches, these were the work of an Outreachy intern, Khadija Kamran, and that makes me happy; hopefully it does the same for all of you reading this. - LSM hook comment header fixes One patch to add a missing hook comment header, one to fix a minor typo. - Remove an old, unused credential function declaration It wasn't clear to me who should pick this up, but it was trivial, obviously correct, and arguably the LSM layer has a vested interest in credentials so I merged it. Sadly I'm now noticing that despite my subject line cleanup I didn't cleanup the "unsued" misspelling, sigh * tag 'lsm-pr-20230829' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm: lsm: constify the 'file' parameter in security_binder_transfer_file() lsm: constify the 'target' parameter in security_capget() lsm: add comment block for security_sk_classify_flow LSM hook security: Fix ret values doc for security_inode_init_security() cred: remove unsued extern declaration change_create_files_as() evm: Support multiple LSMs providing an xattr evm: Align evm_inode_init_security() definition with LSM infrastructure smack: Set the SMACK64TRANSMUTE xattr in smack_inode_init_security() security: Allow all LSMs to provide xattrs for inode_init_security hook lsm: fix typo in security_file_lock() comment header |
||
![]() |
1dbae18987 |
selinux/stable-6.6 PR 20230829
-----BEGIN PGP SIGNATURE----- iQJIBAABCAAyFiEES0KozwfymdVUl37v6iDy2pc3iXMFAmTuKKAUHHBhdWxAcGF1 bC1tb29yZS5jb20ACgkQ6iDy2pc3iXNTVBAAjFt/+t74FkH7OeIuwa4QFodNHWLS 4AgtudUrH0oE/JnDDZsMNm3gjtFhM0cYcCCvItjNmea9sCMfR/huAaQXuYldLI/w IF2/n31Q9SlaHF8xgdpPg55tudx7khGoecC+aHA/qsRoikjCvUhOiWJZT4xn92R7 3DtPMyoY7/1Bw8TWhq9Kj4ks2evx+o9Q798Nxu6XKUWZ6X+CasQfH0bAWWcmh8sL Z/mh5pyICmg9/sHvTdi3/VhhhSiWY0gyaX3tBo7Y3z6J4xjBzLzpJTC2TMaqpqLx nUO2R02Bk6PgMjQcVYl77WlUOfwPz3L2NkWcCmTBNzyBzirl9pUjVZ7ay9fek4Zf GSnoZ3IPNLxsK/oU4Zh+LKzkBpq7astovebFd4SYG+KHIhXkp8eblzxM5/M+LfG/ SXnGrLaCwxWp4jWlYubufPtKhF8jFlpzRvTELKrhU9pZCgzpvwvIJZhUGVP6cDWh 7j9RRJy1wBgqoDASM5+tCrplkwFa3r4AV5CVOFYSXgyvLCHk28IxcMEVqCvoO3yQ FmvAi0MV85XOa/NpD3hYpnvfAcLbv6ftZC6JekqXMjwVE9vfdrywS4ZW4Z3BcEo+ M+eDsCh8ek9LmO+6IW0jffNuHijLKDr3hywnndNWascrmkmgDmu60kW/HIQORkPa tZkswaKUa6RszR0= =XB/b -----END PGP SIGNATURE----- Merge tag 'selinux-pr-20230829' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux Pull selinux updates from Paul Moore: "Thirty three SELinux patches, which is a pretty big number for us, but there isn't really anything scary in here; in fact we actually manage to remove 10 lines of code with this :) - Promote the SELinux DEBUG_HASHES macro to CONFIG_SECURITY_SELINUX_DEBUG The DEBUG_HASHES macro was a buried SELinux specific preprocessor debug macro that was a problem waiting to happen. Promoting the debug macro to a proper Kconfig setting should help both improve the visibility of the feature as well enable improved test coverage. We've moved some additional debug functions under the CONFIG_SECURITY_SELINUX_DEBUG flag and we may see more work in the future. - Emit a pr_notice() message if virtual memory is executable by default As this impacts the SELinux access control policy enforcement, if the system's configuration is such that virtual memory is executable by default we print a single line notice to the console. - Drop avtab_search() in favor of avtab_search_node() Both functions are nearly identical so we removed avtab_search() and converted the callers to avtab_search_node(). - Add some SELinux network auditing helpers The helpers not only reduce a small amount of code duplication, but they provide an opportunity to improve UDP flood performance slightly by delaying initialization of the audit data in some cases. - Convert GFP_ATOMIC allocators to GFP_KERNEL when reading SELinux policy There were two SELinux policy load helper functions that were allocating memory using GFP_ATOMIC, they have been converted to GFP_KERNEL. - Quiet a KMSAN warning in selinux_inet_conn_request() A one-line error path (re)set patch that resolves a KMSAN warning. It is important to note that this doesn't represent a real bug in the current code, but it quiets KMSAN and arguably hardens the code against future changes. - Cleanup the policy capability accessor functions This is a follow-up to the patch which reverted SELinux to using a global selinux_state pointer. This patch cleans up some artifacts of that change and turns each accessor into a one-line READ_ONCE() call into the policy capabilities array. - A number of patches from Christian Göttsche Christian submitted almost two-thirds of the patches in this pull request as he worked to harden the SELinux code against type differences, variable overflows, etc. - Support for separating early userspace from the kernel in policy, with a later revert We did have a patch that added a new userspace initial SID which would allow SELinux to distinguish between early user processes created before the initial policy load and the kernel itself. Unfortunately additional post-merge testing revealed a problematic interaction with an old SELinux userspace on an old version of Ubuntu so we've reverted the patch until we can resolve the compatibility issue. - Remove some outdated comments dealing with LSM hook registration When we removed the runtime disable functionality we forgot to remove some old comments discussing the importance of LSM hook registration ordering. - Minor administrative changes Stephen Smalley updated his email address and "debranded" SELinux from "NSA SELinux" to simply "SELinux". We've come a long way from the original NSA submission and I would consider SELinux a true community project at this point so removing the NSA branding just makes sense" * tag 'selinux-pr-20230829' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux: (33 commits) selinux: prevent KMSAN warning in selinux_inet_conn_request() selinux: use unsigned iterator in nlmsgtab code selinux: avoid implicit conversions in policydb code selinux: avoid implicit conversions in selinuxfs code selinux: make left shifts well defined selinux: update type for number of class permissions in services code selinux: avoid implicit conversions in avtab code selinux: revert SECINITSID_INIT support selinux: use GFP_KERNEL while reading binary policy selinux: update comment on selinux_hooks[] selinux: avoid implicit conversions in services code selinux: avoid implicit conversions in mls code selinux: use identical iterator type in hashtab_duplicate() selinux: move debug functions into debug configuration selinux: log about VM being executable by default selinux: fix a 0/NULL mistmatch in ad_net_init_from_iif() selinux: introduce SECURITY_SELINUX_DEBUG configuration selinux: introduce and use lsm_ad_net_init*() helpers selinux: update my email address selinux: add missing newlines in pr_err() statements ... |
||
![]() |
b96a3e9142 |
- Some swap cleanups from Ma Wupeng ("fix WARN_ON in add_to_avail_list")
- Peter Xu has a series (mm/gup: Unify hugetlb, speed up thp") which reduces the special-case code for handling hugetlb pages in GUP. It also speeds up GUP handling of transparent hugepages. - Peng Zhang provides some maple tree speedups ("Optimize the fast path of mas_store()"). - Sergey Senozhatsky has improved te performance of zsmalloc during compaction (zsmalloc: small compaction improvements"). - Domenico Cerasuolo has developed additional selftest code for zswap ("selftests: cgroup: add zswap test program"). - xu xin has doe some work on KSM's handling of zero pages. These changes are mainly to enable the user to better understand the effectiveness of KSM's treatment of zero pages ("ksm: support tracking KSM-placed zero-pages"). - Jeff Xu has fixes the behaviour of memfd's MEMFD_NOEXEC_SCOPE_NOEXEC_ENFORCED sysctl ("mm/memfd: fix sysctl MEMFD_NOEXEC_SCOPE_NOEXEC_ENFORCED"). - David Howells has fixed an fscache optimization ("mm, netfs, fscache: Stop read optimisation when folio removed from pagecache"). - Axel Rasmussen has given userfaultfd the ability to simulate memory poisoning ("add UFFDIO_POISON to simulate memory poisoning with UFFD"). - Miaohe Lin has contributed some routine maintenance work on the memory-failure code ("mm: memory-failure: remove unneeded PageHuge() check"). - Peng Zhang has contributed some maintenance work on the maple tree code ("Improve the validation for maple tree and some cleanup"). - Hugh Dickins has optimized the collapsing of shmem or file pages into THPs ("mm: free retracted page table by RCU"). - Jiaqi Yan has a patch series which permits us to use the healthy subpages within a hardware poisoned huge page for general purposes ("Improve hugetlbfs read on HWPOISON hugepages"). - Kemeng Shi has done some maintenance work on the pagetable-check code ("Remove unused parameters in page_table_check"). - More folioification work from Matthew Wilcox ("More filesystem folio conversions for 6.6"), ("Followup folio conversions for zswap"). And from ZhangPeng ("Convert several functions in page_io.c to use a folio"). - page_ext cleanups from Kemeng Shi ("minor cleanups for page_ext"). - Baoquan He has converted some architectures to use the GENERIC_IOREMAP ioremap()/iounmap() code ("mm: ioremap: Convert architectures to take GENERIC_IOREMAP way"). - Anshuman Khandual has optimized arm64 tlb shootdown ("arm64: support batched/deferred tlb shootdown during page reclamation/migration"). - Better maple tree lockdep checking from Liam Howlett ("More strict maple tree lockdep"). Liam also developed some efficiency improvements ("Reduce preallocations for maple tree"). - Cleanup and optimization to the secondary IOMMU TLB invalidation, from Alistair Popple ("Invalidate secondary IOMMU TLB on permission upgrade"). - Ryan Roberts fixes some arm64 MM selftest issues ("selftests/mm fixes for arm64"). - Kemeng Shi provides some maintenance work on the compaction code ("Two minor cleanups for compaction"). - Some reduction in mmap_lock pressure from Matthew Wilcox ("Handle most file-backed faults under the VMA lock"). - Aneesh Kumar contributes code to use the vmemmap optimization for DAX on ppc64, under some circumstances ("Add support for DAX vmemmap optimization for ppc64"). - page-ext cleanups from Kemeng Shi ("add page_ext_data to get client data in page_ext"), ("minor cleanups to page_ext header"). - Some zswap cleanups from Johannes Weiner ("mm: zswap: three cleanups"). - kmsan cleanups from ZhangPeng ("minor cleanups for kmsan"). - VMA handling cleanups from Kefeng Wang ("mm: convert to vma_is_initial_heap/stack()"). - DAMON feature work from SeongJae Park ("mm/damon/sysfs-schemes: implement DAMOS tried total bytes file"), ("Extend DAMOS filters for address ranges and DAMON monitoring targets"). - Compaction work from Kemeng Shi ("Fixes and cleanups to compaction"). - Liam Howlett has improved the maple tree node replacement code ("maple_tree: Change replacement strategy"). - ZhangPeng has a general code cleanup - use the K() macro more widely ("cleanup with helper macro K()"). - Aneesh Kumar brings memmap-on-memory to ppc64 ("Add support for memmap on memory feature on ppc64"). - pagealloc cleanups from Kemeng Shi ("Two minor cleanups for pcp list in page_alloc"), ("Two minor cleanups for get pageblock migratetype"). - Vishal Moola introduces a memory descriptor for page table tracking, "struct ptdesc" ("Split ptdesc from struct page"). - memfd selftest maintenance work from Aleksa Sarai ("memfd: cleanups for vm.memfd_noexec"). - MM include file rationalization from Hugh Dickins ("arch: include asm/cacheflush.h in asm/hugetlb.h"). - THP debug output fixes from Hugh Dickins ("mm,thp: fix sloppy text output"). - kmemleak improvements from Xiaolei Wang ("mm/kmemleak: use object_cache instead of kmemleak_initialized"). - More folio-related cleanups from Matthew Wilcox ("Remove _folio_dtor and _folio_order"). - A VMA locking scalability improvement from Suren Baghdasaryan ("Per-VMA lock support for swap and userfaults"). - pagetable handling cleanups from Matthew Wilcox ("New page table range API"). - A batch of swap/thp cleanups from David Hildenbrand ("mm/swap: stop using page->private on tail pages for THP_SWAP + cleanups"). - Cleanups and speedups to the hugetlb fault handling from Matthew Wilcox ("Change calling convention for ->huge_fault"). - Matthew Wilcox has also done some maintenance work on the MM subsystem documentation ("Improve mm documentation"). -----BEGIN PGP SIGNATURE----- iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZO1JUQAKCRDdBJ7gKXxA jrMwAP47r/fS8vAVT3zp/7fXmxaJYTK27CTAM881Gw1SDhFM/wEAv8o84mDenCg6 Nfio7afS1ncD+hPYT8947UnLxTgn+ww= =Afws -----END PGP SIGNATURE----- Merge tag 'mm-stable-2023-08-28-18-26' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull MM updates from Andrew Morton: - Some swap cleanups from Ma Wupeng ("fix WARN_ON in add_to_avail_list") - Peter Xu has a series (mm/gup: Unify hugetlb, speed up thp") which reduces the special-case code for handling hugetlb pages in GUP. It also speeds up GUP handling of transparent hugepages. - Peng Zhang provides some maple tree speedups ("Optimize the fast path of mas_store()"). - Sergey Senozhatsky has improved te performance of zsmalloc during compaction (zsmalloc: small compaction improvements"). - Domenico Cerasuolo has developed additional selftest code for zswap ("selftests: cgroup: add zswap test program"). - xu xin has doe some work on KSM's handling of zero pages. These changes are mainly to enable the user to better understand the effectiveness of KSM's treatment of zero pages ("ksm: support tracking KSM-placed zero-pages"). - Jeff Xu has fixes the behaviour of memfd's MEMFD_NOEXEC_SCOPE_NOEXEC_ENFORCED sysctl ("mm/memfd: fix sysctl MEMFD_NOEXEC_SCOPE_NOEXEC_ENFORCED"). - David Howells has fixed an fscache optimization ("mm, netfs, fscache: Stop read optimisation when folio removed from pagecache"). - Axel Rasmussen has given userfaultfd the ability to simulate memory poisoning ("add UFFDIO_POISON to simulate memory poisoning with UFFD"). - Miaohe Lin has contributed some routine maintenance work on the memory-failure code ("mm: memory-failure: remove unneeded PageHuge() check"). - Peng Zhang has contributed some maintenance work on the maple tree code ("Improve the validation for maple tree and some cleanup"). - Hugh Dickins has optimized the collapsing of shmem or file pages into THPs ("mm: free retracted page table by RCU"). - Jiaqi Yan has a patch series which permits us to use the healthy subpages within a hardware poisoned huge page for general purposes ("Improve hugetlbfs read on HWPOISON hugepages"). - Kemeng Shi has done some maintenance work on the pagetable-check code ("Remove unused parameters in page_table_check"). - More folioification work from Matthew Wilcox ("More filesystem folio conversions for 6.6"), ("Followup folio conversions for zswap"). And from ZhangPeng ("Convert several functions in page_io.c to use a folio"). - page_ext cleanups from Kemeng Shi ("minor cleanups for page_ext"). - Baoquan He has converted some architectures to use the GENERIC_IOREMAP ioremap()/iounmap() code ("mm: ioremap: Convert architectures to take GENERIC_IOREMAP way"). - Anshuman Khandual has optimized arm64 tlb shootdown ("arm64: support batched/deferred tlb shootdown during page reclamation/migration"). - Better maple tree lockdep checking from Liam Howlett ("More strict maple tree lockdep"). Liam also developed some efficiency improvements ("Reduce preallocations for maple tree"). - Cleanup and optimization to the secondary IOMMU TLB invalidation, from Alistair Popple ("Invalidate secondary IOMMU TLB on permission upgrade"). - Ryan Roberts fixes some arm64 MM selftest issues ("selftests/mm fixes for arm64"). - Kemeng Shi provides some maintenance work on the compaction code ("Two minor cleanups for compaction"). - Some reduction in mmap_lock pressure from Matthew Wilcox ("Handle most file-backed faults under the VMA lock"). - Aneesh Kumar contributes code to use the vmemmap optimization for DAX on ppc64, under some circumstances ("Add support for DAX vmemmap optimization for ppc64"). - page-ext cleanups from Kemeng Shi ("add page_ext_data to get client data in page_ext"), ("minor cleanups to page_ext header"). - Some zswap cleanups from Johannes Weiner ("mm: zswap: three cleanups"). - kmsan cleanups from ZhangPeng ("minor cleanups for kmsan"). - VMA handling cleanups from Kefeng Wang ("mm: convert to vma_is_initial_heap/stack()"). - DAMON feature work from SeongJae Park ("mm/damon/sysfs-schemes: implement DAMOS tried total bytes file"), ("Extend DAMOS filters for address ranges and DAMON monitoring targets"). - Compaction work from Kemeng Shi ("Fixes and cleanups to compaction"). - Liam Howlett has improved the maple tree node replacement code ("maple_tree: Change replacement strategy"). - ZhangPeng has a general code cleanup - use the K() macro more widely ("cleanup with helper macro K()"). - Aneesh Kumar brings memmap-on-memory to ppc64 ("Add support for memmap on memory feature on ppc64"). - pagealloc cleanups from Kemeng Shi ("Two minor cleanups for pcp list in page_alloc"), ("Two minor cleanups for get pageblock migratetype"). - Vishal Moola introduces a memory descriptor for page table tracking, "struct ptdesc" ("Split ptdesc from struct page"). - memfd selftest maintenance work from Aleksa Sarai ("memfd: cleanups for vm.memfd_noexec"). - MM include file rationalization from Hugh Dickins ("arch: include asm/cacheflush.h in asm/hugetlb.h"). - THP debug output fixes from Hugh Dickins ("mm,thp: fix sloppy text output"). - kmemleak improvements from Xiaolei Wang ("mm/kmemleak: use object_cache instead of kmemleak_initialized"). - More folio-related cleanups from Matthew Wilcox ("Remove _folio_dtor and _folio_order"). - A VMA locking scalability improvement from Suren Baghdasaryan ("Per-VMA lock support for swap and userfaults"). - pagetable handling cleanups from Matthew Wilcox ("New page table range API"). - A batch of swap/thp cleanups from David Hildenbrand ("mm/swap: stop using page->private on tail pages for THP_SWAP + cleanups"). - Cleanups and speedups to the hugetlb fault handling from Matthew Wilcox ("Change calling convention for ->huge_fault"). - Matthew Wilcox has also done some maintenance work on the MM subsystem documentation ("Improve mm documentation"). * tag 'mm-stable-2023-08-28-18-26' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (489 commits) maple_tree: shrink struct maple_tree maple_tree: clean up mas_wr_append() secretmem: convert page_is_secretmem() to folio_is_secretmem() nios2: fix flush_dcache_page() for usage from irq context hugetlb: add documentation for vma_kernel_pagesize() mm: add orphaned kernel-doc to the rst files. mm: fix clean_record_shared_mapping_range kernel-doc mm: fix get_mctgt_type() kernel-doc mm: fix kernel-doc warning from tlb_flush_rmaps() mm: remove enum page_entry_size mm: allow ->huge_fault() to be called without the mmap_lock held mm: move PMD_ORDER to pgtable.h mm: remove checks for pte_index memcg: remove duplication detection for mem_cgroup_uncharge_swap mm/huge_memory: work on folio->swap instead of page->private when splitting folio mm/swap: inline folio_set_swap_entry() and folio_swap_entry() mm/swap: use dedicated entry for swap in folio mm/swap: stop using page->private on tail pages for THP_SWAP selftests/mm: fix WARNING comparing pointer to 0 selftests: cgroup: fix test_kmem_memcg_deletion kernel mem check ... |
||
![]() |
bd6c11bc43 |
Networking changes for 6.6.
Core ---- - Increase size limits for to-be-sent skb frag allocations. This allows tun, tap devices and packet sockets to better cope with large writes operations. - Store netdevs in an xarray, to simplify iterating over netdevs. - Refactor nexthop selection for multipath routes. - Improve sched class lifetime handling. - Add backup nexthop ID support for bridge. - Implement drop reasons support in openvswitch. - Several data races annotations and fixes. - Constify the sk parameter of routing functions. - Prepend kernel version to netconsole message. Protocols --------- - Implement support for TCP probing the peer being under memory pressure. - Remove hard coded limitation on IPv6 specific info placement inside the socket struct. - Get rid of sysctl_tcp_adv_win_scale and use an auto-estimated per socket scaling factor. - Scaling-up the IPv6 expired route GC via a separated list of expiring routes. - In-kernel support for the TLS alert protocol. - Better support for UDP reuseport with connected sockets. - Add NEXT-C-SID support for SRv6 End.X behavior, reducing the SR header size. - Get rid of additional ancillary per MPTCP connection struct socket. - Implement support for BPF-based MPTCP packet schedulers. - Format MPTCP subtests selftests results in TAP. - Several new SMC 2.1 features including unique experimental options, max connections per lgr negotiation, max links per lgr negotiation. BPF --- - Multi-buffer support in AF_XDP. - Add multi uprobe BPF links for attaching multiple uprobes and usdt probes, which is significantly faster and saves extra fds. - Implement an fd-based tc BPF attach API (TCX) and BPF link support on top of it. - Add SO_REUSEPORT support for TC bpf_sk_assign. - Support new instructions from cpu v4 to simplify the generated code and feature completeness, for x86, arm64, riscv64. - Support defragmenting IPv(4|6) packets in BPF. - Teach verifier actual bounds of bpf_get_smp_processor_id() and fix perf+libbpf issue related to custom section handling. - Introduce bpf map element count and enable it for all program types. - Add a BPF hook in sys_socket() to change the protocol ID from IPPROTO_TCP to IPPROTO_MPTCP to cover migration for legacy. - Introduce bpf_me_mcache_free_rcu() and fix OOM under stress. - Add uprobe support for the bpf_get_func_ip helper. - Check skb ownership against full socket. - Support for up to 12 arguments in BPF trampoline. - Extend link_info for kprobe_multi and perf_event links. Netfilter --------- - Speed-up process exit by aborting ruleset validation if a fatal signal is pending. - Allow NLA_POLICY_MASK to be used with BE16/BE32 types. Driver API ---------- - Page pool optimizations, to improve data locality and cache usage. - Introduce ndo_hwtstamp_get() and ndo_hwtstamp_set() to avoid the need for raw ioctl() handling in drivers. - Simplify genetlink dump operations (doit/dumpit) providing them the common information already populated in struct genl_info. - Extend and use the yaml devlink specs to [re]generate the split ops. - Introduce devlink selective dumps, to allow SF filtering SF based on handle and other attributes. - Add yaml netlink spec for netlink-raw families, allow route, link and address related queries via the ynl tool. - Remove phylink legacy mode support. - Support offload LED blinking to phy. - Add devlink port function attributes for IPsec. New hardware / drivers ---------------------- - Ethernet: - Broadcom ASP 2.0 (72165) ethernet controller - MediaTek MT7988 SoC - Texas Instruments AM654 SoC - Texas Instruments IEP driver - Atheros qca8081 phy - Marvell 88Q2110 phy - NXP TJA1120 phy - WiFi: - MediaTek mt7981 support - Can: - Kvaser SmartFusion2 PCI Express devices - Allwinner T113 controllers - Texas Instruments tcan4552/4553 chips - Bluetooth: - Intel Gale Peak - Qualcomm WCN3988 and WCN7850 - NXP AW693 and IW624 - Mediatek MT2925 Drivers ------- - Ethernet NICs: - nVidia/Mellanox: - mlx5: - support UDP encapsulation in packet offload mode - IPsec packet offload support in eswitch mode - improve aRFS observability by adding new set of counters - extends MACsec offload support to cover RoCE traffic - dynamic completion EQs - mlx4: - convert to use auxiliary bus instead of custom interface logic - Intel - ice: - implement switchdev bridge offload, even for LAG interfaces - implement SRIOV support for LAG interfaces - igc: - add support for multiple in-flight TX timestamps - Broadcom: - bnxt: - use the unified RX page pool buffers for XDP and non-XDP - use the NAPI skb allocation cache - OcteonTX2: - support Round Robin scheduling HTB offload - TC flower offload support for SPI field - Freescale: - add XDP_TX feature support - AMD: - ionic: add support for PCI FLR event - sfc: - basic conntrack offload - introduce eth, ipv4 and ipv6 pedit offloads - ST Microelectronics: - stmmac: maximze PTP timestamping resolution - Virtual NICs: - Microsoft vNIC: - batch ringing RX queue doorbell on receiving packets - add page pool for RX buffers - Virtio vNIC: - add per queue interrupt coalescing support - Google vNIC: - add queue-page-list mode support - Ethernet high-speed switches: - nVidia/Mellanox (mlxsw): - add port range matching tc-flower offload - permit enslavement to netdevices with uppers - Ethernet embedded switches: - Marvell (mv88e6xxx): - convert to phylink_pcs - Renesas: - r8A779fx: add speed change support - rzn1: enables vlan support - Ethernet PHYs: - convert mv88e6xxx to phylink_pcs - WiFi: - Qualcomm Wi-Fi 7 (ath12k): - extremely High Throughput (EHT) PHY support - RealTek (rtl8xxxu): - enable AP mode for: RTL8192FU, RTL8710BU (RTL8188GU), RTL8192EU and RTL8723BU - RealTek (rtw89): - Introduce Time Averaged SAR (TAS) support - Connector: - support for event filtering Signed-off-by: Paolo Abeni <pabeni@redhat.com> -----BEGIN PGP SIGNATURE----- iQJGBAABCAAwFiEEg1AjqC77wbdLX2LbKSR5jcyPE6QFAmTt1ZoSHHBhYmVuaUBy ZWRoYXQuY29tAAoJECkkeY3MjxOkgFUP/REFaYWdWUvAzmWeezyx9dqgZMfSOjWq 9QvySiA94OAOcjIYkb7wfzQ5BBAZqaBQ/f8XqWwS1EDDDEBs8sP1cxmABKwW7Hsr qFRu2sOqLzKBk223d0jIgEocfQaFpGbF71gXoTlDivBjBi5UxWm9bF0XnbYWcKgO /QEvzNosi9uNdi85Fzmv62J6YzAdidEpwGsM7X2CfejwNRmStxAEg/NwvRR0Hyiq OJCo97omEgTRaUle8nc64PDx33u4h5kQ1BkaeHEv0rbE3hftFC2YPKn/InmqSFGz 6ew2xnrGPR37LCuAiCcIIv6yR7K0eu0iYJ7jXwZxBDqxGavEPuwWGBoCP6qFiitH ZLWhIrAUrdmSbySkTOCONhJ475qFAuQoYHYpZnX/bJZUHlSsb/9lwDJYJQGpVfd1 /daqJVSb7lhaifmNO1iNd/ibCIXq9zapwtkRwA897M8GkZBTsnVvazFld1Em+Se3 Bx6DSDUVBqVQ9fpZG2IAGD6odDwOzC1lF2IoceFvK9Ff6oE0psI+A0qNLMkHxZbW Qlo7LsNe53hpoCC+yHTfXX7e/X8eNt0EnCGOQJDusZ0Nr3K7H4LKFA0i8UBUK05n 4lKnnaSQW7GQgdofLWt103OMDR9GoDxpFsm7b1X9+AEk6Fz6tq50wWYeMZETUKYP DCW8VGFOZjZM =9CsR -----END PGP SIGNATURE----- Merge tag 'net-next-6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next Pull networking updates from Paolo Abeni: "Core: - Increase size limits for to-be-sent skb frag allocations. This allows tun, tap devices and packet sockets to better cope with large writes operations - Store netdevs in an xarray, to simplify iterating over netdevs - Refactor nexthop selection for multipath routes - Improve sched class lifetime handling - Add backup nexthop ID support for bridge - Implement drop reasons support in openvswitch - Several data races annotations and fixes - Constify the sk parameter of routing functions - Prepend kernel version to netconsole message Protocols: - Implement support for TCP probing the peer being under memory pressure - Remove hard coded limitation on IPv6 specific info placement inside the socket struct - Get rid of sysctl_tcp_adv_win_scale and use an auto-estimated per socket scaling factor - Scaling-up the IPv6 expired route GC via a separated list of expiring routes - In-kernel support for the TLS alert protocol - Better support for UDP reuseport with connected sockets - Add NEXT-C-SID support for SRv6 End.X behavior, reducing the SR header size - Get rid of additional ancillary per MPTCP connection struct socket - Implement support for BPF-based MPTCP packet schedulers - Format MPTCP subtests selftests results in TAP - Several new SMC 2.1 features including unique experimental options, max connections per lgr negotiation, max links per lgr negotiation BPF: - Multi-buffer support in AF_XDP - Add multi uprobe BPF links for attaching multiple uprobes and usdt probes, which is significantly faster and saves extra fds - Implement an fd-based tc BPF attach API (TCX) and BPF link support on top of it - Add SO_REUSEPORT support for TC bpf_sk_assign - Support new instructions from cpu v4 to simplify the generated code and feature completeness, for x86, arm64, riscv64 - Support defragmenting IPv(4|6) packets in BPF - Teach verifier actual bounds of bpf_get_smp_processor_id() and fix perf+libbpf issue related to custom section handling - Introduce bpf map element count and enable it for all program types - Add a BPF hook in sys_socket() to change the protocol ID from IPPROTO_TCP to IPPROTO_MPTCP to cover migration for legacy - Introduce bpf_me_mcache_free_rcu() and fix OOM under stress - Add uprobe support for the bpf_get_func_ip helper - Check skb ownership against full socket - Support for up to 12 arguments in BPF trampoline - Extend link_info for kprobe_multi and perf_event links Netfilter: - Speed-up process exit by aborting ruleset validation if a fatal signal is pending - Allow NLA_POLICY_MASK to be used with BE16/BE32 types Driver API: - Page pool optimizations, to improve data locality and cache usage - Introduce ndo_hwtstamp_get() and ndo_hwtstamp_set() to avoid the need for raw ioctl() handling in drivers - Simplify genetlink dump operations (doit/dumpit) providing them the common information already populated in struct genl_info - Extend and use the yaml devlink specs to [re]generate the split ops - Introduce devlink selective dumps, to allow SF filtering SF based on handle and other attributes - Add yaml netlink spec for netlink-raw families, allow route, link and address related queries via the ynl tool - Remove phylink legacy mode support - Support offload LED blinking to phy - Add devlink port function attributes for IPsec New hardware / drivers: - Ethernet: - Broadcom ASP 2.0 (72165) ethernet controller - MediaTek MT7988 SoC - Texas Instruments AM654 SoC - Texas Instruments IEP driver - Atheros qca8081 phy - Marvell 88Q2110 phy - NXP TJA1120 phy - WiFi: - MediaTek mt7981 support - Can: - Kvaser SmartFusion2 PCI Express devices - Allwinner T113 controllers - Texas Instruments tcan4552/4553 chips - Bluetooth: - Intel Gale Peak - Qualcomm WCN3988 and WCN7850 - NXP AW693 and IW624 - Mediatek MT2925 Drivers: - Ethernet NICs: - nVidia/Mellanox: - mlx5: - support UDP encapsulation in packet offload mode - IPsec packet offload support in eswitch mode - improve aRFS observability by adding new set of counters - extends MACsec offload support to cover RoCE traffic - dynamic completion EQs - mlx4: - convert to use auxiliary bus instead of custom interface logic - Intel - ice: - implement switchdev bridge offload, even for LAG interfaces - implement SRIOV support for LAG interfaces - igc: - add support for multiple in-flight TX timestamps - Broadcom: - bnxt: - use the unified RX page pool buffers for XDP and non-XDP - use the NAPI skb allocation cache - OcteonTX2: - support Round Robin scheduling HTB offload - TC flower offload support for SPI field - Freescale: - add XDP_TX feature support - AMD: - ionic: add support for PCI FLR event - sfc: - basic conntrack offload - introduce eth, ipv4 and ipv6 pedit offloads - ST Microelectronics: - stmmac: maximze PTP timestamping resolution - Virtual NICs: - Microsoft vNIC: - batch ringing RX queue doorbell on receiving packets - add page pool for RX buffers - Virtio vNIC: - add per queue interrupt coalescing support - Google vNIC: - add queue-page-list mode support - Ethernet high-speed switches: - nVidia/Mellanox (mlxsw): - add port range matching tc-flower offload - permit enslavement to netdevices with uppers - Ethernet embedded switches: - Marvell (mv88e6xxx): - convert to phylink_pcs - Renesas: - r8A779fx: add speed change support - rzn1: enables vlan support - Ethernet PHYs: - convert mv88e6xxx to phylink_pcs - WiFi: - Qualcomm Wi-Fi 7 (ath12k): - extremely High Throughput (EHT) PHY support - RealTek (rtl8xxxu): - enable AP mode for: RTL8192FU, RTL8710BU (RTL8188GU), RTL8192EU and RTL8723BU - RealTek (rtw89): - Introduce Time Averaged SAR (TAS) support - Connector: - support for event filtering" * tag 'net-next-6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1806 commits) net: ethernet: mtk_wed: minor change in wed_{tx,rx}info_show net: ethernet: mtk_wed: add some more info in wed_txinfo_show handler net: stmmac: clarify difference between "interface" and "phy_interface" r8152: add vendor/device ID pair for D-Link DUB-E250 devlink: move devlink_notify_register/unregister() to dev.c devlink: move small_ops definition into netlink.c devlink: move tracepoint definitions into core.c devlink: push linecard related code into separate file devlink: push rate related code into separate file devlink: push trap related code into separate file devlink: use tracepoint_enabled() helper devlink: push region related code into separate file devlink: push param related code into separate file devlink: push resource related code into separate file devlink: push dpipe related code into separate file devlink: move and rename devlink_dpipe_send_and_alloc_skb() helper devlink: push shared buffer related code into separate file devlink: push port related code into separate file devlink: push object register/unregister notifications into separate helpers inet: fix IP_TRANSPARENT error handling ... |
||
![]() |
68df1baf15 |
selinux: use vma_is_initial_stack() and vma_is_initial_heap()
Use the helpers to simplify code. Link: https://lkml.kernel.org/r/20230728050043.59880-4-wangkefeng.wang@huawei.com Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Acked-by: Paul Moore <paul@paul-moore.com> Reviewed-by: David Hildenbrand <david@redhat.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Stephen Smalley <stephen.smalley.work@gmail.com> Cc: Eric Paris <eparis@parisplace.org> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Christian Göttsche <cgzones@googlemail.com> Cc: "Christian König" <christian.koenig@amd.com> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: David Airlie <airlied@gmail.com> Cc: Felix Kuehling <felix.kuehling@amd.com> Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> |
||
![]() |
8e4672d6f9 |
lsm: constify the 'file' parameter in security_binder_transfer_file()
SELinux registers the implementation for the "binder_transfer_file" hook. Looking at the function implementation we observe that the parameter "file" is not changing. Mark the "file" parameter of LSM hook security_binder_transfer_file() as "const" since it will not be changing in the LSM hook. Signed-off-by: Khadija Kamran <kamrankhadijadj@gmail.com> [PM: subject line whitespace fix] Signed-off-by: Paul Moore <paul@paul-moore.com> |
||
![]() |
d80a8f1b58 |
vfs, security: Fix automount superblock LSM init problem, preventing NFS sb sharing
When NFS superblocks are created by automounting, their LSM parameters aren't set in the fs_context struct prior to sget_fc() being called, leading to failure to match existing superblocks. This bug leads to messages like the following appearing in dmesg when fscache is enabled: NFS: Cache volume key already in use (nfs,4.2,2,108,106a8c0,1,,,,100000,100000,2ee,3a98,1d4c,3a98,1) Fix this by adding a new LSM hook to load fc->security for submount creation. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Jeff Layton <jlayton@kernel.org> Link: https://lore.kernel.org/r/165962680944.3334508.6610023900349142034.stgit@warthog.procyon.org.uk/ # v1 Link: https://lore.kernel.org/r/165962729225.3357250.14350728846471527137.stgit@warthog.procyon.org.uk/ # v2 Link: https://lore.kernel.org/r/165970659095.2812394.6868894171102318796.stgit@warthog.procyon.org.uk/ # v3 Link: https://lore.kernel.org/r/166133579016.3678898.6283195019480567275.stgit@warthog.procyon.org.uk/ # v4 Link: https://lore.kernel.org/r/217595.1662033775@warthog.procyon.org.uk/ # v5 Fixes: |
||
![]() |
817199e006 |
selinux: revert SECINITSID_INIT support
This commit reverts
|
||
![]() |
6672efbb68 |
lsm: constify the 'target' parameter in security_capget()
Three LSMs register the implementations for the "capget" hook: AppArmor, SELinux, and the normal capability code. Looking at the function implementations we may observe that the first parameter "target" is not changing. Mark the first argument "target" of LSM hook security_capget() as "const" since it will not be changing in the LSM hook. cap_capget() LSM hook declaration exceeds the 80 characters per line limit. Split the function declaration to multiple lines to decrease the line length. Signed-off-by: Khadija Kamran <kamrankhadijadj@gmail.com> Acked-by: John Johansen <john.johansen@canonical.com> [PM: align the cap_capget() declaration, spelling fixes] Signed-off-by: Paul Moore <paul@paul-moore.com> |
||
![]() |
64f18f8a8c |
selinux: update comment on selinux_hooks[]
After commit
|
||
![]() |
19c5b015d1 |
selinux: log about VM being executable by default
In case virtual memory is being marked as executable by default, SELinux checks regarding explicit potential dangerous use are disabled. Inform the user about it. Signed-off-by: Christian Göttsche <cgzones@googlemail.com> Signed-off-by: Paul Moore <paul@paul-moore.com> |
||
![]() |
3876043ad9 |
selinux: fix a 0/NULL mistmatch in ad_net_init_from_iif()
Use a NULL instead of a zero to resolve a int/pointer mismatch.
Cc: Paolo Abeni <pabeni@redhat.com>
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202307210332.4AqFZfzI-lkp@intel.com/
Fixes:
|
||
![]() |
dd51fcd42f |
selinux: introduce and use lsm_ad_net_init*() helpers
Perf traces of network-related workload shows a measurable overhead inside the network-related selinux hooks while zeroing the lsm_network_audit struct. In most cases we can delay the initialization of such structure to the usage point, avoiding such overhead in a few cases. Additionally, the audit code accesses the IP address information only for AF_INET* families, and selinux_parse_skb() will fill-out the relevant fields in such cases. When the family field is zeroed or the initialization is followed by the mentioned parsing, the zeroing can be limited to the sk, family and netif fields. By factoring out the audit-data initialization to new helpers, this patch removes some duplicate code and gives small but measurable performance gain under UDP flood. Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Paul Moore <paul@paul-moore.com> |
||
![]() |
0fe53224bf |
selinux: update my email address
Update my email address; MAINTAINERS was updated some time ago. Signed-off-by: Stephen Smalley <stephen.smalley.work@gmail.com> Signed-off-by: Paul Moore <paul@paul-moore.com> |
||
![]() |
e5faa839c3 |
selinux: add missing newlines in pr_err() statements
The kernel print statements do not append an implicit newline to format strings. Signed-off-by: Christian Göttsche <cgzones@googlemail.com> [PM: subject line tweak] Signed-off-by: Paul Moore <paul@paul-moore.com> |
||
![]() |
90aa4f5e92 |
selinux: de-brand SELinux
Change "NSA SELinux" to just "SELinux" in Kconfig help text and comments. While NSA was the original primary developer and continues to help maintain SELinux, SELinux has long since transitioned to a wide community of developers and maintainers. SELinux has been part of the mainline Linux kernel for nearly 20 years now [1] and has received contributions from many individuals and organizations. [1] https://lore.kernel.org/lkml/Pine.LNX.4.44.0308082228470.1852-100000@home.osdl.org/ Signed-off-by: Stephen Smalley <stephen.smalley.work@gmail.com> Signed-off-by: Paul Moore <paul@paul-moore.com> |
||
![]() |
a13479bb3c |
selinux: avoid implicit conversions in the LSM hooks
Use the identical types in assignments of local variables for the destination. Merge tail calls into return statements. Avoid using leading underscores for function local variable. Signed-off-by: Christian Göttsche <cgzones@googlemail.com> [PM: subject line tweaks] Signed-off-by: Paul Moore <paul@paul-moore.com> |
||
![]() |
5b52ad34f9 |
security: Constify sk in the sk_getsecid hook.
The sk_getsecid hook shouldn't need to modify its socket argument. Make it const so that callers of security_sk_classify_flow() can use a const struct sock *. Signed-off-by: Guillaume Nault <gnault@redhat.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net> |
||
![]() |
5b0eea835d |
selinux: introduce an initial SID for early boot processes
Currently, SELinux doesn't allow distinguishing between kernel threads and userspace processes that are started before the policy is first loaded - both get the label corresponding to the kernel SID. The only way a process that persists from early boot can get a meaningful label is by doing a voluntary dyntransition or re-executing itself. Reusing the kernel label for userspace processes is problematic for several reasons: 1. The kernel is considered to be a privileged domain and generally needs to have a wide range of permissions allowed to work correctly, which prevents the policy writer from effectively hardening against early boot processes that might remain running unintentionally after the policy is loaded (they represent a potential extra attack surface that should be mitigated). 2. Despite the kernel being treated as a privileged domain, the policy writer may want to impose certain special limitations on kernel threads that may conflict with the requirements of intentional early boot processes. For example, it is a good hardening practice to limit what executables the kernel can execute as usermode helpers and to confine the resulting usermode helper processes. However, a (legitimate) process surviving from early boot may need to execute a different set of executables. 3. As currently implemented, overlayfs remembers the security context of the process that created an overlayfs mount and uses it to bound subsequent operations on files using this context. If an overlayfs mount is created before the SELinux policy is loaded, these "mounter" checks are made against the kernel context, which may clash with restrictions on the kernel domain (see 2.). To resolve this, introduce a new initial SID (reusing the slot of the former "init" initial SID) that will be assigned to any userspace process started before the policy is first loaded. This is easy to do, as we can simply label any process that goes through the bprm_creds_for_exec LSM hook with the new init-SID instead of propagating the kernel SID from the parent. To provide backwards compatibility for existing policies that are unaware of this new semantic of the "init" initial SID, introduce a new policy capability "userspace_initial_context" and set the "init" SID to the same context as the "kernel" SID unless this capability is set by the policy. Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com> Signed-off-by: Paul Moore <paul@paul-moore.com> |
||
![]() |
6bcdfd2cac |
security: Allow all LSMs to provide xattrs for inode_init_security hook
Currently, the LSM infrastructure supports only one LSM providing an xattr and EVM calculating the HMAC on that xattr, plus other inode metadata. Allow all LSMs to provide one or multiple xattrs, by extending the security blob reservation mechanism. Introduce the new lbs_xattr_count field of the lsm_blob_sizes structure, so that each LSM can specify how many xattrs it needs, and the LSM infrastructure knows how many xattr slots it should allocate. Modify the inode_init_security hook definition, by passing the full xattr array allocated in security_inode_init_security(), and the current number of xattr slots in that array filled by LSMs. The first parameter would allow EVM to access and calculate the HMAC on xattrs supplied by other LSMs, the second to not leave gaps in the xattr array, when an LSM requested but did not provide xattrs (e.g. if it is not initialized). Introduce lsm_get_xattr_slot(), which LSMs can call as many times as the number specified in the lbs_xattr_count field of the lsm_blob_sizes structure. During each call, lsm_get_xattr_slot() increments the number of filled xattrs, so that at the next invocation it returns the next xattr slot to fill. Cleanup security_inode_init_security(). Unify the !initxattrs and initxattrs case by simply not allocating the new_xattrs array in the former. Update the documentation to reflect the changes, and fix the description of the xattr name, as it is not allocated anymore. Adapt both SELinux and Smack to use the new definition of the inode_init_security hook, and to call lsm_get_xattr_slot() to obtain and fill the reserved slots in the xattr array. Move the xattr->name assignment after the xattr->value one, so that it is done only in case of successful memory allocation. Finally, change the default return value of the inode_init_security hook from zero to -EOPNOTSUPP, so that BPF LSM correctly follows the hook conventions. Reported-by: Nicolas Bouchinet <nicolas.bouchinet@clip-os.org> Link: https://lore.kernel.org/linux-integrity/Y1FTSIo+1x+4X0LS@archlinux/ Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com> Acked-by: Casey Schaufler <casey@schaufler-ca.com> [PM: minor comment and variable tweaks, approved by RS] Signed-off-by: Paul Moore <paul@paul-moore.com> |
||
![]() |
cec5fe7007 |
selinux: make labeled NFS work when mounted before policy load
Currently, when an NFS filesystem that supports passing LSM/SELinux
labels is mounted during early boot (before the SELinux policy is
loaded), it ends up mounted without the labeling support (i.e. with
Fedora policy all files get the generic NFS label
system_u:object_r:nfs_t:s0).
This is because the information that the NFS mount supports passing
labels (communicated to the LSM layer via the kern_flags argument of
security_set_mnt_opts()) gets lost and when the policy is loaded the
mount is initialized as if the passing is not supported.
Fix this by noting the "native labeling" in newsbsec->flags (using a new
SE_SBNATIVE flag) on the pre-policy-loaded call of
selinux_set_mnt_opts() and then making sure it is respected on the
second call from delayed_superblock_init().
Additionally, make inode_doinit_with_dentry() initialize the inode's
label from its extended attributes whenever it doesn't find it already
intitialized by the filesystem. This is needed to properly initialize
pre-existing inodes when delayed_superblock_init() is called. It should
not trigger in any other cases (and if it does, it's still better to
initialize the correct label instead of leaving the inode unlabeled).
Fixes:
|
||
![]() |
85c3222ddd |
selinux: Implement mptcp_add_subflow hook
Newly added subflows should inherit the LSM label from the associated MPTCP socket regardless of the current context. This patch implements the above copying sid and class from the MPTCP socket context, deleting the existing subflow label, if any, and then re-creating the correct one. The new helper reuses the selinux_netlbl_sk_security_free() function, and the latter can end-up being called multiple times with the same argument; we additionally need to make it idempotent. Signed-off-by: Paolo Abeni <pabeni@redhat.com> Acked-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Paul Moore <paul@paul-moore.com> |
||
![]() |
4158cb6000 |
selinux: declare read-only data arrays const
The array of mount tokens in only used in match_opt_prefix() and never modified. The array of symtab names is never modified and only used in the DEBUG_HASHES configuration as output. The array of files for the SElinux filesystem sub-directory `ss` is similar to the other `struct tree_descr` usages only read from to construct the containing entries. Signed-off-by: Christian Göttsche <cgzones@googlemail.com> Signed-off-by: Paul Moore <paul@paul-moore.com> |
||
![]() |
3d9047a064 |
selinux: adjust typos in comments
Found by codespell(1) Signed-off-by: Christian Göttsche <cgzones@googlemail.com> Signed-off-by: Paul Moore <paul@paul-moore.com> |
||
![]() |
f22f9aaf6c |
selinux: remove the runtime disable functionality
After working with the larger SELinux-based distros for several years, we're finally at a place where we can disable the SELinux runtime disable functionality. The existing kernel deprecation notice explains the functionality and why we want to remove it: The selinuxfs "disable" node allows SELinux to be disabled at runtime prior to a policy being loaded into the kernel. If disabled via this mechanism, SELinux will remain disabled until the system is rebooted. The preferred method of disabling SELinux is via the "selinux=0" boot parameter, but the selinuxfs "disable" node was created to make it easier for systems with primitive bootloaders that did not allow for easy modification of the kernel command line. Unfortunately, allowing for SELinux to be disabled at runtime makes it difficult to secure the kernel's LSM hooks using the "__ro_after_init" feature. It is that last sentence, mentioning the '__ro_after_init' hardening, which is the real motivation for this change, and if you look at the diffstat you'll see that the impact of this patch reaches across all the different LSMs, helping prevent tampering at the LSM hook level. From a SELinux perspective, it is important to note that if you continue to disable SELinux via "/etc/selinux/config" it may appear that SELinux is disabled, but it is simply in an uninitialized state. If you load a policy with `load_policy -i`, you will see SELinux come alive just as if you had loaded the policy during early-boot. It is also worth noting that the "/sys/fs/selinux/disable" file is always writable now, regardless of the Kconfig settings, but writing to the file has no effect on the system, other than to display an error on the console if a non-zero/true value is written. Finally, in the several years where we have been working on deprecating this functionality, there has only been one instance of someone mentioning any user visible breakage. In this particular case it was an individual's kernel test system, and the workaround documented in the deprecation notice ("selinux=0" on the kernel command line) resolved the issue without problem. Acked-by: Casey Schaufler <casey@schaufler-ca.com> Signed-off-by: Paul Moore <paul@paul-moore.com> |
||
![]() |
a7e4676e8e |
selinux: remove the 'checkreqprot' functionality
We originally promised that the SELinux 'checkreqprot' functionality would be removed no sooner than June 2021, and now that it is March 2023 it seems like it is a good time to do the final removal. The deprecation notice in the kernel provides plenty of detail on why 'checkreqprot' is not desirable, with the key point repeated below: This was a compatibility mechanism for legacy userspace and for the READ_IMPLIES_EXEC personality flag. However, if set to 1, it weakens security by allowing mappings to be made executable without authorization by policy. The default value of checkreqprot at boot was changed starting in Linux v4.4 to 0 (i.e. check the actual protection), and Android and Linux distributions have been explicitly writing a "0" to /sys/fs/selinux/checkreqprot during initialization for some time. Along with the official deprecation notice, we have been discussing this on-list and directly with several of the larger SELinux-based distros and everyone is happy to see this feature finally removed. In an attempt to catch all of the smaller, and DIY, Linux systems we have been writing a deprecation notice URL into the kernel log, along with a growing ssleep() penalty, when admins enabled checkreqprot at runtime or via the kernel command line. We have yet to have anyone come to us and raise an objection to the deprecation or planned removal. It is worth noting that while this patch removes the checkreqprot functionality, it leaves the user visible interfaces (kernel command line and selinuxfs file) intact, just inert. This should help prevent breakages with existing userspace tools that correctly, but unnecessarily, disable checkreqprot at boot or runtime. Admins that attempt to enable checkreqprot will be met with a removal message in the kernel log. Acked-by: Stephen Smalley <stephen.smalley.work@gmail.com> Signed-off-by: Paul Moore <paul@paul-moore.com> |
||
![]() |
e67b79850f |
selinux: stop passing selinux_state pointers and their offspring
Linus observed that the pervasive passing of selinux_state pointers
introduced by me in commit
|
||
![]() |
01beba7957
|
fs: port inode_owner_or_capable() to mnt_idmap
Convert to struct mnt_idmap.
Last cycle we merged the necessary infrastructure in
|
||
![]() |
700b794052
|
fs: port acl to mnt_idmap
Convert to struct mnt_idmap.
Last cycle we merged the necessary infrastructure in
|
||
![]() |
39f60c1cce
|
fs: port xattr to mnt_idmap
Convert to struct mnt_idmap.
Last cycle we merged the necessary infrastructure in
|
||
![]() |
4609e1f18e
|
fs: port ->permission() to pass mnt_idmap
Convert to struct mnt_idmap.
Last cycle we merged the necessary infrastructure in
|
||
![]() |
c76ff350bd |
lsm/stable-6.2 PR 20221212
-----BEGIN PGP SIGNATURE----- iQJIBAABCAAyFiEES0KozwfymdVUl37v6iDy2pc3iXMFAmOXmxkUHHBhdWxAcGF1 bC1tb29yZS5jb20ACgkQ6iDy2pc3iXMPXg//cxfYC8lRtVpuGNCZWDietSiHzpzu +qFntaTplvybJMQX0HfgNee5cTBZM+W5mp1BHRcZInvV5LRhyrVtgsxDBifutE4x LyUJAw5SkiPdRC+XLDIRLKiZCobFBLVs2zO+qibIqsyR60pFjU6WXBLbJfidXBFR yWudDbLU0YhQJCHdNHNqnHCgqrEculxn6q3QPvm/DX0xzBwkFHSSYBkGNvHW2ZTA lKNreEOwEk5DTLIKjP4bJ72ixp0xbshw5CXuxtwB/12/4h8QbWbJVQLlIeZrTLmp zQXQLJ3pCqKJ2OUCgMDK+wmkvLezd80BV3Due7KX0pT0YRDygoh5QEpZ5/8k8eG7 prxToh2gJWk2htfJF6kgMpAh9Jqewcke4BysbYVM/427OPZYwQqLDZDGOzbtT6pl FYF+adN9wwkAErnHnPlzYipUEpBWurbjtsV8KFWNERoZ4YmzfSPEisRqGIHDGRws bTyq/7qs5FXkb1zULELj8V+S2ULsmxPqsxJ63p9di54Uo9lHK0I+0IUtajGDdfze psAasa9DD/oH2PAbSmpQ5Xo9XyfHRXsVuz1twEmEA14ML0m4wHbNWVHaK0aaXVdG kJKSDSjMsiV+GiwNo7ISJ4pVdUpnMI/iZSghFfV28cJslNhJDeaREHaE/Wtn1/xF /bCVmEfS16UoJsQ= =klFk -----END PGP SIGNATURE----- Merge tag 'lsm-pr-20221212' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm Pull lsm updates from Paul Moore: - Improve the error handling in the device cgroup such that memory allocation failures when updating the access policy do not potentially alter the policy. - Some minor fixes to reiserfs to ensure that it properly releases LSM-related xattr values. - Update the security_socket_getpeersec_stream() LSM hook to take sockptr_t values. Previously the net/BPF folks updated the getsockopt code in the network stack to leverage the sockptr_t type to make it easier to pass both kernel and __user pointers, but unfortunately when they did so they didn't convert the LSM hook. While there was/is no immediate risk by not converting the LSM hook, it seems like this is a mistake waiting to happen so this patch proactively does the LSM hook conversion. - Convert vfs_getxattr_alloc() to return an int instead of a ssize_t and cleanup the callers. Internally the function was never going to return anything larger than an int and the callers were doing some very odd things casting the return value; this patch fixes all that and helps bring a bit of sanity to vfs_getxattr_alloc() and its callers. - More verbose, and helpful, LSM debug output when the system is booted with "lsm.debug" on the command line. There are examples in the commit description, but the quick summary is that this patch provides better information about which LSMs are enabled and the ordering in which they are processed. - General comment and kernel-doc fixes and cleanups. * tag 'lsm-pr-20221212' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm: lsm: Fix description of fs_context_parse_param lsm: Add/fix return values in lsm_hooks.h and fix formatting lsm: Clarify documentation of vm_enough_memory hook reiserfs: Add missing calls to reiserfs_security_free() lsm,fs: fix vfs_getxattr_alloc() return type and caller error paths device_cgroup: Roll back to original exceptions after copy failure LSM: Better reporting of actual LSMs at boot lsm: make security_socket_getpeersec_stream() sockptr_t safe audit: Fix some kernel-doc warnings lsm: remove obsoleted comments for security hooks fs: edit a comment made in bad taste |
||
![]() |
b10b9c342f |
lsm: make security_socket_getpeersec_stream() sockptr_t safe
Commit
|
||
![]() |
1bdeb21862
|
selinux: implement get, set and remove acl hook
The current way of setting and getting posix acls through the generic xattr interface is error prone and type unsafe. The vfs needs to interpret and fixup posix acls before storing or reporting it to userspace. Various hacks exist to make this work. The code is hard to understand and difficult to maintain in it's current form. Instead of making this work by hacking posix acls through xattr handlers we are building a dedicated posix acl api around the get and set inode operations. This removes a lot of hackiness and makes the codepaths easier to maintain. A lot of background can be found in [1]. So far posix acls were passed as a void blob to the security and integrity modules. Some of them like evm then proceed to interpret the void pointer and convert it into the kernel internal struct posix acl representation to perform their integrity checking magic. This is obviously pretty problematic as that requires knowledge that only the vfs is guaranteed to have and has lead to various bugs. Add a proper security hook for setting posix acls and pass down the posix acls in their appropriate vfs format instead of hacking it through a void pointer stored in the uapi format. I spent considerate time in the security module infrastructure and audited all codepaths. SELinux has no restrictions based on the posix acl values passed through it. The capability hook doesn't need to be called either because it only has restrictions on security.* xattrs. So these are all fairly simply hooks for SELinux. Link: https://lore.kernel.org/all/20220801145520.1532837-1-brauner@kernel.org [1] Acked-by: Paul Moore <paul@paul-moore.com> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org> |
||
![]() |
4c0ed7d8d6 |
whack-a-mole: constifying struct path *
-----BEGIN PGP SIGNATURE----- iHUEABYIAB0WIQQqUNBr3gm4hGXdBJlZ7Krx/gZQ6wUCYzxmRQAKCRBZ7Krx/gZQ 6+/kAQD2xyf+i4zOYVBr1NB3qBbhVS1zrni1NbC/kT3dJPgTvwEA7z7eqwnrN4zg scKFP8a3yPoaQBfs4do5PolhuSr2ngA= =NBI+ -----END PGP SIGNATURE----- Merge tag 'pull-path' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs constification updates from Al Viro: "whack-a-mole: constifying struct path *" * tag 'pull-path' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: ecryptfs: constify path spufs: constify path nd_jump_link(): constify path audit_init_parent(): constify path __io_setxattr(): constify path do_proc_readlink(): constify path overlayfs: constify path fs/notify: constify path may_linkat(): constify path do_sys_name_to_handle(): constify path ->getprocattr(): attribute name is const char *, TYVM... |
||
![]() |
26b84401da |
lsm/stable-6.1 PR 20221003
-----BEGIN PGP SIGNATURE----- iQJIBAABCAAyFiEES0KozwfymdVUl37v6iDy2pc3iXMFAmM68YIUHHBhdWxAcGF1 bC1tb29yZS5jb20ACgkQ6iDy2pc3iXOTbA//TR8i+Wy8iswUCmtfmYg91h1uebpl /kjNsSmfgivAUTGamr3eN2WRlGhZfkFDPIHa25uybSA6Q+75p4lst83Rt3HDbjkv Ga7grCXnHwSDwJoHOSeFh0pojV2u7Zvfmiib2U5hPZEmd3kBw3NCgAJVcSGN80B2 dct36fzZNXjvpWDbygmFtRRkmEseslSkft8bUVvNZBP+B0zvv3vcNY1QFuKuK+W2 8wWpvO/cCSmke5i2c2ktHSk2f8/Y6n26Ik/OTHcTVfoKZLRaFbXEzLyxzLrNWd6m hujXgcxszTtHdmoXx+J6uBauju7TR8pi1x8mO2LSGrlpRc1cX0A5ED8WcH71+HVE 8L1fIOmZShccPZn8xRok7oYycAUm/gIfpmSLzmZA76JsZYAe+mp9Ze9FA6fZtSwp 7Q/rfw/Rlz25WcFBe4xypP078HkOmqutkCk2zy5liR+cWGrgy/WKX15vyC0TaPrX tbsRKuCLkipgfXrTk0dX3kmhz+3bJYjqeZEt7sfPSZYpaOGkNXVmAW0wnCOTuLMU +8pIVktvQxMmACEj2gBMz11iooR4DpWLxOcQQR/impgCpNdZ60nA0a6KPJoIXC+5 NfTa422FZkc99QRVblUZyWSgJBW78Z3ZAQcQlo1AGLlFydbfrSFTRLbmNJZo/Nkl KwpGvWs5nB0rVw0= =VZl5 -----END PGP SIGNATURE----- Merge tag 'lsm-pr-20221003' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm Pull LSM updates from Paul Moore: "Seven patches for the LSM layer and we've got a mix of trivial and significant patches. Highlights below, starting with the smaller bits first so they don't get lost in the discussion of the larger items: - Remove some redundant NULL pointer checks in the common LSM audit code. - Ratelimit the lockdown LSM's access denial messages. With this change there is a chance that the last visible lockdown message on the console is outdated/old, but it does help preserve the initial series of lockdown denials that started the denial message flood and my gut feeling is that these might be the more valuable messages. - Open userfaultfds as readonly instead of read/write. While this code obviously lives outside the LSM, it does have a noticeable impact on the LSMs with Ondrej explaining the situation in the commit description. It is worth noting that this patch languished on the VFS list for over a year without any comments (objections or otherwise) so I took the liberty of pulling it into the LSM tree after giving fair notice. It has been in linux-next since the end of August without any noticeable problems. - Add a LSM hook for user namespace creation, with implementations for both the BPF LSM and SELinux. Even though the changes are fairly small, this is the bulk of the diffstat as we are also including BPF LSM selftests for the new hook. It's also the most contentious of the changes in this pull request with Eric Biederman NACK'ing the LSM hook multiple times during its development and discussion upstream. While I've never taken NACK's lightly, I'm sending these patches to you because it is my belief that they are of good quality, satisfy a long-standing need of users and distros, and are in keeping with the existing nature of the LSM layer and the Linux Kernel as a whole. The patches in implement a LSM hook for user namespace creation that allows for a granular approach, configurable at runtime, which enables both monitoring and control of user namespaces. The general consensus has been that this is far preferable to the other solutions that have been adopted downstream including outright removal from the kernel, disabling via system wide sysctls, or various other out-of-tree mechanisms that users have been forced to adopt since we haven't been able to provide them an upstream solution for their requests. Eric has been steadfast in his objections to this LSM hook, explaining that any restrictions on the user namespace could have significant impact on userspace. While there is the possibility of impacting userspace, it is important to note that this solution only impacts userspace when it is requested based on the runtime configuration supplied by the distro/admin/user. Frederick (the pathset author), the LSM/security community, and myself have tried to work with Eric during development of this patchset to find a mutually acceptable solution, but Eric's approach and unwillingness to engage in a meaningful way have made this impossible. I have CC'd Eric directly on this pull request so he has a chance to provide his side of the story; there have been no objections outside of Eric's" * tag 'lsm-pr-20221003' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm: lockdown: ratelimit denial messages userfaultfd: open userfaultfds with O_RDONLY selinux: Implement userns_create hook selftests/bpf: Add tests verifying bpf lsm userns_create hook bpf-lsm: Make bpf_lsm_userns_create() sleepable security, lsm: Introduce security_create_user_ns() lsm: clean up redundant NULL pointer check |
||
![]() |
e816da29bc |
selinux/stable-6.1 PR 20221003
-----BEGIN PGP SIGNATURE----- iQJIBAABCAAyFiEES0KozwfymdVUl37v6iDy2pc3iXMFAmM68ZsUHHBhdWxAcGF1 bC1tb29yZS5jb20ACgkQ6iDy2pc3iXOAtRAAw/lcyPoyN8ia6+PPihRtAKGUFIf5 +IdEPYfCqkGghqB7BRDl5bXOLFgpY/m/41g+xFvzJ0fhVPLa7UWB//N7yTu3OnW/ vXz1wn0EJAeDlLbPzWd6V/SpcxJ1WPzjHj2B3YXNWnukfMjCnPIA8XlZc18zAWS1 /OOEBoOo/a/8Giw2l1bEXxfmDI20NrXNL3vWKQ+Bbhg2PJaH/FTk4DNxopt84o28 vA+cbfQcOOjeRjBuncnTp9/b244ojeM+lRSJZozGTogFIeDUp3KW1D7NHqNwyX12 seDooqLEP25vP+kQh8zH7gvacpoeDLz40bSpd+MKKj02IxKGikykWuvtlFWY3xNB o1mT4SJhh3JcewS7gh6P5aESSSgLg9zb3zMGtjHhtz+HHi/Sq7PK7xJgrnKOBNgu CLIu3L+5vJpAgrsze2tIcwRUySIzDKnfgw8Oz7zaS2lOTJ58emz00QwEioHMQufK 8gZXTvZykJAtLF19PJw+mHKu38hbdD/4vt8AFuIgJzFkjWKzaZAxUBT+3p/uaLHG 2PegjKzpCqH9vZ/HCdYI42OB8TKiPU3eBtYZ2eP3h7cdDu++tp1rf0hwHQrwE2AD PRuoCaBYOTUedbR8CV07fSSGFnZvlPnuk9yB7/eztV2thBQG28ALGxVhWadn4ap/ UIFgCs5QDRj11u8= =BQ+i -----END PGP SIGNATURE----- Merge tag 'selinux-pr-20221003' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux Pull SELinux updates from Paul Moore: "Six SELinux patches, all are simple and easily understood, but a list of the highlights is below: - Use 'grep -E' instead of 'egrep' in the SELinux policy install script. Fun fact, this seems to be GregKH's *second* dedicated SELinux patch since we transitioned to git (ignoring merges, the SPDX stuff, and a trivial fs reference removal when lustre was yanked); the first was back in 2011 when selinuxfs was placed in /sys/fs/selinux. Oh, the memories ... - Convert the SELinux policy boolean values to use signed integer types throughout the SELinux kernel code. Prior to this we were using a mix of signed and unsigned integers which was probably okay in this particular case, but it is definitely not a good idea in general. - Remove a reference to the SELinux runtime disable functionality in /etc/selinux/config as we are in the process of deprecating that. See [1] for more background on this if you missed the previous notes on the deprecation. - Minor cleanups: remove unneeded variables and function parameter constification" Link: https://github.com/SELinuxProject/selinux-kernel/wiki/DEPRECATE-runtime-disable [1] * tag 'selinux-pr-20221003' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux: selinux: remove runtime disable message in the install_policy.sh script selinux: use "grep -E" instead of "egrep" selinux: remove the unneeded result variable selinux: declare read-only parameters const selinux: use int arrays for boolean values selinux: remove an unneeded variable in sel_make_class_dir_entries() |
||
![]() |
09b71adab0 |
selinux: remove the unneeded result variable
Return the value avc_has_perm() directly instead of storing it in another redundant variable. Reported-by: Zeal Robot <zealci@zte.com.cn> Signed-off-by: Xu Panda <xu.panda@zte.com.cn> [PM: subject line tweak] Signed-off-by: Paul Moore <paul@paul-moore.com> |
||
![]() |
c8e477c649 |
->getprocattr(): attribute name is const char *, TYVM...
cast of ->d_name.name to char * is completely wrong - nothing is allowed to modify its contents. Reviewed-by: Christian Brauner (Microsoft) <brauner@kernel.org> Acked-by: Paul Moore <paul@paul-moore.com> Acked-by: Casey Schaufler <casey@schaufler-ca.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> |
||
![]() |
f4d653dcaa |
selinux: implement the security_uring_cmd() LSM hook
Add a SELinux access control for the iouring IORING_OP_URING_CMD
command. This includes the addition of a new permission in the
existing "io_uring" object class: "cmd". The subject of the new
permission check is the domain of the process requesting access, the
object is the open file which points to the device/file that is the
target of the IORING_OP_URING_CMD operation. A sample policy rule
is shown below:
allow <domain> <file>:io_uring { cmd };
Cc: stable@vger.kernel.org
Fixes:
|
||
![]() |
ed5d44d42c |
selinux: Implement userns_create hook
Unprivileged user namespace creation is an intended feature to enable sandboxing, however this feature is often used to as an initial step to perform a privilege escalation attack. This patch implements a new user_namespace { create } access control permission to restrict which domains allow or deny user namespace creation. This is necessary for system administrators to quickly protect their systems while waiting for vulnerability patches to be applied. This permission can be used in the following way: allow domA_t domA_t : user_namespace { create }; Signed-off-by: Frederick Lawler <fred@cloudflare.com> Signed-off-by: Paul Moore <paul@paul-moore.com> |
||
![]() |
79802ada87 |
selinux/stable-6.0 PR 20220801
-----BEGIN PGP SIGNATURE----- iQJIBAABCAAyFiEES0KozwfymdVUl37v6iDy2pc3iXMFAmLoEeIUHHBhdWxAcGF1 bC1tb29yZS5jb20ACgkQ6iDy2pc3iXNSOhAAwWwRcmcHnk+k2agT9QjKrLo26NCO MQLE89o4y2ChEFHxC7F7SKoQRxtfYa323p1vmlGzKrlB+IZ6oqERVp4QNQQbXsfn n9VvVpxjRNHAetcRhCM9ZOchWjUdw6AMaJ8e3fdRNRESadAUUFDxifw1wpjgG9+i LmtDbfZ7vLs2grTf9OZy3JIl1VF3lVRUTI7ZBQggfJncMa+LXNWdVNmEe3yfyboA 1MwpSao7K2si0hBGAQo/UGQz4b19Tm4xMg8bSy7oTsP5Lae5ciPkeI3qazvs9usp WScZYhQ8NugqLbDbjs7dm6QCpj4x3dUs6ei48LKe3GF2mcGesFfOPo9sNHao4kKv C9t0f9qw+EhGvnNL7uQIDDf8OuTjuLWDvZSrMLID/IJKFF5NJ3y+XzaS9aPM3VEY qyOsX+cEzheXGhD6xE1sCo+AyPUDYqNDMIKBj2wlIGCKlzDGa8RT6VsQuvgf3c3K 43CnRCQeWDWOHCq3MnRe/fmYtW+JB7tsXiKAq4OJADacwPP36bsP3bqU8AlWYwDt tnuMa+LKusHnMEQpMPI8FW8qGdxwGSen+mymfLFIMgtwNGkV7WGRJ6Lbyn0SaR6v HyXgZASIOQRnamK3yZCDpxo0K81IVxPWJIjHyg53znqT5TCpXccPyV4HwbJKI/KG 8PtHrXOdPOGCZ2g= =WWq1 -----END PGP SIGNATURE----- Merge tag 'selinux-pr-20220801' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux Pull selinux updates from Paul Moore: "A relatively small set of patches for SELinux this time, eight patches in total with really only one significant change. The highlights are: - Add support for proper labeling of memfd_secret anonymous inodes. This will allow LSMs that implement the anonymous inode hooks to apply security policy to memfd_secret() fds. - Various small improvements to memory management: fixed leaks, freed memory when needed, boundary checks. - Hardened the selinux_audit_data struct with __randomize_layout. - A minor documentation tweak to fix a formatting/style issue" * tag 'selinux-pr-20220801' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux: selinux: selinux_add_opt() callers free memory selinux: Add boundary check in put_entry() selinux: fix memleak in security_read_state_kernel() docs: selinux: add '=' signs to kernel boot options mm: create security context for memfd_secret inodes selinux: fix typos in comments selinux: drop unnecessary NULL check selinux: add __randomize_layout to selinux_audit_data |
||
![]() |
ef54ccb616 |
selinux: selinux_add_opt() callers free memory
The selinux_add_opt() function may need to allocate memory for the mount options if none has already been allocated, but there is no need to free that memory on error as the callers handle that. Drop the existing kfree() on error to help increase consistency in the selinux_add_opt() error handling. This patch also changes selinux_add_opt() to return -EINVAL when the mount option value, @s, is NULL. It currently return -ENOMEM. Link: https://lore.kernel.org/lkml/20220611090550.135674-1-xiujianfeng@huawei.com/T/ Suggested-by: Paul Moore <paul@paul-moore.com> Signed-off-by: Xiu Jianfeng <xiujianfeng@huawei.com> [PM: fix subject, rework commit description language] Signed-off-by: Paul Moore <paul@paul-moore.com> |
||
![]() |
cad140d008 |
selinux: free contexts previously transferred in selinux_add_opt()
`selinux_add_opt()` stopped taking ownership of the passed context since commit |
||
![]() |
9691e4f9ba |
selinux: fix typos in comments
Signed-off-by: Jonas Lindner <jolindner@gmx.de> [PM: fixed duplicated subject line] Signed-off-by: Paul Moore <paul@paul-moore.com> |
||
![]() |
4d3d0ed60e |
selinux: drop unnecessary NULL check
Commit
|
||
![]() |
c29722fad4 |
selinux: log anon inode class name
Log the anonymous inode class name in the security hook inode_init_security_anon. This name is the key for name based type transitions on the anon_inode security class on creation. Example: type=AVC msg=audit(02/16/22 22:02:50.585:216) : avc: granted \ { create } for pid=2136 comm=mariadbd anonclass=[io_uring] \ scontext=system_u:system_r:mysqld_t:s0 \ tcontext=system_u:system_r:mysqld_iouring_t:s0 tclass=anon_inode Add a new LSM audit data type holding the inode and the class name. Signed-off-by: Christian Göttsche <cgzones@googlemail.com> [PM: adjusted 'anonclass' to be a trusted string, cgzones approved] Signed-off-by: Paul Moore <paul@paul-moore.com> |
||
![]() |
81200b0265 |
selinux: checkreqprot is deprecated, add some ssleep() discomfort
The checkreqprot functionality was disabled by default back in
Linux v4.4 (2015) with commit
|
||
![]() |
0a9876f36b |
selinux: Remove redundant assignments
Get rid of redundant assignments which end up in values not being read either because they are overwritten or the function ends. Reported by clang-tidy [deadcode.DeadStores] Signed-off-by: Michal Orzel <michalorzel.eng@gmail.com> Signed-off-by: Paul Moore <paul@paul-moore.com> |
||
![]() |
1930a6e739 |
ptrace: Cleanups for v5.18
This set of changes removes tracehook.h, moves modification of all of the ptrace fields inside of siglock to remove races, adds a missing permission check to ptrace.c The removal of tracehook.h is quite significant as it has been a major source of confusion in recent years. Much of that confusion was around task_work and TIF_NOTIFY_SIGNAL (which I have now decoupled making the semantics clearer). For people who don't know tracehook.h is a vestiage of an attempt to implement uprobes like functionality that was never fully merged, and was later superseeded by uprobes when uprobes was merged. For many years now we have been removing what tracehook functionaly a little bit at a time. To the point where now anything left in tracehook.h is some weird strange thing that is difficult to understand. Eric W. Biederman (15): ptrace: Move ptrace_report_syscall into ptrace.h ptrace/arm: Rename tracehook_report_syscall report_syscall ptrace: Create ptrace_report_syscall_{entry,exit} in ptrace.h ptrace: Remove arch_syscall_{enter,exit}_tracehook ptrace: Remove tracehook_signal_handler task_work: Remove unnecessary include from posix_timers.h task_work: Introduce task_work_pending task_work: Call tracehook_notify_signal from get_signal on all architectures task_work: Decouple TIF_NOTIFY_SIGNAL and task_work signal: Move set_notify_signal and clear_notify_signal into sched/signal.h resume_user_mode: Remove #ifdef TIF_NOTIFY_RESUME in set_notify_resume resume_user_mode: Move to resume_user_mode.h tracehook: Remove tracehook.h ptrace: Move setting/clearing ptrace_message into ptrace_stop ptrace: Return the signal to continue with from ptrace_stop Jann Horn (1): ptrace: Check PTRACE_O_SUSPEND_SECCOMP permission on PTRACE_SEIZE Yang Li (1): ptrace: Remove duplicated include in ptrace.c MAINTAINERS | 1 - arch/Kconfig | 5 +- arch/alpha/kernel/ptrace.c | 5 +- arch/alpha/kernel/signal.c | 4 +- arch/arc/kernel/ptrace.c | 5 +- arch/arc/kernel/signal.c | 4 +- arch/arm/kernel/ptrace.c | 12 +- arch/arm/kernel/signal.c | 4 +- arch/arm64/kernel/ptrace.c | 14 +-- arch/arm64/kernel/signal.c | 4 +- arch/csky/kernel/ptrace.c | 5 +- arch/csky/kernel/signal.c | 4 +- arch/h8300/kernel/ptrace.c | 5 +- arch/h8300/kernel/signal.c | 4 +- arch/hexagon/kernel/process.c | 4 +- arch/hexagon/kernel/signal.c | 1 - arch/hexagon/kernel/traps.c | 6 +- arch/ia64/kernel/process.c | 4 +- arch/ia64/kernel/ptrace.c | 6 +- arch/ia64/kernel/signal.c | 1 - arch/m68k/kernel/ptrace.c | 5 +- arch/m68k/kernel/signal.c | 4 +- arch/microblaze/kernel/ptrace.c | 5 +- arch/microblaze/kernel/signal.c | 4 +- arch/mips/kernel/ptrace.c | 5 +- arch/mips/kernel/signal.c | 4 +- arch/nds32/include/asm/syscall.h | 2 +- arch/nds32/kernel/ptrace.c | 5 +- arch/nds32/kernel/signal.c | 4 +- arch/nios2/kernel/ptrace.c | 5 +- arch/nios2/kernel/signal.c | 4 +- arch/openrisc/kernel/ptrace.c | 5 +- arch/openrisc/kernel/signal.c | 4 +- arch/parisc/kernel/ptrace.c | 7 +- arch/parisc/kernel/signal.c | 4 +- arch/powerpc/kernel/ptrace/ptrace.c | 8 +- arch/powerpc/kernel/signal.c | 4 +- arch/riscv/kernel/ptrace.c | 5 +- arch/riscv/kernel/signal.c | 4 +- arch/s390/include/asm/entry-common.h | 1 - arch/s390/kernel/ptrace.c | 1 - arch/s390/kernel/signal.c | 5 +- arch/sh/kernel/ptrace_32.c | 5 +- arch/sh/kernel/signal_32.c | 4 +- arch/sparc/kernel/ptrace_32.c | 5 +- arch/sparc/kernel/ptrace_64.c | 5 +- arch/sparc/kernel/signal32.c | 1 - arch/sparc/kernel/signal_32.c | 4 +- arch/sparc/kernel/signal_64.c | 4 +- arch/um/kernel/process.c | 4 +- arch/um/kernel/ptrace.c | 5 +- arch/x86/kernel/ptrace.c | 1 - arch/x86/kernel/signal.c | 5 +- arch/x86/mm/tlb.c | 1 + arch/xtensa/kernel/ptrace.c | 5 +- arch/xtensa/kernel/signal.c | 4 +- block/blk-cgroup.c | 2 +- fs/coredump.c | 1 - fs/exec.c | 1 - fs/io-wq.c | 6 +- fs/io_uring.c | 11 +- fs/proc/array.c | 1 - fs/proc/base.c | 1 - include/asm-generic/syscall.h | 2 +- include/linux/entry-common.h | 47 +------- include/linux/entry-kvm.h | 2 +- include/linux/posix-timers.h | 1 - include/linux/ptrace.h | 81 ++++++++++++- include/linux/resume_user_mode.h | 64 ++++++++++ include/linux/sched/signal.h | 17 +++ include/linux/task_work.h | 5 + include/linux/tracehook.h | 226 ----------------------------------- include/uapi/linux/ptrace.h | 2 +- kernel/entry/common.c | 19 +-- kernel/entry/kvm.c | 9 +- kernel/exit.c | 3 +- kernel/livepatch/transition.c | 1 - kernel/ptrace.c | 47 +++++--- kernel/seccomp.c | 1 - kernel/signal.c | 62 +++++----- kernel/task_work.c | 4 +- kernel/time/posix-cpu-timers.c | 1 + mm/memcontrol.c | 2 +- security/apparmor/domain.c | 1 - security/selinux/hooks.c | 1 - 85 files changed, 372 insertions(+), 495 deletions(-) Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEgjlraLDcwBA2B+6cC/v6Eiajj0AFAmJCQkoACgkQC/v6Eiaj j0DCWQ/5AZVFU+hX32obUNCLackHTwgcCtSOs3JNBmNA/zL/htPiYYG0ghkvtlDR Dw5J5DnxC6P7PVAdAqrpvx2uX2FebHYU0bRlyLx8LYUEP5dhyNicxX9jA882Z+vw Ud0Ue9EojwGWS76dC9YoKUj3slThMATbhA2r4GVEoof8fSNJaBxQIqath44t0FwU DinWa+tIOvZANGBZr6CUUINNIgqBIZCH/R4h6ArBhMlJpuQ5Ufk2kAaiWFwZCkX4 0LuuAwbKsCKkF8eap5I2KrIg/7zZVgxAg9O3cHOzzm8OPbKzRnNnQClcDe8perqp S6e/f3MgpE+eavd1EiLxevZ660cJChnmikXVVh8ZYYoefaMKGqBaBSsB38bNcLjY 3+f2dB+TNBFRnZs1aCujK3tWBT9QyjZDKtCBfzxDNWBpXGLhHH6j6lA5Lj+Cef5K /HNHFb+FuqedlFZh5m1Y+piFQ70hTgCa2u8b+FSOubI2hW9Zd+WzINV0ANaZ2LvZ 4YGtcyDNk1q1+c87lxP9xMRl/xi6rNg+B9T2MCo4IUnHgpSVP6VEB3osgUmrrrN0 eQlUI154G/AaDlqXLgmn1xhRmlPGfmenkxpok1AuzxvNJsfLKnpEwQSc13g3oiZr disZQxNY0kBO2Nv3G323Z6PLinhbiIIFez6cJzK5v0YJ2WtO3pY= =uEro -----END PGP SIGNATURE----- Merge tag 'ptrace-cleanups-for-v5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace Pull ptrace cleanups from Eric Biederman: "This set of changes removes tracehook.h, moves modification of all of the ptrace fields inside of siglock to remove races, adds a missing permission check to ptrace.c The removal of tracehook.h is quite significant as it has been a major source of confusion in recent years. Much of that confusion was around task_work and TIF_NOTIFY_SIGNAL (which I have now decoupled making the semantics clearer). For people who don't know tracehook.h is a vestiage of an attempt to implement uprobes like functionality that was never fully merged, and was later superseeded by uprobes when uprobes was merged. For many years now we have been removing what tracehook functionaly a little bit at a time. To the point where anything left in tracehook.h was some weird strange thing that was difficult to understand" * tag 'ptrace-cleanups-for-v5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: ptrace: Remove duplicated include in ptrace.c ptrace: Check PTRACE_O_SUSPEND_SECCOMP permission on PTRACE_SEIZE ptrace: Return the signal to continue with from ptrace_stop ptrace: Move setting/clearing ptrace_message into ptrace_stop tracehook: Remove tracehook.h resume_user_mode: Move to resume_user_mode.h resume_user_mode: Remove #ifdef TIF_NOTIFY_RESUME in set_notify_resume signal: Move set_notify_signal and clear_notify_signal into sched/signal.h task_work: Decouple TIF_NOTIFY_SIGNAL and task_work task_work: Call tracehook_notify_signal from get_signal on all architectures task_work: Introduce task_work_pending task_work: Remove unnecessary include from posix_timers.h ptrace: Remove tracehook_signal_handler ptrace: Remove arch_syscall_{enter,exit}_tracehook ptrace: Create ptrace_report_syscall_{entry,exit} in ptrace.h ptrace/arm: Rename tracehook_report_syscall report_syscall ptrace: Move ptrace_report_syscall into ptrace.h |
||
![]() |
355f841a3f |
tracehook: Remove tracehook.h
Now that all of the definitions have moved out of tracehook.h into ptrace.h, sched/signal.h, resume_user_mode.h there is nothing left in tracehook.h so remove it. Update the few files that were depending upon tracehook.h to bring in definitions to use the headers they need directly. Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lkml.kernel.org/r/20220309162454.123006-13-ebiederm@xmission.com Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> |
||
![]() |
65881e1db4 |
selinux: allow FIOCLEX and FIONCLEX with policy capability
These ioctls are equivalent to fcntl(fd, F_SETFD, flags), which SELinux always allows too. Furthermore, a failed FIOCLEX could result in a file descriptor being leaked to a process that should not have access to it. As this patch removes access controls, a policy capability needs to be enabled in policy to always allow these ioctls. Based-on-patch-by: Demi Marie Obenour <demiobenour@gmail.com> Signed-off-by: Richard Haines <richard_c_haines@btinternet.com> [PM: subject line tweak] Signed-off-by: Paul Moore <paul@paul-moore.com> |
||
![]() |
5ea33af9d4 |
selinux: drop return statement at end of void functions
Those return statements at the end of a void function are redundant. Reported by clang-tidy [readability-redundant-control-flow] Signed-off-by: Christian Göttsche <cgzones@googlemail.com> Signed-off-by: Paul Moore <paul@paul-moore.com> |
||
![]() |
3eb8eaf2ca |
security: implement sctp_assoc_established hook in selinux
Do this by extracting the peer labeling per-association logic from
selinux_sctp_assoc_request() into a new helper
selinux_sctp_process_new_assoc() and use this helper in both
selinux_sctp_assoc_request() and selinux_sctp_assoc_established(). This
ensures that the peer labeling behavior as documented in
Documentation/security/SCTP.rst is applied both on the client and server
side:
"""
An SCTP socket will only have one peer label assigned to it. This will be
assigned during the establishment of the first association. Any further
associations on this socket will have their packet peer label compared to
the sockets peer label, and only if they are different will the
``association`` permission be validated. This is validated by checking the
socket peer sid against the received packets peer sid to determine whether
the association should be allowed or denied.
"""
At the same time, it also ensures that the peer label of the association
is set to the correct value, such that if it is peeled off into a new
socket, the socket's peer label will then be set to the association's
peer label, same as it already works on the server side.
While selinux_inet_conn_established() (which we are replacing by
selinux_sctp_assoc_established() for SCTP) only deals with assigning a
peer label to the connection (socket), in case of SCTP we need to also
copy the (local) socket label to the association, so that
selinux_sctp_sk_clone() can then pick it up for the new socket in case
of SCTP peeloff.
Careful readers will notice that the selinux_sctp_process_new_assoc()
helper also includes the "IPv4 packet received over an IPv6 socket"
check, even though it hadn't been in selinux_sctp_assoc_request()
before. While such check is not necessary in
selinux_inet_conn_request() (because struct request_sock's family field
is already set according to the skb's family), here it is needed, as we
don't have request_sock and we take the initial family from the socket.
In selinux_sctp_assoc_established() it is similarly needed as well (and
also selinux_inet_conn_established() already has it).
Fixes:
|
||
![]() |
70f4169ab4 |
selinux: parse contexts for mount options early
Commit
|
||
![]() |
0e326df069 |
selinux: various sparse fixes
When running the SELinux code through sparse, there are a handful of warnings. This patch resolves some of these warnings caused by "__rcu" mismatches. % make W=1 C=1 security/selinux/ Signed-off-by: Paul Moore <paul@paul-moore.com> |
||
![]() |
6bc1968c14 |
selinux: try to use preparsed sid before calling parse_sid()
Avoid unnecessary parsing of sids that have already been parsed via selinux_sb_eat_lsm_opts(). Signed-off-by: Scott Mayhew <smayhew@redhat.com> Signed-off-by: Paul Moore <paul@paul-moore.com> |
||
![]() |
b8b87fd954 |
selinux: Fix selinux_sb_mnt_opts_compat()
selinux_sb_mnt_opts_compat() is called under the sb_lock spinlock and shouldn't be performing any memory allocations. Fix this by parsing the sids at the same time we're chopping up the security mount options string and then using the pre-parsed sids when doing the comparison. Fixes: |
||
![]() |
ecff30575b |
LSM: general protection fault in legacy_parse_param
The usual LSM hook "bail on fail" scheme doesn't work for cases where a security module may return an error code indicating that it does not recognize an input. In this particular case Smack sees a mount option that it recognizes, and returns 0. A call to a BPF hook follows, which returns -ENOPARAM, which confuses the caller because Smack has processed its data. The SELinux hook incorrectly returns 1 on success. There was a time when this was correct, however the current expectation is that it return 0 on success. This is repaired. Reported-by: syzbot+d1e3b1d92d25abf97943@syzkaller.appspotmail.com Signed-off-by: Casey Schaufler <casey@schaufler-ca.com> Acked-by: James Morris <jamorris@linux.microsoft.com> Signed-off-by: Paul Moore <paul@paul-moore.com> |
||
![]() |
cdeea45422 |
selinux: fix a type cast problem in cred_init_security()
In the process of removing an explicit type cast to preserve a cred
const qualifier in cred_init_security() we ran into a problem where
the task_struct::real_cred field is defined with the "__rcu"
attribute but the selinux_cred() function parameter is not, leading
to a sparse warning:
security/selinux/hooks.c:216:36: sparse: sparse:
incorrect type in argument 1 (different address spaces)
@@ expected struct cred const *cred
@@ got struct cred const [noderef] __rcu *real_cred
As we don't want to add the "__rcu" attribute to the selinux_cred()
parameter, we're going to add an explicit cast back to
cred_init_security().
Fixes:
|
||
![]() |
b084e189b0 |
selinux: simplify cred_init_security
The parameter of selinux_cred() is declared const, so an explicit cast dropping the const qualifier is not necessary. Without the cast the local variable cred serves no purpose. Reported by clang [-Wcast-qual] Signed-off-by: Christian Göttsche <cgzones@googlemail.com> Signed-off-by: Paul Moore <paul@paul-moore.com> |
||
![]() |
0266c25e7c |
selinux: access superblock_security_struct in LSM blob way
LSM blob has been involved for superblock's security struct. So fix the remaining direct access to sb->s_security by using the LSM blob mechanism. Fixes: |
||
![]() |
a135ce4400 |
selinux/stable-5.17 PR 20220110
-----BEGIN PGP SIGNATURE----- iQJIBAABCAAyFiEES0KozwfymdVUl37v6iDy2pc3iXMFAmHcekAUHHBhdWxAcGF1 bC1tb29yZS5jb20ACgkQ6iDy2pc3iXNMAQ//TlHhQPEkPYZwD0EPXQIkl5IxGZfR DSg24OMzrh+x/jKVYhMXuW6BlnF10wd2ocG0DNAt9weF22PLrbDsxbzAAnv+MLXM wCzWLmgKpPpbaG57oa17LMcswDjVr8YbxXPOvyZJ/G3YsWDarH8Ezot8iNlVUkOh oqjjXTy0dyLUB/FsW7sIGWa3O18SkbI4pDQxL2vcpdDbvPtAY94kNG3bKTONK/kn OxLrjKscTtu6EuSdFhMqwcUxpfqvqUDrEfiSdNruMlT0/DixHgFlJA8WHXjn7Wpq XTbqdFrClfpmIiTrPSSszLrZAceegTdefDAf5wgfTxmcfYKiPDF9kPu5UPX1rAgO hSoXvGvPjez+RzyXmv9S292lkhV4tz/Wg0YlxZ0LUWif/CSZEyeGGoFZs8080Ukl 1C/oNrByriaGGRLVwKNGOg5x/UkP1ipnorNnAiiIhw+xkzfXm12RdisUyUgLh9+w WsfsC9BJkXMpuvfkgya5PJ677wVNou4ZtJX+2MV8CfGEKTAJp/X/HKh3tWkQFYKx 35QLVO1LD6gJZlSZZTsjZDxUyPwSd8e55GSFn1qKIHuC2jKjSkwCE7JfErIrI/W0 js6HKO2ak7oMTWjxRizANyKI/yva/DeMl+mKCC4QV0xmwlm5JsOe7mYSCNGws7AV A2qYX7S1xKbLTGI= =8H+p -----END PGP SIGNATURE----- Merge tag 'selinux-pr-20220110' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux Pull selinux updates from Paul Moore: "Nothing too significant, but five SELinux patches for v5.17 that do the following: - Harden the code through additional use of the struct_size() macro - Plug some memory leaks - Clean up the code via removal of the security_add_mnt_opt() LSM hook and minor tweaks to selinux_add_opt() - Rename security_task_getsecid_subj() to better reflect its actual behavior/use - now called security_current_getsecid_subj()" * tag 'selinux-pr-20220110' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux: selinux: minor tweaks to selinux_add_opt() selinux: fix potential memleak in selinux_add_opt() security,selinux: remove security_add_mnt_opt() selinux: Use struct_size() helper in kmalloc() lsm: security_task_getsecid_subj() -> security_current_getsecid_subj() |
||
![]() |
732bc2ff08 |
selinux: initialize proto variable in selinux_ip_postroute_compat()
Clang static analysis reports this warning
hooks.c:5765:6: warning: 4th function call argument is an uninitialized
value
if (selinux_xfrm_postroute_last(sksec->sid, skb, &ad, proto))
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
selinux_parse_skb() can return ok without setting proto. The later call
to selinux_xfrm_postroute_last() does an early check of proto and can
return ok if the garbage proto value matches. So initialize proto.
Cc: stable@vger.kernel.org
Fixes:
|
||
![]() |
6cd9d4b978 |
selinux: minor tweaks to selinux_add_opt()
Two minor edits to selinux_add_opt(): use "sizeof(*ptr)" instead of "sizeof(type)" in the kzalloc() call, and rename the "Einval" jump target to "err" for the sake of consistency. Signed-off-by: Paul Moore <paul@paul-moore.com> |
||
![]() |
2e08df3c7c |
selinux: fix potential memleak in selinux_add_opt()
This patch try to fix potential memleak in error branch.
Fixes:
|
||
![]() |
cc274ae776 |
selinux: fix sleeping function called from invalid context
selinux_sb_mnt_opts_compat() is called via sget_fc() under the sb_lock
spinlock, so it can't use GFP_KERNEL allocations:
[ 868.565200] BUG: sleeping function called from invalid context at
include/linux/sched/mm.h:230
[ 868.568246] in_atomic(): 1, irqs_disabled(): 0,
non_block: 0, pid: 4914, name: mount.nfs
[ 868.569626] preempt_count: 1, expected: 0
[ 868.570215] RCU nest depth: 0, expected: 0
[ 868.570809] Preemption disabled at:
[ 868.570810] [<0000000000000000>] 0x0
[ 868.571848] CPU: 1 PID: 4914 Comm: mount.nfs Kdump: loaded
Tainted: G W 5.16.0-rc5.2585cf9dfa #1
[ 868.573273] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009),
BIOS 1.14.0-4.fc34 04/01/2014
[ 868.574478] Call Trace:
[ 868.574844] <TASK>
[ 868.575156] dump_stack_lvl+0x34/0x44
[ 868.575692] __might_resched.cold+0xd6/0x10f
[ 868.576308] slab_pre_alloc_hook.constprop.0+0x89/0xf0
[ 868.577046] __kmalloc_track_caller+0x72/0x420
[ 868.577684] ? security_context_to_sid_core+0x48/0x2b0
[ 868.578569] kmemdup_nul+0x22/0x50
[ 868.579108] security_context_to_sid_core+0x48/0x2b0
[ 868.579854] ? _nfs4_proc_pathconf+0xff/0x110 [nfsv4]
[ 868.580742] ? nfs_reconfigure+0x80/0x80 [nfs]
[ 868.581355] security_context_str_to_sid+0x36/0x40
[ 868.581960] selinux_sb_mnt_opts_compat+0xb5/0x1e0
[ 868.582550] ? nfs_reconfigure+0x80/0x80 [nfs]
[ 868.583098] security_sb_mnt_opts_compat+0x2a/0x40
[ 868.583676] nfs_compare_super+0x113/0x220 [nfs]
[ 868.584249] ? nfs_try_mount_request+0x210/0x210 [nfs]
[ 868.584879] sget_fc+0xb5/0x2f0
[ 868.585267] nfs_get_tree_common+0x91/0x4a0 [nfs]
[ 868.585834] vfs_get_tree+0x25/0xb0
[ 868.586241] fc_mount+0xe/0x30
[ 868.586605] do_nfs4_mount+0x130/0x380 [nfsv4]
[ 868.587160] nfs4_try_get_tree+0x47/0xb0 [nfsv4]
[ 868.587724] vfs_get_tree+0x25/0xb0
[ 868.588193] do_new_mount+0x176/0x310
[ 868.588782] __x64_sys_mount+0x103/0x140
[ 868.589388] do_syscall_64+0x3b/0x90
[ 868.589935] entry_SYSCALL_64_after_hwframe+0x44/0xae
[ 868.590699] RIP: 0033:0x7f2b371c6c4e
[ 868.591239] Code: 48 8b 0d dd 71 0e 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e
0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 49 89 ca b8 a5 00
00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d aa 71
0e 00 f7 d8 64 89 01 48
[ 868.593810] RSP: 002b:00007ffc83775d88 EFLAGS: 00000246
ORIG_RAX: 00000000000000a5
[ 868.594691] RAX: ffffffffffffffda RBX: 00007ffc83775f10 RCX: 00007f2b371c6c4e
[ 868.595504] RDX: 0000555d517247a0 RSI: 0000555d51724700 RDI: 0000555d51724540
[ 868.596317] RBP: 00007ffc83775f10 R08: 0000555d51726890 R09: 0000555d51726890
[ 868.597162] R10: 0000000000000000 R11: 0000000000000246 R12: 0000555d51726890
[ 868.598005] R13: 0000000000000003 R14: 0000555d517246e0 R15: 0000555d511ac925
[ 868.598826] </TASK>
Cc: stable@vger.kernel.org
Fixes:
|
||
![]() |
52f982f00b |
security,selinux: remove security_add_mnt_opt()
Its last user has been removed in commit
|
||
![]() |
6326948f94 |
lsm: security_task_getsecid_subj() -> security_current_getsecid_subj()
The security_task_getsecid_subj() LSM hook invites misuse by allowing callers to specify a task even though the hook is only safe when the current task is referenced. Fix this by removing the task_struct argument to the hook, requiring LSM implementations to use the current task. While we are changing the hook declaration we also rename the function to security_current_getsecid_subj() in an effort to reinforce that the hook captures the subjective credentials of the current task and not an arbitrary task on the system. Reviewed-by: Serge Hallyn <serge@hallyn.com> Reviewed-by: Casey Schaufler <casey@schaufler-ca.com> Signed-off-by: Paul Moore <paul@paul-moore.com> |
||
![]() |
32a370abf1 |
net,lsm,selinux: revert the security_sctp_assoc_established() hook
This patch reverts two prior patches, |
||
![]() |
e7310c9402 |
security: implement sctp_assoc_established hook in selinux
Different from selinux_inet_conn_established(), it also gives the
secid to asoc->peer_secid in selinux_sctp_assoc_established(),
as one UDP-type socket may have more than one asocs.
Note that peer_secid in asoc will save the peer secid for this
asoc connection, and peer_sid in sksec will just keep the peer
secid for the latest connection. So the right use should be do
peeloff for UDP-type socket if there will be multiple asocs in
one socket, so that the peeloff socket has the right label for
its asoc.
v1->v2:
- call selinux_inet_conn_established() to reduce some code
duplication in selinux_sctp_assoc_established(), as Ondrej
suggested.
- when doing peeloff, it calls sock_create() where it actually
gets secid for socket from socket_sockcreate_sid(). So reuse
SECSID_WILD to ensure the peeloff socket keeps using that
secid after calling selinux_sctp_sk_clone() for client side.
Fixes:
|
||
![]() |
c081d53f97 |
security: pass asoc to sctp_assoc_request and sctp_sk_clone
This patch is to move secid and peer_secid from endpoint to association,
and pass asoc to sctp_assoc_request and sctp_sk_clone instead of ep. As
ep is the local endpoint and asoc represents a connection, and in SCTP
one sk/ep could have multiple asoc/connection, saving secid/peer_secid
for new asoc will overwrite the old asoc's.
Note that since asoc can be passed as NULL, security_sctp_assoc_request()
is moved to the place right after the new_asoc is created in
sctp_sf_do_5_1B_init() and sctp_sf_do_unexpected_init().
v1->v2:
- fix the description of selinux_netlbl_skbuff_setsid(), as Jakub noticed.
- fix the annotation in selinux_sctp_assoc_request(), as Richard Noticed.
Fixes:
|
||
![]() |
cdab10bf32 |
selinux/stable-5.16 PR 20211101
-----BEGIN PGP SIGNATURE----- iQJIBAABCAAyFiEES0KozwfymdVUl37v6iDy2pc3iXMFAmGANbAUHHBhdWxAcGF1 bC1tb29yZS5jb20ACgkQ6iDy2pc3iXNaMBAAg+9gZr0F7xiafu8JFZqZfx/AQdJ2 G2cn3le+/tXGZmF8m/+82lOaR6LeQLatgSDJNSkXWkKr0nRwseQJDbtRfvYJdn0t Ax05/Fmz6OGxQ2wgRYgaFiSrKpE5p3NhDtiLFVdkCJaQNe/8DZOc7NhBl6EjZf3x ubhl2hUiJ4AmiXGwcYhr4uKgP4nhW8OM1/OkskVi+bBMmLA8KTY9kslmIDP5E3BW 29W4qhqeLNQupY5dGMEMVcyxY9ZUWpO39q4uOaQVZrUGE7xABkj/jhnxT5gFTSlI pu8VhsYXm9KuRVveIsv0L5SZfadwoM9YAl7ki1wD3W5rHqOAte3rBTm6VmNlQwfU MqxP65Jiyxudxet5Be3/dCRH/+MDQuwBxivgmZXbeVxor2SeznVb0GDaEUC5FSHu CJIgWtQzsPJMxgAEGXN4F3QGP0htTTJni56GUPOsrf4TIBW02TT+oLTLFRIokQQL INNOfwVSRXElnCsvxsHR4oB+JZ9pJyBaAmeupcQ6jmcKiWlbLj4s+W0U0pM5h91v hmMpz7KMxrX6gVL4gB2Jj4aN3r5YRbq26NBu6D+wdwwBTeTTocaHSpAqkv4buClf uNk3cG8Hkp8TTg9cM8jYgpxMyzKH/AI/Uw3VhEa1xCiq2Ck3DgfnZvnvcRRaZevU FPgmwgqePJXGi60= =sb8J -----END PGP SIGNATURE----- Merge tag 'selinux-pr-20211101' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux Pull selinux updates from Paul Moore: - Add LSM/SELinux/Smack controls and auditing for io-uring. As usual, the individual commit descriptions have more detail, but we were basically missing two things which we're adding here: + establishment of a proper audit context so that auditing of io-uring ops works similarly to how it does for syscalls (with some io-uring additions because io-uring ops are *not* syscalls) + additional LSM hooks to enable access control points for some of the more unusual io-uring features, e.g. credential overrides. The additional audit callouts and LSM hooks were done in conjunction with the io-uring folks, based on conversations and RFC patches earlier in the year. - Fixup the binder credential handling so that the proper credentials are used in the LSM hooks; the commit description and the code comment which is removed in these patches are helpful to understand the background and why this is the proper fix. - Enable SELinux genfscon policy support for securityfs, allowing improved SELinux filesystem labeling for other subsystems which make use of securityfs, e.g. IMA. * tag 'selinux-pr-20211101' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux: security: Return xattr name from security_dentry_init_security() selinux: fix a sock regression in selinux_ip_postroute_compat() binder: use cred instead of task for getsecid binder: use cred instead of task for selinux checks binder: use euid from cred instead of using task LSM: Avoid warnings about potentially unused hook variables selinux: fix all of the W=1 build warnings selinux: make better use of the nf_hook_state passed to the NF hooks selinux: fix race condition when computing ocontext SIDs selinux: remove unneeded ipv6 hook wrappers selinux: remove the SELinux lockdown implementation selinux: enable genfscon labeling for securityfs Smack: Brutalist io_uring support selinux: add support for the io_uring access controls lsm,io_uring: add LSM hooks to io_uring io_uring: convert io_uring to the secure anon inode interface fs: add anon_inode_getfile_secure() similar to anon_inode_getfd_secure() audit: add filtering for io_uring records audit,io_uring,io-wq: add some basic audit support to io_uring audit: prepare audit_context for use in calling contexts beyond syscalls |
||
![]() |
15bf32398a |
security: Return xattr name from security_dentry_init_security()
Right now security_dentry_init_security() only supports single security label and is used by SELinux only. There are two users of this hook, namely ceph and nfs. NFS does not care about xattr name. Ceph hardcodes the xattr name to security.selinux (XATTR_NAME_SELINUX). I am making changes to fuse/virtiofs to send security label to virtiofsd and I need to send xattr name as well. I also hardcoded the name of xattr to security.selinux. Stephen Smalley suggested that it probably is a good idea to modify security_dentry_init_security() to also return name of xattr so that we can avoid this hardcoding in the callers. This patch adds a new parameter "const char **xattr_name" to security_dentry_init_security() and LSM puts the name of xattr too if caller asked for it (xattr_name != NULL). Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Christian Brauner <christian.brauner@ubuntu.com> Acked-by: James Morris <jamorris@linux.microsoft.com> [PM: fixed typos in the commit description] Signed-off-by: Paul Moore <paul@paul-moore.com> |
||
![]() |
1c73213ba9 |
selinux: fix a sock regression in selinux_ip_postroute_compat()
Unfortunately we can't rely on nf_hook_state->sk being the proper
originating socket so revert to using skb_to_full_sk(skb).
Fixes:
|
||
![]() |
52f8869337 |
binder: use cred instead of task for selinux checks
Since binder was integrated with selinux, it has passed
'struct task_struct' associated with the binder_proc
to represent the source and target of transactions.
The conversion of task to SID was then done in the hook
implementations. It turns out that there are race conditions
which can result in an incorrect security context being used.
Fix by using the 'struct cred' saved during binder_open and pass
it to the selinux subsystem.
Cc: stable@vger.kernel.org # 5.14 (need backport for earlier stables)
Fixes:
|
||
![]() |
1d1e1ded13 |
selinux: make better use of the nf_hook_state passed to the NF hooks
This patch builds on a previous SELinux/netfilter patch by Florian Westphal and makes better use of the nf_hook_state variable passed into the SELinux/netfilter hooks as well as a number of other small cleanups in the related code. Signed-off-by: Paul Moore <paul@paul-moore.com> |
||
![]() |
4342f70538 |
selinux: remove unneeded ipv6 hook wrappers
Netfilter places the protocol number the hook function is getting called from in state->pf, so we can use that instead of an extra wrapper. While at it, remove one-line wrappers too and make selinux_ip_{out,forward,postroute} useable as hook function. Signed-off-by: Florian Westphal <fw@strlen.de> Message-Id: <20211011202229.28289-1-fw@strlen.de> Signed-off-by: Paul Moore <paul@paul-moore.com> |
||
![]() |
f5d0e5e9d7 |
selinux: remove the SELinux lockdown implementation
NOTE: This patch intentionally omits any "Fixes:" metadata or stable
tagging since it removes a SELinux access control check; while
removing the control point is the right thing to do moving forward,
removing it in stable kernels could be seen as a regression.
The original SELinux lockdown implementation in
|
||
![]() |
8a764ef1bd |
selinux: enable genfscon labeling for securityfs
Add support for genfscon per-file labeling of securityfs files. This allows for separate labels and thereby access control for different files. For example a genfscon statement genfscon securityfs /integrity/ima/policy \ system_u:object_r:ima_policy_t:s0 will set a private label to the IMA policy file and thus allow to control the ability to set the IMA policy. Setting labels directly with setxattr(2), e.g. by chcon(1) or setfiles(8), is still not supported. Signed-off-by: Christian Göttsche <cgzones@googlemail.com> [PM: line width fixes in the commit description] Signed-off-by: Paul Moore <paul@paul-moore.com> |
||
![]() |
a3727a8bac |
selinux,smack: fix subjective/objective credential use mixups
Jann Horn reported a problem with commit |
||
![]() |
740b03414b |
selinux: add support for the io_uring access controls
This patch implements two new io_uring access controls, specifically support for controlling the io_uring "personalities" and IORING_SETUP_SQPOLL. Controlling the sharing of io_urings themselves is handled via the normal file/inode labeling and sharing mechanisms. The io_uring { override_creds } permission restricts which domains the subject domain can use to override it's own credentials. Granting a domain the io_uring { override_creds } permission allows it to impersonate another domain in io_uring operations. The io_uring { sqpoll } permission restricts which domains can create asynchronous io_uring polling threads. This is important from a security perspective as operations queued by this asynchronous thread inherit the credentials of the thread creator by default; if an io_uring is shared across process/domain boundaries this could result in one domain impersonating another. Controlling the creation of sqpoll threads, and the sharing of io_urings across processes, allow policy authors to restrict the ability of one domain to impersonate another via io_uring. As a quick summary, this patch adds a new object class with two permissions: io_uring { override_creds sqpoll } These permissions can be seen in the two simple policy statements below: allow domA_t domB_t : io_uring { override_creds }; allow domA_t self : io_uring { sqpoll }; Signed-off-by: Paul Moore <paul@paul-moore.com> |
||
![]() |
9e9fb7655e |
Core:
- Enable memcg accounting for various networking objects. BPF: - Introduce bpf timers. - Add perf link and opaque bpf_cookie which the program can read out again, to be used in libbpf-based USDT library. - Add bpf_task_pt_regs() helper to access user space pt_regs in kprobes, to help user space stack unwinding. - Add support for UNIX sockets for BPF sockmap. - Extend BPF iterator support for UNIX domain sockets. - Allow BPF TCP congestion control progs and bpf iterators to call bpf_setsockopt(), e.g. to switch to another congestion control algorithm. Protocols: - Support IOAM Pre-allocated Trace with IPv6. - Support Management Component Transport Protocol. - bridge: multicast: add vlan support. - netfilter: add hooks for the SRv6 lightweight tunnel driver. - tcp: - enable mid-stream window clamping (by user space or BPF) - allow data-less, empty-cookie SYN with TFO_SERVER_COOKIE_NOT_REQD - more accurate DSACK processing for RACK-TLP - mptcp: - add full mesh path manager option - add partial support for MP_FAIL - improve use of backup subflows - optimize option processing - af_unix: add OOB notification support. - ipv6: add IFLA_INET6_RA_MTU to expose MTU value advertised by the router. - mac80211: Target Wake Time support in AP mode. - can: j1939: extend UAPI to notify about RX status. Driver APIs: - Add page frag support in page pool API. - Many improvements to the DSA (distributed switch) APIs. - ethtool: extend IRQ coalesce uAPI with timer reset modes. - devlink: control which auxiliary devices are created. - Support CAN PHYs via the generic PHY subsystem. - Proper cross-chip support for tag_8021q. - Allow TX forwarding for the software bridge data path to be offloaded to capable devices. Drivers: - veth: more flexible channels number configuration. - openvswitch: introduce per-cpu upcall dispatch. - Add internet mix (IMIX) mode to pktgen. - Transparently handle XDP operations in the bonding driver. - Add LiteETH network driver. - Renesas (ravb): - support Gigabit Ethernet IP - NXP Ethernet switch (sja1105) - fast aging support - support for "H" switch topologies - traffic termination for ports under VLAN-aware bridge - Intel 1G Ethernet - support getcrosststamp() with PCIe PTM (Precision Time Measurement) for better time sync - support Credit-Based Shaper (CBS) offload, enabling HW traffic prioritization and bandwidth reservation - Broadcom Ethernet (bnxt) - support pulse-per-second output - support larger Rx rings - Mellanox Ethernet (mlx5) - support ethtool RSS contexts and MQPRIO channel mode - support LAG offload with bridging - support devlink rate limit API - support packet sampling on tunnels - Huawei Ethernet (hns3): - basic devlink support - add extended IRQ coalescing support - report extended link state - Netronome Ethernet (nfp): - add conntrack offload support - Broadcom WiFi (brcmfmac): - add WPA3 Personal with FT to supported cipher suites - support 43752 SDIO device - Intel WiFi (iwlwifi): - support scanning hidden 6GHz networks - support for a new hardware family (Bz) - Xen pv driver: - harden netfront against malicious backends - Qualcomm mobile - ipa: refactor power management and enable automatic suspend - mhi: move MBIM to WWAN subsystem interfaces Refactor: - Ambient BPF run context and cgroup storage cleanup. - Compat rework for ndo_ioctl. Old code removal: - prism54 remove the obsoleted driver, deprecated by the p54 driver. - wan: remove sbni/granch driver. Signed-off-by: Jakub Kicinski <kuba@kernel.org> -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmEukBYACgkQMUZtbf5S IrsyHA//TO8dw18NYts4n9LmlJT2naJ7yBUUSSXK/M+DtW0MQ9nnHhqzPm5uJdRl IgQTNJrW3dYzRwgqaWZqEwO1t5/FI+f87ND1Nsekg7x9tF66a6ov5WxU26TwwSba U+si/inQ/4chuQ+LxMQobqCDxaLE46I2dIoRl+YfndJ24DRzYSwAEYIPPbSdfyU+ +/l+3s4GaxO4k/hLciPAiOniyxLoUNiGUTNh+2yqRBXelSRJRKVnl+V22ANFrxRW nTEiplfVKhlPU1e4iLuRtaxDDiePHhw9I3j/lMHhfeFU2P/gKJIvz4QpGV0CAZg2 1VvDU32WEx1GQLXJbKm0KwoNRUq1QSjOyyFti+BO7ugGaYAR4gKhShOqlSYLzUtB tbtzQhSNLWOGqgmSJOztZb5kFDm2EdRSll5/lP2uyFlPkIsIp0QbscJVzNTnS74b Xz15ZOw41Z4TfWPEMWgfrx6Zkm7pPWkly+7WfUkPcHa1gftNz6tzXXxSXcXIBPdi yQ5JCzzxrM5573YHuk5YedwZpn6PiAt4A/muFGk9C6aXP60TQAOS/ppaUzZdnk4D NfOk9mj06WEULjYjPcKEuT3GGWE6kmjb8Pu0QZWKOchv7vr6oZly1EkVZqYlXELP AfhcrFeuufie8mqm0jdb4LnYaAnqyLzlb1J4Zxh9F+/IX7G3yoc= =JDGD -----END PGP SIGNATURE----- Merge tag 'net-next-5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next Pull networking updates from Jakub Kicinski: "Core: - Enable memcg accounting for various networking objects. BPF: - Introduce bpf timers. - Add perf link and opaque bpf_cookie which the program can read out again, to be used in libbpf-based USDT library. - Add bpf_task_pt_regs() helper to access user space pt_regs in kprobes, to help user space stack unwinding. - Add support for UNIX sockets for BPF sockmap. - Extend BPF iterator support for UNIX domain sockets. - Allow BPF TCP congestion control progs and bpf iterators to call bpf_setsockopt(), e.g. to switch to another congestion control algorithm. Protocols: - Support IOAM Pre-allocated Trace with IPv6. - Support Management Component Transport Protocol. - bridge: multicast: add vlan support. - netfilter: add hooks for the SRv6 lightweight tunnel driver. - tcp: - enable mid-stream window clamping (by user space or BPF) - allow data-less, empty-cookie SYN with TFO_SERVER_COOKIE_NOT_REQD - more accurate DSACK processing for RACK-TLP - mptcp: - add full mesh path manager option - add partial support for MP_FAIL - improve use of backup subflows - optimize option processing - af_unix: add OOB notification support. - ipv6: add IFLA_INET6_RA_MTU to expose MTU value advertised by the router. - mac80211: Target Wake Time support in AP mode. - can: j1939: extend UAPI to notify about RX status. Driver APIs: - Add page frag support in page pool API. - Many improvements to the DSA (distributed switch) APIs. - ethtool: extend IRQ coalesce uAPI with timer reset modes. - devlink: control which auxiliary devices are created. - Support CAN PHYs via the generic PHY subsystem. - Proper cross-chip support for tag_8021q. - Allow TX forwarding for the software bridge data path to be offloaded to capable devices. Drivers: - veth: more flexible channels number configuration. - openvswitch: introduce per-cpu upcall dispatch. - Add internet mix (IMIX) mode to pktgen. - Transparently handle XDP operations in the bonding driver. - Add LiteETH network driver. - Renesas (ravb): - support Gigabit Ethernet IP - NXP Ethernet switch (sja1105): - fast aging support - support for "H" switch topologies - traffic termination for ports under VLAN-aware bridge - Intel 1G Ethernet - support getcrosststamp() with PCIe PTM (Precision Time Measurement) for better time sync - support Credit-Based Shaper (CBS) offload, enabling HW traffic prioritization and bandwidth reservation - Broadcom Ethernet (bnxt) - support pulse-per-second output - support larger Rx rings - Mellanox Ethernet (mlx5) - support ethtool RSS contexts and MQPRIO channel mode - support LAG offload with bridging - support devlink rate limit API - support packet sampling on tunnels - Huawei Ethernet (hns3): - basic devlink support - add extended IRQ coalescing support - report extended link state - Netronome Ethernet (nfp): - add conntrack offload support - Broadcom WiFi (brcmfmac): - add WPA3 Personal with FT to supported cipher suites - support 43752 SDIO device - Intel WiFi (iwlwifi): - support scanning hidden 6GHz networks - support for a new hardware family (Bz) - Xen pv driver: - harden netfront against malicious backends - Qualcomm mobile - ipa: refactor power management and enable automatic suspend - mhi: move MBIM to WWAN subsystem interfaces Refactor: - Ambient BPF run context and cgroup storage cleanup. - Compat rework for ndo_ioctl. Old code removal: - prism54 remove the obsoleted driver, deprecated by the p54 driver. - wan: remove sbni/granch driver" * tag 'net-next-5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1715 commits) net: Add depends on OF_NET for LiteX's LiteETH ipv6: seg6: remove duplicated include net: hns3: remove unnecessary spaces net: hns3: add some required spaces net: hns3: clean up a type mismatch warning net: hns3: refine function hns3_set_default_feature() ipv6: remove duplicated 'net/lwtunnel.h' include net: w5100: check return value after calling platform_get_resource() net/mlxbf_gige: Make use of devm_platform_ioremap_resourcexxx() net: mdio: mscc-miim: Make use of the helper function devm_platform_ioremap_resource() net: mdio-ipq4019: Make use of devm_platform_ioremap_resource() fou: remove sparse errors ipv4: fix endianness issue in inet_rtm_getroute_build_skb() octeontx2-af: Set proper errorcode for IPv4 checksum errors octeontx2-af: Fix static code analyzer reported issues octeontx2-af: Fix mailbox errors in nix_rss_flowkey_cfg octeontx2-af: Fix loop in free and unmap counter af_unix: fix potential NULL deref in unix_dgram_connect() dpaa2-eth: Replace strlcpy with strscpy octeontx2-af: Use NDC TX for transmit packet data ... |
||
![]() |
bc49d8169a |
mctp: Add MCTP base
Add basic Kconfig, an initial (empty) af_mctp source object, and {AF,PF}_MCTP definitions, and the required definitions for a new protocol type. Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au> Signed-off-by: David S. Miller <davem@davemloft.net> |
||
![]() |
893c47d196 |
selinux: return early for possible NULL audit buffers
audit_log_start() may return NULL in below cases: - when audit is not initialized. - when audit backlog limit exceeds. After the call to audit_log_start() is made and then possible NULL audit buffer argument is passed to audit_log_*() functions, audit_log_*() functions return immediately in case of a NULL audit buffer argument. But it is optimal to return early when audit_log_start() returns NULL, because it is not necessary for audit_log_*() functions to be called with NULL audit buffer argument. So add exception handling for possible NULL audit buffers where return value can be handled from callers. Signed-off-by: Austin Kim <austin.kim@lge.com> [PM: tweak subject line] Signed-off-by: Paul Moore <paul@paul-moore.com> |
||
![]() |
d99cf13f14 |
selinux: kill 'flags' argument in avc_has_perm_flags() and avc_audit()
... along with avc_has_perm_flags() itself, since now it's identical to avc_has_perm() (as pointed out by Paul Moore) Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> [PM: add "selinux:" prefix to subj and tweak for length] Signed-off-by: Paul Moore <paul@paul-moore.com> |
||
![]() |
b17ec22fb3 |
selinux: slow_avc_audit has become non-blocking
dump_common_audit_data() is safe to use under rcu_read_lock() now; no need for AVC_NONBLOCKING and games around it Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Paul Moore <paul@paul-moore.com> |
||
![]() |
869cbeef18 |
lsm_audit,selinux: pass IB device name by reference
While trying to address a Coverity warning that the dev_name string might end up unterminated when strcpy'ing it in selinux_ib_endport_manage_subnet(), I realized that it is possible (and simpler) to just pass the dev_name pointer directly, rather than copying the string to a buffer. The ibendport variable goes out of scope at the end of the function anyway, so the lifetime of the dev_name pointer will never be shorter than that of ibendport, thus we can safely just pass the dev_name pointer and be done with it. Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com> Acked-by: Richard Guy Briggs <rgb@redhat.com> Signed-off-by: Paul Moore <paul@paul-moore.com> |
||
![]() |
17ae69aba8 |
Add Landlock, a new LSM from Mickaël Salaün <mic@linux.microsoft.com>
-----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEgycj0O+d1G2aycA8rZhLv9lQBTwFAmCInP4ACgkQrZhLv9lQ BTza0g//dTeb9woC9H7qlEhK4l9yk62lTss60Q8X7m7ZSNfdL4tiEbi64SgK+iOW OOegbrOEb8Kzh4KJJYmVlVZ5YUWyH4szgmee1wnylBdsWiWaPLPF3Cflz77apy6T TiiBsJd7rRE29FKheaMt34B41BMh8QHESN+DzjzJWsFoi/uNxjgSs2W16XuSupKu bpRmB1pYNXMlrkzz7taL05jndZYE5arVriqlxgAsuLOFOp/ER7zecrjImdCM/4kL W6ej0R1fz2Geh6CsLBJVE+bKWSQ82q5a4xZEkSYuQHXgZV5eywE5UKu8ssQcRgQA VmGUY5k73rfY9Ofupf2gCaf/JSJNXKO/8Xjg0zAdklKtmgFjtna5Tyg9I90j7zn+ 5swSpKuRpilN8MQH+6GWAnfqQlNoviTOpFeq3LwBtNVVOh08cOg6lko/bmebBC+R TeQPACKS0Q0gCDPm9RYoU1pMUuYgfOwVfVRZK1prgi2Co7ZBUMOvYbNoKYoPIydr ENBYljlU1OYwbzgR2nE+24fvhU8xdNOVG1xXYPAEHShu+p7dLIWRLhl8UCtRQpSR 1ofeVaJjgjrp29O+1OIQjB2kwCaRdfv/Gq1mztE/VlMU/r++E62OEzcH0aS+mnrg yzfyUdI8IFv1q6FGT9yNSifWUWxQPmOKuC8kXsKYfqfJsFwKmHM= =uCN4 -----END PGP SIGNATURE----- Merge tag 'landlock_v34' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security Pull Landlock LSM from James Morris: "Add Landlock, a new LSM from Mickaël Salaün. Briefly, Landlock provides for unprivileged application sandboxing. From Mickaël's cover letter: "The goal of Landlock is to enable to restrict ambient rights (e.g. global filesystem access) for a set of processes. Because Landlock is a stackable LSM [1], it makes possible to create safe security sandboxes as new security layers in addition to the existing system-wide access-controls. This kind of sandbox is expected to help mitigate the security impact of bugs or unexpected/malicious behaviors in user-space applications. Landlock empowers any process, including unprivileged ones, to securely restrict themselves. Landlock is inspired by seccomp-bpf but instead of filtering syscalls and their raw arguments, a Landlock rule can restrict the use of kernel objects like file hierarchies, according to the kernel semantic. Landlock also takes inspiration from other OS sandbox mechanisms: XNU Sandbox, FreeBSD Capsicum or OpenBSD Pledge/Unveil. In this current form, Landlock misses some access-control features. This enables to minimize this patch series and ease review. This series still addresses multiple use cases, especially with the combined use of seccomp-bpf: applications with built-in sandboxing, init systems, security sandbox tools and security-oriented APIs [2]" The cover letter and v34 posting is here: https://lore.kernel.org/linux-security-module/20210422154123.13086-1-mic@digikod.net/ See also: https://landlock.io/ This code has had extensive design discussion and review over several years" Link: https://lore.kernel.org/lkml/50db058a-7dde-441b-a7f9-f6837fe8b69f@schaufler-ca.com/ [1] Link: https://lore.kernel.org/lkml/f646e1c7-33cf-333f-070c-0a40ad0468cd@digikod.net/ [2] * tag 'landlock_v34' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: landlock: Enable user space to infer supported features landlock: Add user and kernel documentation samples/landlock: Add a sandbox manager example selftests/landlock: Add user space tests landlock: Add syscall implementations arch: Wire up Landlock syscalls fs,security: Add sb_delete hook landlock: Support filesystem access-control LSM: Infrastructure management of the superblock landlock: Add ptrace restrictions landlock: Set up the security framework and manage credentials landlock: Add ruleset and domain management landlock: Add object management |
||
![]() |
1aea780837 |
LSM: Infrastructure management of the superblock
Move management of the superblock->sb_security blob out of the individual security modules and into the security infrastructure. Instead of allocating the blobs from within the modules, the modules tell the infrastructure how much space is required, and the space is allocated there. Cc: John Johansen <john.johansen@canonical.com> Signed-off-by: Casey Schaufler <casey@schaufler-ca.com> Signed-off-by: Mickaël Salaün <mic@linux.microsoft.com> Reviewed-by: Stephen Smalley <stephen.smalley.work@gmail.com> Acked-by: Serge Hallyn <serge@hallyn.com> Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20210422154123.13086-6-mic@digikod.net Signed-off-by: James Morris <jamorris@linux.microsoft.com> |
||
![]() |
eb1231f73c |
selinux: clarify task subjective and objective credentials
SELinux has a function, task_sid(), which returns the task's objective credentials, but unfortunately is used in a few places where the subjective task credentials should be used. Most notably in the new security_task_getsecid_subj() LSM hook. This patch fixes this and attempts to make things more obvious by introducing a new function, task_sid_subj(), and renaming the existing task_sid() function to task_sid_obj(). This patch also adds an interesting function in task_sid_binder(). The task_sid_binder() function has a comment which hopefully describes it's reason for being, but it basically boils down to the simple fact that we can't safely access another task's subjective credentials so in the case of binder we need to stick with the objective credentials regardless. Reviewed-by: Richard Guy Briggs <rgb@redhat.com> Signed-off-by: Paul Moore <paul@paul-moore.com> |
||
![]() |
4ebd7651bf |
lsm: separate security_task_getsecid() into subjective and objective variants
Of the three LSMs that implement the security_task_getsecid() LSM hook, all three LSMs provide the task's objective security credentials. This turns out to be unfortunate as most of the hook's callers seem to expect the task's subjective credentials, although a small handful of callers do correctly expect the objective credentials. This patch is the first step towards fixing the problem: it splits the existing security_task_getsecid() hook into two variants, one for the subjective creds, one for the objective creds. void security_task_getsecid_subj(struct task_struct *p, u32 *secid); void security_task_getsecid_obj(struct task_struct *p, u32 *secid); While this patch does fix all of the callers to use the correct variant, in order to keep this patch focused on the callers and to ease review, the LSMs continue to use the same implementation for both hooks. The net effect is that this patch should not change the behavior of the kernel in any way, it will be up to the latter LSM specific patches in this series to change the hook implementations and return the correct credentials. Acked-by: Mimi Zohar <zohar@linux.ibm.com> (IMA) Acked-by: Casey Schaufler <casey@schaufler-ca.com> Reviewed-by: Richard Guy Briggs <rgb@redhat.com> Signed-off-by: Paul Moore <paul@paul-moore.com> |
||
![]() |
69c4a42d72 |
lsm,selinux: add new hook to compare new mount to an existing mount
Add a new hook that takes an existing super block and a new mount with new options and determines if new options confict with an existing mount or not. A filesystem can use this new hook to determine if it can share the an existing superblock with a new superblock for the new mount. Signed-off-by: Olga Kornievskaia <kolga@netapp.com> Acked-by: Anna Schumaker <Anna.Schumaker@Netapp.com> [PM: tweak the subject line, fix tab/space problems] Signed-off-by: Paul Moore <paul@paul-moore.com> |
||
![]() |
7fa2e79a6b |
selinux: Allow context mounts for unpriviliged overlayfs
Now overlayfs allow unpriviliged mounts. That is root inside a non-init user namespace can mount overlayfs. This is being added in 5.11 kernel. Giuseppe tried to mount overlayfs with option "context" and it failed with error -EACCESS. $ su test $ unshare -rm $ mkdir -p lower upper work merged $ mount -t overlay -o lowerdir=lower,workdir=work,upperdir=upper,userxattr,context='system_u:object_r:container_file_t:s0' none merged This fails with -EACCESS. It works if option "-o context" is not specified. Little debugging showed that selinux_set_mnt_opts() returns -EACCESS. So this patch adds "overlay" to the list, where it is fine to specific context from non init_user_ns. Reported-by: Giuseppe Scrivano <gscrivan@redhat.com> Signed-off-by: Vivek Goyal <vgoyal@redhat.com> [PM: trimmed the changelog from the description] Signed-off-by: Paul Moore <paul@paul-moore.com> |
||
![]() |
7d6beb71da |
idmapped-mounts-v5.12
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCYCegywAKCRCRxhvAZXjc
ouJ6AQDlf+7jCQlQdeKKoN9QDFfMzG1ooemat36EpRRTONaGuAD8D9A4sUsG4+5f
4IU5Lj9oY4DEmF8HenbWK2ZHsesL2Qg=
=yPaw
-----END PGP SIGNATURE-----
Merge tag 'idmapped-mounts-v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux
Pull idmapped mounts from Christian Brauner:
"This introduces idmapped mounts which has been in the making for some
time. Simply put, different mounts can expose the same file or
directory with different ownership. This initial implementation comes
with ports for fat, ext4 and with Christoph's port for xfs with more
filesystems being actively worked on by independent people and
maintainers.
Idmapping mounts handle a wide range of long standing use-cases. Here
are just a few:
- Idmapped mounts make it possible to easily share files between
multiple users or multiple machines especially in complex
scenarios. For example, idmapped mounts will be used in the
implementation of portable home directories in
systemd-homed.service(8) where they allow users to move their home
directory to an external storage device and use it on multiple
computers where they are assigned different uids and gids. This
effectively makes it possible to assign random uids and gids at
login time.
- It is possible to share files from the host with unprivileged
containers without having to change ownership permanently through
chown(2).
- It is possible to idmap a container's rootfs and without having to
mangle every file. For example, Chromebooks use it to share the
user's Download folder with their unprivileged containers in their
Linux subsystem.
- It is possible to share files between containers with
non-overlapping idmappings.
- Filesystem that lack a proper concept of ownership such as fat can
use idmapped mounts to implement discretionary access (DAC)
permission checking.
- They allow users to efficiently changing ownership on a per-mount
basis without having to (recursively) chown(2) all files. In
contrast to chown (2) changing ownership of large sets of files is
instantenous with idmapped mounts. This is especially useful when
ownership of a whole root filesystem of a virtual machine or
container is changed. With idmapped mounts a single syscall
mount_setattr syscall will be sufficient to change the ownership of
all files.
- Idmapped mounts always take the current ownership into account as
idmappings specify what a given uid or gid is supposed to be mapped
to. This contrasts with the chown(2) syscall which cannot by itself
take the current ownership of the files it changes into account. It
simply changes the ownership to the specified uid and gid. This is
especially problematic when recursively chown(2)ing a large set of
files which is commong with the aforementioned portable home
directory and container and vm scenario.
- Idmapped mounts allow to change ownership locally, restricting it
to specific mounts, and temporarily as the ownership changes only
apply as long as the mount exists.
Several userspace projects have either already put up patches and
pull-requests for this feature or will do so should you decide to pull
this:
- systemd: In a wide variety of scenarios but especially right away
in their implementation of portable home directories.
https://systemd.io/HOME_DIRECTORY/
- container runtimes: containerd, runC, LXD:To share data between
host and unprivileged containers, unprivileged and privileged
containers, etc. The pull request for idmapped mounts support in
containerd, the default Kubernetes runtime is already up for quite
a while now: https://github.com/containerd/containerd/pull/4734
- The virtio-fs developers and several users have expressed interest
in using this feature with virtual machines once virtio-fs is
ported.
- ChromeOS: Sharing host-directories with unprivileged containers.
I've tightly synced with all those projects and all of those listed
here have also expressed their need/desire for this feature on the
mailing list. For more info on how people use this there's a bunch of
talks about this too. Here's just two recent ones:
https://www.cncf.io/wp-content/uploads/2020/12/Rootless-Containers-in-Gitpod.pdf
https://fosdem.org/2021/schedule/event/containers_idmap/
This comes with an extensive xfstests suite covering both ext4 and
xfs:
https://git.kernel.org/brauner/xfstests-dev/h/idmapped_mounts
It covers truncation, creation, opening, xattrs, vfscaps, setid
execution, setgid inheritance and more both with idmapped and
non-idmapped mounts. It already helped to discover an unrelated xfs
setgid inheritance bug which has since been fixed in mainline. It will
be sent for inclusion with the xfstests project should you decide to
merge this.
In order to support per-mount idmappings vfsmounts are marked with
user namespaces. The idmapping of the user namespace will be used to
map the ids of vfs objects when they are accessed through that mount.
By default all vfsmounts are marked with the initial user namespace.
The initial user namespace is used to indicate that a mount is not
idmapped. All operations behave as before and this is verified in the
testsuite.
Based on prior discussions we want to attach the whole user namespace
and not just a dedicated idmapping struct. This allows us to reuse all
the helpers that already exist for dealing with idmappings instead of
introducing a whole new range of helpers. In addition, if we decide in
the future that we are confident enough to enable unprivileged users
to setup idmapped mounts the permission checking can take into account
whether the caller is privileged in the user namespace the mount is
currently marked with.
The user namespace the mount will be marked with can be specified by
passing a file descriptor refering to the user namespace as an
argument to the new mount_setattr() syscall together with the new
MOUNT_ATTR_IDMAP flag. The system call follows the openat2() pattern
of extensibility.
The following conditions must be met in order to create an idmapped
mount:
- The caller must currently have the CAP_SYS_ADMIN capability in the
user namespace the underlying filesystem has been mounted in.
- The underlying filesystem must support idmapped mounts.
- The mount must not already be idmapped. This also implies that the
idmapping of a mount cannot be altered once it has been idmapped.
- The mount must be a detached/anonymous mount, i.e. it must have
been created by calling open_tree() with the OPEN_TREE_CLONE flag
and it must not already have been visible in the filesystem.
The last two points guarantee easier semantics for userspace and the
kernel and make the implementation significantly simpler.
By default vfsmounts are marked with the initial user namespace and no
behavioral or performance changes are observed.
The manpage with a detailed description can be found here:
|
||
![]() |
71bc356f93
|
commoncap: handle idmapped mounts
When interacting with user namespace and non-user namespace aware filesystem capabilities the vfs will perform various security checks to determine whether or not the filesystem capabilities can be used by the caller, whether they need to be removed and so on. The main infrastructure for this resides in the capability codepaths but they are called through the LSM security infrastructure even though they are not technically an LSM or optional. This extends the existing security hooks security_inode_removexattr(), security_inode_killpriv(), security_inode_getsecurity() to pass down the mount's user namespace and makes them aware of idmapped mounts. In order to actually get filesystem capabilities from disk the capability infrastructure exposes the get_vfs_caps_from_disk() helper. For user namespace aware filesystem capabilities a root uid is stored alongside the capabilities. In order to determine whether the caller can make use of the filesystem capability or whether it needs to be ignored it is translated according to the superblock's user namespace. If it can be translated to uid 0 according to that id mapping the caller can use the filesystem capabilities stored on disk. If we are accessing the inode that holds the filesystem capabilities through an idmapped mount we map the root uid according to the mount's user namespace. Afterwards the checks are identical to non-idmapped mounts: reading filesystem caps from disk enforces that the root uid associated with the filesystem capability must have a mapping in the superblock's user namespace and that the caller is either in the same user namespace or is a descendant of the superblock's user namespace. For filesystems that are mountable inside user namespace the caller can just mount the filesystem and won't usually need to idmap it. If they do want to idmap it they can create an idmapped mount and mark it with a user namespace they created and which is thus a descendant of s_user_ns. For filesystems that are not mountable inside user namespaces the descendant rule is trivially true because the s_user_ns will be the initial user namespace. If the initial user namespace is passed nothing changes so non-idmapped mounts will see identical behavior as before. Link: https://lore.kernel.org/r/20210121131959.646623-11-christian.brauner@ubuntu.com Cc: Christoph Hellwig <hch@lst.de> Cc: David Howells <dhowells@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: linux-fsdevel@vger.kernel.org Reviewed-by: Christoph Hellwig <hch@lst.de> Acked-by: James Morris <jamorris@linux.microsoft.com> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com> |