After two months of development, Linus Torvalds has released the Linux 5.15 kernel . Notable changes include: new NTFS driver with write support, ksmbd module with SMB server implementation, DAMON subsystem for monitoring memory access, locking primitives for real-time mode, fs-verity support in Btrfs, process_mrelease system call for shortage response systems memory, remote attestation module dm-ima.
The new version received 13,499 fixes from 1888 developers, the size of the patch is 42 MB (changes affected 10,895 files, 632,522 lines of code added, 299,966 lines removed). About 45% of all changes introduced in 5.15 are related to device drivers, approximately 14% of changes are related to updating code specific to hardware architectures, 14% are related to the network stack, 6% are related to filesystems, and 3% are related to internal kernel subsystems.
Major Changes in Linux 5.15 Kernel :
- Disk subsystem, I / O and file systems
- The kernel adopted a new implementation of the NTFS file system, opened by Paragon Software. The new driver can work in write mode and supports all the features of the current version of NTFS 3.1, including extended file attributes, access lists (ACL), data compression mode, efficient work with voids in files (sparse) and replay of changes from the log to restore integrity after failures …
- The Btrfs file system supports the fs-verity mechanism, which is used to transparently control the integrity and authenticity of individual files using cryptographic hashes or keys stored in the metadata area associated with the files. Previously fs-verity was only available for FS Ext4 and F2fs.Btrfs also adds support for mapping user IDs for mounted filesystems (previously supported for FAT, ext4 and XFS filesystems). This feature allows you to associate files of a certain user on a mounted foreign partition with another user on the current system.Other changes in Btrfs include: speeding up the addition of keys to the directory index to improve file creation performance; the ability to work raid0 with one device, and raid10 with two (for example, in the process of reconfiguring an array); option “rescue = ibadroots” to ignore an invalid extent tree; acceleration of the “send” operation; reducing lock conflicts during rename operations; the ability to use 4K sectors on systems with a 64K memory page size.
- XFS stabilizes the ability to use dates after 2038 in FS. The mechanism of deferred inode deactivation and support for deferred installation and removal of file attributes has been implemented. In order to avoid problems, the ability to disable disk quotas for already mounted partitions has been removed (you can forcibly disable quotas, but counting related to them will continue, so remounting is required to fully disable).
- In EXT4, work has been done to increase the performance of writing delalloc buffers and processing orphaned (orphan) files that continue to exist due to the fact that they remain open, but ended up without being bound to a directory. Processing of discard operations has been removed from the jbd2-stream kthread to exclude locks for operations with metadata.
- F2FS adds “discard_unit = block | segment | section” option to bind discard operations (marking freed blocks that can no longer be physically stored) to block, sector, segment, or section alignment. Added support for tracking I / O latency changes.
- The EROFS (Extendable Read-Only File System) file system adds support for direct I / O for files saved without compression, as well as support for fiemap .
- OverlayFS implements correct handling of “immutable”, “append-only”, “sync” and “noatime” mount flags.
- NFS has improved handling of situations where the NFS server becomes unresponsive. Added the ability to mount from an already used server, but accessible through a different network address.
- Preparations have begun for rewriting the FSCACHE subsystem.
- Added support for EFI partitions with non-standard GPT table layout.
- The fanotify mechanism has a new FAN_REPORT_PIDFD flag that causes pidfd to be specified as the metadata returned. Pidfd helps to handle PID reuse situations to more accurately identify processes accessing monitored files (pidfd is associated with a specific process and does not change, while a PID can be associated with another process after the current process associated with that PID terminates).
- The move_mount () system call adds the ability to add mount points to existing shared groups, which solves problems with saving and restoring process state in the CRIU toolkit when there are multiple mount spaces shared in isolated containers.
- Added protection against hidden race conditions that could potentially lead to file corruption when reading from the cache while handling voids in a file.
- Dropped support for mandatory file locks implemented by blocking system calls that lead to file changes. Due to possible race conditions, these locks were considered unreliable and were deprecated many years ago.
- Removed the LightNVM subsystem, which allowed direct access to the SSD-drive, bypassing the emulation layer. LightNVM lost its meaning after the emergence of NVMe standards providing for the possibility of zoning (ZNS, Zoned Namespace).
- Memory and system services
- The DAMON (Data Access MONitor) subsystem has been implemented , which allows you to track activity related to accessing data in RAM, in relation to a selected process running in user space. The subsystem allows you to analyze which memory areas the process has accessed during its entire operation, and which memory areas remained unclaimed. DAMON features low CPU usage, low memory usage, high accuracy and predictable constant overhead regardless of size. The subsystem can be used both by the kernel to optimize memory management and by utilities in user space to understand what the process is doing and optimize memory use, such as freeing up extra memory to the system.
- The process_mrelease system call has been implemented to speed up the process of freeing memory of a process that is terminating its execution. Under normal conditions, resource release and process termination is not instantaneous and can be delayed for various reasons, which interferes with the operation of early response systems for low memory functioning in user space, such as oomd (provided in systemd) and lmkd (used in Android). By calling process_mrelease, such systems can more predictably initiate memory reclamations from forcibly terminated processes.
- From the PREEMPT_RT kernel tree, which is developed to support the work in real time, carried variants of primitives for the organization of mutex locks, ww_mutex, rw_semaphore, spinlock and rwlock, based subsystem RT-the Mutex . Changes have been added to the SLUB slab allocator to improve the PREEMPT_RT mode and reduce the impact on interrupts.
- Added support for the SCHED_IDLE task scheduler attribute to the cgroup, which allows you to assign this attribute to all processes in a group that are part of a certain cgroup at once. Those. these processes will only be started when there are no other pending tasks on the system. Unlike setting the SCHED_IDLE attribute for each process separately, when binding SCHED_IDLE to a cgroup, the relative weight of tasks within the group is taken into account when selecting a task to execute.
- The mechanism for accounting for memory consumption in cgroup has been expanded with the ability to track additional kernel data structures, including those created for poll-ing, signal processing and namespaces.
- Added support for asymmetric scheduling of assigning tasks to processor cores on architectures in which some CPUs allow 32-bit tasks, and some only work in 64-bit mode (for example, ARM). The new mode allows only CPUs that support 32-bit tasks to be considered when scheduling 32-bit tasks.
- The io_uring asynchronous I / O interface now supports opening files directly in the fixed-file index table, without using file descriptors, which makes it possible to significantly speed up some types of operations, but goes against the traditional Unix process of using file descriptors to open files.In io_uring, a new “BIO recycling” mechanism is implemented for the BIO subsystem (Block I / O Layer), which allows to reduce the overhead in the process of managing internal memory and increase the number of processed I / O operations per second by about 10%. Io_uring also adds support for the mkdirat (), symlinkat (), and linkat () system calls.
- For BPF programs , the ability to request and process timer events has been implemented . Added an iterator for UNIX sockets, and implemented the ability to get and set socket options for setsockopt. Added support for typed data to BTF dumper.
- On NUMA systems with different types of memory that differ in performance, in a situation where free space is exhausted, preempted memory pages are transferred from DRAM to slower persistent memory instead of deleting these pages. Testing has shown that this tactic generally improves performance on similar systems. For NUMA, the ability to allocate memory pages for a process from a selected set of NUMA nodes is also implemented.
- For the ARC architecture, support for three- and four-level memory page tables has been implemented, which will further enable support for 64-bit ARC processors.
- For the s390 architecture, the ability to use the KFENCE mechanism to detect errors when working with memory has been implemented , and support for the KCSAN race conditions detector has been added.
- Added support for indexing the list of messages printed through printk (), which allows you to retrieve all such messages at once and track changes in user space.
- The mmap () discontinued support VM_DENYWRITE options and kernel code spared from use MAP_DENYWRITE mode, reducing the number of situations that lead to the blocking of records in error ETXTBSY file.
- A new type of checks “Event probes” has been added to the tracing subsystem , which can be attached to existing trace events, defining their own output format.
- When building a kernel using the Clang compiler, the LLVM project’s inline assembler is now enabled by default.
- As part of the project to get rid of the kernel code that causes the compiler to display warnings, an experiment was carried out with the “-Werror” mode enabled by default, in which compiler warnings are treated as errors. While preparing for release 5.15, Linus started accepting only changes that did not lead to warnings when building the kernel and activated the build with “-Werror”, but then agreed with the opinion that such a decision was premature and deferred the inclusion of “-Werror” by default. Controlling the inclusion of the “-Werror” flag during assembly is performed using the WERROR parameter, which is set to the COMPILE_TEST value by default, i.e. while it turns on only during test builds.
- Virtualization and security
- A new dm-ima handler has been added to Device Mapper (DM) with the implementation of a remote attestation mechanism based on the Integrity Measurement Architecture (IMA) subsystem, which allows an external service to verify the state of the kernel subsystems in order to verify their authenticity. In practice, dm-ima allows you to create, using Device Mapper, storages tied to external cloud systems, in which the validity of the launched DM target configuration is checked using IMA.
- Prctl () implements a new option PR_SPEC_L1D_FLUSH, when enabled, the kernel starts flushing the contents of the first level (L1D) cache every time a context switch is made. This mode allows selectively for the most important processes to implement additional protection against the use of side-channel attacks, carried out to determine the data that has settled in the cache as a result of vulnerabilities caused by speculative execution of instructions in the CPU. The cost of enabling PR_SPEC_L1D_FLUSH (not enabled by default) is a significant performance penalty.
- Implemented the ability to build the kernel with the addition of the “-fzero-call-used-regs = used-gpr” flag to GCC, which clears all registers before returning control from the function. This option protects against information leakage from functions and reduces the number of blocks suitable for building ROP gadgets (Return-Oriented Programming) in exploits by 20% .
- The ability to build kernels for the ARM64 architecture in the form of clients for the Hyper-V hypervisor has been implemented.
- A new framework for the development of drivers “VDUSE” has been proposed, which allows implementing virtual block devices in user space and using Virtio as a transport for access from guest systems.
- Added Virtio driver for the I2C bus, which makes it possible to emulate I2C controllers in paravirtualization mode using separate backends.
- Added the gpio-virtio Virtio driver to allow guest systems to access the GPIO lines provided by the host system.
- Added the ability to restrict access to memory pages for devices with DMA support on systems without I / O MMU (memory-management unit).
- The KVM hypervisor implements the ability to display statistics in the form of linear and logarithmic histograms.
- Network subsystem
- The ksmbd module has been added to the kernel with the implementationa file server using the SMB3 protocol. The module complements the previously available SMB client implementation in the kernel and, unlike a user-space SMB server, is more efficient in terms of performance, memory consumption, and integration with advanced kernel capabilities. Ksmbd is touted as a high-performance, embedded-device-ready extension to Samba, integrating as needed with Samba tools and libraries. Among the features of ksmbd stands out improved support for distributed file caching (SMB leases) technology on local systems, which can significantly reduce traffic. In the future, they plan to add support for RDMA (“smbdirect”) and protocol extensions related to enhancing the reliability of encryption and verification by digital signatures.
- The CIFS client has dropped support for NTLM and less secure authentication algorithms.
- The implementation of network bridging for vlan implements multicast support.
- The bonding driver, used for aggregating network interfaces, adds support for the XDP (eXpress Data Path) subsystem, which allows you to manipulate network packets at a stage before they are processed by the Linux kernel network stack.
- The mac80211 wireless stack supports 6GHZ STA (Special Temporary Authorization) in LPI, SP and VLP modes, as well as the ability to set separate TWT (Target Wake Time) in access point mode.
- Added support for Management Component Transport Protocol ( MCTP ), which is used to communicate between controllers and related devices (host processors, peripherals, etc.).
- Continued integration into the MPTCP (MultiPath TCP) core, an extension of the TCP protocol for organizing the operation of a TCP connection with the delivery of packets simultaneously along several routes through different network interfaces bound to different IP addresses. The new release adds support for fullmesh addresses .
- Netfilter adds handlers for network streams encapsulated in SRv6 (Segment Routing IPv6).
- Added sockmap support for Unix streaming sockets.
- Equipment
- The amdgpu driver supports Cyan Skillfish APUs (equipped with Navi 1x GPUs). Video codec support is implemented for APU Yellow Carp. Improved GPU Aldebaran support. Added new map IDs based on GPU Navi 24 “Beige Goby” and RDNA2. An improved implementation of virtual screens (VKMS) is proposed. Added support for monitoring the temperature of AMD Zen 3 chips.
- The amdkfd driver (for discrete GPUs such as Polaris) implements a shared virtual memory (SVM) manager based on the Heterogeneous memory management (HMM) subsystem, which allows devices with their own memory management units (MMUs) to be used, which can access the main memory. Including with the help of HMM, it is possible to organize a joint address space between the GPU and the CPU, in which the GPU can access the main memory of the process.
- The i915 driver for Intel graphics expands the use of the TTM video memory manager and includes the ability to manage power consumption based on GuC (Graphics micro Controller). Preparations have begun to implement support for Intel ARC Alchemist graphics and Intel Xe-HP GPUs.
- The nouveau driver implements eDP panel backlight control using DPCD (DisplayPort Configuration Data).
- Added support for Adreno 7c Gen 3 and Adreno 680 GPUs in the msm driver.
- The IOMMU driver is implemented for the Apple M1 chip .
- Added sound driver for systems based on AMD Van Gogh APUs.
- The Realtek R8188EU driver has been added to the staging branch , which replaced the old driver (rtl8188eu) for Realtek RTL8188EU 802.11 b / g / n wireless chips.
- The composition includes the ocp_pt driver for a PCIe board developed by Meta (Facebook) with the implementation of a miniature atomic clock and a GNSS receiver, which can be used to organize the operation of separate time synchronization servers.
- Added support for smartphones Sony Xperia 10II (Snapdragon 665), Xiaomi Redmi 2 (Snapdragon MSM8916), Samsung Galaxy S3 (Snapdragon MSM8226), Samsung Gavini / Codina / Kyle.
- Added support for ARM SoC and NVIDIA Jetson TX2 NX Developer Kit, Sancloud BBE Lite, PicoITX, DRC02, SolidRun SolidSense, SKOV i.MX6, Nitrogen8, Traverse Ten64, GW7902, Microchip SAMA7, ualcomm Snapdragon SDM636 / SM8150, Renesas R-Car -2G / M3e-2G, Marvell CN913x, ASpeed AST2600 (Facebook Cloudripper, Elbert and Fuji server boards), 4KOpen STiH418-b2264.
- Added support for LCD-panels Gopher 2b, EDT ETM0350G0DH6 / ETMV570G2DHU, LOGIC Technologies LTTD800480070-L6WH-RT, Multi-Innotechnology MI1010AIT-1CP1, Innolux EJ030NA 3.0, ilitek ili9341, E Ink WSA30A.
- Added LiteETH driver with support for Ethernet controllers used in LiteX software SoCs (for FPGAs).
- The lowlatency option has been added to the usb-audio driver to control the inclusion of work in the minimum latency mode. Also added the quirk_flags option to pass device-specific settings.
At the same time, the Free Software Foundation of Latin America formed a version of the completely free kernel 5.15 – Linux-libre 5.15-gnucleaned of firmware and driver elements containing non-free components or code sections, the scope of which is limited by the manufacturer. The new release implements the output of a message to the log about the completion of cleaning. Fixed issues when forming packages using mkspec, improved support for snap packages. Removed some warnings displayed when processing the header file firmware.h. Allowed output of some types of warnings (“format-extra-args”, comments, unused functions and variables) when building in the “-Werror” mode. Added cleaning for the gehc-achc driver. Updated blob cleaning code in drivers and subsystems adreno, btusb, btintel, brcmfmac, aarch64 qcom. Stopped cleaning the prism54 (removed) and rtl8188eu drivers (replaced with r8188eu).
Leave a Reply