What is a full-fledged queue developer

Kernel Log - What 3.19 brings (3): Infrastructure

The Linux kernel now contains the driver "amdkfd" (1, 2, 3, 4), with which "Heterogeneous Queuing" (HQ) can be used with processors and graphics chips from AMD. The technology is one of several of the "Heterogeneous System Architecture" (HSA) with which AMD and other companies want to increase the efficiency of the cooperation between different chips in a system. Heterogeneous queuing is supposed to do this by performing subtasks of a larger work task with the chip that is best suited to it.

However, programs do not interact directly with the new driver, but rather queue up the work tasks for the individual chips using an "HSA runtime". These tasks can call each other so that, for example, a task in the GPU queue can process some data and then start a task in the CPU queue that processes the data further; this avoids context changes and thus increases the efficiency when dividing work between CPU and GPU.

The HSA Runtime is a userspace library that AMD developed itself and published under an open source license in November. The HSA runtime interacts not only with the new kernel driver, but also with the R600 backend from LLVM, which generates code that is executed on the CPU or GPU. The functions responsible for this have been contributed by AMD developers to LLVM; they will become part of LLVM version 3.6, which is due to appear in spring. AMD chips that support heterogeneous queuing include processors with a Kaveri core, which AMD has been selling since the beginning of 2014.

Kernel changes are still in the assessment phase in order to use the best-known HSA technology: The function known as Heterogeneous Memory Management (HMM) or Heterogeneous Uniform Memory Access (HUMA), in which HSA-compatible chips use a shared virtual address space. This means that data processed by the main processor do not have to be copied from the main memory to the memory of the graphics processor before code running there can process the data; instead, all that needs to be passed is a pointer that points to the area in the virtual address space that receives the data.

The same works in the opposite direction and, together with the amdkfd driver already included in the 3.19 kernel, simplifies the use of graphics processors for general computing tasks (GPGPU / General Purpose Computation on Graphics Processing Unit). HSA is also interesting for more efficient connection of other chips, for example for crypto accelerators or functional units of system on chips (SoCs).

What Linux 3.19 brings

Linux 3.19 will probably be released at the beginning of the second week of February, as Linus Torvalds indicated when the seventh pre-release version was released. No more major modifications are to be expected, because the kernel hackers integrated all of the essential innovations back in December. The kernel log was therefore able to provide an overview of the most important changes in this version even before completion. This is done with the four-part series of articles "What 3.19 brings", which is gradually devoted to the various areas of the kernel:

IPC via kernel

The binder used in Android for interprocess communication is now a full component of the kernel. The IPC service was previously in the staging branch for code with quality defects, but these defects have not yet been resolved; there are no plans to do so either.

Staging supervisor Greg Kroah-Hartman has nevertheless included Binder in the regular kernel code, since the Binder API has to be supported anyway; however, in a merge comment he called the code "hideous" and said the API left a lot to be desired. Three long-time kernel developers strongly criticized the upgrade in a discussion; They also warned about the API and its use in programs, but could not prevent the upgrade.

The kernel now has a high-level IPC service - as Kroah-Hartman suggests in the merge comment, but work is in progress to replace Binder with something better. He is evidently alluding to Kdbus, the kernel IPC service that was created in the systemd environment and wants to inherit D-Bus. Kroah-Hartman himself works on the development of Kdbus, the kernel parts of which recently went to the list of kernel developers for public assessment for the third time. But there is no telling when the Kdbus developers will try to integrate their code into the Linux kernel.

64-bit ARM

The ARM64 code of Linux can now emulate some CPU instructions supported by older ARM architectures that ARM has given up with ARMv8 or wants to give up at some point (including 1, 2, 3). In contrast to the x86 world, new ARM architectures are not fully downward compatible, so without such an emulation it is not possible to run programs that were compiled for older ARM architectures and that use the instructions given; Details are explained in an article on LWN.net.

The Linux kernel now supports AMD's SoC Seattle (Opteron A1100), a 64-bit component with ARMv8 architecture intended for servers, which has so far only been available on motherboards designed for developers. One such is the ARMv8 board Juno, which comes from ARM itself and which the kernel now also supports. ARM subsystem maintainer Arnd Bergmann lists details on these and other changes to the ARM and ARM64 code in a blog post.


The memory management code of the kernel now supports more functions of the Page Attribute Table (PAT). As a result, graphics drivers, for example, could now use write-through caching when accessing the video memory, which can increase graphics performance.

The Linux kernel now knows the x86 instruction MPX (1, ​​2), a hardware protection against buffer overflows and underflows, which a message in the heise news ticker and the documentation on kernel support explain in more detail. Intel wants to support this memory protection expansion for the first time with the processor generation "Skylake", which is expected according to the processor whisper of the c't 4/15 in the second half of the year.


The kernel now supports the "Nios II", a 32-bit processor architecture created by Altera that the company uses in some FPGAs (including 1, 2).

The kernel developers added support for device tree overlays. They can be used to provide the kernel more easily with information on the equipment and configuration of hardware components that are already being used in the boot process. This is important to make it easier to address the plug-in boards called "shields", with which some boards can be expanded with ARM SoCs; Details are explained in an article on LWN.net.

Some changes to memory management can improve the scaling of the kernel. A special microbenchmark increased by a good 30 percent as a result of the change - but in a test setup that the developer had created specifically to measure the changes he had made.

Merge Commits

There were hundreds of other changes to the code in the kernel areas described. You can find information about these via the following links, which refer to Git Merge Commits, with which the most important innovations in these areas have been integrated for the next kernel version.

Architecture and board support


Further background information and information about developments in the Linux kernel and its environment can be found in the previous kernel logs on heise open. New issues of the kernel log will be announced via the Twitter account "@kernellog". (thl)