Direct Rendering Manager - Misplaced Pages

The Direct Rendering Manager ( DRM ) is a subsystem of the Linux kernel responsible for interfacing with GPUs of modern video cards . DRM exposes an API that user-space programs can use to send commands and data to the GPU and perform operations such as configuring the mode setting of the display. DRM was first developed as the kernel-space component of the X Server Direct Rendering Infrastructure , but since then it has been used by other graphic stack alternatives such as Wayland and standalone applications and libraries such as SDL2 and Kodi .

#492507

108-514: User-space programs can use the DRM API to command the GPU to do hardware-accelerated 3D rendering and video decoding , as well as GPGPU computing . The Linux kernel already had an API called fbdev , used to manage the framebuffer of a graphics adapter , but it couldn't be used to handle the needs of modern 3D-accelerated GPU -based video hardware. These devices usually require setting and managing

216-646: A GPU , fixed-function implemented on field-programmable gate arrays (FPGAs), and fixed-function implemented on application-specific integrated circuits (ASICs). Hardware acceleration is advantageous for performance , and practical when the functions are fixed, so updates are not as needed as in software solutions. With the advent of reprogrammable logic devices such as FPGAs, the restriction of hardware acceleration to fully fixed algorithms has eased since 2010, allowing hardware acceleration to be applied to problem domains requiring modification to algorithms and processing control flow . The disadvantage, however,

324-436: A header file . Request numbers usually combine a code identifying the device or class of devices for which the request is intended and a number indicating the particular request; the code identifying the device or class of devices is usually a single ASCII character. Some Unix systems, including 4.2BSD and later BSD releases, operating systems derived from those releases, and Linux , have conventions that also encode within

432-459: A register file ). Hardware accelerators improve the execution of a specific algorithm by allowing greater concurrency , having specific datapaths for their temporary variables , and reducing the overhead of instruction control in the fetch-decode-execute cycle. Modern processors are multi-core and often feature parallel "single-instruction; multiple data" ( SIMD ) units. Even so, hardware acceleration still yields benefits. Hardware acceleration

540-446: A serial port . The normal read and write calls on a serial port receive and send data bytes. An ioctl(fd,TCSETS,data) call, separate from such normal I/O, controls various driver options like handling of special characters , or the output signals on the port (such as the DTR signal). A Win32 DeviceIoControl takes as parameters: The Win32 device control code takes into consideration

648-409: A GEM name from a GEM handle. The process can then pass this GEM name (32-bit integer) to another process using any IPC mechanism available. The GEM name can be used by the recipient process to obtain a local GEM handle pointing to the original GEM object. Unfortunately, the use of GEM names to share buffers is not secure. A malicious third-party process accessing the same DRM device could try to guess

756-477: A GEM object with another process can convert its local GEM handle to a DMA-BUF file descriptor and pass it to the recipient, which in turn can get its own GEM handle from the received file descriptor. This method is used by DRI3 to share buffers between the client and the X Server and also by Wayland . In order to work properly, a video card or graphics adapter must set a mode —a combination of screen resolution , color depth and refresh rate —that

864-445: A GEM object. GEM handles are local 32-bit integers unique to a process but repeatable in other processes, therefore not suitable for sharing. What is needed is a global namespace, and GEM provides one through the use of global handles called GEM names . A GEM name refers to one, and only one, GEM object created within the same DRM device by the same DRM driver, by using a unique 32-bit integer . GEM provides an operation flink to obtain

972-504: A command queue in their own memory to dispatch commands to the GPU and also require management of buffers and free space within that memory. Initially, user-space programs (such as the X Server ) directly managed these resources, but they usually acted as if they were the only ones with access to them. When two or more programs tried to control the same hardware at the same time, and set its resources each one in its own way, most times they ended catastrophically. The Direct Rendering Manager

1080-523: A dedicated VRAM, in a suitable way, and to provide every conceivable feature in a memory manager for use with any type of hardware, led to an overly complex solution with an API far larger than needed. Some DRM developers thought that it wouldn't fit well with any specific driver, especially the API. When GEM emerged as a simpler memory manager, its API was preferred over the TTM one. But some driver developers considered that

1188-451: A failed mode-setting process has to be undone ("rollback"). Atomic mode-setting allows one to know beforehand if certain specific mode configuration is appropriate, by providing mode testing capabilities. When an atomic mode is tested and its validity confirmed, it can be applied with a single indivisible (atomic) commit operation. Both test and commit operations are provided by the same new ioctl with different flags. Atomic page flip on

SECTION 10

#1732898286493

1296-740: A general-purpose central processing unit (CPU). Any transformation of data that can be calculated in software running on a generic CPU can also be calculated in custom-made hardware, or in some mix of both. To perform computing tasks more efficiently, generally one can invest time and money in improving the software, improving the hardware, or both. There are various approaches with advantages and disadvantages in terms of decreased latency , increased throughput , and reduced energy consumption . Typical advantages of focusing on software may include greater versatility, more rapid development , lower non-recurring engineering costs, heightened portability , and ease of updating features or patching bugs , at

1404-454: A graphical compositor, ...) have full access to the DRM API, including the privileged parts like the modeset API. Other user space applications that want to render or make GPGPU computations should be granted by the owner of the DRM device ("DRM Master") through the use of a special authentication interface. Then the authenticated applications can render or make computations using a restricted version of

1512-408: A large collection of facilities. Some of these facilities may not be foreseen by the kernel designer, and as a consequence it is difficult for a kernel to provide system calls for using the devices. To solve this problem, the kernel is designed to be extensible, and may accept an extra module called a device driver which runs in kernel space and can directly address the device. An ioctl interface

1620-489: A local GEM handle to a DMA-BUF file descriptor and another for the exact opposite operation. These two new ioctls were later reused as a way to fix the inherent unsafety of GEM buffer sharing. Unlike GEM names, file descriptors can not be guessed (they are not a global namespace), and Unix operating systems provide a safe way to pass them through a Unix domain socket using the SCM_RIGHTS semantics. A process that wants to share

1728-450: A new Linux version is going to be released. Torvalds, as top maintainer of the whole kernel, holds the last word on whether a patch is suitable or not for inclusion in the kernel. For historical reasons, the source code of the libdrm library is maintained under the umbrella of the Mesa project. In 1999, while developing DRI for XFree86 , Precision Insight created the first version of DRM for

1836-590: A parameter specifying a request code; the effect of a call depends completely on the request code. Request codes are often device-specific. For instance, a CD-ROM device driver which can instruct a physical device to eject a disc would provide an ioctl request code to do so. Device-independent request codes are sometimes used to give userspace access to kernel functions which are only used by core system software or still under development. The ioctl system call first appeared in Version 7 of Unix under that name. It

1944-507: A separate VRAM. For this reason, other DRM drivers have decided to expose to user-space programs the GEM API, but internally they implemented a different memory manager better suited for their particular hardware and memory architecture. The GEM API also provides ioctls for control of the execution flow (command buffers), but they are Intel-specific, to be used with Intel i915 and later GPUs. No other DRM driver has attempted to implement any part of

2052-479: A single FPGA or ASIC. Similarly, specialized functional units can be composed in parallel, as in digital signal processing , without being embedded in a processor IP core . Therefore, hardware acceleration is often employed for repetitive, fixed tasks involving little conditional branching , especially on large amounts of data. This is how Nvidia 's CUDA line of GPUs are implemented. As device mobility has increased, new metrics have been developed that measure

2160-482: A tight integration with the graphics memory manager is highly recommended. That's the main reason why the kernel mode-setting code was incorporated into DRM and not as a separate subsystem. To avoid breaking backwards compatibility of the DRM API, Kernel Mode-Setting is provided as an additional driver feature of certain DRM drivers. Any DRM driver can choose to provide the DRIVER_MODESET flag when it registers with

2268-427: A way to do the mode-setting natively instead of BIOS-based, showing that it was possible to do it using normal kernel code and laying the groundwork for what would become Kernel Mode Setting . In May 2007 Jesse Barnes ( Intel ) published the first proposal for a drm-modesetting API and a working native implementation of mode-setting for Intel GPUs within the i915 DRM driver. In December 2007 Jerome Glisse started to add

SECTION 20

#1732898286493

2376-487: Is a Linux kernel internal API designed to provide a generic mechanism to share DMA buffers across multiple devices, possibly managed by different types of device drivers. For example, a Video4Linux device and a graphics adapter device could share buffers through DMA-BUF to achieve zero-copy of the data of a video stream produced by the first and consumed by the latter. Any Linux device driver can implement this API as exporter, as user (consumer) or both. This feature

2484-462: Is a single system call by which userspace may communicate with device drivers. Requests on a device driver are vectored with respect to this ioctl system call, typically by a handle to the device and a request number. The basic kernel can thus allow the userspace to access a device driver without knowing anything about the facilities supported by the device, and without needing an unmanageably large collection of system calls. A common use of ioctl

2592-501: Is also used by the sysmon framework. One use of ioctl in code exposed to end-user applications is terminal I/O. Unix operating systems have traditionally made heavy use of command-line interfaces , originally with hardware text terminals such as VT100s attached to serial ports , and later with terminal emulators and remote login servers using pseudoterminals . Serial port devices and pseudoterminals are both controlled and configured using ioctl calls. For instance,

2700-486: Is built upon the old KMS API. It uses the same model and objects (CRTCs, encoders, connectors, planes, ...), but with an increasing number of object properties that can be modified. The atomic procedure is based on changing the relevant properties to build the state that we want to test or commit. The properties we want to modify depend on whether we want to do a mode-setting (mostly CRTCs, encoders and connectors properties) or page flipping (usually planes properties). The ioctl

2808-656: Is developed within the Linux kernel , and its source code resides in the /drivers/gpu/drm directory of the Linux source code. The subsystem maintainer is Dave Airlie, with other maintainers taking care of specific drivers. As usual in the Linux kernel development, DRM submaintainers and contributors send their patches with new features and bug fixes to the main DRM maintainer which integrates them into its own Linux repository . The DRM maintainer in turn submits all of these patches that are ready to be mainlined to Linus Torvalds whenever

2916-475: Is done by processing Boolean functions on the binary input, and then outputting the results for storage or further processing by other devices. Because all Turing machines can run any computable function , it is always possible to design custom hardware that performs the same function as a given piece of software. Conversely, software can always be used to emulate the function of a given piece of hardware. Custom hardware may offer higher performance per watt for

3024-488: Is in practice little difference between an ioctl call and a system call; an ioctl call is simply a system call with a different dispatching mechanism. Many of the arguments against expanding the kernel system call interface could therefore be applied to ioctl interfaces. To application developers, system calls appear no different from application subroutines; they are simply function calls that take arguments and return values. The core libraries (e.g. libc ) mask

3132-442: Is limited in parallel processing capability only by the area and logic blocks available on the integrated circuit die . Therefore, hardware is much more free to offer massive parallelism than software on general-purpose processors, offering a possibility of implementing the parallel random-access machine (PRAM) model. It is common to build multicore and manycore processing units out of microprocessor IP core schematics on

3240-527: Is not the case. ioctl interfaces are larger, more diverse, and less well defined, and thus harder to audit than system calls. Furthermore, because ioctl calls can be provided by third-party developers, often after the core operating system has been released, ioctl call implementations may generally receive less scrutiny and thus harbor more vulnerabilities. Finally, some ioctl calls, particularly for third-party device drivers, can be entirely undocumented. Varying fixes for this have been created, with

3348-424: Is overhead to decoding instruction opcodes and multiplexing available execution units on a microprocessor or microcontroller , leading to low circuit utilization. Modern processors that provide simultaneous multithreading exploit under-utilization of available processor functional units and instruction level parallelism between different hardware threads. Hardware execution units do not in general rely on

Direct Rendering Manager - Misplaced Pages Continue

3456-548: Is suitable for any computation-intensive algorithm which is executed frequently in a task or program. Depending upon the granularity, hardware acceleration can vary from a small functional unit, to a large functional block (like motion estimation in MPEG-2 ). Ioctl In computing , ioctl (an abbreviation of input/output control ) is a system call for device-specific input/output operations and other operations which cannot be expressed by regular file semantics. It takes

3564-408: Is supported by most Unix and Unix-like systems, including Linux and macOS , though the available request codes differ from system to system. Microsoft Windows provides a similar function, named " DeviceIoControl ", in its Win32 API . Conventional operating systems can be divided into two layers, userspace and the kernel . Application code such as a text editor resides in userspace, while

3672-474: Is that in many open source projects, it requires proprietary libraries that not all vendors are keen to distribute or expose, making it difficult to integrate in such projects. Integrated circuits are designed to handle various operations on both analog and digital signals. In computing, digital signals are the most common and are typically represented as binary numbers. Computer hardware and software use this binary representation to perform computations. This

3780-493: Is the same for both cases, the difference being the list of properties passed with each one. In the original DRM API, the DRM device /dev/dri/card X is used for both privileged (modesetting, other display control) and non-privileged (rendering, GPGPU compute) operations. For security reasons, opening the associated DRM device file requires special privileges "equivalent to root-privileges". This leads to an architecture where only some reliable user space programs (the X server,

3888-447: Is to control hardware devices. For example, on Win32 systems, ioctl calls can communicate with USB devices, or they can discover drive-geometry information of the attached storage-devices. On OpenBSD and NetBSD , ioctl is used by the bio(4) pseudo-device driver and the bioctl utility to implement RAID volume management in a unified vendor-agnostic interface similar to ifconfig . On NetBSD , ioctl

3996-429: Is within the range of values supported by itself and the attached display screen . This operation is called mode-setting , and it usually requires raw access to the graphics hardware—i.e. the ability to write to certain registers of the video card display controller . A mode-setting operation must be performed before starting to use the framebuffer , and also when the mode is required to change by an application or

4104-465: The fcntl ("file control") system call configures open files, and is used in situations such as enabling non-blocking I/O ; and the setsockopt ("set socket option") system call configures open network sockets , a facility used to configure the ipfw packet firewall on BSD Unix systems. Netlink is a socket-like mechanism for inter-process communication (IPC), designed to be a more flexible successor to ioctl . ioctl calls minimize

4212-557: The 3dfx video cards, as a Linux kernel patch included within the Mesa source code. Later that year, the DRM code was mainlined in Linux kernel 2.3.18 under the /drivers/char/drm/ directory for character devices . During the following years the number of supported video cards grew. When Linux 2.4.0 was released in January 2001 there was already support for Creative Labs GMX 2000, Intel i810, Matrox G200/G400 and ATI Rage 128, in addition to 3dfx Voodoo3 cards, and that list expanded during

4320-548: The Graphics Address Remapping Table (GART). TTM should also handle the portions of the video RAM that are not directly addressable by the CPU and do it with the best possible performance, considering that user-space graphics applications typically work with large amounts of video data. Another important matter was to maintain the consistency between the different memories and caches involved. The main concept of TTM are

4428-585: The Unix principle of " everything is a file " to expose the GPUs through the filesystem name space, using device files under the /dev hierarchy. Each GPU detected by DRM is referred to as a DRM device , and a device file /dev/dri/card X (where X is a sequential number) is created to interface with it. User-space programs that want to talk to the GPU must open this file and use ioctl calls to communicate with DRM. Different ioctls correspond to different functions of

Direct Rendering Manager - Misplaced Pages Continue

4536-524: The XFree86 Server —and later the X.Org Server —handled the case when the user switched from the graphical environment to a text virtual console by saving its mode-setting state, and restoring it when the user switched back to X. This process caused an annoying flicker in the transition, and also can fail, leading to a corrupted or unusable output display. The user space mode setting approach also caused other issues: To address these problems,

4644-412: The instruction cycle ), to execute the instructions constituting the software program. Relying on a common cache for code and data leads to the "von Neumann bottleneck", a fundamental limitation on the throughput of software on processors implementing the von Neumann architecture. Even in the modified Harvard architecture , where instructions and data have separate caches in the memory hierarchy , there

4752-450: The mode setting and page flipping operations. This enhanced KMS API is what is called Atomic Display (formerly known as atomic mode-setting and atomic or nuclear pageflip ). The purpose of the atomic mode-setting is to ensure a correct change of mode in complex configurations with multiple restrictions, by avoiding intermediate steps which could lead to an inconsistent or invalid video state; it also avoids risky video states when

4860-572: The server industry, intended to prevent regular expression denial of service (ReDoS) attacks. The hardware that performs the acceleration may be part of a general-purpose CPU, or a separate unit called a hardware accelerator, though they are usually referred to with a more specific term, such as 3D accelerator, or cryptographic accelerator . Traditionally, processors were sequential (instructions are executed one by one), and were designed to run general purpose algorithms controlled by instruction fetch (for example, moving temporary results to and from

4968-417: The "buffer objects", regions of video memory that at some point must be addressable by the GPU. When a user-space graphics application wants access to a certain buffer object (usually to fill it with content), TTM may require relocating it to a type of memory addressable by the CPU. Further relocations—or GART mapping operations—could happen when the GPU needs access to a buffer object but it isn't in

5076-452: The 2.4.x series, with drivers for ATI Radeon cards, some SiS video cards and Intel 830M and subsequent integrated GPUs. The split of DRM into two components, DRM core and DRM driver, called DRM core/personality split was done during the second half of 2004, and merged into kernel version 2.6.11. This split allowed multiple DRM drivers for multiple devices to work simultaneously, opening the way to multi-GPU support. The idea of putting all

5184-435: The 3 main manufacturers of GPUs for desktop computers (AMD, NVIDIA and Intel), as well as from a growing number of mobile GPU and System on a chip (SoC) integrators. The quality of each driver varies highly, depending on the degree of cooperation by the manufacturer and other matters. There is also a number of drivers for old, obsolete hardware detailed in the next table for historical purposes. The Direct Rendering Manager

5292-453: The DRM API . A library called libdrm was created to facilitate the interface of user-space programs with the DRM subsystem. This library is merely a wrapper that provides a function written in C for every ioctl of the DRM API, as well as constants, structures and other helper elements. The use of libdrm not only avoids exposing the kernel interface directly to applications, but presents

5400-441: The DRM API that either for security purposes or for concurrency issues must be restricted to be used by a single user-space process per device. To implement this restriction, DRM limits such ioctls to be only invoked by the process considered the "master" of a DRM device, usually called DRM-Master . Only one of all processes that have the device node /dev/dri/card X opened will have its file handle marked as master, specifically

5508-428: The DRM API without privileged operations. This design imposes a severe constraint: there must always be a running graphics server (the X Server, a Wayland compositor, ...) acting as DRM-Master of a DRM device so that other user space programs can be granted the use of the device, even in cases not involving any graphics display like GPGPU computations. The "render nodes" concept tries to solve these scenarios by splitting

SECTION 50

#1732898286493

5616-451: The DRM core to indicate that supports the KMS API. Those drivers that implement Kernel Mode-Setting are often called KMS drivers as a way to differentiate them from the legacy—without KMS—DRM drivers. KMS has been adopted to such an extent that certain drivers which lack 3D acceleration (or for which the hardware vendor doesn't want to expose or implement it) nevertheless implement

5724-462: The DRM user space API into two interfaces – one privileged and one non-privileged – and using separate device files (or "nodes") for each one. For every GPU found, its corresponding DRM driver—if it supports the render nodes feature—creates a device file /dev/dri/renderD X , called the render node , in addition to the primary node /dev/dri/card X . Clients that use a direct rendering model and applications that want to take advantage of

5832-493: The DRM-Master's approval to get such privileges. The procedure consists of: Due to the increasing size of video memory and the growing complexity of graphics APIs such as OpenGL , the strategy of reinitializing the graphics card state at each context switch was too expensive, performance-wise. Also, modern Linux desktops needed an optimal way to share off-screen buffers with the compositing manager . These requirements led to

5940-415: The GEM API beyond the memory-management specific ioctls. Translation Table Maps (TTM) is the name of the generic memory manager for GPUs that was developed before GEM. It was specifically designed to manage the different types of memory that a GPU might access, including dedicated Video RAM (commonly installed in the video card) and system memory accessible through an I/O memory management unit called

6048-546: The GEM name of a buffer shared by two other processes, simply by probing 32-bit integers. Once a GEM name is found, its contents can be accessed and modified, violating the confidentiality and integrity of the information of the buffer. This drawback was overcome later by the introduction of DMA-BUF support into DRM, as DMA-BUF represents buffers in userspace as file descriptors, which may be shared securely . Another important task for any video-memory management system besides managing

6156-593: The GPU and memory architecture, and thus driver-specific. GEM was initially developed by Intel engineers to provide a video-memory manager for its i915 driver. The Intel GMA 9xx family are integrated GPUs with a Uniform Memory Architecture (UMA), where the GPU and CPU share the physical memory, and there is not a dedicated VRAM. GEM defines "memory domains" for memory synchronization, and while these memory domains are GPU-independent, they are specifically designed with an UMA memory architecture in mind, making them less suitable for other memory architectures like those with

6264-485: The GPU's address space yet. Each of these relocation operations must handle any related data and cache-coherency issues. Another important TTM concept is fences . Fences are essentially a mechanism to manage concurrency between the CPU and the GPU. A fence tracks when a buffer object is no longer used by the GPU, generally to notify any user-space process with access to it. The fact that TTM tried to manage all kind of memory architectures, including those with and without

6372-418: The KMS API without the rest of the DRM API, allowing display servers (like Wayland ) to run with ease. KMS models and manages the output devices as a series of abstract hardware blocks commonly found on the display output pipeline of a display controller . These blocks are: In recent years there has been an ongoing effort to bring atomicity to some regular operations pertaining the KMS API, specifically to

6480-482: The X Server. VidMode extension was later superseded by the more generic XRandR extension. However, this was not the only code doing mode-setting in a Linux system. During the system booting process, the Linux kernel must set a minimal text mode for the virtual console (based on the standard modes defined by VESA BIOS extensions). Also the Linux kernel framebuffer driver contained mode-setting code to configure framebuffer devices. To avoid mode-setting conflicts,

6588-547: The additional ioctls. The DRM core exports several interfaces to user-space applications, generally intended to be used through corresponding libdrm wrapper functions. In addition, drivers export device-specific interfaces for use by user-space drivers and device-aware applications through ioctls and sysfs files. External interfaces include: memory mapping, context management, DMA operations, AGP management, vblank control, fence management, memory management, and output management. There are several operations (ioctls) in

SECTION 60

#1732898286493

6696-427: The allocated memory in coming operations. GEM API also provides operations to populate the buffer and to release it when it is not needed anymore. Memory from unreleased GEM handles gets recovered when the user-space process closes the DRM device file descriptor —intentionally or because it terminates. GEM also allows two or more user-space processes using the same DRM device (hence the same DRM driver) to share

6804-476: The approach taken by TTM was more suitable for discrete video cards with dedicated video memory and IOMMUs, so they decided to use TTM internally, while exposing their buffer objects as GEM objects and thus supporting the GEM API. Examples of current drivers using TTM as an internal memory manager but providing a GEM API are the radeon driver for AMD video cards and the nouveau driver for NVIDIA video cards. The DMA Buffer Sharing API (often abbreviated as DMA-BUF)

6912-665: The complexity involved in invoking system calls. The same is true for ioctl s, where driver interfaces usually come with a user space library. (E.g. Mesa for the Direct Rendering Infrastructure of graphics drivers.) Libpcap and libdnet are two examples of third-party wrapper Unix libraries designed to mask the complexity of ioctl interfaces, for packet capture and packet I/O, respectively. In traditional design, kernels resided in ring 0 , separated from device drivers in ring 1, and in microkernels , also from each other. This has largely been given up due adding

7020-419: The complexity of the kernel's system call interface. However, by providing a place for developers to "stash" bits and pieces of kernel programming interfaces, ioctl calls complicate the overall user-to-kernel API. A kernel that provides several hundred system calls may provide several thousand ioctl calls. Though the interface to ioctl calls appears somewhat different from conventional system calls, there

7128-424: The computing facilities of a GPU, can do it without requiring additional privileges by simply opening any existing render node and dispatching GPU operations using the limited subset of the DRM API supported by those nodes—provided they have file system permissions to open the device file. Display servers, compositors and any other program that requires the modeset API or any other privileged operation must open

7236-406: The corresponding device node during its startup, and keeps these privileges for the entire graphical session until it finishes or dies. For the remaining user-space processes there is another way to gain the privilege to invoke some restricted operations on the DRM device called DRM-Auth . It is basically a method of authentication against the DRM device, in order to prove to it that the process has

7344-439: The cost of overhead to compute general operations. Advantages of focusing on hardware may include speedup , reduced power consumption , lower latency, increased parallelism and bandwidth , and better utilization of area and functional components available on an integrated circuit ; at the cost of lower ability to update designs once etched onto silicon and higher costs of functional verification , times to market, and

7452-400: The cost of general-purpose utility. Greater RTL customization of hardware designs allows emerging architectures such as in-memory computing , transport triggered architectures (TTA) and networks-on-chip (NoC) to further benefit from increased locality of data to execution context, thereby reducing computing and communication latency between modules and functional units. Custom hardware

7560-465: The development of new methods to manage graphics buffers inside the kernel. The Graphics Execution Manager (GEM) emerged as one of these methods. GEM provides an API with explicit memory management primitives. Through GEM, a user-space program can create, handle and destroy memory objects living in the GPU video memory. These objects, called "GEM objects", are persistent from the user-space program's perspective and don't need to be reloaded every time

7668-453: The display size is set using the TIOCSWINSZ call. The TIOCSTI (terminal I/O control, simulate terminal input) ioctl function can push a character into a device stream. When applications need to extend the kernel, for instance to accelerate network processing, ioctl calls provide a convenient way to bridge userspace code to kernel extensions. Kernel extensions can provide a location in

7776-707: The filesystem that can be opened by name, through which an arbitrary number of ioctl calls can be dispatched, allowing the extension to be programmed without adding system calls to the operating system. According to an OpenBSD developer, ioctl and sysctl are the two system calls for extending the kernel, with sysctl possibly being the simpler of the two. In NetBSD , the sysmon_envsys framework for hardware monitoring uses ioctl through proplib ; whereas OpenBSD and DragonFly BSD instead use sysctl for their corresponding hw.sensors framework. The original revision of envsys in NetBSD

7884-546: The first calling the SET_MASTER ioctl. Any attempt to use one of these restricted ioctls without being the DRM-Master will return an error. A process can also give up its master role—and let another process acquire it—by calling the DROP_MASTER ioctl. The X Server —or any other display server —is commonly the process that acquires the DRM-Master status in every DRM device it manages, usually when it opens

7992-415: The flickering issues while changing between console and X, and also between different instances of X (fast user switching). Since it is available in the kernel, it can also be used at the beginning of the boot process, saving flickering due to mode changes in these early stages. The fact that KMS is part of the kernel allows it to use resources only available at kernel space such as interrupts . For example,

8100-422: The goal of achieving an equivalent to the former security, while keeping the gained speed. Win32 and Unix operating systems can protect a userspace device name from access by applications with specific access controls applied to the device. Security problems can arise when device driver developers do not apply appropriate access controls to the userspace accessible object. Some modern operating systems protect

8208-456: The hardware-dependent part of the API, specific to the type of GPU it supports; it should provide the implementation of the remaining ioctls not covered by DRM core, but it may also extend the API, offering additional ioctls with extra functionality only available on such hardware. When a specific DRM driver provides an enhanced API, user-space libdrm is also extended by an extra library libdrm- driver that can be used by user space to interface with

8316-402: The isolation that operating systems should provide between programs and hardware, raising both stability and security concerns, but also could leave the graphics hardware in an inconsistent state if two or more user space programs try to do the mode-setting at the same time. To avoid these conflicts, the X Server became in practice the only user space program that performed mode-setting operations;

8424-590: The kernel from hostile userspace code (such as applications that have been infected by buffer overflow exploits) using system call wrappers . System call wrappers implement role-based access control by specifying which system calls can be invoked by which applications; wrappers can, for instance, be used to "revoke" the right of a mail program to spawn other programs. ioctl interfaces complicate system call wrappers because there are large numbers of them, each taking different arguments, some of which may be required by normal programs. Furthermore, such solutions negate

8532-414: The kernel layer. A system call usually takes the form of a "system call vector", in which the desired system call is indicated with an index number. For instance, exit() might be system call number 1, and write() number 4. The system call vector is then used to find the desired kernel function for the request. In this way, conventional operating systems typically provide several hundred system calls to

8640-440: The mode of the operation being performed. There are 4 defined modes of operation, impacting the security of the device driver - Devices and kernel extensions may be linked to userspace using additional new system calls, although this approach is rarely taken, because operating system developers try to keep the system call interface focused and efficient. On Unix operating systems, two other vectored call interfaces are popular:

8748-415: The mode recovery after a suspend/resume process simplifies a lot by being managed by the kernel itself, and incidentally improves security (no more user space tools requiring root permissions). The kernel also allows the hotplug of new display devices easily, solving a longstanding problem. Mode-setting is also closely related to memory management—since framebuffers are basically memory buffers—so

8856-424: The mode-setting code was moved to a single place inside the kernel, specifically to the existing DRM module. Then, every process—including the X Server—should be able to command the kernel to perform mode-setting operations, and the kernel would ensure that concurrent operations don't result in an inconsistent state. The new kernel API and code added to the DRM module to perform these mode-setting operations

8964-413: The native mode-setting code for ATI cards to the radeon DRM driver. Work on both the API and drivers continued during 2008, but got delayed by the necessity of a memory manager also in kernel space to handle the framebuffers. Hardware acceleration Hardware acceleration is the use of computer hardware designed to perform specific functions more efficiently when compared to software running on

9072-447: The need for more parts. In the hierarchy of digital computing systems ranging from general-purpose processors to fully customized hardware, there is a tradeoff between flexibility and efficiency, with efficiency increasing by orders of magnitude when any given application is implemented higher up that hierarchy. This hierarchy includes general-purpose processors such as CPUs, more specialized processors such as programmable shaders in

9180-418: The other hand allows to update multiple planes on the same output (for instance the primary plane, the cursor plane and maybe some overlays or secondary planes) all synchronized within the same VBLANK interval, ensuring a proper display without tearing. This requirement is especially relevant to mobile and embedded display controllers, that tend to use multiple planes/overlays to save power. The new atomic API

9288-460: The program regains control of the GPU. When a user-space program needs a chunk of video memory (to store a framebuffer , texture or any other data required by the GPU), it requests the allocation to the DRM driver using the GEM API. The DRM driver keeps track of the used video memory and is able to comply with the request if there is free memory available, returning a "handle" to user space to further refer

9396-656: The relative performance of specific acceleration protocols, considering characteristics such as physical hardware dimensions, power consumption, and operations throughput. These can be summarized into three categories: task efficiency, implementation efficiency, and flexibility. Appropriate metrics consider the area of the hardware along with both the corresponding operations throughput and energy consumed. Examples of hardware acceleration include bit blit acceleration functionality in graphics processing units (GPUs), use of memristors for accelerating neural networks , and regular expression hardware acceleration for spam control in

9504-483: The remainder user space programs relied on the X Server to set the appropriate mode and to handle any other operation involving mode-setting. Initially the mode-setting was performed exclusively during the X Server startup process, but later the X Server gained the ability to do it while running. The XFree86-VidModeExtension extension was introduced in XFree86 3.1.2 to let any X client request modeline (resolution) changes to

9612-435: The request number the size of the data to be transferred to/from the device driver and the direction of the data transfer. Regardless of whether any such conventions are followed, the kernel and the driver collaborate to deliver a uniform error code (denoted by the symbolic constant ENOTTY ) to an application which makes a request of a driver which does not recognise it. The mnemonic ENOTTY (traditionally associated with

9720-634: The same functions that can be specified in software. Hardware description languages (HDLs) such as Verilog and VHDL can model the same semantics as software and synthesize the design into a netlist that can be programmed to an FPGA or composed into the logic gates of an ASIC. The vast majority of software-based computing occurs on machines implementing the von Neumann architecture , collectively known as stored-program computers . Computer programs are stored as data and executed by processors . Such processors must fetch and decode instructions, as well as load data operands from memory (as part of

9828-488: The same overhead of transitioning between rings to driver/kernel interfaces, that syscalls impose on kernel/user space interfaces. This has led to the difficult-in-practice requirement that all drivers, which now reside in ring 0 as well, must uphold the same level of security as the kernel core. While the user-to-kernel interfaces of mainstream operating systems are often audited heavily for code flaws and security vulnerabilities prior to release, these audits typically focus on

9936-415: The standard primary node that grants access to the full DRM API and use it as usual. Render nodes explicitly disallow the GEM flink operation to prevent buffer sharing using insecure GEM global names; only PRIME (DMA-BUF) file descriptors can be used to share buffers with another client, including the graphics server. The Linux DRM subsystem includes free and open-source drivers to support hardware from

10044-551: The system call remained as ioctl , and the message was removed. The ioctl system call first appeared in Version 7 Unix , as a replacement for the stty and gtty system calls, with an additional request code argument. An ioctl call takes as parameters : The kernel generally dispatches an ioctl call straight to the device driver, which can interpret the request number and data in whatever way required. The writers of each driver document request numbers for that particular driver and provide them as constants in

10152-439: The textual message " Not a typewriter ") derives from the earliest systems that incorporated an ioctl call, where only the teletype ( tty ) device raised this error. Though the symbolic mnemonic is fixed by compatibility requirements, some modern systems more helpfully render a more general message such as " Inappropriate device control operation " (or a localization thereof). TCSETS exemplifies an ioctl call on

10260-446: The underlying facilities of the operating system, such as the network stack , reside in the kernel. Kernel code handles sensitive resources and implements the security and reliability barriers between applications; for this reason, user mode applications are prevented by the operating system from directly accessing kernel resources. Userspace applications typically make requests to the kernel by means of system calls , whose code lies in

10368-574: The user. In early days, the user space programs that wanted to use the graphical framebuffer were also responsible for providing the mode-setting operations, and therefore they needed to run with privileged access to the video hardware. In Unix-type operating systems, the X Server was the most prominent example, and its mode-setting implementation lived in the DDX driver for each specific type of video card. This approach, later referred to as User space Mode-Setting or UMS, poses several issues. It not only breaks

10476-499: The userspace. Though an expedient design for accessing standard kernel facilities, system calls are sometimes inappropriate for accessing non-standard hardware peripherals. By necessity, most hardware peripherals (aka devices) are directly addressable only within the kernel. But user code may need to communicate directly with devices; for instance, an administrator might configure the media type on an Ethernet interface. Modern operating systems support diverse devices, many of which offer

10584-417: The usual advantages of reusing and sharing code between programs. DRM consists of two parts: a generic "DRM core" and a specific one ("DRM driver") for each type of supported hardware. DRM core provides the basic framework where different DRM drivers can register and also provides to user space a minimal set of ioctls with common, hardware-independent functionality. A DRM driver, on the other hand, implements

10692-534: The video mode setting code in one place inside the kernel had been acknowledged for years, but the graphics card manufacturers had argued that the only way to do the mode-setting was to use the routines provided by themselves and contained in the Video BIOS of each graphics card. Such code had to be executed using x86 real mode , which prevented it from being invoked by a kernel running in protected mode . The situation changed when Luc Verhaegen and other developers found

10800-482: The video-memory space is handling the memory synchronization between the GPU and the CPU. Current memory architectures are very complex and usually involve various levels of caches for the system memory and sometimes for the video memory too. Therefore, video-memory managers should also handle the cache coherence to ensure the data shared between CPU and GPU is consistent. This means that often video-memory management internals are highly dependent on hardware details of

10908-652: The von Neumann or modified Harvard architectures and do not need to perform the instruction fetch and decode steps of an instruction cycle and incur those stages' overhead. If needed calculations are specified in a register transfer level (RTL) hardware design, the time and circuit area costs that would be incurred by instruction fetch and decoding stages can be reclaimed and put to other uses. This reclamation saves time, power, and circuit area in computation. The reclaimed resources can be used for increased parallel computation, other functions, communication, or memory, as well as increased input/output capabilities. This comes at

11016-464: The well-documented system call interfaces. For instance, auditors might ensure that sensitive security calls such as changing user IDs are only available to administrative users. Because the handler for an ioctl call also resides directly in ring 0, the input from userspace should be validated just as carefully. As vulnerabilities in device drivers can be exploited by local users, e.g. by passing invalid buffers to ioctl calls. In practice, this

11124-605: The whole kernel DRM subsystem. The trend to include two GPUs in a computer—a discrete GPU and an integrated one—led to new problems such as GPU switching that also needed to be solved at the DRM layer. In order to match the Nvidia Optimus technology, DRM was provided with GPU offloading abilities, called PRIME. The Direct Rendering Manager resides in kernel space , so user-space programs must use kernel system calls to request its services. However, DRM doesn't define its own customized system calls. Instead, it follows

11232-426: The years to cover more functionality previously handled by user-space programs, such as framebuffer managing and mode setting , memory-sharing objects and memory synchronization. Some of these expansions were given specific names, such as Graphics Execution Manager (GEM) or kernel mode-setting (KMS), and the terminology prevails when the functionality they provide is specifically alluded. But they are really parts of

11340-427: Was called Kernel Mode-Setting (KMS). Kernel Mode-Setting provides several benefits. The most immediate is of course the removal of duplicate mode-setting code, from both the kernel (Linux console, fbdev) and user space (X Server DDX drivers). KMS also makes it easier to write alternative graphics systems, which now don't need to implement their own mode-setting code. By providing centralized mode management, KMS solves

11448-406: Was created to allow multiple programs to use video hardware resources cooperatively. The DRM gets exclusive access to the GPU and is responsible for initializing and maintaining the command queue, memory, and any other hardware resource. Programs wishing to use the GPU send requests to DRM, which acts as an arbitrator and takes care to avoid possible conflicts. The scope of DRM has been expanded over

11556-402: Was exploited for the first time in DRM to implement PRIME, a solution for GPU offloading that uses DMA-BUF to share the resulting framebuffers between the DRM drivers of the discrete and the integrated GPU. An important feature of DMA-BUF is that a shared buffer is presented to user space as a file descriptor . For the development of PRIME two new ioctls were added to the DRM API, one to convert

11664-469: Was implemented with ioctl before proplib was available, and had a message suggesting that the framework is experimental, and should be replaced by a sysctl(8) interface, should one be developed, which potentially explains the choice of sysctl in OpenBSD with its subsequent introduction of hw.sensors in 2003. However, when the envsys framework was redesigned in 2007 around proplib ,

#492507