Misplaced Pages

OpenCL

Article snapshot taken from Wikipedia with creative commons attribution-sharealike license. Give it a read and then ask your questions in the chat. We can research this topic together.

SYCL (pronounced "sickle") is a higher-level programming model to improve programming productivity on various hardware accelerators . It is a single-source embedded domain-specific language ( eDSL ) based on pure C++17 . It is a standard developed by Khronos Group , announced in March 2014.

#697302

79-410: OpenCL C 3.0 revision V3.0.11 C++ for OpenCL 1.0 and 2021 OpenCL ( Open Computing Language ) is a framework for writing programs that execute across heterogeneous platforms consisting of central processing units (CPUs), graphics processing units (GPUs), digital signal processors (DSPs), field-programmable gate arrays (FPGAs) and other processors or hardware accelerators . OpenCL specifies

158-476: A C-like language for writing programs. Functions executed on an OpenCL device are called " kernels ". A single compute device typically consists of several compute units , which in turn comprise multiple processing elements (PEs). A single kernel execution can run on all or many of the PEs in parallel. How a compute device is subdivided into compute units and PEs is up to the vendor; a compute unit can be thought of as

237-495: A programming language (based on C99 ) for programming these devices and application programming interfaces (APIs) to control the platform and execute programs on the compute devices . OpenCL provides a standard interface for parallel computing using task- and data-based parallelism . OpenCL is an open standard maintained by the Khronos Group , a non-profit , open standards organisation. Conformant implementations (passed

316-407: A software framework is an abstraction in which software , providing generic functionality, can be selectively changed by additional user-written code, thus providing application-specific software. It provides a standard way to build and deploy applications and is a universal, reusable software environment that provides particular functionality as part of a larger software platform to facilitate

395-449: A " core ", but the notion of core is hard to define across all the types of devices supported by OpenCL (or even within the category of "CPUs"), and the number of compute units may not correspond to the number of cores claimed in vendors' marketing literature (which may actually be counting SIMD lanes ). In addition to its C-like programming language, OpenCL defines an application programming interface (API) that allows programs running on

474-666: A Linux platform, the Nvidia ICD would need to be installed such that the OpenCL runtime (the ICD loader) would be able to locate the ICD for the vendor and redirect the calls appropriately. The standard OpenCL header is used by the consumer application; calls to each function are then proxied by the OpenCL runtime to the appropriate driver using the ICD. Each vendor must implement each OpenCL call in their driver. The Apple, Nvidia, ROCm , RapidMind and Gallium3D implementations of OpenCL are all based on

553-400: A basic CPU implementation that relies on pure runtime without any specific compiler. Both DPC++ and AdaptiveCpp compilers provide a backend to NVIDIA GPUs, similar to how CUDA does. This allows SYCL code to be compiled and run on NVIDIA hardware, allowing developers to leverage SYCL's high-level abstractions on CUDA-capable GPUs. ROCm HIP targets Nvidia GPU, AMD GPU, and x86 CPU. HIP

632-554: A broader range of accelerators and vendors. SYCL supports multiple types of accelerators simultaneously within a single application through the concept of backends. Additionally, SYCL is written in pure C++, whereas HIP, like CUDA, uses some language extensions. These extensions prevent HIP from being compiled with a standard C++ compiler. Both DPC++ and AdaptiveCpp compilers provide backends for NVIDIA and AMD GPUs, similar to how HIP does. This enables SYCL code to be compiled and executed on hardware from these vendors, offering developers

711-662: A concrete software system with a software framework, developers utilize the hot spots according to the specific needs and requirements of the system. Software frameworks rely on the Hollywood Principle : "Don't call us, we'll call you." This means that the user-defined classes (for example, new subclasses) receive messages from the predefined framework classes. Developers usually handle this by implementing superclass abstract methods . SYCL SYCL (pronounced ‘sickle’) originally stood for SY stem-wide C ompute L anguage, but since 2020 SYCL developers have stated that SYCL

790-404: A device program having a main function, OpenCL C functions are marked __kernel to signal that they are entry points into the program to be called from the host program. Function pointers , bit fields and variable-length arrays are omitted, and recursion is forbidden. The C standard library is replaced by a custom set of standard functions, geared toward math programming. OpenCL C

869-496: A framework consists of composing and subclassing the existing classes. The necessary functionality can be implemented by using the Template Method Pattern in which the frozen spots are known as invariant methods and the hot spots are known as variant or hook methods. The invariant methods in the superclass provide default behaviour while the hook methods in each subclass provide custom behaviour. When developing

SECTION 10

#1733085527698

948-422: A full matrix–vector multiplication, the OpenCL runtime maps the kernel over the rows of the matrix. On the host side, the clEnqueueNDRangeKernel function does this; it takes as arguments the kernel to execute, its arguments, and a number of work-items, corresponding to the number of rows in the matrix A . This example will load a fast Fourier transform (FFT) implementation and execute it. The implementation

1027-414: A more flexible SYCL specification to address the increasing diversity of current hardware accelerators , including artificial intelligence engines, which led to SYCL 2020. The latest version is SYCL 2020 revision 6 which was published on November 13, 2022, an evolution from first release of revision 2 which was published on February 9, 2021, taking into account the feedback from users and implementors on

1106-464: A product. Further, due to the complexity of their APIs, the intended reduction in overall development time may not be achieved due to the need to spend additional time learning to use the framework; this criticism is clearly valid when a special or new framework is first encountered by development staff. If such a framework is not used in subsequent job taskings, the time invested in learning the framework can cost more than purpose-written code familiar to

1185-475: A smooth transition path to C++ functionality for the OpenCL kernel code developers as they can continue using familiar programming flow and even tools as well as leverage existing extensions and libraries available for OpenCL C. The language semantics is described in the documentation published in the releases of OpenCL-Docs repository hosted by the Khronos Group ;but it is currently not ratified by

1264-435: A standard way to take advantage of external accelerators by allowing developers to specify an execution policy for parallel operations, such as std::for_each , std::transform , and std::reduce . This enables efficient use of multi-core processors and other parallel hardware without requiring significant changes to the code. SYCL can be used as a backend for std::par , enabling the execution of standard algorithms on

1343-517: A subset of C++14 , while maintaining support for the preexisting OpenCL C kernel language. Vulkan and OpenCL 2.1 share SPIR-V as an intermediate representation allowing high-level language front-ends to share a common compilation target. Updates to the OpenCL API include: AMD, ARM , Intel, HPC, and YetiWare have declared support for OpenCL 2.1. OpenCL 2.2 brings the OpenCL C++ kernel language into

1422-399: A wide range of external accelerators, including GPUs from Intel, AMD, and NVIDIA, as well as other types of accelerators. By leveraging SYCL's capabilities, developers can write standard C++ code that seamlessly executes on heterogeneous computing environments. This integration allows for greater flexibility and performance optimization across different hardware platforms. The use of SYCL as

1501-447: Is a matrix–vector multiplication algorithm in OpenCL C. The kernel function matvec computes, in each invocation, the dot product of a single row of a matrix A and a vector x : y i = a i , : ⋅ x = ∑ j a i , j x j . {\displaystyle y_{i}=a_{i,:}\cdot x=\sum _{j}a_{i,j}x_{j}.} To extend this into

1580-457: Is a lower-level API that closely resembles CUDA's APIs. For example, AMD released a tool called HIPIFY that can automatically translate CUDA code to HIP. Therefore, many of the points mentioned in the comparison between CUDA and SYCL also apply to the comparison between HIP and SYCL. ROCm HIP has some similarities to SYCL in the sense that it can target various vendors (AMD and Nvidia) and accelerator types (GPU and CPU). However, SYCL can target

1659-631: Is a name and have made clear that it is no longer an acronym and contains no reference to OpenCL . SYCL is a royalty-free, cross-platform abstraction layer that builds on the underlying concepts, portability and efficiency inspired by OpenCL that enables code for heterogeneous processors to be written in a “single-source” style using completely standard C++ . SYCL enables single-source development where C++ template functions can contain both host and device code to construct complex algorithms that use hardware accelerators , and then re-use them throughout their source code on different types of data. While

SECTION 20

#1733085527698

1738-419: Is also considering Vulkan-like loader and layers and a "flexible profile" for deployment flexibility on multiple accelerator types. OpenCL consists of a set of headers and a shared object that is loaded at runtime. An installable client driver (ICD) must be installed on the platform for every class of vendor for which the runtime would need to support. That is, for example, in order to support Nvidia devices on

1817-704: Is extended to facilitate use of parallelism with vector types and operations, synchronization, and functions to work with work-items and work-groups. In particular, besides scalar types such as float and double , which behave similarly to the corresponding types in C, OpenCL provides fixed-length vector types such as float4 (4-vector of single-precision floats); such vector types are available in lengths two, three, four, eight and sixteen for various base types. Vectorized operations on these types are intended to map onto SIMD instructions sets, e.g., SSE or VMX , when running OpenCL programs on CPUs. Other specialized types include 2-d and 3-d image types. The following

1896-483: Is fully compatible with the OpenCL 3.0 standard. A work in progress draft of the latest C++ for OpenCL documentation can be found on the Khronos website. C++ for OpenCL supports most of the features (syntactically and semantically) from OpenCL C except for nested parallelism and blocks. However, there are minor differences in some supported features mainly related to differences in semantics between C++ and C. For example, C++

1975-426: Is higher-level than C++ AMP and CUDA since you do not need to build an explicit dependency graph between all the kernels, and it provides you with automatic asynchronous scheduling of the kernels with communication and computation overlap. This is all done by using the concept of accessors without requiring any compiler support. Unlike C++ AMP and CUDA, SYCL is a pure C++ eDSL without any C++ extension. This allows for

2054-595: Is more strict with the implicit type conversions and it does not support the restrict type qualifier. The following C++ features are not supported by C++ for OpenCL: virtual functions, dynamic_cast operator, non-placement new / delete operators, exceptions, pointer to member functions, references to functions, C++ standard libraries. C++ for OpenCL extends the concept of separate memory regions ( address spaces ) from OpenCL C to C++ features – functional casts, templates, class members, references, lambda functions, and operators. Most of C++ features are not available for

2133-515: Is named "CUDA Runtime API ," is somewhat similar to SYCL. In fact, Intel released a tool called SYCLOMATIC that automatically translated code from CUDA to SYCL. However, there is a less known non-single-source version of CUDA, which is called "CUDA Driver API," similar to OpenCL, and used, for example, by the CUDA Runtime API implementation itself. SYCL extends the C++ AMP features, relieving

2212-546: Is now possible with the concept of a generic backend to target any acceleration API while enabling full interoperability with the target API , like using existing native libraries to reach the maximum performance along with simplifying the programming effort. For example, the Open SYCL implementation targets ROCm and CUDA via AMD 's cross-vendor HIP . SYCL was introduced at GDC in March 2014 with provisional version 1.2, then

2291-462: Is shown below. The code asks the OpenCL library for the first available graphics card, creates memory buffers for reading and writing (from the perspective of the graphics card), JIT-compiles the FFT-kernel and then finally asynchronously runs the kernel. The result from the transform is not read in this example. The actual calculation inside file "fft1D_1024_kernel_src.cl" (based on "Fitting FFT onto

2370-403: Is used to write compute kernels is called kernel language. OpenCL adopts C / C++ -based languages to specify the kernel computations performed on the device with some restrictions and additions to facilitate efficient mapping to the heterogeneous hardware resources of accelerators. Traditionally OpenCL C was used to program the accelerators in OpenCL standard, later C++ for OpenCL kernel language

2449-430: Is user-defined). For those frameworks that generate code, for example, "elegance" would imply the creation of code that is clean and comprehensible to a reasonably knowledgeable programmer (and which is therefore readily modifiable), versus one that merely generates correct code. The elegance issue is why relatively few software frameworks have stood the test of time: the best frameworks have been able to evolve gracefully as

OpenCL - Misplaced Pages Continue

2528-588: Is very attractive to the library developers. C++ for OpenCL sources can be compiled by OpenCL drivers that support cl_ext_cxx_for_opencl extension. Arm has announced support for this extension in December 2020. However, due to increasing complexity of the algorithms accelerated on OpenCL devices, it is expected that more applications will compile C++ for OpenCL kernels offline using stand alone compilers such as Clang into executable binary format or portable binary format e.g. SPIR-V. Such an executable can be loaded during

2607-872: The LLVM Compiler technology and use the Clang compiler as their frontend. As of 2016, OpenCL runs on graphics processing units (GPUs), CPUs with SIMD instructions, FPGAs , Movidius Myriad 2 , Adapteva Epiphany and DSPs . To be officially conformant, an implementation must pass the Khronos Conformance Test Suite (CTS), with results being submitted to the Khronos Adopters Program. The Khronos CTS code for all OpenCL versions has been available in open source since 2017. The Khronos Group maintains an extended list of OpenCL-conformant products. Software framework In computer programming ,

2686-501: The Conformance Test Suite) are available from a range of companies including AMD , ARM , Cadence , Google , Imagination , Intel , Nvidia , Qualcomm , Samsung , SPI and Verisilicon . OpenCL views a computing system as consisting of a number of compute devices , which might be central processing units (CPUs) or "accelerators" such as graphics processing units (GPUs), attached to a host processor (a CPU). It defines

2765-518: The G80 Architecture"): A full, open source implementation of an OpenCL FFT can be found on Apple's website. In 2020, Khronos announced the transition to the community driven C++ for OpenCL programming language that provides features from C++17 in combination with the traditional OpenCL C features. This language allows to leverage a rich variety of language features from standard C++ while preserving backward compatibility to OpenCL C. This opens up

2844-590: The Khronos Group announced the creation of the SYCL SC Working Group, with the objective of creating a high-level heterogeneous computing framework for safety-critical systems . These systems span various fields, including avionics, automotive, industrial, and medical sectors. The SYCL Safety Critical framework will comply with several industry standards to ensure its reliability and safety. These standards include MISRA C++ 202X, which provides guidelines for

2923-526: The Khronos Group announced the ratification and public release of the finalized OpenCL 2.0 specification. Updates and additions to OpenCL 2.0 include: The ratification and release of the OpenCL 2.1 provisional specification was announced on March 3, 2015, at the Game Developer Conference in San Francisco. It was released on November 16, 2015. It introduced the OpenCL C++ kernel language, based on

3002-424: The Khronos Group on June 14, 2010, and adds significant functionality for enhanced parallel programming flexibility, functionality, and performance including: On November 15, 2011, the Khronos Group announced the OpenCL 1.2 specification, which added significant functionality over the previous versions in terms of performance and features for parallel programming. Most notable features include: On November 18, 2013,

3081-468: The Khronos Group. The C++ for OpenCL language is not documented in a stand-alone document and it is based on the specification of C++ and OpenCL C. The open source Clang compiler has supported C++ for OpenCL since release 9. C++ for OpenCL has been originally developed as a Clang compiler extension and appeared in the release 9. As it was tightly coupled with OpenCL C and did not contain any Clang specific functionality its documentation has been re-hosted to

3160-487: The Khronos OpenCL Working Group, improved Vulkan Interop with semaphores and memory sharing. Last minor update was 3.0.14 with bugfix and a new extension for multiple devices. When releasing OpenCL 2.2, the Khronos Group announced that OpenCL would converge where possible with Vulkan to enable OpenCL software deployment flexibility over both APIs. This has been now demonstrated by Adobe's Premiere Rush using

3239-541: The Kokkos community. SYCL focuses more on heterogeneous systems; thanks to its integration with OpenCL, it can be adopted on a wide range of devices. Kokkos, on the other hand, targets most of the HPC platforms, thus it is more HPC-oriented for performance. As of 2024, the Kokkos team is developing a SYCL backend, which enables Kokkos to target Intel hardware in addition to the platforms it already supports. This development broadens

OpenCL - Misplaced Pages Continue

3318-456: The OpenCL 1.0 specification to its GPU Computing Toolkit. On October 30, 2009, IBM released its first OpenCL implementation as a part of the XL compilers . Acceleration of calculations with factor to 1000 are possible with OpenCL in graphic cards against normal CPU. Some important features of next Version of OpenCL are optional in 1.0 like double- or half-precision operations. OpenCL 1.1 was ratified by

3397-515: The OpenCL C language and deprecates the OpenCL C++ Kernel Language, replacing it with the C++ for OpenCL language based on a Clang / LLVM compiler which implements a subset of C++17 and SPIR-V intermediate code. Version 3.0.7 of C++ for OpenCL with some Khronos openCL extensions were presented at IWOCL 21. Actual is 3.0.11 with some new extensions and corrections. NVIDIA, working closely with

3476-475: The OpenCL applications execution using a dedicated OpenCL API. Binaries compiled from sources in C++ for OpenCL 1.0 can be executed on OpenCL 2.0 conformant devices. Depending on the language features used in such kernel sources it can also be executed on devices supporting earlier OpenCL versions or OpenCL 3.0. Aside from OpenCL drivers kernels written in C++ for OpenCL can be compiled for execution on Vulkan devices using clspv compiler and clvk runtime layer just

3555-425: The OpenCL back-end. More recently Khronos Group has ratified SYCL , a higher-level programming model for OpenCL as a single-source eDSL based on pure C++17 to improve programming productivity . People interested by C++ kernels but not by SYCL single-source programming style can use C++ features with compute kernel sources written in "C++ for OpenCL" language. OpenCL defines a four-level memory hierarchy for

3634-540: The OpenCL standard consists of a library that implements the API for C and C++, and an OpenCL C compiler for the compute devices targeted. In order to open the OpenCL programming model to other languages or to protect the kernel source from inspection, the Standard Portable Intermediate Representation (SPIR) can be used as a target-independent way to ship kernels between a front-end compiler and

3713-475: The OpenCL-Docs repository from the Khronos Group along with the sources of other specifications and reference cards. The first official release of this document describing C++ for OpenCL version 1.0 has been published in December 2020. C++ for OpenCL 1.0 contains features from C++17 and it is backward compatible with OpenCL C 2.0. In December 2021, a new provisional C++ for OpenCL version 2021 has been released which

3792-466: The Raja team is developing a SYCL backend, which will enable Raja to also target Intel hardware. This development will enhance Raja's portability and flexibility, allowing it to leverage SYCL's capabilities and expand its applicability across a wider array of hardware platforms. OpenMP targets computational offloading to external accelerators, primarily focusing on multi-core architectures and GPUs. SYCL, on

3871-524: The SYCL 1.2 final version was introduced at IWOCL 2015 in May 2015. The latest version for the previous SYCL 1.2.1 series is SYCL 1.2.1 revision 7 which was published on April 27, 2020 (the first version was published on December 6, 2017 ). SYCL 2.2 provisional was introduced at IWOCL 2016 in May 2016 targeting C++14 and OpenCL 2.2. But the SYCL committee preferred not to finalize this version and to move towards

3950-443: The SYCL 2020 Provisional Specification revision 1 published on June 30, 2020. C++17 and OpenCL 3.0 support are main targets of this release. Unified shared memory (USM) is one main feature for GPUs with OpenCL and CUDA support. At IWOCL 2021 a roadmap was presented. DPC++, ComputeCpp, Open SYCL, triSYCL and neoSYCL are the main implementations of SYCL. Next Target in development is support of C++20 in future SYCL 202x. In March 2023

4029-451: The SYCL standard started as the higher-level programming model sub-group of the OpenCL working group and was originally developed for use with OpenCL and SPIR , SYCL is a Khronos Group workgroup independent from the OpenCL working group since September 20, 2019 and starting with SYCL 2020, SYCL has been generalized as a more general heterogeneous framework able to target other systems. This

SECTION 50

#1733085527698

4108-555: The applicability of Kokkos and allows for greater flexibility in leveraging different hardware architectures within HPC applications. Raja is a library of C++ software abstractions to enable the architecture and programming portability of HPC applications. Like SYCL, it provides portable code across heterogeneous platforms. However, unlike SYCL, Raja introduces an abstraction layer over other programming models like CUDA, HIP, OpenMP, and others. This allows developers to write their code once and run it on various backends without modifying

4187-448: The clspv open source compiler to compile significant amounts of OpenCL C kernel code to run on a Vulkan runtime for deployment on Android. OpenCL has a forward looking roadmap independent of Vulkan, with 'OpenCL Next' under development and targeting release in 2020. OpenCL Next may integrate extensions such as Vulkan / OpenCL Interop, Scratch-Pad Memory Management, Extended Subgroups, SPIR-V 1.4 ingestion and SPIR-V Extended debug info. OpenCL

4266-414: The common code of the enterprise, instead of using a generic "one-size-fits-all" framework developed by third parties for general purposes. An example of that would be how the user interface in such an application package as an office suite grows to have common look, feel, and data-sharing attributes and methods, as the once disparate bundled applications, grow unified into a suite that is tighter and smaller;

4345-467: The compute device: Not every device needs to implement each level of this hierarchy in hardware. Consistency between the various levels in the hierarchy is relaxed, and only enforced by explicit synchronization constructs, notably barriers . Devices may or may not share memory with the host CPU. The host API provides handles on device memory buffers and functions to transfer data back and forth between host and devices. The programming language that

4424-457: The core logic. Raja is maintained and developed at Lawrence Livermore National Laboratory (LLNL), whereas SYCL is an open standard maintained by the community. Similar to Kokkos, Raja is more tailored for HPC use cases, focusing on performance and scalability in high-performance computing environments. In contrast, SYCL supports a broader range of devices, making it more versatile for different types of applications beyond just HPC. As of 2024,

4503-429: The core specification for significantly enhanced parallel programming productivity. It was released on May 16, 2017. Maintenance Update released in May 2018 with bugfixes. The OpenCL 3.0 specification was released on September 30, 2020, after being in preview since April 2020. OpenCL 1.2 functionality has become a mandatory baseline, while all OpenCL 2.x and OpenCL 3.0 features were made optional. The specification retains

4582-592: The development of software applications , products and solutions. Software frameworks may include support programs, compilers, code libraries, toolsets, and application programming interfaces (APIs) that bring together all the different components to enable development of a project or system . Frameworks have key distinguishing features that separate them from normal libraries : The designers of software frameworks aim to facilitate software developments by allowing designers and programmers to devote their time to meeting software requirements rather than dealing with

4661-474: The flexibility to leverage SYCL's high-level abstractions across a diverse range of devices and platforms. SYCL has many similarities to the Kokkos programming model, including the use of opaque multi-dimensional array objects (SYCL buffers and Kokkos arrays), multi-dimensional ranges for parallel execution, and reductions (added in SYCL 2020). Numerous features in SYCL 2020 were added in response to feedback from

4740-480: The following (with examples): Khronos Maintains a list of SYCL resource. Codeplay Software also provides tutorials on the website sycl.tech along with other information and news on the SYCL ecosystem. The source files for building the specification, such as Makefiles and some scripts, the SYCL headers and the SYCL code samples are under the Apache 2.0 license . The open standards SYCL and OpenCL are similar to

4819-506: The host to launch kernels on the compute devices and manage device memory, which is (at least conceptually) separate from host memory. Programs in the OpenCL language are intended to be compiled at run-time , so that OpenCL-using applications are portable between implementations for various host devices. The OpenCL standard defines host APIs for C and C++ ; third-party APIs exist for other programming languages and platforms such as Python , Java , Perl , D and .NET . An implementation of

SECTION 60

#1733085527698

4898-600: The intricate details of memory transfers and synchronization. Both OpenMP and SYCL support C++ and are standardized. OpenMP is standardized by the OpenMP Architecture Review Board (ARB), while SYCL is standardized by the Khronos Group. OpenMP has wide support from various compilers, like GCC and Clang . std::par is part of the C++17 standard and is designed to facilitate the parallel execution of standard algorithms on C++ standard containers. It provides

4977-632: The kernel functions e.g. overloading or templating, arbitrary class layout in parameter type. The following code snippet illustrates how kernels with complex-number arithmetic can be implemented in C++ for OpenCL language with convenient use of C++ features. C++ for OpenCL language can be used for the same applications or libraries and in the same way as OpenCL C language is used. Due to the rich variety of C++ language features, applications written in C++ for OpenCL can express complex functionality more conveniently than applications written in OpenCL C and in particular generic programming paradigm from C++

5056-499: The more standard low-level details of providing a working system, thereby reducing overall development time. For example, a team using a web framework to develop a banking website can focus on writing code particular to banking rather than the mechanics of request handling and state management . Frameworks often add to the size of programs, a phenomenon termed " code bloat ". Due to customer-demand-driven applications needs, both competing and complementary frameworks sometimes end up in

5135-417: The newer/evolved suite can be a product that shares integral utility libraries and user interfaces. This trend in the controversy brings up an important issue about frameworks. Creating a framework that is elegant, versus one that merely solves a problem, is still rather a craft than a science. "Software elegance " implies clarity, conciseness, and little waste (extra or extraneous functionality, much of which

5214-462: The other hand, is oriented towards a broader range of devices due to its integration with OpenCL, which enables support for various types of hardware accelerators. OpenMP uses a pragma-based approach, where the programmer annotates the code with directives, and the compiler handles the complexity of parallel execution and memory management. This high-level abstraction makes it easier for developers to parallelize their applications without dealing with

5293-439: The other hand, is the high-level single-source C++ embedded domain-specific language (eDSL). It enables developers to write code for heterogeneous computing systems, including CPUs, GPUs, and other accelerators, using a single-source approach. This means that both host and device code can be written in the same C++ source file. By comparison, the single-source C++ embedded domain-specific language version of CUDA, which

5372-434: The output product, nor its relative efficiency and conciseness. Using any library solution necessarily pulls in extras and unused extraneous assets unless the software is a compiler-object linker making a tight (small, wholly controlled, and specified) executable module. The issue continues, but a decade-plus of industry experience has shown that the most effective frameworks turn out to be those that evolve from re-factoring

5451-476: The overall architecture of a software system, that is to say its basic components and the relationships between them. These remain unchanged (frozen) in any instantiation of the application framework. Hot spots represent those parts where the programmers using the framework add their own code to add the functionality specific to their own project. In an object-oriented environment, a framework consists of abstract and concrete classes . Instantiation of such

5530-522: The programmer from explicitly transferring data between the host and devices by using buffers and accessors. This is in contrast to CUDA (prior to the introduction of Unified Memory in CUDA 6), where explicit data transfers were required. Starting with SYCL 2020, it is also possible to use Unified Shared Memory (USM) to augment, rather than replace, the buffer-based interfaces, providing a lower-level programming model similar to Unified Memory in CUDA. SYCL

5609-476: The programming models of the proprietary stack CUDA from Nvidia and HIP from the open-source stack ROCm , supported by AMD . In the Khronos Group realm, OpenCL and Vulkan are the low-level non-single source APIs , providing fine-grained control over hardware resources and operations. OpenCL is widely used for parallel programming across various hardware types, while Vulkan primarily focuses on high-performance graphics and computing tasks. SYCL, on

5688-404: The project's staff; many programmers keep copies of useful boilerplate code for common needs. However, once a framework is learned, future projects can be faster and easier to complete; the concept of a framework is to make a one-size-fits-all solution set, and with familiarity, code production should logically rise. There are no such claims made about the size of the code eventually bundled with

5767-410: The same way as OpenCL C kernels. C++ for OpenCL is an open language developed by the community of contributors listed in its documentation. New contributions to the language semantic definition or open source tooling support are accepted from anyone interested as soon as they are aligned with the main design philosophy and they are reviewed and approved by the experienced contributors. OpenCL

5846-439: The technical details of the specification for OpenCL 1.0 by November 18, 2008. This technical specification was reviewed by the Khronos members and approved for public release on December 8, 2008. OpenCL 1.0 released with Mac OS X Snow Leopard on August 28, 2009. According to an Apple press release: Snow Leopard further extends support for modern hardware with Open Computing Language (OpenCL), which lets any application tap into

5925-546: The underlying technology on which they were built advanced. Even there, having evolved, many such packages will retain legacy capabilities bloating the final software as otherwise replaced methods have been retained in parallel with the newer methods. Software frameworks typically contain considerable housekeeping and utility code in order to help bootstrap user applications, but generally focus on specific problem domains, such as: According to Pree, software frameworks consist of frozen spots and hot spots . Frozen spots define

6004-512: The use of C++ in critical systems, RTCA DO-178C / EASA ED-12C, which are standards for software considerations in airborne systems and equipment certification, ISO 26262/21448, which pertains to the functional safety of road vehicles, IEC 61508 , which covers the functional safety of electrical/electronic/programmable electronic safety-related systems, and IEC 62304 , which relates to the lifecycle requirements for medical device software. Some notable software fields that make use of SYCL include

6083-507: The vast gigaflops of GPU computing power previously available only to graphics applications. OpenCL is based on the C programming language and has been proposed as an open standard. AMD decided to support OpenCL instead of the now deprecated Close to Metal in its Stream framework . RapidMind announced their adoption of OpenCL underneath their development platform to support GPUs from multiple vendors with one interface. On December 9, 2008, Nvidia announced its intention to add full support for

6162-417: Was developed that inherited all functionality from OpenCL C but allowed to use C++ features in the kernel sources. OpenCL C is a C99 -based language dialect adapted to fit the device model in OpenCL. Memory buffers reside in specific levels of the memory hierarchy , and pointers are annotated with the region qualifiers __global , __local , __constant , and __private , reflecting this. Instead of

6241-520: Was initially developed by Apple Inc. , which holds trademark rights, and refined into an initial proposal in collaboration with technical teams at AMD , IBM , Qualcomm , Intel , and Nvidia . Apple submitted this initial proposal to the Khronos Group . On June 16, 2008, the Khronos Compute Working Group was formed with representatives from CPU, GPU, embedded-processor, and software companies. This group worked for five months to finish

#697302