...The goals of this solution were to not only reimplement GPUs into SIA but to also ensure that the implementation would be extensible in the future as well as easier to use for the SIAL programmer. Previously, in ACES III, the GPU implementation required specific actions by the SIAL programmer to move and allocate data. This new solution will automatically handle this by integrating the GPU operations into the block operations more naturally. At a high level, this implementation handles data lazily but could be easily changed to proactively transfer and allocate data through look-ahead optimizations in the SIAL compiler. The data transfers and allocations are considered lazy because they wait until an operation needs them, at which point, the data will be allocated or transferred. To implement this data management, we built on top of the Block class in SIA. Each block represents a tensor and therefore has the array of data making up the tensor as well as some other administrative information associated with it. Previously, in ACES III, this Block class had two data pointers: one pointer to the host data and another pointer to GPU data if it is available. In this new solution, we use a map and OpenCL to track on what devices a block’s data can be found. Since OpenCL allows for executing code on multiple types of devices, we chose to use it for determining what data is on what device as well as the current device on which we are currently working. By using OpenCL in this way, we...
Words: 812 - Pages: 4