Skip to content

Commit

Permalink
Try #868:
Browse files Browse the repository at this point in the history
  • Loading branch information
bors[bot] authored Sep 30, 2022
2 parents d0908cd + b6c3f4c commit a431abd
Show file tree
Hide file tree
Showing 31 changed files with 1,579 additions and 2,215 deletions.
183 changes: 183 additions & 0 deletions docs/gt4py/arrays.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,183 @@
===============================
Allocation and Array Interfaces
===============================

GT4Py does not provide its own data container class, but supports established python standards for exposing
N-dimensional buffers. There is a minimalistic interface allowing to specify the correspondence of buffer dimensions
to the semantic dimensions assumed in the stencil. This correspondence does not necessarily need to be specified since
the stencils specify a default ordering.

GT4Py provides utilities to allocate buffers that have optimal layout and alignment for a given backend.

In this document, we describe the interfaces for

* supported buffer interfaces
* exposing dimension labels and the behavior for default values
* performance-optimal allocation

----------
Interfaces
----------

Stencil Calls
-------------

Supported Buffer Interfaces
^^^^^^^^^^^^^^^^^^^^^^^^^^^

The user is free to choose a buffer interface (or multiple) as long as it is supported by :code:`numpy.asarray` in case
a CPU backend is chosen or :code:`cupy.asarray` for a GPU backend respectively. If multiple buffer interfaces are
implemented the provided information needs to agree otherwise the behaviour is undefined. Similarly the backend is also
free to choose what buffer interface to use in order to retrieve the required information (e.g. pointer, strides, etc.)
In particular, we support the following interfaces to expose a buffer:

* `__array_interface__ <https://omz-software.com/pythonista/numpy/reference/arrays.interface.html>`_ (CPU backends)
* `__cuda_array_interface__ <https://numba.pydata.org/numba-doc/dev/cuda/cuda_array_interface.html>`_ (GPU backends)
* `python buffer protocol <https://docs.python.org/3/c-api/buffer.html>`_ (CPU backends)

Internally, gt4py uses the utilities :code:`gt4py.utils.as_numpy` and :code:`gt4py.utils.as_cupy` to retrieve the
buffers. GT4Py developers are advised to always use those utilities as to guarantee support across gt4py as the
supported interfaces are extended.

Dimension Mapping
^^^^^^^^^^^^^^^^^

The user can optionally implement a :code:`__gt_dims__` attribute in the object implementing any of the supported buffer
interfaces. The returned object should be a tuple of strings labeling the dimensions in index order.
As a fallback if the attribute is not implemented, it is assumed that the buffer contains the dimensions given in the annotations
(by means of :code:`gtscript.Field`) exactly in the same order.

Valid dimension strings are :code:`"I"`, :code:`"J"`, :code:`"K"` as well as decimal string representations of integer
numbers to denote data dimensions.

Developers are advised to use the utility :code:`gt4py.utils.get_dims(storage, annotation)`,
which implements this lookup.

Note: Support for xarray can be added manually by the user by means of the mechanism described
`here <https://xarray.pydata.org/en/stable/internals/extending-xarray.html>`_.

Default Origin
^^^^^^^^^^^^^^

A buffer object can optionally implement the :code:`__gt_origin__` attribute which is used as the origin value unless
overwritten by the :code:`origin` keyword argument to the stencil call.



Allocation
----------

For the performance-optimal allocation and initialization of arrays to be used in GT4Py, we provide the following set of
functions which closely resemble their NumPy counterparts (meaning of the common parameters is explained below).

The return type is either a :code:`numpy.ndarray` or a :code:`cupy.ndarray`, for CPU and GPU backends, respectively.

:code:`empty(shape: Sequence[int], dtype: dtype_like = np.float64, **kwargs) -> ndarray`
Allocate an array with uninitialized (undefined) values.

Parameters:
+ :code:`shape: Sequence[int]`
Sequence of length :code:`ndim` (:code:`ndim` = number of dimensions) with the
shape of the storage.

+ :code:`dtype: dtype_like`
The dtype of the storage (NumPy dtype or accepted by :code:`np.dtype()`). It defaults to
:code:`np.float64`.
Keyword Arguments:
+ :code:`aligned_index: Sequence[int]`
The index of the grid point to which the memory is aligned. Note that this only partly takes the
role of the deprecated :code:`default_origin` parameter, since it does not imply anything about the
origin or domain when passed to a stencil. (See :code:`__gt_origin__` interface instead.)

For common keyword-only arguments, please see below.

:code:`zeros(shape: Sequence[int], dtype: dtype_like = np.float64, **kwargs) -> ndarray`
Allocate an array with values initialized to 0.

Parameters:
+ :code:`shape: Sequence[int]`
Sequence of length :code:`ndim` (:code:`ndim` = number of dimensions) with the
shape of the storage.

+ :code:`dtype: dtype_like`
The dtype of the storage (NumPy dtype or accepted by :code:`np.dtype()`). It defaults to
:code:`np.float64`.
Keyword Arguments:
+ :code:`aligned_index: Sequence[int]`
The index of the grid point to which the memory is aligned. Note that this only partly takes the
role of the deprecated :code:`default_origin` parameter, since it does not imply anything about the
origin or domain when passed to a stencil. (See :code:`__gt_origin__` interface instead.)


For common keyword-only arguments, please see below.

:code:`ones(shape: Sequence[int], dtype: dtype_like = np.float64, **kwargs) -> ndarray`
Allocate an array with values initialized to 1.

Parameters:
+ :code:`shape: Sequence[int]`
Sequence of length :code:`ndim` (:code:`ndim` = number of dimensions) with the
shape of the storage.

+ :code:`dtype: dtype_like`
The dtype of the storage (NumPy dtype or accepted by :code:`np.dtype()`). It defaults to
:code:`np.float64`.
Keyword Arguments:
+ :code:`aligned_index: Sequence[int]`
The index of the grid point to which the memory is aligned. Note that this only partly takes the
role of the deprecated :code:`default_origin` parameter, since it does not imply anything about the
origin or domain when passed to a stencil. (See :code:`__gt_origin__` interface instead.)


For common keyword-only arguments, please see below.


:code:`full(shape: Sequence[int], fill_value: Number, dtype: dtype_like = np.float64, **kwargs) -> ndarray`
Allocate an array with values initialized to the scalar given in :code:`fill_value`.

Parameters:
+ :code:`shape: Sequence[int]`
Sequence of length :code:`ndim` (:code:`ndim` = number of dimensions) with the
shape of the storage.

+ :code:`fill_value: Number`. The number to which the storage is initialized.

+ :code:`dtype: dtype_like`
The dtype of the storage (NumPy dtype or accepted by :code:`np.dtype()`). It defaults to
:code:`np.float64`.
Keyword Arguments:
+ :code:`aligned_index: Sequence[int]`
The index of the grid point to which the memory is aligned. Note that this only partly takes the
role of the deprecated :code:`default_origin` parameter, since it does not imply anything about the
origin or domain when passed to a stencil. (See :code:`__gt_origin__` interface instead.)


For common keyword-only arguments, please see below.

:code:`from_array(data: array_like, *, dtype: dtype_like = np.float64, **kwargs) -> ndarray`
Used to allocate an array with values initialized from the content of a given array.

Parameters:
+ :code:`data: array_like`. The original array from which the storage is initialized.

+ :code:`dtype: dtype_like`
The dtype of the storage (NumPy dtype or accepted by :code:`np.dtype()`). It defaults to the dtype of
:code:`data`.
Keyword Arguments:
+ :code:`aligned_index: Sequence[int]`
The index of the grid point to which the memory is aligned. Note that this only partly takes the
role of the deprecated :code:`default_origin` parameter, since it does not imply anything about the
origin or domain when passed to a stencil. (See :code:`__gt_origin__` interface instead.)


Optional Keyword-Only Parameters
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Additionally, these **optional** keyword-only parameters are accepted:

:code:`dimensions: Optional[Sequence[str]]`
Sequence indicating the semantic meaning of the dimensions of this storage. This is used to
determine the default layout for the storage. Currently supported will be :code:`"I"`,
:code:`"J"`, :code:`"K"` and additional dimensions as string representations of integers,
starting at :code:`"0"`. (This information is not retained in the resulting array, and needs to be specified instead
with the :code:`__gt_dims__` interface. )
1 change: 1 addition & 0 deletions docs/gt4py/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -72,6 +72,7 @@ Features:
:caption: Contents:

quickstart
arrays
commandline
apiref
indices
Expand Down
52 changes: 34 additions & 18 deletions docs/gt4py/quickstart.rst
Original file line number Diff line number Diff line change
Expand Up @@ -165,13 +165,15 @@ regular function call receiving the definition function:
The generated code is written to and compiled in a local '.gt_cache' folder. Subsequent
invocations will check whether a recent version of the stencil already exists in the cache.

--------
Storages
--------
----------
Allocation
----------

Since some backends require data to be in a certain layout in memory, GT4Py provides a special `NumPy`-like
multidimensional array implementation called ``storage``. Storage containers can be allocated through the same familiar
set of routines used in `NumPy` for allocation: ``from_array``, ``ones``, ``zeros`` and ``empty``.
Since some backends require data to be in a certain layout in memory, GT4Py provides special `NumPy`-like
allocators. They work like the familiar set of routines used in `NumPy` for allocation: ``ones``, ``zeros``,
``full`` and ``empty``. There is also ``from_array`` that initializes the array to a provided array value.
The result of these routines is either a ``numpy.ndarray`` (for CPU backends) or a ``cupy.ndarray``
(for GPU backends).

.. code:: python
Expand All @@ -180,32 +182,46 @@ set of routines used in `NumPy` for allocation: ``from_array``, ``ones``, ``zero
backend= "numpy"
field_a = gt_storage.from_array(
data=np.random.randn(10, 10, 10),
np.random.randn(10, 10, 10),
np.float64,
backend=backend,
dtype=np.float64,
default_origin=(0, 0, 0),
aligned_index=(0, 0, 0),
)
field_b = gt_storage.ones(
backend=backend, shape=(10, 10, 10), dtype=np.float64, default_origin=(0, 0, 0)
(10, 10, 10), np.float64, backend=backend, aligned_index=(0, 0, 0)
)
field_c = gt_storage.zeros(
backend=backend, shape=(10, 10, 10), dtype=np.float64, default_origin=(0, 0, 0)
(10, 10, 10), np.float64, backend=backend, aligned_index=(0, 0, 0)
)
result = gt_storage.empty(
backend=backend, shape=(10, 10, 10), dtype=np.float64, default_origin=(0, 0, 0)
(10, 10, 10), np.float64, backend=backend, aligned_index=(0, 0, 0)
)
stencil_example(field_a, field_b, field_c, result, alpha=0.5)
The ``default_origin`` parameter plays two roles:
The ``aligned_index`` specifies that the array is to be allocated such that memory address of the point specified in
``aligned_index`` is `aligned` to a backend-dependent value. For optimal performance, you set the ``algined_index`` to
a point which is the lower-left corner of the iteration domain most frequently used for this field.

#. The data is allocated such that memory address of the point specified in ``default_origin`` is `aligned` to a
backend-dependent value. For optimal performance, you set the ``default_origin`` to a point which is the
lower-left corner of the iteration domain most frequently used for this storage.
----------------
Array Interfaces
----------------

#. If when calling the stencil, no other `origin` is specified, this value is where the `iteration domain` begins, i.e.
the grid point with the lowest index where a value is written.
When passing buffers to stencils, they can be in any form that is compatible with ``np.asarray`` or ``cp.asarray``,
respectively. Some meta information can be provided to describe the correspondence between array dimensions and
their semantic meaning (e.g. IJK) as well as to specify the correspondence. Also, an index can be designated as the
`origin` of the array to denote the start of the index range considered to be the `iteration domain`. Specifically, the
behavior is as follows:

#. Dimensions can be denoted by adding a ``__gt_dims__`` attribute to the buffer object. It should be a tuple of strings
where currently valid dimensions are ``"I", "J", "K"`` as well as string representations of integers to represent
data dimensions, i.e. the dimensions of vector, matrices or higher tensors per grid point. If ``__gt_dims__`` is not
present, the dimensions specified in the ``Field`` annotation of functions serves as a default.
#. The origin can be specified with the ``__gt_origin__`` attribute, which is a tuple of ``int`` s. If when calling the
stencil, no other `origin` is specified, this value is where the `iteration domain` begins, i.e. the grid point with
the lowest index where a value is written. The explicit ``origin`` keyword when calling a stencil takes priority over
this.

--------------------------
Computations and Intervals
Expand Down
2 changes: 1 addition & 1 deletion setup.cfg
Original file line number Diff line number Diff line change
Expand Up @@ -92,7 +92,7 @@ cuda116 =
cuda117 =
cupy-cuda117
dace =
dace>=0.13.2
dace~=0.14
sympy
format =
clang-format>=9.0
Expand Down
1 change: 0 additions & 1 deletion src/gt4py/backend/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -72,7 +72,6 @@ class Backend(abc.ABC):
#: - "device": "cpu" | "gpu"
#: - "layout_map": callback converting a mask to a layout
#: - "is_compatible_layout": callback checking if a storage has compatible layout
#: - "is_compatible_type": callback checking if storage has compatible type
storage_info: ClassVar[Dict[str, Any]]

#: Language support:
Expand Down
8 changes: 3 additions & 5 deletions src/gt4py/backend/cuda_backend.py
Original file line number Diff line number Diff line change
Expand Up @@ -30,9 +30,8 @@

from .gtc_common import (
BaseGTBackend,
GTCUDAPyModuleGenerator,
CUDAPyExtModuleGenerator,
cuda_is_compatible_layout,
cuda_is_compatible_type,
make_cuda_layout_map,
)

Expand Down Expand Up @@ -90,7 +89,7 @@ def visit_FieldDecl(self, node: cuir.FieldDecl, **kwargs):
data_ndim = len(node.data_dims)
sid_ndim = domain_ndim + data_ndim
if kwargs["external_arg"]:
return "py::buffer {name}, std::array<gt::int_t,{sid_ndim}> {name}_origin".format(
return "py::object {name}, std::array<gt::int_t,{sid_ndim}> {name}_origin".format(
name=node.name,
sid_ndim=sid_ndim,
)
Expand Down Expand Up @@ -145,10 +144,9 @@ class CudaBackend(BaseGTBackend, CLIBackendMixin):
"device": "gpu",
"layout_map": make_cuda_layout_map,
"is_compatible_layout": cuda_is_compatible_layout,
"is_compatible_type": cuda_is_compatible_type,
}
PYEXT_GENERATOR_CLASS = CudaExtGenerator # type: ignore
MODULE_GENERATOR_CLASS = GTCUDAPyModuleGenerator
MODULE_GENERATOR_CLASS = CUDAPyExtModuleGenerator
GT_BACKEND_T = "gpu"

def generate_extension(self, **kwargs: Any) -> Tuple[str, str]:
Expand Down
Loading

0 comments on commit a431abd

Please sign in to comment.