Execution Policies#

Top Level Execution Policies#

ExecutionPolicyConcept is the fundamental abstraction to represent “how” the execution of a Kokkos parallel pattern takes place.

Policy

Description

RangePolicy

Each iterate is an integer in a contiguous range

MDRangePolicy

Each iterate for each rank is an integer in a contiguous range

TeamPolicy

Assigns to each iterate in a contiguous range a team of threads

Nested Execution Policies#

Nested Execution Policies are used to dispatch parallel work inside of an already executing parallel region either dispatched with a TeamPolicy or a task policy. NestedPolicies summary.

Policy

Description

TeamThreadMDRange

Used inside of a TeamPolicy kernel to perform nested parallel loops over a multidimensional range split over threads of a team.

TeamThreadRange

Used inside of a TeamPolicy kernel to perform nested parallel loops split over threads of a team.

TeamVectorMDRange

Used inside of a TeamPolicy kernel to perform nested parallel loops over a multidimensional range split over threads of a team and their vector lanes.

TeamVectorRange

Used inside of a TeamPolicy kernel to perform nested parallel loops split over threads of a team and their vector lanes.

ThreadVectorMDRange

Used inside of a TeamPolicy kernel to perform nested parallel loops over a multidimensional range with vector lanes of a thread.

ThreadVectorRange

Used inside of a TeamPolicy kernel to perform nested parallel loops with vector lanes of a thread.

Common Arguments for all Execution Policies#

Execution Policies generally accept compile time arguments via template parameters and runtime parameters via constructor arguments or setter functions.

Tip

Template arguments can be given in arbitrary order.

Argument

Options

Purpose

ExecutionSpace

Serial, OpenMP, Threads, Cuda, HIP, SYCL, HPX

Specify the Execution Space to execute the kernel in. Defaults to Kokkos::DefaultExecutionSpace

Schedule

Schedule<Dynamic>, Schedule<Static>

Specify scheduling policy for work items. Dynamic scheduling is implemented through a work stealing queue. Default is machine and backend specific.

IndexType

IndexType<int>

Specify integer type to be used for traversing the iteration space. Defaults to int64_t.

LaunchBounds

LaunchBounds<MaxThreads, MinBlocks>

Specifies hints to to the compiler about CUDA/HIP launch bounds.

WorkTag

SomeClass

Specify the work tag type used to call the functor operator. Any arbitrary type defaults to void.