Skip to content

Commit becd43e

Browse files
committed
Add more documentation
1 parent de46ce4 commit becd43e

File tree

8 files changed

+99
-27
lines changed

8 files changed

+99
-27
lines changed

docs/_static/logo.png

44.7 KB
Loading

docs/_static/my-styles.css

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -24,7 +24,7 @@ h3 {
2424
margin-bottom: 40px;
2525
}
2626

27-
li {
27+
.document li {
2828
margin: 0 0 10px;
2929
}
3030

docs/env_vars.rst

Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,30 @@
1+
Environment Variables
2+
=====================
3+
4+
`Kernel Launcher` recognizes the following environment variables:
5+
6+
* **KERNEL_LAUNCHER_TUNE** (default: ``0``):
7+
Kernels for which a tuning specification will be exported on the first call to the kernel.
8+
The value should a comma-seperated list of kernel names.
9+
Additionally, an ``*`` can be used as a wild card.
10+
11+
Examples:
12+
13+
* ``foo,bar``: matches kernels ``foo`` and ``bar``.
14+
* ``vector_*``: matches kernels that start with ``vector``.
15+
* ``*_matrix_*``: matches kernels that contains ``matrix``.
16+
* ``*``: matches all kernels.
17+
18+
19+
* **KERNEL_LAUNCHER_WISDOM** (default: ``.``):
20+
The default directory where wisdom files are located. Defaults to the current working directory.
21+
22+
* **KERNEL_LAUNCHER_LOG** (default: ``info``):
23+
Controls how much logging information is printed to stderr. There are three possible options:
24+
25+
* ``debug``: Everything is logged.
26+
* ``info``: Only warnings and high-level information is logged.
27+
* ``warn``: Only warnings are logged.
28+
29+
* **KERNEL_LAUNCHER_INCLUDE** (default: ``.``):
30+
List of comma-seperate directories that are considered while compiling kernels when searching for header files.

docs/examples/registry.rst

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,6 @@
11
Kernel Registry
22
===============
33

4+
The kernel registry essentially acts like a global cache of compiled kernels.
5+
46
TODO

docs/examples/wisdom.cpp

Lines changed: 18 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -1,44 +1,39 @@
11
#include "kernel_launcher.h"
22

3+
// Namespace alias.
4+
namespace kl = kernel_launcher;
35

4-
int main() {
5-
// Namespace alias.
6-
namespace kl = kernel_launcher;
7-
8-
// Create a kernel builder
6+
kl::KernelBuilder build_kernel() {
97
kl::KernelBuilder builder("vector_add", "vector_add_kernel.cu");
10-
11-
// Define tunable parameters
8+
129
auto threads_per_block = builder.tune("block_size", {32, 64, 128, 256, 512, 1024});
1310
auto elements_per_thread = builder.tune("elements_per_thread", {1, 2, 4, 8});
14-
15-
// Define expressions
1611
auto elements_per_block = threads_per_block * elements_per_thread;
17-
18-
// Define kernel properties
12+
1913
builder
2014
.block_size(threads_per_block)
2115
.grid_divisors(threads_per_block * elements_per_thread)
2216
.template_args(kl::type_of<float>())
2317
.define("ELEMENTS_PER_THREAD", elements_per_thread);
2418

25-
// Define configuration
26-
kl::Config config;
27-
config.insert(threads_per_block, 32);
28-
config.insert(elements_per_thread, 2);
19+
return builder;
20+
}
21+
22+
void main() {
23+
kl::set_global_wisdom_directory("wisdom/");
24+
kl::set_global_tuning_directory("tuning/");
25+
26+
// Define the kernel. "vector_add" is the tuning key.
27+
std::string tuning_key = "vector_add":
28+
kl::KernelBuilder builder = build_kernel();
29+
kl::WisdomKernel vector_add_kernel(tuning_key, builder);
2930

30-
// Compile kernel
31-
kl::Kernel<int, int*, const int*, const int*> vector_add_kernel;
32-
vector_add_kernel.compile(builder, config);
33-
3431
// Initialize CUDA memory. This is outside the scope of kernel_launcher.
3532
unsigned int n = 1000000;
3633
float *dev_A, *dev_B, *dev_C;
3734
/* cudaMalloc, cudaMemcpy, ... */
38-
35+
3936
// Launch the kernel!
4037
unsigned int problem_size = n;
41-
vector_add_kernel
42-
.instantiate(problem_size)
43-
.launch(n, dev_C, dev_A, dev_B);
38+
vector_add_kernel(problem_size)(n, dev_C, dev_A, dev_B);
4439
}

docs/examples/wisdom.rst

Lines changed: 44 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,47 @@
11
Wisdom Files
22
============
33

4-
TODO: Write about wisdom files
4+
In the previous example, we saw how it is possible to compile a kernel by providing both a ``KernelBuilder`` instance (describing `blueprint` for the kernel) and a ``Config`` instance (describing the configuration of the tunable parameters).
5+
6+
However, determining the optimal configuration is often difficult since it highly depends both on the `problem size` and the type of `GPU` being used.
7+
`Kernel Launcher` offers a solution this problem in form of `wisdom` files (terminology borrowed from `FFTW <http://www.fftw.org/>`_).
8+
9+
Let's see this in action.
10+
11+
12+
C++ source code
13+
---------------
14+
15+
The following snippet show an example:
16+
17+
.. literalinclude:: wisdom.cpp
18+
19+
20+
Notice how this example is similar to the previous example, except ``kl::Kernel`` has been replaced by ``kl::WisdomKernel``.
21+
On the first call this kernel, the kernel searches for the wisdom file for the key ``vector_add`` and compiles the kernel for the given ``problem_size`` and the current GPU.
22+
If no wisdom file has been found, the default configuration is chosen (in this case, that will be ``block_size=32,elements_per_thread=1``).
23+
24+
25+
26+
Export the kernel
27+
-----------------
28+
To tune the kernel, we first need to export the tuning specifications. To do this, we run the program with the environment variable ``KERNEL_LAUNCHER_TUNE=vector_add``::
29+
30+
KERNEL_LAUNCHER_TUNE=vector_add ./main
31+
32+
This generates a file ``vector_add_1000000.json`` in the directory set by ``set_global_tuning_directory``.
33+
34+
35+
Tune the kernel
36+
---------------
37+
TODO: Using kernel tuner
38+
39+
40+
Import the wisdom
41+
-----------------
42+
After tuning the kernel and obtaining the wisdom file, we place this wisdom file in the directory specified by ``set_global_wisdom_directory``.
43+
Now, when running the program, on the first call to ``vector_add_kernel``, the kernel finds the wisdom file and compiles the kernel given the optimal configuration.
44+
45+
To confirm that wisdom file has indeed been found, check the debugging output by define the environment variable ``KERNEL_LAUNCHER_LOG=debug``.
46+
47+

docs/index.rst

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,9 @@
1010
install
1111
example
1212
api
13+
env_vars
1314
license
15+
Github repository <https://github.com/KernelTuner/kernel_launcher>
1416

1517
Kernel Launcher
1618
===============

docs/install.rst

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,7 @@ First, check out the repostitory.
1717

1818
.. code-block:: bash
1919
20-
git clone https://github.com/stijnh/kernel_launcher/
20+
git clone https://github.com/KernelTuner/kernel_launcher/
2121
2222
2323
Second, add the following lines to your ``CMakeLists.txt``::
@@ -34,7 +34,7 @@ An alternative is to build a static library that can be linked to your project.
3434

3535
.. code-block:: bash
3636
37-
git clone https://github.com/stijnh/kernel_launcher/
37+
git clone https://github.com/KernelTuner/kernel_launcher/
3838
cd kernel_launcher
3939
cmake -DCMAKE_BUILD_TYPE=Release -S . -B build
4040
cmake --build build

0 commit comments

Comments
 (0)