Posts
The band structure of graphene
|0> to |1>: Why quantum technology is bound to rise.
Finite differences and second quantization
Dynamically adjustable HoppingAmplitudes using callbacks in TBTK
Retrieving the Hamiltonian from the Model in TBTK
Creating Models with complex geometries using an IndexFilter in TBTK
Direct access to wave function amplitudes and eigenvalues in TBTK
Using TBTK to calculate the density of states (DOS) of a 1D, 2D, and 3D square lattice
TBTK v1.0.0 released
Using TBTK to time C++ code
Introduction to TBTK – A software development kit for quantum mechanical calculations
The second quantum revolution

Using TBTK to time C++ code

Kristofer Björnson
(16th of May 2018)

Most recent TBTK release at the time of writing: v0.9.6

When developing computationally intensive algorithms, it is important to be able to time the execution to find the main bottlenecks. TBTK therefore provides a flexible Timer class that allows for simple profiling of C++ code.

1 Include file

The Timer is made available through

1#include ”TBTK/Timer.h”

2 Stacked tick-tock calls

In its basic use case the Timer acts as a time stamp stack that allows a single time stamp to be pushed onto the stack through the call

1Timer::tick(”Optional tag.”);

The latest time stamp can similarly be popped from the stack using

1Timer::tock();

When a time stamp is popped, the time difference between the current time and the time that the time stamp was pushed onto the stack is printed to the standard output. In addition, the current stack depth and the optional tag is also printed. This allows for easy identification of which part of the code a specific time difference corresponds to.

In the following example nested tick-tock calls are used to simultaneously measure the total calculation time of a nested loop as well as the time taken by a single iteration of the outer loop.

1Timer::tick(”Full calculation.”)
2double sum = 0;
3for(unsigned int i = 0; i < 10; i++){
4    Timer::tick(”Single iteration of the outer loop.”);
5    for(unsigned int j = 0; j < 10000000; j++)
6        sum += sqrt(i*j);
7    Timer::tock();
8}
9Timer::tock();

A typical output is

1(1) 23ms 356us 578ns    Single iteration of the outer loop.
2(1) 22ms 121us 595ns    Single iteration of the outer loop.
3(1) 20ms 894us 633ns    Single iteration of the outer loop.
4(1) 20ms 693us 151ns    Single iteration of the outer loop.
5(1) 20ms 753us 656ns    Single iteration of the outer loop.
6(1) 20ms 632us 406ns    Single iteration of the outer loop.
7(1) 20ms 693us 43ns     Single iteration of the outer loop.
8(1) 20ms 751us 261ns    Single iteration of the outer loop.
9(1) 20ms 420us 362ns    Single iteration of the outer loop.
10(1) 20ms 753us 927ns    Single iteration of the outer loop.
11(0) 211ms 272us 43ns    Full calculation.

We can see that each iteration takes 20-23ms to execute while the full calculation takes 211ms. On the left we also see the stack depth, while the tag is displayed to the right.

As a second example we consider the following recursive function call

1double recursive(int depth){
2        Timer::tick(”Recursive step  + to_string(depth));
3
4        double answer;
5        if(depth == 10){
6                answer = 0;
7        }
8        else{
9                double sum = 0;
10                for(unsigned int i = 0; i < 10000000; i++)
11                        sum += sqrt(depth*i);
12                answer = sum + recursive(depth+1);
13        }
14
15        Timer::tock();
16
17        return answer;
18}

Each time the recursive function is entered, a new time stamp is pushed onto the stack. After ten consecutive function calls the function begins to return and the time stamp stack is unwinded. The resulting output is

1(10) 501ns              Recursive step 10
2(9) 62ms 570us 861ns    Recursive step 9
3(8) 125ms 299us 748ns   Recursive step 8
4(7) 187ms 827us 863ns   Recursive step 7
5(6) 250ms 704us 696ns   Recursive step 6
6(5) 313ms 284us 290ns   Recursive step 5
7(4) 376ms 373us 532ns   Recursive step 4
8(3) 439ms 518us 286ns   Recursive step 3
9(2) 502ms 852us 691ns   Recursive step 2
10(1) 566ms 298us 567ns   Recursive step 1
11(0) 617ms 420us 578ns   Recursive step 0

3 Accumulators

While the stacked mode allows for flexible measurements of multiple pieces of code at the same time, it is not suitable for a case like the following

1for(unsigned int i = 0; i < 1000000; i++){
2    doA();
3    doB();
4}

Here both doA() and doB() are executed multiple times and we may want to figure out which one that is the bottleneck. Moreover, measuring doA() and doB() in single shot measurements may not be useful for figuring out the overall performance. For example, doA() and doB() may take widely different times to execute each time and the execution time for doB() may even depend on what the previous call of doA() did, and vice versa. In such cases it is only meaningful to measure the accumulated time taken by each function over the full loop.

For this reason the Timer also has an accumulator mode. To use this mode we first create two new accumulators.

1unsigned int idA = Timer::createAccumulator(”Accumulator A”);
2unsigned int idB = Timer::createAccumulator(”Accumulator B”);

The accumulators can then be used independently by supplying the corresponding ID to the tick-tock calls.

1for(unsigned int i = 0; i < 1000000; i++){
2    Timer::tick(idA);
3    doA();
4    Timer::tock(idA);
5
6    Timer::tick(idB);
7    doB();
8    Timer::tock(idB);
9}

Since tick-tock pairs can be executed millions of times for any given accumulator, no information is printed by the tock calls in this mode. Instead the time accumulated so far can be printed at any time through the call

1Timer::printAccumulators();

A typical output looks like

1============================== Accumulator table ==============================
2ID        Time                                  Tag
3[0]                       650ms 618us 220ns     Accumulator A
4[1]                    2s 870ms 110us  19ns     Accumulator B
5===============================================================================

Note that time is added to the accumulators when the tock calls are made. Therefore, when printing the accumulator table, only completed time intervals contribute to the listed time. We also note that it is possible to use stacked tick-tock calls and accumulator calls simultaneously.

4 Learn more

For more information about the Timer class, see the corresponding manual page and API entry.