DS.rs = [[an error occurred while processing this directive]]

Tiled matrix

Tiled matrix implements a multi-level tiled matrix.

A matrix is divided up into M levels, order to avoid cache misses. A few callback-based routines have been implemented for this matrix type, along with allocation and freeing.

For example, a test program (included in the source) on an E5405 (option 0 is brain-dead cache-thrashing matrix multiplication (_core function), option 1 is with a single level (_L1) of tiling, each roughly the size of the CPU's L2 cache, option 2 is a single level (_L1) of tiling with tiles the size of the CPU's L1 cache, and option 3 is two levels of tiles (_L2) the size of L2 containing tiles the size of L1) on an old version:

./xtiling.bin
Processor is Xeon E5405; L1=32768B/32.00kB/0.03MB (4096 doubles), L2=3145728B/3072.00kB/3.00MB (393216 doubles)
Size of:
        float:4
        double:8
        long double:16
        char:1
        short int:2
        int:4
        long int:8
        long long int:8
Option 0 (core):Allocating, multiplying, checking matrices of 1, with size of 30, tile 9, subtile 32 (total 8640 per side; total size 74649600)

        Option 0: 10854 seconds, 636430microseconds
        Option 0: Got fives!
Option 1 (L1):Allocating, multiplying, checking matrices of 1, with size of 30, tile 9, subtile 32 (total 8640 per side; total size 74649600)
        Option 1: 2322 seconds, 681719microseconds
        Option 1: Got fives!
Option 2 (L1):Allocating, multiplying, checking matrices of 1, with size of 30, tile 9, subtile 32 (total 8640 per side; total size 74649600)
        Option 2: 1500 seconds, 258941microseconds
        Option 2: Got fives!
Option 3 (L2):Allocating, multiplying, checking matrices of 1, with size of 30, tile 9, subtile 32 (total 8640 per side; total size 74649600)
        Option 3: 1464 seconds, 907788microseconds
        Option 3: Got fives!

Recent changes

The code is being put up on the web for the first time.

Getting tiled_matrix

To get the source code, you will need the Canonical's Free software distributed version control system (DVCS) bzr. Once you have bzr, you can get the latest version of tiled_matrix with the following command:

bzr branch http://digitasaru.net/bzr/tiled_matrix/

To download any revisions subsequent to your branch, cd into the tiled_matrix directory and then use the command

bzr update

Building tiled_matrix

Building tiled_matrix requires the GNU autotools (specifically, autoconf, automake, and libtool) and the GNU Compiler Collection's C++ compiler (g++). If you wish to use OpenMP, you will need to have GCC version 4.1 or later. Versions earlier than GCC 4.3 may work, but are not tested.

Once you've downloaded and installed the prerequisites and gotten the source code to tiled_matrix, the following commands will build tiled_matrix:

libtoolize
autoreconf --insall
./configure [options]
make
#the next line is optional
make check

As this is proof-of-concept code, it is entirely unsupported. In addition, no installation system has been tested.

Configuration options

The following configuration options are available:

--enable-debug
Turns on debugging and debugging symbols. It also turns off any optimizations.
--enable-profiling
Turns on GCC's profiling support. This is not tested, but will probably work (it was inherited from a previous project and is generally a good thing to have around; it's just not been enabled and tested.)
--enable-openmp
Turn on OpenMP multi-processor/core support.
--enable-log_indices
Make all matrix indices long unsigned ints instead of unsigned ints.

Future work

  1. The (barebones) check needs to be fixed, and additional checks introduced. This may or may not bring a cunit dependency.
  2. The code should become (Open)MPI-aware.
  3. Processor data is always welcomed.

License

tiled matrix is distributed under the GNU Affero General Public License version 3 or later.

Requirements

Building the library and tests requires the following:

Page last modified Monday, 17-Aug-2009 14:08:02 EDT