Sparsix Logo

What is Sparse Data?

Sparse linear systems analysis involves two types of data: matrices and vectors.

A matrix is said to be “sparse” when most of its elements have a null or zero value and the remaining non-zero (NZ) elements are populated throughout the matrix in a non-linear fashion.

Storing and analyzing large sparse matrices in their natural format is a hugely inefficient use of computing resources. To avoid these inefficiencies, mathematicians have developed numerical methods to compress sparse matrices into dense arrays by removing the null elements, leaving only the NZ elements. Each of these numerical methods moves data elements to a new location within the array and uses indices to indicate where the NZ elements were originally stored in the sparse matrix.

So far, so good. But problems with computing performance arise when these sparse linear analysis methods are forced to deal with how general-purpose computers retrieve data from memory.

Every general-purpose computer is optimized to retrieve data from main memory in blocks and to store those blocks in local cache memory where it can be accessed at high speeds by the central processing unit (CPU). For most operations this process works well and can boost overall performance. However, in the case of sparse data, this approach creates huge inefficiencies.

Mathematical operations performed on these data types fetch data elements from memory in one of two ways: sequentially or non-sequentially. Depending on the type of operation being performed, the data for one operand may be accessed sequentially while the data for another operand is accessed non-sequentially.

For example, in the particular mathematical operation known as sparse matrix-vector product, the NZ elements in a row of a dense array are retrieved sequentially while the vector elements are retrieved non-sequentially based upon the indices of the array elements.

Depending on how sparse the original matrix was, each block of vector data loaded into cache may contain only one valid data element.

The next stage of an operation will require the CPU to retrieve vector elements that are located within a different block of vector data. This block will be retrieved into cache, replacing a previous block.

Not only does this slow down the CPU as it continuously waits for data to be retrieved from main memory, but the efficiency of its cache memory is reduced by constantly filling it with irrelevant data. The cumulative impact of these cache misses creates a massive memory bottleneck, causing a significant decrease in overall system throughput.