Defining SPARQL with Boolean Tensor Operations
The Resource Description Framework (RDF) represents information in the form of subject–predicate–object triples. We can represent such RDF data using 3way binary tensors: An element \((i,j,k)\) of \(\tens{T}\) is \(1\) if and only if the respective subjectpredicateobject triple \((s_i,p_j,o_k)\) is present in the RDF graph \(G\).
An example for a simple SPARQL query that has a single triple pattern as basic graph pattern is SELECT * WHERE {?a
\(G\):p
\(_j\) \(G\):o
\(_k\)}
. The keyword SELECT
acts as a projection operator. It identifies the variables to appear in the query result.
It matches all RDF triples of \(G\) that have predicate p
\(_j\) and object o
\(_k\). If the RDF data is represented as a binary 3way tensor \(\tens{T}\), the triple pattern selects the fiber \(t_{:jk}\). That is a vector of all subjects with predicate \(j\) and object \(k\). This vector has a \(1\) at positions \(i\) that correspond to an RDF triple \((s_i,p_j,o_k)\) present in the RDF graph \(G\). A slice \(T_{:j:}\) of \(\tens{T}\) would be selected if only one mode was fixed by the query as in SELECT * WHERE {?a
\(G\):p
\(_j\) ?b}
.
It turns out that most SPARQL operations, especially the joins, can be expressed using the Khatri–Rao product [1]. Thus, insights on the computation of the Khatri–Rao can lead to direct realworld benefits. This reinterpretation also allows us to use techniques more familiar in relational data bases natively with RDF data.
