” (…) just as relational databases or XML need specific query languages (SQL and XQuery, respectively), the Web of Data, typically represented using RDF as a data format, needs its own, RDF-specific query language and facilities. This is provided by the SPARQL query language and the accompanying protocols. SPARQL makes it possible to send queries and receive results, e.g., through HTTP or SOAP” – W3C Semantic Web Activity
SPARQL
The acronym SPARQL Protocol and RDF Query Language (SPARQL) is officially recursive but sometimes is referred to as Simple Protocol and RDF Query Language. It bears some similarity to SQL but is a W3C standard for querying data on the semantic web (graph databases) rather than data in relational databases.
There are a number of commercial SPARQL software packages including Virtuoso which Chemical Semantics’ portal uses. The following is an example of a SPARQL query that searches for all molecules that have 5 atoms and include a Chlorine atom.
prefix gc: <http://purl.org/gc/>
prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>
select
?molecule ?moleculeLabel ?inchikey
where {
graph ?graph {
?molecule rdf:type gc:Molecule ;
rdfs:label ?moleculeLabel ;
gc:hasNumberOfAtoms “5”;
gc:hasAtom ?atom;
gc:hasInChIKey ?inchikey .
?atom gc:isElement “Cl” .
}
}
order by ?molecule
The query begins with a definition of the namespaces gc: (Gainesville Core defined by Chemical Semantics, Inc.) and rdfs: (RDF Schema) that it will use. The quantities ?molecule, ?moleculeLabel, etc. are simply variables with arbitrary names.
The query searches for Molecules that have certain properties such as a label, number of atoms, etc. The “;” is essentially an “and” and the search ends with a “.” In addition, to the required Molecule properties, a ?atom found in the molecule must also be a Chlorine according to the last requirement,
?atom gc:isElement “Cl” .
An example of using the query at the portal of Chemical Semantics, Inc. with a small demonstration RDF graph results in a table as shown below:
The columns are labelled by the variables used in the query select. It returns here only the inchiKey of the two molecules but an extension of the query could return any computed properties of the two molecules that are part of our graph database on the semantic web.