The Sample Dataset
We’ll use a tiny dataset of 5 movies, small enough to test in AGS but just enough to illustrate SPARQL features.
Movie ID | Title | Year | Rating | Genre | Director | Actor |
---|
M1 | The Shawshank Redemption | 1994 | 9.3 | Drama | Frank Darabont | Tim Robbins |
M2 | The Godfather | 1972 | 9.2 | Crime | Francis Ford Coppola | Marlon Brando |
M3 | Inception | 2010 | 8.8 | Sci-Fi | Christopher Nolan | Leonardo DiCaprio |
M4 | The Dark Knight | 2008 | 9.0 | Action | Christopher Nolan | Christian Bale |
M5 | Pulp Fiction | 1994 | 8.9 | Crime | Quentin Tarantino | John Travolta |
About This Dataset
This Movies dataset is a lightweight, illustrative sample designed for all SPARQL KB articles in this series.
It contains just 5 movies -enough to demonstrate graph joins, relationships, and filters - while keeping query performance instantaneous in AGS.
Purpose:
- To give developers a common base dataset to test and validate queries.
- To ensure examples remain consistent across KBs (no need to reload different data each time).
- To help visualize real-world relationships — e.g., movies sharing the same director, or actors appearing in multiple genres.
Structure Overview:
- Each movie (
ex:M1
–ex:M5
) is a node of type ex:Movie
. - Each has properties like
ex:title
, ex:released
, ex:rating
, and ex:genre
. - Relationships connect movies to
ex:director
and ex:hasActor
. - Each director and actor is a
foaf:Person
with a foaf:name
.
Example triple pattern:
ex:M1 ex:director ex:Frank_Darabont
ex:Frank_Darabont foaf:name "Frank Darabont"
The AGS Perspective: From Tables to Triples
Unlike a relational database that uses tables, rows, and columns, AGS (Anzo Graph Server) organizes all data as triples:
Subject → Predicate → Object
Each triple is one fact — for example:
ex:M1 ex:title "The Shawshank Redemption"
ex:M1 ex:genre "Drama"
ex:M1 ex:director ex:Frank_Darabont
Unlike SQL joins, AGS traverses relationships natively using graph patterns.
This allows you to discover linked entities like movies by the same director or actors appearing in multiple genres.
📘 Learn more: SPARQL Basics
How AGS Uses SPARQL
- AGS is an in-memory, massively parallel graph database.
- SPARQL queries execute across distributed nodes for speed.
- You can connect directly through the Query Console or GDI service.
- AGS extends standard SPARQL 1.1 with:
- Parallel execution for scalability
- GDI + SERVICE clause for data virtualization
- Named Queries, incremental loading, and Query Console features
Core SPARQL Query Types (with Movie Examples)
SELECT - Retrieve data
PREFIX ex: <http://example.org/movie/>
SELECT ?title ?rating
WHERE {
?m a ex:Movie ;
ex:title ?title ;
ex:rating ?rating .
}
LIMIT 5
Output (example):
"The Shawshank Redemption" | 9.3
"The Godfather" | 9.2
"The Dark Knight" | 9.0
"Inception" | 8.8
"Pulp Fiction" | 8.9
Executed in-memory for sub-second retrieval. Use LIMIT while testing.
📖 Learn More: SPARQL Queries
CONSTRUCT - Create a new graph
Create triples linking each movie to its director’s name:
PREFIX ex: <http://example.org/movie/>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
CONSTRUCT {
?movie ex:hasDirectorName ?name .
}
WHERE {
?movie a ex:Movie ;
ex:director ?d .
?d foaf:name ?name .
}
LIMIT 3
Output:
ex:M1 ex:hasDirectorName "Frank Darabont"
ex:M2 ex:hasDirectorName "Francis Ford Coppola"
ex:M3 ex:hasDirectorName "Christopher Nolan"
Useful for reshaping or exporting graph structures.
INSERT DATA - Add triples
PREFIX ex: <http://example.org/movie/>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
INSERT DATA {
ex:movie/fight_club a ex:Movie ;
ex:title "Fight Club" ;
ex:released "1999"^^xsd:gYear ;
ex:rating "8.8"^^xsd:decimal .
}
For small loads. Use GDI pipelines for large datasets.
DELETE DATA — Remove triples
PREFIX ex: <http://example.org/movie/>
DELETE DATA {
ex:movie/fight_club ex:rating "8.8"^^xsd:decimal .
}
Removes specific triples from the graph.
ASK - Boolean existence check
PREFIX ex: <http://example.org/movie/>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
ASK {
?m a ex:Movie ;
ex:rating ?r .
FILTER(xsd:decimal(?r) > 9.0)
}
Output: true
SPARQL in AGS vs SQL
Concept | SQL (Tables) | SPARQL in AGS (Graphs) |
---|
Data Model | Rows & columns | Triples: subject–predicate–object |
Query Target | Fixed schema | Dynamic graph patterns |
Relationships | Foreign keys / joins | Direct graph edges |
Execution | Disk-based, sequential | In-memory, parallel |
Federation | ETL into warehouse | GDI + SERVICE = live queries |
Developer Quick Wins in AGS
- Always test with
LIMIT
. - Use typed literals for filtering and sorting.
- Prefer
VALUES
over complex FILTER
logic (for performance). - For big joins, use subqueries or WITH clauses (for modular performance).
📖 Tooling: Query Console
📖 Data Loading: Manage Data
Where to Learn More