Introduction: In our previous units, we talked about Web Analytics and how data is collected. But how is that data stored when relationships matter? Enter Graph Databases. Unlike traditional tables, graph databases like Neo4j use Nodes and Relationships.
If you are preparing for your exams or a career in Data Science, understanding Cypher (the query language for Neo4j) is a must. Below is a practical dataset and 9 powerful queries to help you master the logic!
| Learn the most important Cypher queries used in graph databases like Neo4j |
The Scenario: A Movie Recommendation System
Imagine a
system where we have Actors, Movies, Directors, and Users. We want to
find who acted with whom, what friends are watching, and how to update ratings.
Part 1: The Dataset (Setup)
To practice,
you first need to create the data. Copy and paste the code below into your
Neo4j Browser to set up your nodes for Actors, Movies, and Users.
// Actors node with
name property
CREATE (:Actor
{name: "Actor1"})
CREATE (:Actor
{name: "Actor2"})
CREATE (:Actor
{name: "Actor3"})
// Users node
with name property
CREATE (:User
{name: "User1"})
CREATE (:User
{name: "User2"})
CREATE (:User
{name: "John"})
// Movies node
with title, release_year, and rating properties
CREATE (:Movie
{title: "Movie1", release_year: 1995, rating: 7.5})
CREATE (:Movie
{title: "Movie2", release_year: 1998, rating: 8.2})
CREATE (:Movie
{title: "Movie3", release_year: 2002, rating: 9.1})
// Directors
node with name property
CREATE
(:Director {name: "Director1"})
CREATE
(:Director {name: "Director2"})
// Genres node
with name property
CREATE (:Genre
{name: "Genre1"})
CREATE (:Genre
{name: "Genre2"})
//
Relationships
// ACTED_IN
relationship between Actors and Movies with a 'role' property
MATCH (a:Actor
{name: "Actor1"}), (m:Movie {title: "Movie1"})
CREATE
(a)-[:ACTED_IN {role: "Role1"}]->(m)
MATCH (a:Actor
{name: "Actor2"}), (m:Movie {title: "Movie1"})
CREATE (a)-[:ACTED_IN
{role: "Role2"}]->(m)
MATCH (a:Actor
{name: "Actor1"}), (m:Movie {title: "Movie2"})
CREATE
(a)-[:ACTED_IN {role: "Role3"}]->(m)
MATCH (a:Actor
{name: "Actor3"}), (m:Movie {title: "Movie2"})
CREATE
(a)-[:ACTED_IN {role: "Role4"}]->(m)
// FRIEND relationship
between Users
MATCH (u1:User
{name: "User1"}), (u2:User {name: "User2"})
CREATE
(u1)-[:FRIEND]->(u2)
MATCH (u2:User
{name: "User2"}), (u3:User {name: "John"})
CREATE
(u2)-[:FRIEND]->(u3)
// DIRECTED
relationship between Directors and Movies
MATCH
(d:Director {name: "Director1"}), (m:Movie {title:
"Movie1"})
CREATE
(d)-[:DIRECTED]->(m)
MATCH
(d:Director {name: "Director2"}), (m:Movie {title:
"Movie2"})
CREATE
(d)-[:DIRECTED]->(m)
// BELONGS_TO
relationship between Movies and Genres
MATCH (m:Movie
{title: "Movie1"}), (g:Genre {name: "Genre1"})
CREATE
(m)-[:BELONGS_TO]->(g)
MATCH (m:Movie {title:
"Movie2"}), (g:Genre {name: "Genre2"})
CREATE (m)-[:BELONGS_TO]->(g)
Part 2: Essential Cypher Queries (Exam Focus)
Here are 9
queries that solve real-world data problems:
1.
Identifying "Power Actors" (Aggregation) Need to find actors who have been
in more than 5 movies? This uses COUNT and WITH.
MATCH (a:Actor)-[:ACTED_IN]->(m:Movie)
WITH a, COUNT(m) AS movieCount
WHERE movieCount > 5
RETURN a.name AS Actor, movieCount AS MoviesActedIn
2. Social
Networking: Finding Connections In social media, we often need the "Shortest
Path" between friends.
MATCH path = shortestPath((u1:User)-[:FRIEND*]-(u2:User))
WHERE u1.name = "User1" AND u2.name =
"User2"
RETURN [n IN nodes(path) | n.name] AS Path
3. Bulk
Updates: Increasing Ratings Marketing trends change! If you need to boost the ratings of older
"classic" movies (released before 2000):
MATCH (m:Movie)
WHERE m.release_year < 2000
SET m.rating = m.rating + 1
4. Finding
"Co-Stars" (Self-Joins) To find pairs of actors who frequently work together:
MATCH (a1:Actor)-[:ACTED_IN]->(m:Movie)<-[:ACTED_IN]-(a2:Actor)
WHERE id(a1) < id(a2)
WITH a1, a2, COUNT(m) AS moviesTogether
WHERE moviesTogether > 1
RETURN a1.name AS Actor1, a2.name AS Actor2,
moviesTogether AS MoviesTogether
5.
Leaderboards: Top Directors
MATCH (d:Director)-[:DIRECTED]->(m:Movie)
RETURN d.name AS Director, COUNT(m) AS MoviesDirected
ORDER BY MoviesDirected DESC
LIMIT 3
6. Filtering
by User Experience (Average Ratings)
MATCH (u:User)-[r:RATED]->(m:Movie)
WITH u, AVG(r.rating) AS avgRating
WHERE avgRating > 4
RETURN u.name AS User, avgRating AS AverageRating
7.
"People Also Liked..." Logic Find users who have similar tastes to
"John":
MATCH
(u1:User)-[r1:RATED]->(m:Movie)<-[r2:RATED]-(u2:User {name:
"John"})
WHERE u1 <> u2
RETURN DISTINCT u1.name AS User
8. Dynamic
Tagging (Conditional Logic) Automatically label high-performing content:
MATCH (m:Movie)
WHERE m.rating > 8
CREATE (m)-[:BELONGS_TO]->(:Genre {name: "High
Rated"})
9. Complex
Recommendation Paths Find
connections based on unique shared interests (Genres):
MATCH path =
(u1:User)-[:RATED]->(m:Movie)-[:BELONGS_TO]->(g:Genre)<-[:BELONGS_TO]-(otherM:Movie)<-[:RATED]-(u2:User)
WHERE u1.name = "User1" AND u2.name =
"User2"
WITH path, COLLECT(DISTINCT g.name) AS uniqueGenres
WHERE SIZE(uniqueGenres) = 1
RETURN path, uniqueGenres
Also Read : Neo4j Graph Database Practical: Movie Database Example Using Cypher Queries
0 Comments