A graph is a set of vertices connected by edges. In the realm of databases, a graph is a set of items with each item having any type of relationship to another item in the dataset.
Vertices - Vertices are the datapoints in a graph. For those familiar with any form of SQL database, a vertex can be viewed as a row/record. For those not familiar with SQL, a vertex can be viewed as a piece of data.
Edges - An edge is the relationship between two different vertices. An edge is hard to translate into SQL terms because of how flexible they are with graph databases, but an edge can be viewed as the way two pieces of data are connected.
A social network is one of the best examples of a graph that most people can relate to. In a social network, you have people and you have relationships between each person. The people are represented as vertices, and the relationships are represented as edges. There are many different types of relationships such as: married, friends with, in a relationship with, works with, etc. This is the same for graphs. There are endless possibilities for different types of edges and there are endless possibilities for different types of vertices.
<img src=“https://assets.digitalocean.com/articles/Neo4J_Ubuntu/1.png”>
In this picture, the graph vertices are just integers and the edges are not labeled. Despite the simplicity, this is still a graph.
In the example of an airline company, when dealing with getting a plane from point A to point B, you want to choose the best possible path for the plane to take. Let the airports be visualized as vertices and the flight paths between them be edges.
<img src=“https://assets.digitalocean.com/articles/Neo4J_Ubuntu/2.png”>
Each edge is assigned a weight, or a cost, for utilizing it. Here, the weight represents the distance between two airports. So for example, in the graph above, the cost to get from LAX to ORD is 1749. Weighted graphs are especially useful in geographical data representations where distance is a factor.
A graph database is a NoSQL database that stores information as vertices and edges (nodes and relationships). Rather than having foreign keys and select statements, you use edges and graph traversals to query the data. This method of querying data is extremely powerful in many cases such as social networks, biology, chemistry, business analytics, and any situation where data is best represented as items that have relationships with other items in the dataset.
In this tutorial we will be installing Neo4J: an extremely popular graph database with many language bindings for pretty much any popular programming language.
Add the Neo4J key into the apt package manager:
wget -O - http://debian.neo4j.org/neotechnology.gpg.key | apt-key add -
Add Neo4J to the Apt sources list:
echo 'deb http://debian.neo4j.org/repo stable/' > /etc/apt/sources.list.d/neo4j.list
Update the package manager:
apt-get update
Install Neo4J:
apt-get install neo4j
Neo4J should be running. You can check this with the following command
service neo4j-service status
One of the things that makes Neo4J awesome is that it has a very easy to use RESTful API, which means that it can be used by virtually any programming language that can make web requests. Many of the operations performed on a Neo4J database are executed using a Cypher query. The Cypher query language is the query language used by Neo4J to manipulate and read data. Cypher is to Neo4J as SQL is to MySQL.
The structure of a web request to the Neo4J RESTful API is as follows:
curl -H "Accept: application/json; charset=UTF-8" -H "Content-Type: application/json" -X POST http://SERVERNAME:7474/db/data/cypher -d '{
"query" : "CYPHER QUERY GOES HERE",
"params" : {
QUERY PARAMETERS GO HERE
}
}'
Neo4J is a database, and databases need data, so let’s add some data!
Create a new node:
curl -H "Accept: application/json; charset=UTF-8" -H "Content-Type: application/json" -X POST http://localhost:7474/db/data/cypher -d '{
"query" : "CREATE (n:Person { name : {name} }) RETURN n",
"params" : {
"name" : "Foo"
}
}'
I mentioned earlier that graph databases store data as nodes and relationships. A relationship requires two nodes, so let’s create another node:
curl -H "Accept: application/json; charset=UTF-8" -H "Content-Type: application/json" -X POST http://localhost:7474/db/data/cypher -d '{
"query" : "CREATE (n:Person { name : {name} }) RETURN n",
"params" : {
"name" : "Bar"
}
}'
Now we can create a relationship between these two nodes:
curl -H "Accept: application/json; charset=UTF-8" -H "Content-Type: application/json" -X POST http://localhost:7474/db/data/node/0/relationships -d '{
"to" : "http://localhost:7474/db/data/node/1",
"type" : "Comes Before"
}
}'
Below are some example cypher queries we can use to view the data we previously inserted.
We can start at the first node we created, and get all connected nodes and the corresponding relationships:
curl -H "Accept: application/json; charset=UTF-8" -H "Content-Type: application/json" -X POST http://localhost:7474/db/data/cypher -d '{
"query" : "MATCH (x {name: {startName}})-[r]->(n) RETURN type(r), n.name",
"params" : {
"startName" : "Foo"
}
}'
Return the name property of all the nodes in the graph (Note: This should not be performed on large graphs):
curl -H "Accept: application/json; charset=UTF-8" -H "Content-Type: application/json" -X POST http://localhost:7474/db/data/cypher -d '{
"query" : "START n = node(*) return n.name",
"params" : {
}
}'
Return all the relationship types in the graph (Note: This should not be performed on large graphs):
curl -H "Accept: application/json; charset=UTF-8" -H "Content-Type: application/json" -X POST http://localhost:7474/db/data/cypher -d '{
"query" : "START r=rel(*) return type(r) ",
"params" : {
}
}'
A more complete description and list of methods provided by the Neo4J RESTful API can be found <a href=“http://docs.neo4j.org/chunked/milestone/rest-api.html”>here</a>, and information on the Cypher query language can be found <a href=“http://docs.neo4j.org/chunked/stable/cypher-query-lang.html”>here</a>.
<div class=“author”>Submitted by: <a href=“http://blog.opendev.io”>Cooper Thompson</a></div>
Thanks for learning with the DigitalOcean Community. Check out our offerings for compute, storage, networking, and managed databases.
This textbox defaults to using Markdown to format your answer.
You can type !ref in this text area to quickly search our full set of tutorials, documentation & marketplace offerings and insert the link!
Hello,
I’m not able to access http://IP:7474. Could you please explain how i can turn the admin page on?
Thanks,
The command for starting Neo4j in the tutorial (
service neo4j start
) looks exactly right and should work on 2.0. Note that you need to invoke it as a service rather than through the neo4j start script.I am on the Neo4j development team and we’d love to hear about any problems you have. The best place to get advice for Neo4j is StackOverflow.
-Ben
Got this:
wget: unable to resolve host address ‘debian.neo4j.org’
bit out of date…
service neo4j-service status
should be
service neo4j status
also note most folks will want to mess with ./etc/neo4j/neo4j.conf, at least to open up for external access (see my reply to tuncaucer below)
also most folks will want to visit the browser to mess with the DB using UI… http://Your-server-ip:7474/browser/
The first link “here” is 404.
Nice guide. You should add https on your wget’s to make man-in-the-middle attacks harder.
Any advice on how to administer a database like this? Starting, stopping, restarting, backing up?
having a problem accessing port 7474 from a browser
http://ipaddress:7474 and http://ipaddress:7474/browser simply do not work. I get a “ERR_CONNECTION_REFUSED” in Chrome
Not quite sure what to do here…
Hey how about handling security? Like only accepting connections from certain IP addresses/ or authentication?
Thank you very very much… Its amazing