Finding Data in Graphs with Neo4j

0
6771

Visual Neo4J

Dubbed the database of the future, Neo4j is an open source graph database implemented in Java. This article is a teaser to whet your appetite for Neo4j and other graphical databases.

Technology is evolving so rapidly that very fast processors, high network speeds and agile computational algorithms are readily available and used, yet the performance of databases is of serious concern to those in the industry. Agility of business is important for the market, but what drags it down is that the schema for relational databases isn’t designed for frequent changes. Moreover, data is evolving not only in volume and velocity, but also in complexity and interconnectedness. So RDBMSs have become a bottleneck as they were designed for tabular data, with a consistent structure and fixed schema. All these problems had led to some lateral thinking in search of a design which could be intuitive not just for developers, but for everyone familiar with the domain, and hence emerged Neo4j, the graphical data base.
Neo4j is an open source NoSQL graph database implemented in Java and Scala. It is sponsored by Neo Technology and widely used in production by eBay, Walmart, Telenor, etc. Some of the special features of Neo4j are:

  • Implementation of the property graph model efficiently down to the storage level
  • Materialising of relationships at creation time, resulting in no penalties for complex runtime queries
  • Constant time traversals for relationships in the graph both in depth and in breadth, due to efficient representation of nodes and relationships
  • Written on top of JVM, it has compact storage and memory caching for graphs, resulting in efficient scale-up and billions of nodes in one database, even on moderate hardware
Movies
Figure 1: Movies
Example
Figure 2: Example

Matchmaking, network management, software analytics, scientific research, routing, organisational and project management, recommendations and social networks are some of the popular use cases of Neo4j. The Neo4j database can be downloaded from http://neo4j.com/download and for installation, follow the sequence: Run -> Next ->Next and Finish, just as with the VLC media player installation. This graph database comes with a Web browser based UI bound to http://localhost:7474. The simplest way of getting started is to use a Neo4j’s database browser to execute your graph queries (written in Cypher) in a workbench-like fashion. Results are presented as either intuitive graph visualisation or as easy-to-read, exportable tables. The browser has some tutorials too for some basic hands-on experience, like the movie example shown in Figure 1.

The next generation query language is Cypher, which is used in Neo4j. Cypher’s syntax provides a familiar way to match patterns of nodes and relationships in the graph. Cypher has the concept of nodes, relationships and labels, which in layman terms can be related to rows, joins and tables, respectively, in the RDBMS. While creating a node, we give an identifier in braces like (P) for the person. All the nodes can be assigned roles or types called labels. We can establish a relationship between nodes and name them by specifying the name in big braces like [:child].
A simple example of a Cypher query is given below:

CREATE (BajrangiBhaijaan:Movie {title:’Bajrangi Bhaijaan’, released:1999, tagline:’Selfie Le Le Re’})
CREATE (KabirKhan:Person {name:’Kabir Khan’, born:1967})
CREATE (SalmanKhan:Person {name:’Salman Khan’, born:1964})
CREATE (SalmanKhan)-[:ACTED_IN {roles:[‘Bajrangi Prasad’]}]->(BajrangiBhaijaan)
,(KabirKhan)-[:DIRECTED]->(BajrangiBhaijaan)
WITH SalmanKhan as a
MATCH (a)-[:ACTED_IN]->(m)<-[:DIRECTED]-(d) RETURN a,m,d

In this Cypher query, we are creating three nodes with two labels, Movie and Person. Thereafter, we are making relations between the nodes as ‘Acted in’ and ‘Directed’. In the last statement, I have displayed the result using the ‘Return’ command. The graph is shown in Figure 2.

DBConnectionManager
Figure 3: DBConnectionManager
Add_update
Figure 4: Add update

Developing applications using Neo4j is fun as it comes with drivers that are compatible with Java, .NET, JavaScript, Python, Ruby, etc. In this article, let’s look at writing some Java code to do the CRUD (create, read, update and delete) operations. The prerequisites are Java7 and the jdbc-neo4j driver. After downloading the JAR, create a new project in Eclipse and add the JAR to the project build path. Now create two files DBConnectionManager.java and NodeManipulation.java as shown in Figure 3.
The DBConnectionManager contains the code for connecting to the neo4j database where the URL, username and password are provided. In my case, neo4j is my username and NMS the password for the neo4j database server. The neo4j database is active on port 7474, thus the URL in the code is jdbc:neo4j//localhost:7474/. The code for addition, deletion, viewing and updating is available in NodeManipulation as shown in Figure 4.

Display_delete
Figure 5: Delete display

In the ‘Add node’ method the uniqueID is an auto-generated value which acts as the primary key for the label. The Hashmap named policyMap contains the values that are stored as the property of the node like PolicyName, Cron Text, Location, etc. In the update statement, the where clause is used and the node with the specific policy ID is updated using the SET command. The node can be deleted using the ‘Delete’command, whereas the ‘Return’ command is used to display the nodes. You can learn how to perform more complex operations on the neo4j site.
I must end by saying that the future is graph databases as we can easily find answers to complex questions like, “Has Kabir ever worked with Salman or known anyone that has worked with him?” within seconds.

LEAVE A REPLY

Please enter your comment!
Please enter your name here