Data Modeling with Graphs using neo4j

Today we have slides for Peter Bell's talk from the GraphConnect conference on Rich Web Data Modeling with Graphs, where we’ll look at a number of proven patterns for effectively modeling and retrieving your data using neo4j.

About Peter Bell, Graph Database Evangelist

Peter was previously SVP Engineering and Senior Fellow at General Assembly, a campus for technology, design, and entrepreneurship. He has presented at a range of conferences including DLD conference, ooPSLA, QCon, RubyNation, SpringOne2GX, Code Generation, Practical Product Lines, the British Computer Society Software Practices Advancement conference, DevNexus, cf.Objective(), CF United, Scotch on the Rocks, WebDU, WebManiacs, UberConf, the Rich Web Experience and the No Fluff Just Stuff Enterprise Java tour.

Want to hear from more top engineers?

Our weekly email contains the best software development content and interviews with top CTOs. Enter your email address now to stay in the loop.

More on the NoSQL movement

NoSQL does not mean absolutely no SQL. Most of the time, it just means not only SQL. So this term is not as strong as it sounds. There are four general types of NoSQL databases. Other kinds of NoSQL databases are column databases, document databases and key-value databses. One of the key differentiators of NoSQL databses from their relational counterparts is a looser adtherance of data consistency and integrity in the NoSQL databases. Not having to manage data consistency is one of the factors that gives NoSQL a performance boost over relational databses.

More on Neo4j

Neo4j is the world's leading graph database. Neo4j offers high performance at extremely high scale. It is opensource and the development is led by Neotechnology.

Neo4j is young but quickly maturing database. It is highly scalable as expected from a graph databases. It is also very reliable. It gains in performance by using a custom disk-based native storage engine. So what you get is speed, intuitiveness, and ease and flexibility of use.

Mode on graph databases

Graph databases are not the best option for every software application out there, but they are great for a number of reasons. The first is incredible scalability from modeling data in a graph. Looking up and inserting data can be orders of magnitude faster than traditional relational databases thank to various algorithms for looking up and inserting data that is modeled in a graph structure. Additionally, because the data is structured as a graph, it represents relationships between nodes. That allows various analytics software to be run on top of the data to extract various other pieces of business intelligence.

According to Wikipedia, "by definition, a graph database is any storage system that provides index-free adjacency." Some non-general types of graph databases are triplestores which are popular with the semantic web community and network databases.

More on graph database data modeling

With graph databases, the way to model data is to emphasize the relationships of the pieces of data. The pieces of data are commonly called nodes. And the edges connect those distinct pieces of information. And properties give metadata about different parts of those relationships. Here is how to think of nodes, properties, and edges when modeling a graph database.

Nodes represent the things in the world that we want to model. Those can be people, businesses, accounts, places, events, or anything else that you more or less think of as a thing or something concrete.

Properties represent information about nodes. For example, if you have a person node and that person is Pete Soderling (founder of this website), his properties may be "technologist" or "programmer" or "Twitter user" or anything else. The list of possible properties is nearly infinite. The trick is to focus on the properties that are important for your business.

In graph databases, nodes and properties need to be connected. And the mechanism by which they are connected are edges.  Here are some examples of how that would work. Pete Sodeling (node) is founder of g33ktalk (node). You can also model that g33ktalk is an interesting startup.  Then by traversing the graph, you can extract the extra big of intelligence that Pete Soderling must be a founder of an interesting startup.

For a talk about working with Neo4j, please take a look at our article about graphing with Neo4j and Node.js.