Graph Databases: A New Way of Thinking About Data
Graph databases are being used by many industries for their unique ability to analyze relationships between pieces of data.
The importance of big data has been on the rise. However, to make the most of the data, companies need to be able to find actionable insights from it. To find powerful insights, there need to be both deep queries and good analytics on the data returned. Traditional SQL queries face limitations when it comes to complex, multi-layered queries, and that limits a company’s goal of retrieving meaningful data.
|Webinar: The Art of Visibility: Enabling Multi-Platform Management
Graph databases have enabled companies to launch complex, multi-layered queries which can be answered instantly, whereas traditional SQL databases would find it extremely difficult to answer such queries. Complex queries are returning unprecedented and valuable insights. Graph databases are being used in many industries such as social media, healthcare and online dating. The graph database, it seems, is providing a new way of looking at data.
What Is a Graph Database?
A graph database is used to store information about different entities, map relationships across entities and query relationships between entities. In this context, entities can be a lot of things such as human beings, companies, animals and cars. An entity can have a specific relationship with another entity. For example, Martin, an entity, is a friend of Jim, another entity. Martin can own a BMW car. In both examples, Martin, Jim and the BMW are the entities with specific relationships between them. "Martin is a friend of Jim" means friendship is the relationship between the two entities. Similarly, "Martin owns a BMW" means ownership is the relationship between Martin and his BMW. In graph database parlance, relationships are known as edges. The relationships are shown in the form of a graph and hence, the concept is known as a graph database. (To learn more about graph databases, see How Graph Databases Bring Networking to Data.)
The concept of the graph database is being implemented across industries such as healthcare, social media and e-commerce. The examples given earlier in this article are simple and straightforward, but the use cases implemented in the industries are highly complex. Take the example of an e-commerce website that provides recommendations to customers. How does the website provide product recommendations that are suitable for a customer? How does the website know the needs and preferences of the customer? The key lies in the product the customer is viewing. If the customer is viewing a book on human resource management, the reccommendation logic of the website looks for other customers who have viewed or purchased the same book. At the same time, the logic also determines other similar or related books that other users with similar interests have viewed or purchased, and similar books are recommended to the user.
How a Graph Database Works
Let us take a closer look at graph databases with the help of an example. Let's assume that a smartphone maker wants to launch a smartphone with several advanced features. The product management will decide on the features after determining the needs and preferences of its target audience, which is corporate executives. The smartphone maker has one or more databases that collects and stores data on executive profiles from multiple data sources. Now, the product managers create a graph data structure based on the data which looks like the one below:
From the image above, the product managers derive the following conclusions or business decisions:
- Steve is an HR manager who uses the messenger extensively. His connections in the HR department probably also use the messenger because of their work profile. So, good messengers in the smartphone may be important.
- The main reason Debra and her husband’s friend Trevor frequent the antivirus forums may be security concerns in their smartphones or computers. So, the new smartphone can have built-in security features.
- Abraham uses a Fitbit, which indicates that he monitors his fitness. So, it would be a good feature if the new smartphone is able to sync data from Fitbit devices and display it in a user-friendly manner.
The above example shows how graph data can be used to solve business problems.
The case studies below show how graph databases have helped solve complex problems in the online dating and online career search industries.
Case Study – Online Dating
Problem: Online dating portals want to find suitable matches for their subscribers. To do that, the portals need information on other members of the website who might have similar tastes, preferences, backgrounds and other information.
Solution: Many online portals have used graph databases to travel through the details of millions of members and scour information. Based on that, the website prepares matches based on tastes, education, hobbies and other details. The website determines that these profiles are most likely to be a good match with a particular profile and provides recommendations accordingly.
Case Study – Professional Networking Websites
Problem: Professional networking websites such as LinkedIn want to recommend the most suitable connections and jobs based on a number of parameters such as profile, connection views, profile views and group membership, which reflects interests and preferences.
Solution: To do this, such networking websites travel through multiple layers of connections such as connections of connections of connections and so on. Then, the graph logic finds common professional interests, careers, job profiles, group membership and other information and based on the findings, provides recommendations on both networks and jobs.
Facts and Figures from the Industry
The facts and figures given below show how much the graph database has been adopted industry-wide:
- More than 30 Global 2000 companies that include Wal-Mart, eBay, Lufthansa, and Deutsche Telekom have adopted Neo4j, the most popular graph database, created by Neo Technology.
- Industry observer DB-Engines has this to say about the popularity and adoption of graph databases, “Graph DBMSs are gaining in popularity faster than any other database category,” as it has been growing at almost 300 percent since January, 2013.
- Since May, 2013, many major online dating sites have started to adopt graph databases.
- LinkedIn has a big team working on its proprietary graph database system.
- Twitter depends extensively on a graph database and has also released FlockDB, an open-source graph database. (For more on open-source databases, see Why Open-Source Databases Are Gaining Popularity.)
- With the goal of making graph databases easy to use for enterprise users, Teradata has released a new type of SQL known as SQL-GR.
The graph database represents a new way of looking at big data. There are two clear benefits of graph data:
- Relational database management systems (RDBMS) are unable to process huge volumes of data in a short period of time. Additionally, it is not able to organize huge volumes of data. A graph database can traverse any number of relationships between entities and organize information logically.
- Graph databases are extremely efficient in retrieving relevant information after scouring several entities and relationships. As stated earlier, they can query and return extremely valuable insights which BI systems can present in a user-friendly manner.
It seems that it is only a matter of time before other industries that deal with huge amounts of data such as banking and finance, pharmaceuticals, defense and intelligence will also be using graph databases. In fact, detecting crimes and identifying insurance fraud with the help of networks, relationships and entities with graph data is sure to be an interesting task.