Just before we entered the world crises caused by COVID-19, I was asked what problems we are solving by building expertise in Knowledge Graphs. My answer was simply, “We are sharpening the spear.” However, there is more to it than a simple one-line answer.
Knowledge Graphs technology has been around for over seven years when Google first started using it to enhance the value of the query information returned by the search engine. The technology enables us to seamlessly connect and analyze different pieces of disjointed information that relational database technologies cannot deliver. Knowledge Graphs, and its underlying Graph Database technology, is used by popular social media, movie streaming, and eCommerce companies in recommending new relationships, movies, and other products to their audiences.
As this technology has the potential to be a game-changer in companies’ digital transformation journey, I have been cursorily reading about Knowledge Graphs for some time but never had an opportunity to do a deep dive into it until earlier this year. My team and I decided to start looking into it as part of our organization’s Data and Analytics practice to build machine learning solutions.
Crises and Opportunity
With the current health and financial crises caused by COVID-19, businesses, specifically those that have front-line workers, are struggling to find answers to how COVID has impacted them and how they can move back to growth. To find solutions to these difficult problems, a lot of disjointed but pertinent data needs to be procured, connected, and then analyzed. As the saying goes, “Luck Is what happens when preparation meets opportunity.” In response to the current crises, our team collectively brainstormed and figured out that Knowledge Graphs technology would be one of the pivotal components of our solution platform.
For a business to understand the impact of COVID-19 and its level of readiness to resume operations, a holistic intelligence platform is required. As our team of data scientists continues to analyze multiple dimensions, a few have come to light — taskforce, supply chain, and facility readiness. Each of these dimensions is based on connecting data elements from curated data sources such as COVID data lakes, general population with economic data, and, of course, company-specific employee data. All the curated data comes from trusted sources that have been cleansed, enriched, and wrangled, with personally identifiable information (PII) removed.
Knowledge Graphs vs. Relational Databases
Leveraging Knowledge Graphs to unlock insights from curated datasets requires building data models and implementing algorithms. Data models have traditionally been designed keeping relational database technologies in mind. Insights that need to be gleaned drive the designing of data models; putting functional constraints on their usability. Any change to the desired insights or adding new insights requires changes to the underlying data models. Simple algorithms implemented in relational databases can get overly complex as the number of relations (known as JOINs in tech speak) among different data elements increase. These algorithms when combined with large datasets can cause them to run for days or at times simply collapse due to the CPU and memory limitations of the underlying hardware infrastructure.
This is where Knowledge Graphs shine. Data models in Knowledge Graphs are created based on natural connections between data elements and without requiring upfront knowledge of the insights to be revealed. This fosters the discovery of new patterns and insights without changing any underlying data models.
There are multiple vendors of Graph Database technology. For scalability and speed purposes, we chose native Graph Database technology from TigerGraph. TigerGraph enables executing algorithms with 5+ levels of connectedness (also known as hops) among the data elements in sub-seconds. As the number of hops increases combined with the quality of data, the accuracy of the insights increases.
COVID Related Insights
At Creospan, we continue to enrich our datasets, graph models, and augmenting the relationships among existing and new data elements. Our goal is to unveil as many new insights as possible to help healthcare, manufacturing, retail, and financial companies determine their pandemic response and move-forward strategies. We see Knowledge Graph technology as an integral component in the ecosystem of an organization that is considering to connect disjointed information spread across the enterprise to glean insights for driving better business outcomes.