We are connected data engineers 👩🏽🔬👨🏽🔬
- We transform massive unmanageable open datasets (such as Companies House),
- into meaningful networks (in a graph database),
- and develop queries to surface unique, high-value insights (available as API endpoints).
Our solutions combine connected data, graph databases and open data.
It's All About Relationships
With data connected in networks, you can access information inherent in those relationships. This is how Google's Page Rank algorithm works, by inferring relevance based on the number and quality of connected web pages.
A data model which treats the relationships between data equally important and is (usually) maintained in a graph database - these are optimised to analyse massive networks quickly at scale.
An example of connected data in action in this simple visualisation app.
- Search for any UK company and uncover all present and past company officers and their connections
- Wheel/zoom in/out, hover over any entity for more properties, and left click to reposition
- Copy and share links to any search
A database architecture designed for connected data. The main technical difference between graph and other types of databases are:
- Index-free adjacency: relationships connecting data entities are created at write time and stored. This allows complex network queries to be performed which are impossible in other databases which typically rely on expensive JOIN operations executed at query time.
- Flexible schema: the schema can be adapted over time, add new entities and relationships without rebuilding the database.
Data collected by governments which is distributed free of copyright or any other restrictions. Open data is a relatively new policy initiative but expanding quickly across the globe with 100,000's of datasets available:
- business information
- birth, death, marriage registers
- patent and trademark information
- public tender databases
- geographic information
- legal information
- meteorological information
- social data
- transport information
We believe open data sources are hugely under utilised because they are available in unhelpful formats and do not have production standard APIs. We solve these problems by creating and maintaining connected data versions, allowing unique insights with our query tools, and making those available via robust APIs ⚙️.