TheyBuyForYou Platform

In TheyBuyForYou we have been working on a layered architecture of data services, ontologies, core APIs and tools that allows different levels of access and use of our procurement knowledge graph.

A layer-based architecture allows separating the different services so that most of the interaction occurs only between adjacent layers and any change in a technology does not affect the rest of the services. As shown in the figure "Tech Stack", five main layers have been defined, those corresponding to data, tools, schemas, core APIs and added-value services that will be explained below.

Tech Stack

High-level Architecture


Data

This bottom layer contains the data that feeds both, the knowledge graph and the document database. The knowledge graph data is obtained from the OpenOpps and OpenCorporates datasets and, through the data ingestion tool, they are transformed into RDF format.

TBFY Knowledge Graph (KG)

The knowledge graph is a database that contains the information about tenders, contracts, awards, organisations and contracting processes, used by de API Gateway.


link API API

Document repository

Database that contains the set of legal documents indexed form Harvester.




link API

Schemas

This layer contains the vocabularies of our domain. These vocabularies are the intermediaries that get the knowledge graph to be understood with tools like SPARQL GUI or R4R.

TBFY ontology

The TBFY ontology imports the OCDS ontology (for procurement data) and the euBusinessGraph ontology (for company data). In addition, it contains a few extensions in order to represent additional meta information needed for the TBFY KG.

API API

euBusinessGraph ontology

euBusinessGraph ontology for company data. Originally developed in the euBusinessGraph project.



link API API

Tools

This layer contains the tools built or used to create the Knowledge Graph and provide access to it. We have to distinguish between tools created specifically for the project (internal) and tools that have not been developed specifically for this project but have been used (external). Among the types of tools, there are those tools that feed databases to those ones that query the TheyBuyForYou SPARQL endpoint.

Internal

Harvester

Harvester downloads articles and legal documents from public procurement sources (OpenOpps, JRC-Acquis or TED) and indexes them into SOLR to allow performing complex queries and visualising results through Banana.

API API

R4R

It allows building and deploying RESTful services from SPARQL queries. The core API uses it to browse the TBFY knowledge graph.



API API

KG data ingestion pipeline

Data ingestion pipeline downloads OCDS releases in JSON format and reconciled supplier-company records in JSON format, enriches and transforms the data to RDF (using RML), and publishes the data to the TBFY KG database.

API API

External

SPARQL GUI for TBFY KG

It uses YASGUI (Yet Another SPARQL GUI) as a web application to query any SPARQL endpoint.




link API

OptiqueVQS

OptiqueVQS enables end users with no technical background and skills to transform their information needs into SPARQL queries visually.



API Video API

core APIs

This layer contains the set of core APIs built or used in the project. We have to distinguish between APIs created specifically for the project (internal) and tools that have not been developed specifically for this project but have been used (external). These core APIs are implemented with the basic resources to extract information from the knowledge graph, from the document repository or even from external data sources.

internal

knowledge graph API

The knowledge graph API allows obtaining information about tenders, organisations, awards, contracts and contracting-processes from the RDF triple store.


Webpage API API Video API

Public procurement OCDS API

Public procurement OCDS API allows obtaining information about public procurement based on the OCDS standard, currently applied to the data from the Zaragoza city council.


Webpage API API

external

OpenCorporates companies API

OpenCorporates companies API provides access to data about 135 million companies from primary public sources.



API API

OpenCorporates reconciliation API

OpenCorporates reconciliation API allows OpenRefine users to match company names to legal corporate entities getting more information about companies.


Webpage API

OpenOpps API

OpenOpps API provides access to tender and contract data from a range of European government bodies, formatting according to OCDS.



Webpage API

librAIry API

librAIry API creates topic-based representations of documents (e.g., tenders) to relate them semantically.



API API API

Wikifier Web Service

Wikifier takes a text document as input and annotates it with links to relevant Wikipedia concepts.



link API

Spend Network’s Classification tool

The classification tool is an advanced classifier to add multiple labels to procurement notices based on the Common Procurement Vocabulary, or CPV. This classifier gives notices five, scored, Level 3 CPV codes based on their text and description.

link API

Added-value services

In this top layer we find non-basic services and tools, which go beyond standard ones and have extended features and add-ons to basic core functions.

API Gateway

The TBFY API provides a flexible abstraction layer and a single-entry point to manage the communication between TBFY clients and online tools.


API API API API

search API

The search API explores collections of multilingual public procurement data through a Restful API.




Webpage API Video API

Storytelling

Storytelling tool is a client-side JavaScript framework designed for the purpose of supporting authors of data stories.



link API Video API

Suppliers notebook

Suppliers notebook is an example to explain how the knowledge graph can be exploited through a notebook.



link API API

Organisation comparison notebook

Organisation comparison notebook is an example of how to create an added value service exploiting the core API and search API (via API gateway).



link API API

Streamstory

StreamStory is a tool, which is intended to help with analysis and interpretation of time varying data.




API Github Video API

Anomaly detection

Visualise and analyse public procurement data and spending data is an online toolkit exploring public spending and tender data and detecting anomalies in them.


API API Video API

Average payment period to suppliers

Average Payment Period to suppliers is an indicator that measures the delay in the payment of commercial debts in economic terms for entities associated to the Zaragoza city council.


link API

COPIN (COmpra Pública INclusiva)

COPIN (COmpra Pública INclusiva) aims at providing better understanding on how public administrations specify and evaluate public tenders.



link API

Online KG data comparison tool

It provides, through a web interface, the analysis of tender and award data, extracted from the Knowledge Graph through the core API and Search API.


API API