Sources for data complementation

External data often remains an untapped resource

With increasing amounts of data available via the Internet or obtained from specialized data providers, external data is becoming more and more relevant. External data complements internal data and helps to improve advanced analysis, optimize business processes (e.g., with geolocation, weather, or traffic data), reduce internal data maintenance efforts, and create new services. However, despite its increasing relevance, external data remain an untapped resource for most companies. 

Definition

What is external data?

Although most companies have an intuitive understanding of external data, there is no common definition. In practice, external data is often associated with specific debates like "open data" "linked open data" or "data market places". The following definition has been developed by the CC CDQ and reflects the understanding of most companies.

External data refers to any type of data that has been captured, processed, and provided from outside the company.
(Krasikov, Eurich and Legner, 2020)

    Classification

    The four types of external data

    Based on a review of current practices, we distinguish four relevant external data types: open data, paid data, shared data, and web data. While all four types have a common feature of stemming from external data sources, they differ in provenance, access, costs, structure, and further dimensions.

    • Open data can be defined as data that is freely available and can be used as well as republished by everyone without restrictions from copyright or patents.
    • Paid data is commercially available data, acquired directly from specialized data providers (or brokers) and data marketplaces, and offered at a certain cost.
    • Shared data refers to the data which is shared between companies within business ecosystems (for instance within the CDQ Data Sharing Community). 
    • Web data comprises any kind of unstructured and semi-structured data publicly available on the Internet. It includes social media data (e.g., Facebook, LinkedIn, Twitter) and the related metadata (e.g., location, time, language).
    Examples of usage of external data in business

    How to use external data

    External data can also be useful in the following situations: 

    • Providing data-driven insights: Data analytics can be enhanced with external data in operational areas, like customer relationship management, HR, supply chain and warehousing. For example, a grocer who wants to improve the demand forecast with the help of external data can rely on the weather data, data from suppliers, and economic data
    • Improving business processes: many companies already use geolocation, weather and traffic data to plan and manage their deliveries; additional information about exceptional events, such as disasters, can help them avoid disruptions in the supply chain
    • Enhancing data management capabilities: sourcing external data reduces data maintenance efforts. It may be also used to enrich internal data and improve data quality
    • Enabling new services: external data is also used to innovate and introduce new products and services matching consumers' needs

    Reference Process

    CC CDQ developed a Reference Process for Sourcing and Managing External Data that comprises six core phases:
    (1) initiation, (2) screen, (3) assess, (4) integrate, (5) manage and use, and (6) retire.

    Image
    External data reference process
    We are happy to help you!

    Do you have questions about external data?

    Our data management experts are happy to help you and answer your questions.

    Get in touch with us