What is Frictionless Data

Home › DevOps › Big Data › Frictionless

Description

Frictionless Data is an open-source framework for building data infrastructure – data management, data integration, data flows, etc. It includes various data standards and provides software to work with data.^[1]

Frictionless Data consists of two main parts, software and standards.

Frictionless Software

The software is based on a suite of data standards that have been designed to make it easy to describe data structure and content so that data is more interoperable, easier to understand, and quicker to use. There are several aspects to the Frictionless software, including two high-level data frameworks (for Python and JavaScript), 10 low-level libraries for other languages, like R, and also visual interfaces and applications. You can read more about how to use the software (and find documentation) on the software page.

For example, here is a validation report created by the Frictionless Repository (opens new window)software. Data validation is one of the main focuses of Frictionless Data and this is a good visual representation of how the project might help to reveal common problems working with data.

Frictionless Standards

Lightweight yet comprehensive data specifications.

The Frictionless Data project is built on top of the Frictionless Standards, which are a set of specifications created to standardize different aspects of working with data. For example, you can use the Standards to describe a collection of data files or to share information about data types.

At the core of Frictionless is a set of patterns for describing data including Data Package (for datasets), Data Resource (for files), Table Schema (for tables), and also domain-specific extensions.

^[2]