The CSV Profiler analyses the input CSV and provides basic informations and metrics:

  • File encoding and delimiter of the input CSV.
  • The header values of the table.
  • Completeness metric: the data-field completeness (i.e. non-empty metric) for (i) all cells, (ii) the columns, and (iii) the headers.
  • The number of distinct values per column.
  • Simple datatype detection: NUMERIC, TEXT, ENTITY (candidates for names, places, things, …), ID (internal IDs, codes, acronyms)

The ADEQUATe framework integrates this profiling information to allow (semi-)automatically generated metadata about CSV files, cf. the CSV MetaData Editor.