Open main menu


This is a Wikidata project to enrich existing wikidata items with data on legal entities (German companies, foundations/Vereine etc) from For now, only OpenCorporates ID (P1320) is licensed CC0 so we'll start with that.

We already downloaded the entire offeneregister dataset, split it up into chunks to make it more managable, and did a bit of pre-processing (re-formatted OpenCorporates ID to include "de/" prefix, split data into sets based on quality of address fields).

To contribute, download a chunk of data and mark it as Crystal Clear app clock-orange.svg In progress. Then install Openrefine and load the data as json with the outermost bracket as import path. Some of the chunks have been preprocessed for your convenience and have been uploaded as openrefine projects.

So far we've been reconciling company name ("_ - name") against Organisation (Q43229). The expected hit-miss-rate when reconciling the data with Wikidata is about 0,007% which will probably result in an affected set of 36000 items.




Use these templates to mark progress and avoid duplication.

  Not done

  In progress



  • Download and chunk data from offeneregister.   Done
  • Create openrefine_projects, re-format OpenCorporates ID to include "de/" prefix and split chunks further based on quality of address data. User:a_ka_es   In progress
    • raw = full unprocessed chunked dataset; 100,000 records - .json
    • openrefine_project = only the records with clean addresses; ready to import as a project in Open Refine; OpenCorporates IDs are aligned, addresses are cleaned; ready to reconcile/upload - .openrefine.tar.gz
    • without_address = only the records without addresses; OpenCorporates IDs are aligned; ready to import/reconcile/upload - .csv
    • to_clean = only the records with "messy" addresses; OpenCorporates IDs are aligned - .csv
  • Reconcile chunks with wikidata and upload OpenCorporates ID.   In progress

links below are not ready yet; "raw" is linked, "openrefine_project", "without_address" and "to_clean" are in progress


The participants listed below can be notified using the following template in discussions:

{{Ping project|WikiProjekt}}