Wikidata talk:WikiProject Companies/Archive 3
This page is an archive. Please do not modify it. Use the current page, even to continue an old discussion. |
Fiscal year data model
Notified participants of WikiProject Companies
There seems to be no uniformity in fiscal year modeling on Wikidata. I see two main problems:
- indication in over time interval financials (total revenue (P2139), net profit (P2295), etc.). At the moment three different approaches are used:
- point in time (P585) and (calendar) year + criterion used (P1013) fiscal year (Q191891) (used 500+ times)
- point in time (P585) and (calendar) year + sourcing circumstances (P1480) fiscal year (Q191891) (used 200+ times)
- start time (P580) and end time (P582) (used less than 10 times)
- indication of properties of the fiscal year itself. I see three solutions:
- new dedicated property fiscal year with data type item (to be created) - values calendar year (Q3186692), fiscal year ending 31 March, fiscal year as 52 or 53-week period that ends on the last Saturday of September, etc.
- new dedicated property end of fiscal year with data type item (to be created) - values December 31 (Q2912), March 31 (Q2461) (not sure how to illustrate more complex cases)
- new dedicated property fiscal year somehow using start time (P580) and end time (P582)
For financials indication I slightly prefer the first option, I do not recommend the third one (17k+ values with year to be converted, 20+ templates to be fixed). For fiscal year properties, I prefer the first option. Any thoughts? --Jklamo (talk) 16:13, 9 April 2021 (UTC)
- @Jklamo: Good questions. For both cases option 1 seems the right choice to me also. Note that you might need start time (P580) as a qualifier on a fiscal year for an organization if it changes at some point. ArthurPSmith (talk) 16:45, 9 April 2021 (UTC)
- @Jklamo: I also prefer option 1. - PKM (talk) 17:55, 9 April 2021 (UTC)
- The most economical way is to use start time (P580) and end time (P582). Sure, that would require some processing to capture and compare consecutive years of the same company. But if you want to compare performance of different companies that have different fiscal years, it's easier to work with the "raw" data (eg sort them by end time (P582)). I think we don't yet have enough data and a strong enough community to be introducing an intermediate node fiscal year --Vladimir Alexiev (talk) 06:55, 10 April 2021 (UTC)
More Registers
Notified participants of WikiProject Companies
We should source data about more registries. I made 4 prop proposals, please vote:
- Wikidata:Property_proposal/GLEI_RAL: GLEIF registration authority code (P9487)
- Wikidata:Property_proposal/GLEI_ELF: under discussion
- Wikidata:Property_proposal/OpenCorporates register id: OpenCorporates register ID (P9532)
- Wikidata:Property_proposal/OpenCorporates register jurisdiction: OpenCorporates register jurisdiction (P9630)
We should also consider:
- Creating more types of registries. Many countries have different kinds of registries, see RAL
- Eg Commercial Register (Q12297114) is company register (Q1394657), and that's clear
- Eg for Bulstat Registry (Q106447925) I used "register data (Q59157850) of nonprofit organization (Q163740)", but it may be better to make a dedicated type
- Eg for EDGAR (Q3050604) I used "company register (Q1394657) of public company (Q891723)", but it may be better to make a dedicated type. Stock exchanges also track public companies, but national authorities regulate them
- There are also registers of banks, insurers, funds, etc
- The difference between an agency and the registries it maintains/publishes (some agencies publish several registries)
- Eg EDGAR (Q3050604) vs U.S. Securities and Exchange Commission (Q953944)
- Eg Commercial Register (Q12297114) and Bulstat Registry (Q106447925) vs Registry Agency (Q101218804)
- But eg for Arizona Corporation Commission (Q4791280) and Corporations Canada (Q5172549), no such distinction is made
- The difference between applies to jurisdiction (P1001) and country (P17).
- Many but not all registries apply to whole countries: eg Germany and US have autonomous registries per Land/State (see counts in RAL)
- So far I've only seen registers that have country (P17). I think we should copy that to applies to jurisdiction (P1001) when the latter is missing.
- Incorporating registry openness scores
- There are 2 from Opencorporates:
- Open Company Data Index Score (Q106448953) (0-100)
- Basel Anti-Money Laundering Score (Q106448723) (10-0, lower is better)
- How to capture the scales?
- I've misappropriated minimum value (P2313), maximum value (P2312) (and even used them in reverse order for the Basel index)
- There are no appropriate values for "best/worst value"
- Is it ok to use review score (P444) to attach to companies, eg see https://www.wikidata.org/wiki/Q12297114#P444 ?
- There are 2 from Opencorporates:
- Incorporating counts from Opencorporates.
- Eg the Arizona register has these numbers (see https://www.wikidata.org/wiki/Q4791280#P4876)
- records (companies) 1,392,539
- records (officers) 3,595,725
- Eg the Arizona register has these numbers (see https://www.wikidata.org/wiki/Q4791280#P4876)
Integrating PM20 company/organization folders into Wikidata
Notified participants of WikiProject Companies
The companies and organizations archive of 20th Century Press Archives (Q36948990) (PM20) comprises more than 8.300 folders with digitized clippings and annual reports, along with according metadata (example). As third part of the ZBW - Leibniz Information Centre for Economics (Q317179) data donation to Wikidata, and started by April 2021, these companies are systematically linked to existing Wikidata items, or are created as new items from the PM20 metadata (example: Steel Brothers & Company (Q106809286)).
To organize the process, the institutions have been segmented to different current and future Mix-n-match catalogs, according to the Wikipedia language edition which is primarily used for matching.
Current status of the PM20 companies data donation (number of links and new items, per catalog/wiki language)
After Dutch as a pilot, currently the English segment is under way:
- Mix-n-match catalog PM20 companies en
- List for searching missing companies in all Wikipedias (via Startpage/Google)
- Pre-fabricated Wikidata items, to be inserted easily via QuickStatements
Segments matched against the French and German Wikipedia will follow. Details of the process are outlined here as part of the Wikidata:WikiProject 20th Century Press Archives.
Help is very welcome.
--Jneubert (talk) 06:00, 18 May 2021 (UTC)
- Nice dataset. But it would be nice to use it to create more than substandard company items. You have enough data to fill headquarters location (P159) and country (P17) (or even industry (P452)) in your example, but unfortunately, none of these is filled.--Jklamo (talk) 09:01, 18 May 2021 (UTC)
- @Jklamo: Filling more properties is on the todo list. This requires additional work, in mapping internal idenifiers to wikidata items. But once done, it can be applied in batch to the newly created as well as to the previously existing companies lacking these properties. Country sounds straightforward, but may turn out as the worst, because some Polish, Czech companies etc. where reported as German due to Nazi occupation - what I really wouldc not want to import into Wikidata. --Jneubert (talk) 10:17, 18 May 2021 (UTC)
- Agreed (@Jneubert:)! btw, which classification is "SEC-06120 Food and Beverage Industry // EN: Food and Tobacco Industry", I haven't seen such code. --Vladimir Alexiev (talk) 10:01, 18 May 2021 (UTC)
- The codes stem from a internal-use ZBW classification which is loosely related to the systematic part of STW Thesaurus for Economics (Q26903352). May be NACE2 will turn out helpful for mapping these to Wikidata. --Jneubert (talk) 10:22, 18 May 2021 (UTC)
- @Vladimir Alexiev: I'm aware of the recommendation to rely on ISIC as international standard. Do you know an existing LOD/SPARQL endpoint (or RDF download) with ISIC4, NACE2 and the published mapping between both, and in case, have you used that to map other classifications transitively to ISIC? I would love to learn from any experiences. --Jneubert (talk) 13:50, 18 May 2021 (UTC)
- @Jneubert: Very nice work! It's been a while since I used Mix n Match, does the "New item" process there work, or do you recommend just using your "pre-fabricated items" commands to create the ones needed? ArthurPSmith (talk) 13:19, 18 May 2021 (UTC)
- @ArthurPSmith: In fact, I did not use M-n-m "New item", because it has no knowledge of how to map source fields to WD properties. This - and the insertion of proper references - can be better done programmatically. I use the mnm description field for all information which could be useful for the manual checking process. But lately I mostly relied on the search statements in the list linked above, let Google do the matching, and when no full match comes up, create an item of its own. So in our current workflow, M-n-m does the initial matching, and then it's customized tools. --Jneubert (talk) 13:40, 18 May 2021 (UTC)
- @Jneubert: Very nice work! :) Would it be a help to do have a python script similar to https://github.com/dpriskorn/LexUse? It could check if an existing company item exist and if not, create it. (I began to rewrite and improve it here, but I did not find spare time to finish it yet)--So9q (talk) 07:37, 24 May 2021 (UTC)
- @So9q: Thank you very much for the hint! I'm not so familiar with Python, so I'll stick with text munging in good ol' Perl + Quickstatements for now, but will keep in mind that repo. Cheers, Jneubert (talk) 05:38, 26 May 2021 (UTC)
Creating entity schemas for organizations, companies, businesses, enterprises, public companies, etc.
Notified participants of WikiProject Companies
The only existing relevant entity schemas (https://www.wikidata.org/wiki/User:HakanIST/EntitySchemaList) for this project are "Instance of organization (or subclass)" (E98), "Public library organisation in The Netherlands" (E90), "software companies" (E73) and "boat manufacturing company" (E296). Can we agree on entity schemas for organizations, companies, businesses, enterprises, public companies, etc and create them? We could use the sheXer tool (http://shexer.weso.es) to make the drafts of entity schemas based on a few entities. May be it will then help to agree on a certain P31 value for companies... RShigapov 09:11, 24 June 2021 (UTC)
- This seems like a good idea, but yes there are some issues with our class hierarchy for organizations and I'm not eager to push people to agree on something there right now! ArthurPSmith (talk) 17:08, 24 June 2021 (UTC)
Since companies are organizations, I would begin with a first step : improve E98 and check that companies follow the schema. PAC2 (talk) 19:17, 26 June 2021 (UTC)
- I have improved E98. It's still a draft, we need to discuss it. RShigapov 11:04, 02 July 2021 (UTC)
Deploy {{Item documentation}}
for companies
In the last months, I've developed {{TP organization}}
, a template which provides generic queries for all kind of organisations and therefore companies.
For instance, {{TP organization|Q2283}}
returns :
This section is generated using {{TP organization}}
- List of organizations which have Microsoft as parent organization (P749) (query)
- List of organizations of which Microsoft is a subsidiary (P355) (query)
- Queries based on employer (P108)
- Queries based on member of (P463)
- See also: WikiProject Organizations
- See also EntitySchema for organizations: E98
.
This template is now automatically integrated into {{Item documentation}}
.
If you find it useful, help me to deploy it to talk page of companies item.
Feedback and contributions are also welcome. PAC2 (talk) 19:14, 26 June 2021 (UTC)
- Nice, thanks! ArthurPSmith (talk) 13:04, 27 June 2021 (UTC)
List of companies (based on industrial-classification and country)
I'm interested with this WikiProject. Can we see the list of all companies (based on industry-classification and country) that members of this WikiProject have contributed? Rtnf (talk) 08:23, 19 July 2021 (UTC)
US Boards for businesses with the largest revenues
I am starting on a project to add the latest revenue (2020 and greater), and board members to the largest US businesses. I would like to then use that data for a project called TheyRule.net. So far I have cleaned up or added boards and Revenue for Alphabet, AmerisourceBergen, and McKesson. This has involved adding some end dates on previous board members, and creating some new people entities. I am working through the top ten (from 10 - 1) on this page: List of largest companies in the United States by revenue. After I complete that (or maybe the top 25), I plan to break and work on how I am going to retrieve and process the data. Then come back to it and do the top 500 or so - probably recruiting some volunteer help! I am adding it here to announce my intentions, and seek any guidance or oversight, and coordination. One question that I have so far is about the minimum useful information that I will need to add when creating a new person in Wikidata. I have been typically adding instances of human, gender, given and family name, occupation: businessperson, sometimes citizenship, Little Sis person Id, sometimes Twitter or Crunchbase or Bloomberg Ids. I haven't worked out how to find the Twitter numeric id, for example. Thanks, --Korimako (talk) 14:19, 3 August 2021 (UTC)
- I think it could be interesting to have actual executive officers .. some boards aren't that productive. --- Jura 20:21, 30 October 2021 (UTC)
- @Jura1 Yes - that would be good data to capture during the process!
- I would like to invite friends, family and other to help, I am going to make some instructions etc, but I would also like to have a copy of a list of all the companies I want them to help update, so they can claim the ones they have done and the ones they plan to do. I think it makes sense to put that on Wikidata as a project page. My question is: What is the best way to go about that - is there anything that I should know? Should I make a page under this companies project? Would anyone object? Korimako (talk) 18:21, 31 October 2021 (UTC)