About this board

Previous discussion was archived at User talk:Joshbaumgartner/Archive 1 on 2016-02-26.

Removal of manufacturer from aircraft titles

6
Huntster (talkcontribs)

Josh, why are you regularly going around removing the name of the manufacturer from aircraft item titles? What policy or guideline is being enforced here, because it feels like personal preference, and it certainly goes against common naming practices across all projects.

Joshbaumgartner (talkcontribs)

Labels in WD have a very different purpose than article or category names, and in fact are not names at all. Take F-16 Fighting Falcon (Q100026) for example. The name of this item is "Q100026". While exact details on how labels are implemented is still in development for Wikidata (see Help:Label), there are some things we have been implementing since the first days of the project.

  1. There is no need, and in fact it is counter-productive to try, to attempt to correlate labels with names used on other projects. Our name is "Q#". While it is perfectly logical for Wikipedia or Commons to include manufacturer names in aircraft article names in order to ensure consistent avoidance of duplication, that is not an issue for WD labels.
  2. Labels do not need to be unique. There can be 100 items with the same label on WD without any issue. For this reason, long labels with disambiguation elements are not desirable. Disambiguation is instead the role of the description. Thus it not valuable to have the manufacturer in the label of an aircraft item, but it is valuable to add it to the description.
  3. Aliases are a unique system to WD and work very differently from redirects on other projects. Thus it is valuable to have an alias that matches various other project names for their aircraft articles, and doing that further removes the value of including the manufacturer in the label.

So this is why the manufacturer name is not required to be included in the label, but is there harm in retaining it in the label anyway even if it is not required? To answer this, one needs to go deeper into what aircraft items on WD really represent. For the most part aircraft items are for classes of aircraft. If every instance of that aircraft class was manufactured by the same manufacturer, than it would not necessarily be a problem to include the manufacturer in the label, as that would be accurate. While not required, it would not be inaccurate either, so not a particular problem. However, take the aforementioned example (F-16 Fighting Falcon (Q100026)). This class includes instances manufactured by General Dynamics, SABCA, Lockheed Martin, and so on. Including any one of these in the label for the entire class would be not only not required, but would in fact be inaccurate as well for a great number of the instances of that class.

Instead, there are several accurate methods of making sure the manufacturer information is accurately included in an item:

  1. Include it in the description. This is actually quite valuable, and for aircraft, descriptions that have the basic info of year (first flown), function (primary design role) and manufacturer (initial designing/building entity) make it very easy to quickly identify the correct entity when searching.
  2. Include it in the aliases. This ensures searches which include the manufacturer name will discover the entity.
  3. Add a manufacturer (P176) claim. This makes it both machine and human readable and allows qualifiers and references.
  4. Add a official name (P1448) claim with correct qualifiers and references.
Huntster (talkcontribs)

Yes, I do know how labels function, in detail, thanks. I was hoping for documented specific best practices. Help:Label is woefully inadequate to its purpose, like most help topics here, so I suppose such documentation simply doesn't exist. Less talking down to would have been appreciated.

The problem, as often is the case, is that you must trade off technical desires for functional usability. WD's mindset against this is what makes me fear the signal-to-noise ratio will eventually (well, fairly rapidly) overwhelm things. Already, the vast import of journal article titles has made finding some common terms so difficult its just not worth pursuing when building new items.

Joshbaumgartner (talkcontribs)

Unfortunately, you are not wrong about the lack of good documentation for WD practices. The project certainly could do with better documentation on a wide number of issues.

That said, I'm not sure why you first decided to denigrate my efforts as merely enforcing 'personal preference', and then upon receiving a detailed response, denigrated that as 'talking down to' you. Statements along those lines make a rational discussion difficult. I would certainly never presume to 'talk down' to any other editor, it would be far to presumptuous. I simply provided a detailed rationale because #1) you asked, and #2) I am well aware of the incomplete nature of current guidelines and policies on WD.

As for your general statements about the mindset of WD, I don't know exactly what you are getting at there. What 'functional useability' is it that we are losing out on due to 'technical desires'? Also, what do you mean by common terms being difficult to find due to journal articles being included? And why does this deter you from creating new items? Do you have some specific examples of these issues?

Huntster (talkcontribs)

I'm sorry, Josh, my frustrations with the site and practices got the better of me. I think my mind was moving along the lines of "templating the regulars".

As for the last part, my concern is third party usefulness. Yes, internally it may be desirable for labels to be as you describe, but that decreases usability by others who may not be as knowledgeable or specifically inclined. WD already has a monumentally more steep learning curve than any other Wikimedia project, it seems unfortunate to make it even more opaque than it already can be. Regarding common terms, when adding values to properties I'm inundated with journal article titles (and to a much lesser degree, book chapters and magazine articles) that obfuscate more useful returns in the dropdown. Not knowing the system's back end, I'm at a loss of how it might be improved, but it is beyond frustrating when trying to find a useful item.

Joshbaumgartner (talkcontribs)

No problem, I get frustrated myself with things, so consider it forgotten.

I do see third party use as an area that has never been adequately worked out. In this vein, labels are one of the worst offenders, if you will. Since labels are both one of the easiest things to access from other projects, and they appear at first glance the same way as article names do, there are a lot of templates and scripts on other projects that simply grab the label to use directly in their own project and assume they are treated the same way as article names (I know some of my early templates on Commons did just this). One problem with this though is that if they want to change how it appears on their project, the easiest (or at least most expedient) way is to come to WD and change the label. Also, if the WD label is changed later, it likewise changes the third party project page and people get a bit miffed at that sometimes. People also get frustrated if WD uses a different scheme for labels than the third party site does for article names. On the WD side, you could potentially have different third party users desiring different labels for the same thing. Fundamentally, this is a WD problem because labels are not handled like claims, and so cannot be referenced, ranked or qualified which is how we would normally handle data. Thus it is far better if a third party instead only pulls claims (such as a official name (P1448) claim) which can be properly referenced and qualified, and we can support multiple claims in parallel so no need to argue which stays and which goes. However, that is an entire new level of work and learning curve to be overcome and as you correctly mention, the learning curve on WD is already steep. I have no good answer for this!

The journal articles do seem to be a bit spammy. For me, it seemed for a while I was wading through endless drop downs of genes and proteins to try and find things. I guess depending on how a search is worded, a different set of spam will appear in your way. The sheer quantity of items in the database is making it harder and harder. I remember when we used to celebrate our 1 millionth, 2 millionth, etc. milestones, but now with 100+ million items, that seems a long way back. Given that we seem to be only scratching the surface on several topics, we could be in the billions before too long. One thing I've been doing is if I know the label is likely to be similar to other items, I add an alias which matches the label with a simple word such as 'aircraft' appended. For example, search for "Boston" and you get a ton of items. Search for "Boston aircraft" and you get right to the DB-7 item. I've seen others doing this too (I didn't invent the tactic), but it is hardly universal or standardized, so it is still not a real answer to the problem. Again, I have no true good answer, just trying to do what I can. I'm trying to think of a use case in which it is more difficult to find an item on account of it not including dab info such as the manufacturer. I would really like to dig into such a case and see what we can do to improve the situation. In any case, let's keep sharing ideas and info. Thanks!

Reply to "Removal of manufacturer from aircraft titles"

Ambrosini SAI.7 (Q15139251) & S.7 (Q28110)

2
Leo Pasini (talkcontribs)

Hi Joshbaumgartner. I think that acronym in SAI Ambrosini manufacture are really illogical and tend to create misunderstanding. The first aicraft was named SAI.7 (a racer) and S.7 was a (post war) training version.

In Q15139251 it seems the opposit; so Proprety:P279 is wrong and should work in reverse direction (Q28110 is P:279 fo Q15139251).

I don't know Wikidata and I'm not able to edit as described.

Thanks.

Joshbaumgartner (talkcontribs)

I agree that it is a bit confusing. S.7 (Q28110) is the item for the aircraft family (umbrella for all specific models). SAI.7 (Q15139251) and SAI.7T (Q15139323) are aircraft models (specific versions within an aircraft family) within the S.7 family. Thus both Q15139251 and Q15139323 must be subclasses of Q28110. What is missing is an item for the aircraft model S.7 (the specific post-war training version) as an aircraft model that would also be a subclass of Q28110. Unfortunately the current setup gives the illusion of SAI.7 (the original racer version) being a subclass of a later training version, which is incorrect. But what we are really saying is that there is an aircraft family (called 'S.7') and within it are two specific versions (called SAI.7 and SAI.7T).

Reply to "Ambrosini SAI.7 (Q15139251) & S.7 (Q28110)"
Vladimir Alexiev (talkcontribs)
Joshbaumgartner (talkcontribs)

Thanks for adding gun mount. However, you should use subclass of (P279) instead of instance of (P31), in that an item such as 5"/38 caliber twin gun mount (Q49839624) are not an instance, but instead a class. This is very much like how F-16 Fighting Falcon (Q100026) is not an instance of an aircraft, but instead a class of aircraft (through a tree of subclasses). Thus, I have added P279 to several items. I gave a stab at classing gun mount as a subclass of installation, though this is less than perfect. Thanks and I look forward to seeing more work done on equipment entries.

Joshbaumgartner (talkcontribs)
Vladimir Alexiev (talkcontribs)

agreed

Kette~cawiki (talkcontribs)

Este resumen de la reversión és correcto: special:diff/1235829874, pero no el valor, por mucho que estuvieraq comisionado a la armada chilena continuaba siendo un light cruiser (Q778129). Esta propiedat se presenta en las «infoboxes» que importan los datos de WD como tipo de barco. No he encontrado este valor como P31 en ningún otro barco.

Joshbaumgartner (talkcontribs)

This item is not a ship, it is a specific period of a ship's history, namely its commissioning in Argentine Navy from 1951-1982. For the item about the actual ship, see {{Q|1713251}}. It is the responsibility of Wikidata to conform its data to end users such as particular infoboxes.

Kette~cawiki (talkcontribs)

Sí, eso ya lo entiendo, pero entonces los artículos ligados en las diferentes Wikis como cawiki (ca:ARA General Belgrano), eswiki (es:ARA General Belgrano (C-4)), enwiki (en:ARA General Belgrano)... ¿de donde importan para la ficha (infobox) el tipo de barco que és? Trabajo en el mantenimiento i la codificacions de las infoboxes para que importen los datos de WD. Según d:Wikidata:WikiProject_Ships/Properties el tipo de barco se obtiene de P31. No puedo cambiar el còdigo de las plantillas para casos tan particulares. Como mínimo no borres el contenido anterior, P31 puede ser multivalor.

Este es mi workbench de esta plantilla.

Reply to "Q540580"
GreenComputer (talkcontribs)
Joshbaumgartner (talkcontribs)

You can use that one I suppose. I went looking for it and didn't find it, and certainly didn't realize and hadn't gotten any notification that it had been closed, which is a bit of a surprise. Nonetheless, I've evolved it a bit in the meantime as I've developed the use of it, so maybe it makes sense to make a fresh proposal to reflect how it is currently set up.

Reply to "size designation (P8030)"
Summary by Thierry Caro

Created!

Thierry Caro (talkcontribs)

Hello. Do you think you can get Wikidata:Property proposal/COTREX trail ID initiated? I'm ready to deal with it and I don't even need you to fill the property page itself, if you don't want to. I can do that. I just need it to be created as I'm the one who has started the proposal.

Thierry Caro (talkcontribs)

Just one click and I'll do the rest!

Joshbaumgartner (talkcontribs)

Well it is a bit more than just one...but it looks like this one is good to go, so I'll move it forward.

Thierry Caro (talkcontribs)

OK. Thank you very much. This is nice.

Joshbaumgartner (talkcontribs)

Okay, I've made a start for it, enjoy!

Adam Harangozó (talkcontribs)

Hi, I created a new page for collecting sites that could be added to Mix'n'match and I plan to expand it with the ones that already have scrapers by category. Feel free to use, expand. Best, Adam Harangozó (talk) 19:59, 19 October 2019 (UTC)

Reply to "New page for catalogues"
2001:B07:6442:8903:C4AD:B849:2AF8:ED72 (talkcontribs)
Joshbaumgartner (talkcontribs)

Not sure I'm familiar with this item or why it has both properties, but it is not a constraint violation to have both, so there may be a valid reason. Why is it a problem? ~~~~

Reply to "Wrong data"
PMG (talkcontribs)
Joshbaumgartner (talkcontribs)

Looks like it should have been the alias, not the label. Seems fixed now, thanks!

PMG (talkcontribs)
Reply to "Fletcher"
Vanished user e175adb86e72bb96a1706f7ab31b9df8 (talkcontribs)
Joshbaumgartner (talkcontribs)

Excellent, will keep it in mind next time around. Thanks.

Reply to "New properties"