Property talk:P3134
Documentation
identifier of a place (region, hotel, restaurant, attraction), in TripAdvisor
List of violations of this constraint: Database reports/Constraint violations/P3134#Format, SPARQL
List of violations of this constraint: Database reports/Constraint violations/P3134#Unique value, SPARQL (every item), SPARQL (by value)
List of violations of this constraint: Database reports/Constraint violations/P3134#Item P31, search, SPARQL
List of violations of this constraint: Database reports/Constraint violations/P3134#Item P17, search, SPARQL
List of violations of this constraint: Database reports/Constraint violations/P3134#Single value, SPARQL
List of violations of this constraint: Database reports/Constraint violations/P3134#Entity types
List of violations of this constraint: Database reports/Constraint violations/P3134#Scope, SPARQL
This property is being used by: Please notify projects that use this property before big changes (renaming, deletion, merge with another property, etc.) |
Pattern ^g\d+-d(\d+)$ will be automatically replaced to \1. Testing: TODO list |
Pattern ^Tourism-g(\d+)$ will be automatically replaced to \1. Testing: TODO list |
Warning edit
Hi, I don't know how to avoid warning when I insert this ID. In Q18490819 I tried three different format, they all work but they all give me the small alert icon at the right. I try to understand it (I read the comment) but so far I cannot fix it. I think the problem is that is structured for specific places and not generic locations, that have a shorter code on Tripadvisor. Can someone show to me how to fix it, or change the code here accordingly to host this case too? Thank you.--Alexmar983 (talk) 13:42, 1 August 2018 (UTC)
- @YMS: I ping you as a creator if you don't mind. It's not urgent but I am processing a lot of Tripadvisor pages in these weeks, it's he ideal moment to add those ID for town and hamlets but the warning is there.--Alexmar983 (talk) 16:32, 9 August 2018 (UTC)
- The second format you tried was almost correct according to the specified format. You just added an additional \, probably inspired from the regular expression given here. But there, the \ is a special character only present to escape the -, which otherwise would have a special meaning. Just copying the beginning of the URL should work fine. The first format you tried actually worked as well, but if I remember it correctly, this is just because TripAdvisor does a lot of fuzzy searching in its URLs (as can be seen in the third format you tried, which also actually worked), and might result in ambiguous links. If I'm wrong here, the regular exception could indeed be simplified. --YMS (talk) 17:07, 9 August 2018 (UTC)
Two items for one ID edit
Hi, this ID covers Orsanmichele (Q860816) and Museum of Orsanmichele (Q2947745). What is the preferred strategy in this case? I apply to both items?--Alexmar983 (talk) 10:34, 10 May 2019 (UTC)
- @Alexmar983: IMHO you should create an item about church and museum, add the identifier and then add has part(s) (P527) and part of (P361) properly. --★ → Airon 90 13:00, 20 February 2020 (UTC)
Identifier could be reduced more edit
As of now, identifier is the union of a couple made by a letter and some numbers, linked with a - (e.g. g670770-d4037786. I found out that it is possible to use just the last numbers (https://www.tripadvisor.com/4037786. Moreover, this code is used by TripAdvisor API to get locations. --★ → Airon 90 12:56, 20 February 2020 (UTC)
- @Airon90: I would Support the simplification of this property to just the numbers, also makes sense to align with TripAdvisor themselves and what they consider the unique ID. Would be happy to write a bot to convert existing statements if there is community consensus. --SilentSpike (talk) 11:18, 21 February 2020 (UTC)
- Please propose a new property for the new format. --- Jura 13:06, 21 February 2020 (UTC)
- @Jura1: Why a new proposal is needed? Why couldn't a bot just change all identifier? --★ → Airon 90 14:57, 23 February 2020 (UTC)
- How would a bot find all of them? --- Jura 14:37, 3 March 2020 (UTC)
- Seems simple enough. --SilentSpike (talk) 15:46, 3 March 2020 (UTC)Try it!
SELECT DISTINCT ?item ?id where { ?item wdt:P3134 ?id. FILTER (contains(?id, "-")) } LIMIT 100
- It's also worth noting these IDs come in two forms, some are like
Tourism-gXXXXX
while others are likegXXXXX-dXXXXX
. It seems like the former applies to regions and the latter specific destinations (the "g" number is the region and the "d" the destination). Tested and it also looks like the URL found by @Airon90: works for the regions too (e.g. https://www.tripadvisor.com/1025218). --SilentSpike (talk) 17:38, 3 March 2020 (UTC)- That's just the first 100 values here at Wikidata with best rank. --- Jura 17:45, 3 March 2020 (UTC)
- Right, because of the
LIMIT 100
, but bot can go through and update these statements and then they will no longer contain substring"-"
so the query will only return the old format. --SilentSpike (talk) 17:56, 3 March 2020 (UTC)- I hope use of this property isn't limited to best rank here at Wikidata. --- Jura 18:21, 3 March 2020 (UTC)
- I'm not sure I follow? It's finding the items which have a statement using this property where the value contains a
"-"
substring. Items don't have a rank. Perhaps I am missing a SPARQL behaviour of some sort? --SilentSpike (talk) 18:35, 3 March 2020 (UTC)- It doesn't find uses in references, as qualifier, in ranks other than best ran, and, most of all, any uses of the property outside Wikidata. --- Jura 18:37, 3 March 2020 (UTC)
- Ah I see, then yeah new property proposal and deprecation is the way to go for sure. --SilentSpike (talk) 18:58, 3 March 2020 (UTC)
- It doesn't find uses in references, as qualifier, in ranks other than best ran, and, most of all, any uses of the property outside Wikidata. --- Jura 18:37, 3 March 2020 (UTC)
- I'm not sure I follow? It's finding the items which have a statement using this property where the value contains a
- I hope use of this property isn't limited to best rank here at Wikidata. --- Jura 18:21, 3 March 2020 (UTC)
- Right, because of the
- That's just the first 100 values here at Wikidata with best rank. --- Jura 17:45, 3 March 2020 (UTC)
- How would a bot find all of them? --- Jura 14:37, 3 March 2020 (UTC)
Am I missing something or bot can change all items in query, then changing the format in format as a regular expression (P1793) will eventually trigger WD:Database reports/Constraint violations/P3134. I don't get what you mean with "property outside Wikidata": if somebody is using data from this property will receive new data which, not only are human usable, like now it is, but also they are machine usable. We are not breaking anything, we are just implementing something better. --★ → Airon 90 19:19, 3 March 2020 (UTC)
- How would you update already existing uses of the property (and its values) outside Wikidata? --- Jura 19:24, 3 March 2020 (UTC)
- How could you tell people changing property and using a new one because this would be deprecated?
- Remember that this change doesn't break anything. Let's suppose it does. A change in the code of programs using this property is required. It would be required even if we create a new property and deprecate this one --★ → Airon 90 19:40, 3 March 2020 (UTC)
- @Airon90: The point of deprecation is that you replace a property with another without destroying existing data (only adding data). If we start changing the value of this property everywhere then suddenly downstream there's an unexpected value. If we instead remove the statements and add a new statement with expected new value then nothing is broken, downstream users would just see data as removed and have time to convert to using the new property since the old would be marked as deprecated. For what it's worth I don't imagine it would be hard to get the new proposal approved (since it's pretty clearly the actual identifier unlike the partial URL slug currently used) and it would also still be easy to replace via bot. --SilentSpike (talk) 20:04, 3 March 2020 (UTC)
- How can I tell you that this change will not break anything? Nobody will get an "unexpected value".
- I'm tired of explaining my position, so if you think again I'm wrong, you will open a new proposal. I won't do that, sorry. --★ → Airon 90 08:03, 4 March 2020 (UTC)
- @Airon90: The point of deprecation is that you replace a property with another without destroying existing data (only adding data). If we start changing the value of this property everywhere then suddenly downstream there's an unexpected value. If we instead remove the statements and add a new statement with expected new value then nothing is broken, downstream users would just see data as removed and have time to convert to using the new property since the old would be marked as deprecated. For what it's worth I don't imagine it would be hard to get the new proposal approved (since it's pretty clearly the actual identifier unlike the partial URL slug currently used) and it would also still be easy to replace via bot. --SilentSpike (talk) 20:04, 3 March 2020 (UTC)
This is now Done. It was decided to not use a new property by lack of consensus at Wikidata:Property_proposal/TripAdvisor_ID_2. --SilentSpike (talk) 12:15, 2 July 2020 (UTC)
Please add simple instructions on how to get the Trip Advisor ID edit
Hi
I'm trying to add a Trip Advisor ID but there are no instructions here how to get it or what it should look like. Please can these be added?
Thanks
--John Cummings (talk) 10:19, 3 March 2020 (UTC)
- @John Cummings: I added instructions, though they may not be great. Basically, given a URL like
https://www.tripadvisor.com/Attraction_Review-g303961-d324092-Reviews-Bazaar_of_Tabriz-Tabriz_East_Azerbaijan_Province.html
, the ID is thegXXXXXX-dXXXXXX
part (where X represents a digit). So for this URL, the ID isg303961-d324092
. Basically the parts betweentripadvisor.com/
and.html
that aren't words. :) Trivialist (talk) 11:07, 3 March 2020 (UTC)- @Trivialist: this is great, thanks so much. John Cummings (talk) 14:12, 3 March 2020 (UTC)
Just to note for future readers that this discussion is outdated as the ID format has now been updated to just using digits after -d or -g (if there's no -d in the URL). --SilentSpike (talk) 12:13, 2 July 2020 (UTC)
Scraping edit
There are many commercial and open-source scrapers that work on TripAdvisor, but all seem to focus on getting detailed data for a known URL, rather than finding out the urls themselves. I came up with the crude approach which could produce a dataset with manual filtering of incorrect results.
curl 'https://duckduckgo.com/?q=!ducky+fort+amherst+site%3Atripadvisor.com' | sed -Ee 's`.*%2Dd([0-9]*)%2DReviews%2D([^%]*)%2D(.*).html\&.*`\1 \2,\3\n`;s`_` `g' 2225973 Fort Amherst,Chatham Kent England
I will experiment on a small scale for fortifications in Kent. Vicarage (talk) 07:52, 20 March 2022 (UTC)