Property talk:P1793
Documentation
regex describing an identifier or a Wikidata property. When using on property constraints, ensure syntax is a PCRE
List of violations of this constraint: Database reports/Constraint violations/P1793#Single value, SPARQL
List of violations of this constraint: Database reports/Constraint violations/P1793#allowed qualifiers, SPARQL
List of violations of this constraint: Database reports/Constraint violations/P1793#Entity types
List of violations of this constraint: Database reports/Constraint violations/P1793#Type Q6545185, Q19847637, Q108566342, Q18610173, Q21099935, SPARQL
List of violations of this constraint: Database reports/Constraint violations/P1793#Scope, SPARQL
List of violations of this constraint: Database reports/Constraint violations/P1793#Format, SPARQL
Find regexes with common issues (Help)
Violations query:
# Find regexes with the following issues: # - unnecessary anchor characters (^ and $) # - separator in character class ([a|b]) SELECT DISTINCT ?item #?cst ?itemLabel ?type ?issue #?re WITH { SELECT * WHERE { { ?item a wikibase:Property . ?item p:P1793 ?cst . ?cst a wikibase:BestRank . ?cst ps:P1793 ?re . BIND("statement" AS ?type) } UNION { ?item a wikibase:Property . ?item p:P2302 ?cst . ?cst a wikibase:BestRank . ?cst ps:P2302 wd:Q21502404 . ?cst pq:P1793 ?re . BIND("constraint" AS ?type) } BIND(xsd:integer(SUBSTR(STR(?item), 33)) AS ?pid) } #LIMIT 1000 } AS %i WHERE { { INCLUDE %i FILTER(STRSTARTS(?re, "^") || STRENDS(?re, "$")) BIND("anchor characters" AS ?issue) } UNION { INCLUDE %i FILTER(REGEX(?re, "\\[[^\\]]*\\|[^\\]]*\\]")) BIND("separator in character class" AS ?issue) } SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en" . } } ORDER BY DESC(?pid)
List of this constraint violations: Database reports/Complex constraint violations/P1793#Find regexes with common issues
|
For uses on properties (not items)
editPlease see Property_talk:P2302#format_as_a_regular_expression_.28P1793.29. --- Jura 20:43, 19 November 2015 (UTC)
Not for use in LUA
editUnfortunately, the regexp format does not meet Lua regexp format specifications. --Jarekt (talk) 15:43, 16 May 2016 (UTC)
The tool does not work with regex containing Unicode blocks syntax
editAnd why do we use some external tool to begin with? --Base (talk) 00:26, 16 June 2016 (UTC)
Duplication on properties
editCurrently, most properties have this twice: once as a qualifier in constraint (new usage), once directly as a statement.
It seems fairly clear that we don't need both. Any opinions or votes?
--- Jura 20:26, 19 July 2017 (UTC)
- I don't think this will cause problem. We have only one intended format but may have additional format constraints to prevent unwanted value (like image (P18))).--GZWDer (talk) 16:59, 23 July 2017 (UTC)
- So what's your plan to keep them in sync? Which one should get the "syntax clarification" qualifier?
--- Jura 11:42, 26 July 2017 (UTC) - @Laddo, Ivan A. Krestinin, Lucas Werkmeister (WMDE): as the conversion is done, is there anything planned?
--- Jura 11:46, 26 July 2017 (UTC)- No plans on my end… I’d suggest removing the main statements, unless someone else uses those. --Lucas Werkmeister (WMDE) (talk) 12:25, 26 July 2017 (UTC)
- OK to remove the main statement;
- however, this property was used to fill property documentation field "Allowed values" and we'll need to revisit Module:Property documentation.
- We can't move qualifier syntax clarification (P2916) to the constraints section, obviously, but its values would fit well as the new content for "Allowed values" (for an example, see syntax clarification (P2916) on IMSLP ID (P839)).
- Unless someone can think of an adequate property, I would suggest to append the "syntax clarification" comment to the field
|allowed values=
of{{Property documentation}}
, on property talk pages. - Note that Category:Property with allowed values to move to statement already reports over a thousand pages bearing some text in "allowed values".
- Unless someone can think of an adequate property, I would suggest to append the "syntax clarification" comment to the field
- -- LaddΩ chat ;) 14:28, 26 July 2017 (UTC)
- I tried the qualifier P2916 on the constraints at Property:P2572#P2302. It seems to work out. In any case, I don't mind if it's added by default as a direct statement.
--- Jura 15:38, 27 July 2017 (UTC)
- I tried the qualifier P2916 on the constraints at Property:P2572#P2302. It seems to work out. In any case, I don't mind if it's added by default as a direct statement.
- So what's your plan to keep them in sync? Which one should get the "syntax clarification" qualifier?
- @Jura1: So is the conclusion that we sometimes use them differently, so it's best to keep both? --99of9 (talk) 02:22, 19 November 2021 (UTC)
Testing on CLI
editYou can test with:
echo '<<<|1936/007721/06|>>>' | pcregrep --color '\d{4}/\d{6}/\d{2}|$'
This should give output
$ echo '<<<|1936/007721/06|>>>' | pcregrep --color '\d{4}/\d{6}/\d{2}|$'
<<<|1936/007721/06|>>>
Mandatory qualifier regular expression syntax (P4240)?
editThe approved property proposal for Wikidata:Property_proposal/regular_expression_syntax seems to have the effect that format as a regular expression (P1793) could have a value that conforms to a variety of regular expression pattern syntax standards--whether it be PCREv1, PCREv2, Java, BRE, etc. However per Help:Property constraints portal/Format, when format as a regular expression (P1793) is used as a property constraint, it appears the expectation is that the syntax should always be Java regular expressions (Q98057029) so that mw:Extension:WikibaseQualityConstraints can understand the syntax correctly. This seems to match the original property proposal for format as a regular expression (P1793) that anticipated "PCRE" (assumed to be Java regular expressions (Q98057029) rather than an earlier variant) was the only syntax to be used with format as a regular expression (P1793). User:KrBot also uses format as a regular expression (P1793) within property constraints, but User:KrBot expects the syntax to be Perl Compatible Regular Expressions 2 (Q98056596) instead. There are some differences between Java regular expressions (Q98057029) and Perl Compatible Regular Expressions 2 (Q98056596) that could prevent software from being able to parse format as a regular expression (P1793) values correctly.
How can this best be resolved? Three initial options to discuss:
- Should regular expression syntax (P4240) be a mandatory qualifier on format as a regular expression (P1793) to ensure there is no ambiguity in how to parse the provided regular expression pattern?
- If yes, when format as a regular expression (P1793) is used as a property constraint, should regular expression syntax (P4240) Java regular expressions (Q98057029), regular expression syntax (P4240) Perl Compatible Regular Expressions 2 (Q98056596) or both of these qualifiers be mandatory?
- Note: the intent would be that regular expression syntax (P4240) could accept more than one value (i.e. regular expression pattern conforms with multiple different syntaxes)
- Should the domain of format as a regular expression (P1793) be changed to Java regular expressions (Q98057029), and regular expression syntax (P4240) adjusted so that it cannot be used as a qualifier for format as a regular expression (P1793) (will primarily impact use of format as a regular expression (P1793) on instance of (P31) identifier (Q853614))?
- Should the domain of format as a regular expression (P1793) be changed to Perl Compatible Regular Expressions 2 (Q98056596), and regular expression syntax (P4240) adjusted so that it cannot be used as a qualifier for format as a regular expression (P1793) (will primarily impact use of format as a regular expression (P1793) on instance of (P31) identifier (Q853614))?
--Dhx1 (talk) 15:24, 5 August 2020 (UTC)
Notified participants of WikiProject property constraints
- Just adding that we hope for WikibaseQualityConstraints to evaluate regular expressions more efficiently in the future, which would change the supported syntax again, see T240884. Personally, I still recommend that constraint editors stick to the common subset of most regular expression flavors, and don’t use any “exotic” features. --Lucas Werkmeister (WMDE) (talk) 15:40, 6 August 2020 (UTC)
- There are two ways to use this property: only one is for property constraints (format constraints). As there it's used as a qualifier, I don't think the mandatory qualifier constraint would have much of an effect. --- Jura 20:33, 10 August 2020 (UTC)