Property talk:P1793

Latest comment: 2 years ago by 99of9 in topic Duplication on properties

Documentation

format as a regular expression
regex describing an identifier or a Wikidata property. When using on property constraints, ensure syntax is a PCRE
Representsregular expression (Q185612)
Data typeString
Domainunique identifier (Q6545185), Wikidata property for an identifier (Q19847637), Wikidata property related to classification schemes (Q108566342), Wikidata property to link to Commons (Q18610173) or Wikidata property with datatype string that is not an external identifier (Q21099935)
Allowed values.+(?!.*A-z).+
Usage notesWhen used in property constraint, it should contain PCRE2 regular expression, treated as "full match". For example, "(?i)[a-z]+|\d+|" means that validated string should match /^(?:[a-z]+|\d+)$/i or be set to "no value" or "unknown value"
ExampleISO 639-3 (Q845956)[a-z{3} [a-z]{3}]
NTFS (Q183205)[^\00\/{1,255} [^\00\/]{1,255}]
IMDb ID (P345)ev\d{7}\/\d{4}(-\d)?|(ch|co|ev|nm|tt)\d{7}
Laws & Regulations Database of the Republic of China ID (P7242)[A-Z0-9{8} [A-Z0-9]{8}]
Formatter URLhttps://regex101.com/?regex=$1
See alsoregular expression syntax (P4240), format as language specific regular expression (P8770), syntax clarification (P2916), applies if regular expression matches (P8460), Smithsonian trinomial format regex (P9790)
Lists
Proposal discussionProposal discussion
Current uses
Total15,713
Main statement6,43841% of uses
Qualifier9,27459% of uses
Reference1<0.1% of uses
Search for values
[create Create a translatable help page (preferably in English) for this property to be included here]
Single value: this property generally contains a single value. (Help)
Exceptions are possible as rare values may exist. Exceptions can be specified using exception to constraint (P2303).
List of violations of this constraint: Database reports/Constraint violations/P1793#Single value, SPARQL
Format “.+: value must be formatted using this pattern (PCRE syntax). (Help)
List of violations of this constraint: Database reports/Constraint violations/P1793#Format, hourly updated report, SPARQL
Allowed entity types are Wikibase item (Q29934200), Wikibase property (Q29934218): the property may only be used on a certain entity type (Help)
Exceptions are possible as rare values may exist. Exceptions can be specified using exception to constraint (P2303).
List of violations of this constraint: Database reports/Constraint violations/P1793#Entity types
Scope is as main value (Q54828448), as qualifier (Q54828449): the property must be used by specified way only (Help)
Exceptions are possible as rare values may exist. Exceptions can be specified using exception to constraint (P2303).
List of violations of this constraint: Database reports/Constraint violations/P1793#Scope, SPARQL
Format “(?!.*A-z).+: value must be formatted using this pattern (PCRE syntax). (Help)
Exceptions are possible as rare values may exist. Exceptions can be specified using exception to constraint (P2303).
List of violations of this constraint: Database reports/Constraint violations/P1793#Format, SPARQL
 
Find regexes with common issues
Find regexes with common issues (Help)
Violations query: # Find regexes with the following issues: # - unnecessary anchor characters (^ and $) # - separator in character class ([a|b]) SELECT DISTINCT ?item #?cst ?itemLabel ?type ?issue #?re WITH { SELECT * WHERE { { ?item a wikibase:Property . ?item p:P1793 ?cst . ?cst a wikibase:BestRank . ?cst ps:P1793 ?re . BIND("statement" AS ?type) } UNION { ?item a wikibase:Property . ?item p:P2302 ?cst . ?cst a wikibase:BestRank . ?cst ps:P2302 wd:Q21502404 . ?cst pq:P1793 ?re . BIND("constraint" AS ?type) } BIND(xsd:integer(SUBSTR(STR(?item), 33)) AS ?pid) } #LIMIT 1000 } AS %i WHERE { { INCLUDE %i FILTER(STRSTARTS(?re, "^") || STRENDS(?re, "$")) BIND("anchor characters" AS ?issue) } UNION { INCLUDE %i FILTER(REGEX(?re, "\\[[^\\]]*\\|[^\\]]*\\]")) BIND("separator in character class" AS ?issue) } SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en" . } } ORDER BY DESC(?pid)
List of this constraint violations: Database reports/Complex constraint violations/P1793#Find regexes with common issues
 

Please notify projects that use this property before big changes (renaming, deletion, merge with another property, etc.)

For uses on properties (not items) edit

Please see Property_talk:P2302#format_as_a_regular_expression_.28P1793.29. --- Jura 20:43, 19 November 2015 (UTC)Reply

Not for use in LUA edit

Unfortunately, the regexp format does not meet Lua regexp format specifications. --Jarekt (talk) 15:43, 16 May 2016 (UTC)Reply

The tool does not work with regex containing Unicode blocks syntax edit

And why do we use some external tool to begin with? --Base (talk) 00:26, 16 June 2016 (UTC)Reply

Duplication on properties edit

Currently, most properties have this twice: once as a qualifier in constraint (new usage), once directly as a statement.

It seems fairly clear that we don't need both. Any opinions or votes?
--- Jura 20:26, 19 July 2017 (UTC)Reply

I don't think this will cause problem. We have only one intended format but may have additional format constraints to prevent unwanted value (like image (P18))).--GZWDer (talk) 16:59, 23 July 2017 (UTC)Reply
So what's your plan to keep them in sync? Which one should get the "syntax clarification" qualifier?
--- Jura 11:42, 26 July 2017 (UTC)Reply
@Laddo, Ivan A. Krestinin, Lucas Werkmeister (WMDE): as the conversion is done, is there anything planned?
--- Jura 11:46, 26 July 2017 (UTC)Reply
No plans on my end… I’d suggest removing the main statements, unless someone else uses those. --Lucas Werkmeister (WMDE) (talk) 12:25, 26 July 2017 (UTC)Reply
  • OK to remove the main statement;
  1. however, this property was used to fill property documentation field "Allowed values" and we'll need to revisit Module:Property documentation.
  2. We can't move qualifier syntax clarification (P2916) to the constraints section, obviously, but its values would fit well as the new content for "Allowed values" (for an example, see syntax clarification (P2916) on IMSLP ID (P839)).
-- LaddΩ chat ;) 14:28, 26 July 2017 (UTC)Reply
I tried the qualifier P2916 on the constraints at Property:P2572#P2302. It seems to work out. In any case, I don't mind if it's added by default as a direct statement.
--- Jura 15:38, 27 July 2017 (UTC)Reply
@Jura1: So is the conclusion that we sometimes use them differently, so it's best to keep both? --99of9 (talk) 02:22, 19 November 2021 (UTC)Reply

Testing on CLI edit

You can test with:

echo '<<<|1936/007721/06|>>>' | pcregrep --color '\d{4}/\d{6}/\d{2}|$'

This should give output


$ echo '<<<|1936/007721/06|>>>' | pcregrep --color '\d{4}/\d{6}/\d{2}|$'
<<<|1936/007721/06|>>>

Mandatory qualifier regular expression syntax (P4240)? edit

The approved property proposal for Wikidata:Property_proposal/regular_expression_syntax seems to have the effect that format as a regular expression (P1793) could have a value that conforms to a variety of regular expression pattern syntax standards--whether it be PCREv1, PCREv2, Java, BRE, etc. However per Help:Property constraints portal/Format, when format as a regular expression (P1793) is used as a property constraint, it appears the expectation is that the syntax should always be Java regular expressions (Q98057029) so that mw:Extension:WikibaseQualityConstraints can understand the syntax correctly. This seems to match the original property proposal for format as a regular expression (P1793) that anticipated "PCRE" (assumed to be Java regular expressions (Q98057029) rather than an earlier variant) was the only syntax to be used with format as a regular expression (P1793). User:KrBot also uses format as a regular expression (P1793) within property constraints, but User:KrBot expects the syntax to be Perl Compatible Regular Expressions 2 (Q98056596) instead. There are some differences between Java regular expressions (Q98057029) and Perl Compatible Regular Expressions 2 (Q98056596) that could prevent software from being able to parse format as a regular expression (P1793) values correctly.

How can this best be resolved? Three initial options to discuss:

--Dhx1 (talk) 15:24, 5 August 2020 (UTC)Reply

  Notified participants of WikiProject property constraints

Just adding that we hope for WikibaseQualityConstraints to evaluate regular expressions more efficiently in the future, which would change the supported syntax again, see T240884. Personally, I still recommend that constraint editors stick to the common subset of most regular expression flavors, and don’t use any “exotic” features. --Lucas Werkmeister (WMDE) (talk) 15:40, 6 August 2020 (UTC)Reply
  • There are two ways to use this property: only one is for property constraints (format constraints). As there it's used as a qualifier, I don't think the mandatory qualifier constraint would have much of an effect. --- Jura 20:33, 10 August 2020 (UTC)Reply
Return to "P1793" page.