I’m working on a set of validation classes and am currently building plugins for applying various validation rules. I’ve built the following class for validating a UK postcode:
class PostcodeUk extends abstr\Prop implements iface\Prop
{
const
/**
* Defines the regular expression against which to test postal code
*
* @see http://en.wikipedia.org/wiki/Postcodes_in_the_United_Kingdom#Validation UK postal code validation rules on Wikipedia
*/
PATTERN = '/^(GIR 0AA)|(((A[BL]|B[ABDHLNRSTX]?|C[ABFHMORTVW]|D[ADEGHLNTY]|E[HNX]?|F[KY]|G[LUY]?|H[ADGPRSUX]|I[GMPV]|JE|K[ATWY]|L[ADELNSU]?|M[EKL]?|N[EGNPRW]?|O[LX]|P[AEHLOR]|R[GHM]|S[AEGKLMNOPRSTY]?|T[ADFNQRSW]|UB|W[ADFNRSV]|YO|ZE)[1-9]?[0-9]|((E|N|NW|SE|SW|W)1|EC[1-4]|WC[12])[A-HJKMNPR-Y]|(SW|W)([2-9]|[1-9][0-9])|EC[1-9][0-9]) [0-9][ABD-HJLNP-UW-Z]{2})$/';
/**
*
* @return bool True if valid
* @throws \InvalidArgumentException
*/
public function isValid ()
{
$valid = false;
$data = $this -> getData ();
switch (gettype ($data))
{
case 'NULL' :
$valid = true;
break;
case 'string' :
$valid = preg_match (static::PATTERN, $data) > 0;
break;
default :
throw new \InvalidArgumentException (__CLASS__ . ': This property cannot be applied to data of type ' . gettype ($data));
break;
}
return ($valid);
}
}
The regex defined in PostcodeUk::PATTERN was derived from the on given in Wikipedia’s article on UK postcodes. However, the regex as given detects valid postcode strings contained within bigger blocks of text. I want it to exactly match a valid post code only, excluding preceding and following characters. So (SW1A 0AA) should be passed as valid, but (foobarSW1A 0AA) should not.
I added the anchors to the regex (^ at the start and $ at the end) to try and force it to only accept a string that consists of only a postcode as valid. However, the class still passes postcodes with padding and/or non-postcode strings wrapping it.
What am I doing wrong? I thought adding the anchors would be enough to get the behaviour I wanted.
Add the anchors as:
^foo|bar$is not the same as^(?:foo|bar)$.You should also use
\zinstead of$.$allows an optional line break at the end of string, while\zis a strict end of string match.