Using Teradata SQL: Validating a UK Postcode

Using Teradata SQL: Validating a UK Postcode

This one came about because a set of data I was dealing with had UK postcodes in it. The data came in from a whole variety of sources, and as such, was not very good in quality terms. The system that the data was going to, on the other hand was really, really fussy. Any postcode had to be perfectly formatted, including the space in the middle. So I wrote some SQL that will validate a postcode and ensure it conforms to the current rules as defined by the UK Government in the National Archives. If it doesn't, the code returns {null}. It assumes the input is in a character field called postal_cd, the output is a character field called Postcode.

CASE
  -- The clauses below accept UK Postcodes as defined by the UK Goverment
  --http://webarchive.nationalarchives.gov.uk/+/http:/www.cabinetoffice.gov.uk/govtalk/schemasstandards/e-gif/datastandards/address/postcode.aspx

  WHEN REGEXP_SUBSTR(UPPER(OREPLACE(postal_cd, ' ' , NULL)), '[A-P,R-U,W,Y,Z][A-H,K-Y][0-9][A,B,E,H,M,N,P,R,V,W,X,Y][0-9][A,B,D-H,J,L,N,P-U,W-Z][A,B,D-H,J,L,N,P-U,W-Z]', 1, 1, 'c') IS NOT NULL
    THEN UPPER(SUBSTR(OREPLACE(postal_cd, ' ' , NULL),1,4) || ' ' || SUBSTR(OREPLACE(postal_cd, ' ' , NULL),5,3))

  WHEN REGEXP_SUBSTR(UPPER(OREPLACE(postal_cd, ' ' , NULL)), '[A-P,R-U,W,Y,Z][A-H,K-Y][0-9][0-9][0-9][A,B,D-H,J,L,N,P-U,W-Z][A,B,D-H,J,L,N,P-U,W-Z]', 1, 1, 'c') IS NOT NULL
    THEN UPPER(SUBSTR(OREPLACE(postal_cd, ' ' , NULL),1,4) || ' ' || SUBSTR(OREPLACE(postal_cd, ' ' , NULL),5,3))

  WHEN REGEXP_SUBSTR(UPPER(OREPLACE(postal_cd, ' ' , NULL)), '[A-P,R-U,W,Y,Z][0-9][A-H,J,K,S,T,U,W][0-9][A,B,D-H,J,L,N,P-U,W-Z][A,B,D-H,J,L,N,P-U,W-Z]', 1, 1, 'c') IS NOT NULL
    THEN UPPER(SUBSTR(OREPLACE(postal_cd, ' ' , NULL),1,3) || ' ' || SUBSTR(OREPLACE(postal_cd, ' ' , NULL),4,3))

  WHEN REGEXP_SUBSTR(UPPER(OREPLACE(postal_cd, ' ' , NULL)), '[A-P,R-U,W,Y,Z][A-H,K-Y][0-9][0-9][A,B,D-H,J,L,N,P-U,W-Z][A,B,D-H,J,L,N,P-U,W-Z]', 1, 1, 'c') IS NOT NULL
    THEN UPPER(SUBSTR(OREPLACE(postal_cd, ' ' , NULL),1,3) || ' ' || SUBSTR(OREPLACE(postal_cd, ' ' , NULL),4,3))

  WHEN REGEXP_SUBSTR(UPPER(OREPLACE(postal_cd, ' ' , NULL)), '[A-P,R-U,W,Y,Z][0-9][0-9][0-9][A,B,D-H,J,L,N,P-U,W-Z][A,B,D-H,J,L,N,P-U,W-Z]', 1, 1, 'c') IS NOT NULL
    THEN UPPER(SUBSTR(OREPLACE(postal_cd, ' ' , NULL),1,3) || ' ' || SUBSTR(OREPLACE(postal_cd, ' ' , NULL),4,3))

  WHEN REGEXP_SUBSTR(UPPER(OREPLACE(postal_cd, ' ' , NULL)), '[A-P,R-U,W,Y,Z][0-9][0-9][A,B,D-H,J,L,N,P-U,W-Z][A,B,D-H,J,L,N,P-U,W-Z]', 1, 1, 'c') IS NOT NULL
  THEN UPPER(SUBSTR(OREPLACE(postal_cd, ' ' , NULL),1,2) || ' ' || SUBSTR(OREPLACE(postal_cd, ' ' , NULL),3,3))

  ELSE NULL
END AS Postcode,

"The combination of hard work and smart work is efficient work" - Robert Half