Since beginning work on my DNS Yogi site, I’ve had to do numerous regular expressions to matching all sorts of string bits. I quickly ran into problems when I realized that I need to add support for Unicode characters since certain TLD registrars support registrations with non-Latin characters.
The main issue is that there are multiple regular expression engines. PHP uses a flavor of the PCRE (Perl Compatible Regular Expression) engine. Each engine and varient of an engine has a slightly different way of handling regular expression syntax. I needed to find out exactly how the PHP regular expression engine worked, and finding that information was not easy.
I’d have to say that there isn’t a single resource that will provide everything needed, and I certainly don’t believe that I can produce and maintain a better one. So, this post will act as a compilation of resources that together provide a robust overview of how PHP handles regular expressions.
- Your first stop should be the PHP Manual‘s page on the preg_match function. This page will get you started on how to run regular expressions using PHP. In addition, you should look at the preg_match_all, preg_replace, and preg_split function references so you get a good overview of what each function can do.
- Next, stop by Regular-Expressions.info and read through their page on PHP to get an outsider view of how PHP handles regular expressions. If you aren’t familiar with regular expressions, you will gain an amazing amount of knowledge about how to build and use them by reading through their Regex Tutorial.
- Now it’s time to really dig in and get dirty on the internals of the PCRE engine that PHP uses. The PCRE Regular Expression Pattern Syntax Reference (PHP preg*) document is an extremely in-depth reference that details the finer points of how the regular expression engine in PHP really works. If you want to know how to build advanced regex patterns in PHP, this is the document for you.
Did I help you? Send me a tip.