Beginners PHP Tutorial – Regular Expressions

November 10, 2008 at 5:52 pm Leave a comment

For users who prefer to learn PHP visually we have a range of PHP video tutorials, this method of training greatly enhances learning and allows beginners to master PHP at their own pace.
View the PHP Tutorial Videos

.

Beginners PHP Tutorials – Regular Expression Basics

Regular expressions (regexes) provide a way to perform pattern matching inside of text strings as well as a way to extract subsets of text from within a string. Although the syntax looks complicated, it is really pretty easy once you understand these basics:

  • PHP offers both a PERL version of regex tools as well as its own. This tutorial covers using the PHP functions which include ereg, eregi, ergei_replace and ereg_replace.  The only difference between the functions ending in “g”, and the ones ending in “i”, is that the ones ending in “i” are case insensitive.int ereg ( string $pattern , string $string [, array &$regs ] )

The first parameter is the pattern to match. The second is the string to look for the pattern in, and the third, optional parameter, is an array to parse the matched parts of the string into. The parameters are the same for the eregi function.

  • Regexes work with patterns. The pattern is what you are seeking inside of the string you will be searching. Patterns are always delimited with a forward slash at the start and end of the pattern. Here is a very simple pattern:

/hello/

In the above example, the pattern would match a string that contained “hello world”.

<?php
   $string = ‘hello world’
   if (ereg ("(/hello/)", $string)) {
       echo “Found it!;
   } else {
       echo "Didn’t find it";
   }
   ?>
  • A basic pattern like the one shown above isn’t very useful. In fact, there are easier functions available in PHP to see if the string “hello” is part of another string. The real power of regexes comes in the use of metacharacters. These are special symbols that trigger regexes to perform more complex tasks. The available metacharacters are:
\ | . ( ) [ ] { } ^ $ + ?

When any of these metacharacters appear within a pattern, PHP knows that they have a special purpose and does not try to match them in the target string. If you wanted to actually match one of these, instead of use them as a metacharacter, you would have to delimit the character with a leading backslash.

A pattern to match “hello?” would be written like this:

/hello\?/

How Each Metacharacter Works

  • The dot (.) is a wildcard. It matches any character in a string. /h.llo/ will match hello, hallo, hillo, etc.
  • The caret and dollar sign (^$) are anchoring metacharacters. The ^ is used to match the beginning of a string. The $ matches the end. So to make sure a string started with the letter h, and ended with o, you would write /^hello$/. This match would fail with a string of “ahellow”, even though the string “hello” matches the “hello” in the pattern.
  • Parenthesis ( () ) are grouping metacharacters that allow complex patterns to be defined.
  • * + ? { } These symbols are quantifying metacharacters that define how many times a certain character or group of characters can or must appear.
  • The pipe ( | ) is an alternation metacharacter roughly equivalent to the PHP logical “or” statement. It tells the regex engine to match either one group of characters OR the other. It is used with the grouping metacharacter. The presence of either of these two strings would be considered a match: /(hello|(howdy)/
  • The square brackets [ ] are character metacharacters. They tell the regex engine to match any of the characters contained between the brackets. /h[eao]llo/. It also supports ranges of characters. /[a-zA-Z0-9_]/

The following shortcut symbols may also be used within or outside of the character class:

/\d/  matches any digit
/\D/  matches any non-digit character
/\w/  matches any of these characters: a-zA-Z0-9_
/\W/ #matches any non-word character
/\s/ #matches any whitespace character like space, tab or newline.
/\S/ #matches any non-whitespace character

So, for example – if you wanted to create a pattern that matched any 5 digits, you could write this: /\d{5}/

String Replacement

Both ereg_replace and eregi_replace use the same pattern matching rules as you have already learned.  The eregi_replace function is case insensitive.

string ereg_replace ( string $pattern , string $replacement , string $string )

The first parameter is the regex pattern you want to match. The second parameter is the string that you want to use as the replacement if your match is found. The third parameter is the string you are matching against. The function returns the modified string, if a match was found, or the original string if it was not.

$mystring =  ereg_replace(“/hello/”,”howdy”,”hello world”);
Echo $mystring;
   “howdy world”


Summary

Read through this lesson a few times and try to create some of your own examples. This is a great tool for validating strings used in input forms. Here’s an example that validates a North American telephone number that was passed to a routine from a customer registration form.  For this example, the number should be in the format of 999-999-9999. Anything other than that is invalid.

Work your way thorough the pattern and identify each metacharacter used.

function checkPhone($phone)
   {
      if(ereg('^[2-9]{1}[0-9]{2}-[0-9]{3}-[0-9]{4}$', $phone))
         return true;
     else
         return false;
   }

As an exercise, modify the above regexp to validate a phone number in this format (999) 999-9999.

Look at this pattern. Can you tell what it is used to match?

("#\d{3}-\d{2}-\d{4}#")

Entry filed under: PHP. Tags: , , .

Beginners PHP Tutorial – Sending Mail with PHP Flash CS3 Tutorial – Create a Preloader

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Connecting to %s

Trackback this post  |  Subscribe to the comments via RSS Feed


Follow us on Twitter


Follow

Get every new post delivered to your Inbox.