Free Online Courses for Software Developers - MrBool
× Please, log in to give us a feedback. Click here to login
×

You must be logged to download. Click here to login

×

MrBool is totally free and you can help us to help the Developers Community around the world

Yes, I'd like to help the MrBool and the Developers Community before download

No, I'd like to download without make the donation

×

MrBool is totally free and you can help us to help the Developers Community around the world

Yes, I'd like to help the MrBool and the Developers Community before download

No, I'd like to download without make the donation

PHP Regular Expressions: An Introduction for PHP Compatibility

In this article, we are going to explore the big world of regular expressions, regarding its use in PHP and its comparison to expressions writen in Pearl and POSIX.

Introduction

Regular expressions are one of the powerful tools of modern programming languages with the help of which we can make a system for any kind of pattern matching. A regular expression is a pattern of characters which will match a pattern in a text. This pattern may be a combination of words with which you may be familiar in advance e.g. the word “Mr-bool” or anything else.

As you know there are various pre-defined functions present in php e.g. strstr(), which will search an exact word in a string or str_replace() which will replace a word in a sentence with another word. But these functions are not so powerful and they only search exact words. On the other hand, if we want to check that a user enters correct email address or we want to find a line which is enclosed between the tags <h2></h2>.

Php provides two kind of regular expressions each with its own kind of syntax:

  1. Pearl compatible regular expressions
  2. POSIX Regular expressions.

In this tutorial we will use PCRE as people are preferring it and are more powerful than POSIX and according to php manual, in PHP 5.3.0, the POSIX Regex extension is deprecated.

Also it is important to mention here that if you can do your work with other string functions then there is no need of using regular expressions though they are more powerful but slower than normal string functions.

Difference between POSIX and PCRE.

A number of differences are present between Posix and PCRE. Some of them are:

  1. PCRE functions require that the pattern is enclosed by delimiters while in Posix functions no slashes are used for pattern.
  2. POSIX possesses dedicated functions for case-insensitive matching while in PCRE uses the pattern modifier e.g. “i”.
  3. The most important difference is that PCRE stops when finding a valid match while POSIX will return the full string match. e.g. using pattern Hi(Mr)?(bool)? On the string HiMrbool with PCRE it will return HiMr, while on POSIX it will return complete string i.e. HiMrbool.
  4. POSIX functions are: ereg_replace(),ereg(),eregi_replace(),eregi(),spliti(),spliti(),sql_regcase()
    While PCRE functions are preg_replace(),preg_match(), preg_split(),preg_grep(),and preg_replace_callback().

Before diving into syntax of regular expression, we should introduce some syntactical variations i.e. Meta-characters, quantifiers, character classes and modifiers.

Meta-characters

Meta-characters are special symbols which have a specific meaning and are used in defining of a pattern. These symbols are shown in Table 1.

Table 1. Meta-characters list.

META-CHARACTERS
Character Meaning
\ Escape character
. Any single character except newline
^ Indicates the beginning of a string
| Alternatives (or)
{ Start of a quantifier
} End of a quantifier
( Start of a sub pattern
) End of a sub pattern
[ Start of a class
] End of a class

QUANTIFIERS

Quantifiers allow us to dictate that how many times something can or must appear. Table 2 shows quantifiers and their use

Table 2. Quantifiers list.

QUANTIFIERS
Character Meaning
? 0 or 1
* 0 or more
+ 1 or more
{2} Exactly 2 occurrences
{2,5} Between 2 and 5

CHARACTER CLASS

Character classes are normally used in regular expressions. Character classes are created by placing characters inside square brackets [ ].

For example, if we write a pattern pa[kis]tan, it will search the names like Pakistan and Tajikistan.

The Table 3 shows some commonly used character classes.

Table 3. Character classes list.

Character classes
Class Short code Meaning
[0-9] \d Any digit between o to 9
[A-Za-z0-9] \w Any word character i.e. a complete word
[\f\r\t\n\v] \s Any white space
[^0-9] \D Not a digit
[^A-Za-z0-9] \W Not a word character
[^\f\r\t\n\v] \S Not a white space

MODIFIERS

Modifiers are used to change working of regular expressions and are used at end of pattern. Some of them are shown in Table 4.

Table 4. Patterns Modifiers list.

Patterns Modifiers
Character Description
I Perform case-insensitive search
G Find all occurrences (perform a global search)
M It changes working of ^ and $, as we know that ^ matches at beginning of line while $ at end of line. So we use m modifier to incase of multiple line so that ^, and $ will match at the beginning and end of every line respectively.
S It changes working of (.) character. As we know dot character matches any character except new line so setting this modifier the dot character will also match new line.
X It will ignore whitespaces and comments within regular expression.

PEARL COMPATIBLE REGULAR EXPRESSIONS

As we know that Perl has provided the most comprehensive regular expressions that are used to search and replace even the most complicated string patterns and that’s why PHP developers used the same syntax of perl-styl functions.

So let’s start with some PCRE functions and their use.

Preg_match()

Preg_match() functions is used to see if a pattern matches a value or not. Preg_match() has three arguments i.e.

preg_match($pattern,$subject,$result)

$result variable will contain the match found or will return nothing if no match were found. So let’s take an example to see working of preg_match() function.

Listing 1: working of preg_match()

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<title>Regular expression</title>
</head>
<body>
<?php
$subject="Mr-bool is for software developers";
$pattern="/\bMR-bool\b/i";
if(preg_match($pattern,$subject,$result)){
print_r($result);}
?>
</body>
</html> 
Listing 1 output.

Figure 1. Listing 1 output.

In Listing 1, we have used a variable $subject to which we assigned a sentence. Now we are looking to find a particular word in that sentence. For this purpose we defined another variable $pattern to which we assigned our pattern for which we are searching i.e.

$pattern=“/\bMR-bool\b/i".
  • Here “//” are part of syntax and are used in defining of every pattern in PCRE.
  • “\b \b” is used for defining of boundary i.e. only that particular word is to be matched.
  • “i” is used after the end of pattern to perform case-insensitive search.

Another variable $result is used inside the function preg_match() for assigning the result of pattern found.

Now when we execute the result our output will contain the matched word “Mr-bool” as shown in Figure 1.

Preg_match_all()

The third variable $result, in preg_match() function will contain the first match found. Now incase when we want to search every match in the sentence we will preg_match_all() . Its syntax is same as that of preg_match().

Preg_match-all($pattern,$subject,$results)

Now this function will return every match found or FALSE if nothing were matched. So let's update Listing 1 and run it to find number of matches.

Listing 2: Code for returning multiple matches found.

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<title>Regular expression</title>
</head>
<body>
<?php
$subject="Mr-bool is for software developers.you will find a number of tutorial coursed on mr-bool";
$pattern="/\bMR-bool\b/i";
if(preg_match_all($pattern,$subject,$results)){
print_r($results);}
?>
</body>
</html>
 Listing 2 output.

Figure 2. Listing 2 output.

Listing 2 is the same as that of Listing 1. We only used preg_match_all() function to return all matches found.

Replacing functions in PCRE

As you have learned how to find matches in a string, now if you want to replace that matched string in the sentence use preg_replace() function.

Syntax of preg_replace()

Preg_replace($pattern,$replacement,$subject)

This function also take a fourth optional argument for limiting number of replacements made.

To understand working of this function let’s go to Listing 3.

Listing 3: Working of listing preg_replace().

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<title>Regular expression</title>
</head>
<body>
<?php
$subject="Pakistan cricket team best players are shahid afridi,junaidkhan, and younaskhan";
$pattern=array("/junaidkhan/","/younaskhan/");
$replacements=array("wasimakram","imrankhan");
echo preg_replace($pattern,$replacements,$subject);
?>
</body>
</html>
 showing working of preg_replace().

Figure 3. Showing preg_replace() working.

In Listing 3, we used three variables $subject for inputing a string of text, $pattern for defining of pattern to found in string and $replacement for assigning of data to be replaced in string. Finally, all these variables are echoed through preg_replace() function which showed the result with replaced words as shown in Figure 3.

Preg_match()

The preg_grep() function return array entries that match the pattern. The preg_grep() accepts three arguments:

Syntax:

array preg_grep(string $pattern, array $input [,flags])
  • string $Pattern: this arguments takes the pattern to search for.
  • array $input: this arguments takes the input array.
  • Flags: This parameter accepts one value PREG_GREP_INVERT. Passing this flag will result in retrieval of those array elements which do not match the pattern.

To understand working of preg_grep() lets us go to Listing 4.

Listing 4: preg_grep() function.

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<title>preg_grep</title>
</head>
<body>
<?php
$input=array("mr-bool","backingsoda","bugs");
$pattern="/^b/";
$result=preg_grep($pattern,$input);
print_r($result);
?>
</body>
</html> 
 showing output of listing 4.

Figure 4. Showing output of Listing 4.

In Listing 4 we used two variables $input and $pattern. Input array is assigned to $input while $pattern is used for defining of pattern i.e. $pattern=”/^b/”;

Here

“//” are used for for defining of pattern.

“^b” is used for selecting of words which contain character b at start. Result of listing 4 is shown in Figure 4.

Preg_replace_callback()

Preg_replace_callback() function is similar to preg_replace(), only the difference is that replacement is done by a callback function. Its syntax is

mixed preg_replace_callback(mixed $pattern, Callable $callback,mixed $str[,int limit])

Here in syntax

  • $pattern: the pattern you are looking for.
  • $str: defines the string you are searching.
  • $Callback: defines the name of function to be used for replacement task.

While the parameter limit in the syntax is optional and is used for specifying that how many limit should take place.

Listing 5: working of preg_replace_callback().

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<title>preg_replace_callback</title>
</head>
<body>
<?php 
$string  = '2015-02-21';

// search pattern
$pattern = '~(\d{4})-(\d{2})-(\d{2})~';

// the function call
$result = preg_replace_callback($pattern, 'callback', $string);

// the callback function
function callback ($matches) {
    print_r($matches);
    return $matches[3].'-'.$matches[2].'-'.$matches[1];
}
echo $result;
?>
</body>
</html>
 showing output of listing 5.

Figure 5. Showing output of Listing 5.

In Listing 5, we defined two variables $string for assigning of string and $pattern for defining pattern i.e.

$pattern = '~(\d{4})-(\d{2})-(\d{2})~';

Here \d is for taking any digit from 0 to 9 while range is defined in { }.

A function “callback” is defined which will print the string in array form i.e. yyyy-mm-dd and then will convert it into dd-mm-yyyy.

This function is called by preg_replace_callback() and the result is assigned to $result which on echoing will show the result as shown in Figure 5.

Preg_split()

Preg_split() splits a string in terms of regular expression.

Syntax:

array preg_split(string $pattern, string $subject[,int $limit=-1[,int $flags=0]])

Here

  • $pattern: the pattern to search for, as a string.
  • $subject: for input string.
  • $limit: if defined only limit number of substring are returned.
  • $flags: flags can be PREG_SPLIT_NO_EMPTY, if this flag is set, only non-empty pieces will be returned by preg_split(); PREG_SPLIT_DELIM_CAPTURE, if this flag is set, parenthesized expression in the delimiter pattern will be captured and returned as well, or even PREG_SPLIT_OFFSET_CAPTURE, if this flag is set, for every occurring match the appendant string offset will also be returned.

Listing 6: Example showing working of preg_split().

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<title>Untitled Document</title>
</head>
<body>
<?php 
$string="welcome to Mr-bool website";
$pattern="/[\s, ]+/";
$result=preg_split($pattern,$string);
print_r($result);?>
</body>
</html>
 showing output of listing 6.

Figure 6. Showing output of Listing 6.

In Listing 6, we assigned a string to a variable $string and assigned our pattern to variable $variable. Then, both parameters are passed to preg_split() and the result is assigned to variable $result.

The variable $result is passed to print_r() which will print it in array form as shown in Figure 6.

Preg_quote( )

Preg_quote() function inserts a backslash delimiter before every character of special significance to regular expression syntax. These special characters are :$^*()+={}[]|\\:.

Syntax:

String preg_quote(string $str[,string $delimiter=NULL] )

Here

  • $str: is for input string.
  • $delimiter: is used for specifying what delimiter is used for the regular expression, causing it to also be escaped by a backslash. It is also useful for escaping the delimiter that is required for the PCRE functions. “/” is the most commonly used delimiter.

Listing 7: Example to explain working of preg_quote().

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<title>preg_quote</title>
</head>
<body>
<?php
$string="this software costs $1000";
echo preg_quote($string,'/');
?>
</body>
</html>
 showing output of listing 7.

Figure 7. Showing output of Listing 7.

In Listing 7, we assigned a sentence to a string variable and then passed it to preg_quote() with a delimiter ‘/’. After echoing it a backslash will be shown before special character $ i.e. \$1000 as shown in Figure 7.

Practical example of regular expressions:

Validation of email and phone number.

Listing 8: Code for form which will accept email address and mobile number.

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<title>Email validation</title>
</head>
<body>

<form method="post" action="submit.php">
<table width="500" border="2">
<tr><td>Email:</td>
 <td><input type="text" name="email"/></td>
 <td>Please provide a valid email address</td>
 </tr>
 <tr><td>Mobile :</td>
 <td><input type="text" name="mobile"/></td>
 <td>55-5555-55</td>
 </tr>
 <tr><td><input type="submit" name="submit" value="submit"/></td>
 </tr>
 </table>
</form>

</body>
</html> 
 showing output of listing 8.

Figure 8. Showing output of Listing 8.

In Listing 8, we have made a form with method POST and action “submit.php” for email and mobile number verification. For this purpose, we make a table by putting table tags i.e.

and this table consist of three rows and three columns (table data) . The first row is for email, second for mobile number ,while the third row contains the ‘Submit’ button.

After entering valid values in the text boxes and hitting the submit button, your form will be submitted to “submit.php” for verification. The code for “submit.php” is shown in Listing 9.

Listing 9: submit.php

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<title>Email validation</title>
</head>
<body>
<?php
$email=trim($_POST['email']);
$mobile=trim($_POST['mobile']);
$pattern="/^[^@ ]+@[^@ ]+\.[^@\. ]+$/";
if(preg_match($pattern,$email)){
echo"your email is valid<br>";}
	else{
         echo"your email is invalid<br>";}
$pattern2="/^\d{2}-\d{4}-\d{2}$/";
           if(preg_match($pattern2,$mobile)){
echo "your mobile number is valid<br>";}
else{
echo"your mobile number is not valide<br>";}
?>
</body>
</html>

In listing 9 we are going to validate the values entered in text boxes for email and mobile numbers in email.html.

Here

trim() is used for stripping of white spaces from beginning and end of a string.

While in

$pattern=”/^[^@ ] HYPERLINK "mailto:+@[%5e@]+\\.%5b%5e@\\.%20%5d+$/" +@[^@]+\.[^@\. ]+$/”;
  • ^ is for matching at beginning of string.
  • [^@ ] match every character except @ and whitespace as we don’t gave it at beginning of email address.
  • + is for matching a repeating character in email address.
  • @ after that @ is used e.g. zohaib@.
  • [^@ ] match every character except @ and whitespace as we don’t gave it at beginning of email address.
  • + is for matching a repeating character in email address.
  • \. as dot is also used in email address i.e. zohaib@gmail.com for this purpose we used backslash for escaping of dot (.) as it has a special meaning in regular expressions.
  • [^@\. ] means match every character except @,\ ,dot(.) and a whitespace as a valid email only contain one @ and dot(.) with no whitespaces.
  • $ means that string will finish here.

Now you have learnt how to define a pattern for email verification if user entered wrong email address our preg_match() will consider it wrong and will return false.

Another pattern is defined in listing 8 for mobile number verification which is

$pattern2="/^\d{2}-\d{4}-\d{2}$/";

Here

  • ^ is for matching at beginning of string.
  • \d{2} for two digits.
  • = a hyphen.
  • \d{4} for four digits.
  • $ means string will finish here.

Again this pattern is passed to preg_match() and entering a wrong mobile number ,the function will return false message.

showing output of listing 9.

Figure 9. Showing output of Listing 9.

If you enter invalid email address and phone number and hit submit button, invalid message will be displayed as shown below in Figure 10.

 shows display of invalid entry.

Figure 10. Shows display of invalid entry.

Conclusion

In this tutorial, we learnt the basics of regular expressions. Then we discussed two types of regular expressions. After that, we learnt PCRE which are the most powerful regular expressions nowadays. In the last part of our tutorial, we discussed a practical example of regular expression for validation purposes. Hope you will understand it well. For further studies of regular expression you may refer to “PHP manual” and internet.



Software developer doing B.E in Computer Science at Hamdard university karachi. Have good skills in php, html, javascript and css.

What did you think of this post?
Services
[Close]
To have full access to this post (or download the associated files) you must have MrBool Credits.

  See the prices for this post in Mr.Bool Credits System below:

Individually – in this case the price for this post is US$ 0,00 (Buy it now)
in this case you will buy only this video by paying the full price with no discount.

Package of 10 credits - in this case the price for this post is US$ 0,00
This subscription is ideal if you want to download few videos. In this plan you will receive a discount of 50% in each video. Subscribe for this package!

Package of 50 credits – in this case the price for this post is US$ 0,00
This subscription is ideal if you want to download several videos. In this plan you will receive a discount of 83% in each video. Subscribe for this package!


> More info about MrBool Credits
[Close]
You must be logged to download.

Click here to login