mb_strtok() – A PHP implementation

While developing a web app, I needed to use the php’s multibyte family of functions. Having to deal with Greek characters specifically (although I always use utf-8) I needed the multibyte equivalent of strtok() to tokenize a stream of Greek characters. A quick look in the php documentation yielded almost every other function, but nothing relevant with what I need, so I decided to create my own version. I’m sharing it with you guys, as Google won’t help you either.

function mb_strtok($delimiters, $str=NULL)
{
	static $pos = 0; // Keep track of the position on the string for each subsequent call.
	static $string = "";

	// If a new string is passed, reset the static parameters.
	if($str!=NULL)
	{
		$pos = 0;
		$string = $str;
	}

	// Initialize the token.
	$token = "";

	while ($pos < mb_strlen($string))
	{
		$char = mb_substr($string, $pos, 1);
		$pos++;

		if(mb_strpos($delimiters, $char)===FALSE)
		{
			$token .= $char;
		}
		else
		{
			// Don't return empty strings.
			if($token!="")
				return $token;
		}

	}

	// Check whether there is a last token to return.
	if ($token!="")
	{
		return $token;
	}
	else
	{
		return false;
	}
}

On the first call of mb_strtok(), you must pass a string containing the delimiters as the first parameter, and the string to tokenize as the second. Both parameters may be multibyte strings.

The second call of mb_strtok() must have only the first parameter, i.e. the string containing the delimiters.

Calling mb_strtok() again with both parameters, loses state about the previous string, a starts a new round of tokenization.

You should use this function as you would use strtok(), for example in a while loop. The function returns a boolean false when there are no more tokens to return.

You may have noticed that the order of the parameters are reversed compared with strtok(). This is because I wanted to keep the code simple, and avoid using func_get_args() which would complicate the code.

You might be interested in …

Problems with Mobile Broadband On Demand on a Mac (Vodafone Greece)

English, Mac

If you bought a pay-as-you-go Mobile Broadband On Demand from Vodafone Greece, that came with a 3G USB modem, model K3565 -Rev 2 (sometimes named K3565-H), by Huawei Technologies, and you are on a Mac (I’m on 10.6.4), you may have troubles making it work.

Read More

Integrating CodeIgniter and MagpieRSS

CodeIgniter, PHP

So you want to use MagpieRSS from within CodeIgniter, so that you can fetch some feeds and do naughty stuff with them. No problemo. You download Magpie and you find that there are a few required files that depend on each other, and have an extension of .inc instead of the more common .php. Renaming […]

Read More

How to check if a shortcode is registered in WordPress

English, PHP, WordPress

Quick and easy function to check if a plugin/theme/whatever has add/registered a shortcode in WordPress: Just add this into your plugin or theme’s functions.php and wherever you need to check if the shortcode exists, just call is_shortcode_defined(“button”);  or something similar from an if statement, as such: Hope this helps.

Read More

3 Comments

Leave a Reply

Your email address will not be published. Required fields are marked *