mb_strtok() – A PHP implementation

While developing a web app, I needed to use the php’s multibyte family of functions. Having to deal with Greek characters specifically (although I always use utf-8) I needed the multibyte equivalent of strtok() to tokenize a stream of Greek characters. A quick look in the php documentation yielded almost every other function, but nothing relevant with what I need, so I decided to create my own version. I’m sharing it with you guys, as Google won’t help you either.

function mb_strtok($delimiters, $str=NULL)
{
	static $pos = 0; // Keep track of the position on the string for each subsequent call.
	static $string = "";

	// If a new string is passed, reset the static parameters.
	if($str!=NULL)
	{
		$pos = 0;
		$string = $str;
	}

	// Initialize the token.
	$token = "";

	while ($pos < mb_strlen($string))
	{
		$char = mb_substr($string, $pos, 1);
		$pos++;

		if(mb_strpos($delimiters, $char)===FALSE)
		{
			$token .= $char;
		}
		else
		{
			// Don't return empty strings.
			if($token!="")
				return $token;
		}

	}

	// Check whether there is a last token to return.
	if ($token!="")
	{
		return $token;
	}
	else
	{
		return false;
	}
}

On the first call of mb_strtok(), you must pass a string containing the delimiters as the first parameter, and the string to tokenize as the second. Both parameters may be multibyte strings.

The second call of mb_strtok() must have only the first parameter, i.e. the string containing the delimiters.

Calling mb_strtok() again with both parameters, loses state about the previous string, a starts a new round of tokenization.

You should use this function as you would use strtok(), for example in a while loop. The function returns a boolean false when there are no more tokens to return.

You may have noticed that the order of the parameters are reversed compared with strtok(). This is because I wanted to keep the code simple, and avoid using func_get_args() which would complicate the code.

You might be interested in …

How to check if a shortcode is registered in WordPress

English, PHP, WordPress

Quick and easy function to check if a plugin/theme/whatever has add/registered a shortcode in WordPress: Just add this into your plugin or theme’s functions.php and wherever you need to check if the shortcode exists, just call is_shortcode_defined(“button”);  or something similar from an if statement, as such: Hope this helps.

Read More

Integrating CodeIgniter and MagpieRSS

CodeIgniter, PHP

So you want to use MagpieRSS from within CodeIgniter, so that you can fetch some feeds and do naughty stuff with them. No problemo. You download Magpie and you find that there are a few required files that depend on each other, and have an extension of .inc instead of the more common .php. Renaming […]

Read More

mb_strtok() – Δημιουργία με PHP

PHP

Καθώς έφτιαχνα μία εφαρμογή, χρειάστηκε να χρησιμοποιήσω τις multibyte functions (που υποστηρίζουν χαρακτήρες πολλαπλών bytes) της php. Συγκεκριμένα, χρησιμοποιούσα κωδικοποίηση utf-8 για την υποστήριξη ελληνικών χαρακτήρων, και χρειάστηκε να χρησιμοποιήσω την αντίστοιχη function της strtok() για να κομματιάσω μία σειρά Ελληνικών χαρακτήρων. Ψάχνοντας τον οδηγό χρήσης της PHP βρήκα σχεδόν κάθε άλλη function, εκτός από […]

Read More

3 Comments

Leave a Reply

Your email address will not be published. Required fields are marked *