And testing with random whitespace;
$text = ' style=" font-style : italic ; font-weight : bold ; " '; $regex = "\s*([a-z][a-z0-9\-]*)\s*:\s*([a-z][a-z0-9\-]*)\s*;"; if (preg_match_all('/'. $regex .'/i', $text, $matches, PREG_SET_ORDER)) { print_r($matches); } $style = array(); foreach ($matches as $match) { $style[$match[1]] = $match[2]; } print_r($style);
works Array
( [0] => Array ( [0] => font-style : italic ; [1] => font-style [2] => italic )
[1] => Array ( [0] => font-weight : bold ; [1] => font-weight [2] => bold )
) Array ( [font-style] => italic [font-weight] => bold )
Testing with commonly practiced and supported syntactical errors and irregularities;
$text = 'style=font-style:italic;font-weight:bold';
doesn't all work
Array ( [0] => Array ( [0] => font-style:italic; [1] => font-style [2] => italic ) ) Array ( [font-style] => italic )
It also doesn't work for non-textual properties, and property names starting with '-' exclude the '-'; $text = 'style=border-width: 0 2em 10px 0; border-left: 1px solid #000; -moz-border-radius: foo;'; outputs
Array ( [0] => Array ( [0] => moz-border-radius: foo; [1] => moz-border-radius [2] => foo )
) Array ( [moz-border-radius] => foo )
This regex deals with those issues (but not the missing trailing semicolon ';' $regex = "\s*([a-z\-][a-z0-9\-]*)\s*:\s*([^;]*)\s*;"; I'm not sure what the best way to deal with that is given the context. Perhaps something like
$text = 'font-style:italic;font-weight:bold'; $regex = "\s*([a-z\-][a-z0-9\-]*)\s*:\s*([^;]*)\s*"; $styles = explode(';', $text); $all_matches = array(); foreach($styles as $style) { if (preg_match('/'. $regex .'/i', $style, $matches)) { print_r($matches); $all_matches[] = $matches; } } $style = array(); foreach ($all_matches as $match) { $style[$match[1]] = $match[2]; } print_r($style);
Bevan/