[development] preg_match bug or regex help needed?

Doug Green douggreen at douggreenconsulting.com
Mon Dec 31 04:27:31 UTC 2007


While this response wasn't the exact solution I was looking for, I want
to say "Thank You". 

When I tried to simplify the problem for the devel list, I actually
transposed a couple things wrong, ... so the double ++ was just a typo,
and I also left of the style= part of the regex,  which made Bevan's
solution possible.  But this convinced me to at put something in code,
even if it was a few extra lines of code.

BTW, this is for the http://drupal.org/project/nicedit.  If anyone else
is not quite satisfied with our wysiwyg editor options, please join me
in working on this new editor option for Drupal.  There are a couple of
open problems documented on the project page, but I think that this is
getting close to usable.

Bevan Rudge wrote:
> And testing with random whitespace;
>
>   
>> $text = ' style="  font-style  : italic  ; font-weight  : bold  ;  " ';
>> $regex = "\s*([a-z][a-z0-9\-]*)\s*:\s*([a-z][a-z0-9\-]*)\s*;";
>> if (preg_match_all('/'. $regex .'/i', $text, $matches, PREG_SET_ORDER)) {
>>   print_r($matches);
>> }
>> $style = array();
>> foreach ($matches as $match) {
>>   $style[$match[1]] = $match[2];
>> }
>> print_r($style);
>>     
>
>
> works
>
> Array
>   
>> (
>>     [0] => Array
>>         (
>>             [0] =>   font-style  : italic  ;
>>             [1] => font-style
>>             [2] => italic
>>         )
>>
>>     [1] => Array
>>         (
>>             [0] =>  font-weight  : bold  ;
>>             [1] => font-weight
>>             [2] => bold
>>         )
>>
>> )
>> Array
>> (
>>     [font-style] => italic
>>     [font-weight] => bold
>> )
>>     
>
>
> Testing with commonly practiced and supported syntactical errors and
> irregularities;
>
>   
>> $text = 'style=font-style:italic;font-weight:bold';
>>
>>     
> doesn't all work
>
>   
>> Array
>> (
>>     [0] => Array
>>         (
>>             [0] => font-style:italic;
>>             [1] => font-style
>>             [2] => italic
>>         )
>> )
>> Array
>> (
>>     [font-style] => italic
>> )
>>     
>
>
> It also doesn't work for non-textual properties, and property names starting
> with '-' exclude the '-';
>
> $text = 'style=border-width: 0 2em 10px 0; border-left: 1px solid #000;
> -moz-border-radius: foo;';
>
> outputs
>
>   
>> Array
>> (
>>     [0] => Array
>>         (
>>             [0] => moz-border-radius: foo;
>>             [1] => moz-border-radius
>>             [2] => foo
>>         )
>>
>> )
>> Array
>> (
>>     [moz-border-radius] => foo
>> )
>>
>>     
> This regex deals with those issues (but not the missing trailing semicolon
> ';' $regex = "\s*([a-z\-][a-z0-9\-]*)\s*:\s*([^;]*)\s*;";
>
> I'm not sure what the best way to deal with that is given the context.
> Perhaps something like
>
>   
>> $text = 'font-style:italic;font-weight:bold';
>> $regex = "\s*([a-z\-][a-z0-9\-]*)\s*:\s*([^;]*)\s*";
>> $styles = explode(';', $text);
>> $all_matches = array();
>> foreach($styles as $style) {
>>   if (preg_match('/'. $regex .'/i', $style, $matches)) {
>>     print_r($matches);
>>     $all_matches[] = $matches;
>>   }
>> }
>> $style = array();
>> foreach ($all_matches as $match) {
>>   $style[$match[1]] = $match[2];
>> }
>> print_r($style);
>>     
>
>
> Bevan/
>
>   



More information about the development mailing list