On 15 Feb, 2005, at 9:13, Bèr Kessels wrote:
I would really like it if some regexp-guru can give me a hand with creating a single regexp that can be used drupalwide.
Below is some code from the _create_re() function in Textile.php in the Textile module. That code is a PHP port of Brad Choate's Perl Textile module. Some/all of this may be helpful. // a URL discovery regex. This is from Mastering Regex from O'Reilly. // Some modifications by Brad Choate <brad at bradchoate dot com> $this->urlre = '(?: # Must start out right... (?=[a-zA-Z0-9./#]) # Match the leading part (proto://hostname, or just hostname) (?: # ftp://, http://, or https:// leading part (?:ftp|https?|telnet|nntp)://(?:\w+(?::\w+)?@)?[-\w]+(?:\.\w[-\w]*)+ | (?:mailto:)?[-\+\w]+@[-\w]+(?:\.\w[-\w]*)+ | # or, try to find a hostname with our more specific sub-expression (?i: [a-z0-9] (?:[-a-z0-9]*[a-z0-9])? \. )+ # sub domains # Now ending .com, etc. For these, require lowercase (?-i: com\b | edu\b | biz\b | gov\b | in(?:t|fo)\b # .int or .info | mil\b | net\b | org\b | museum\b | aero\b | coop\b | name\b | pro\b | [a-z][a-z]\b # two-letter country codes ) )? # Allow an optional port number (?: : \d+ )? # The rest of the URL is optional, and begins with / . . . (?: /? # The rest are heuristics for what seems to work well [^.!,?;:"\'<>()\[\]{}\s\x7F-\xFF]* (?: [.!,?;:]+ [^.!,?;:"\'<>()\[\]{}\s\x7F-\xFF]+ #\'" )* )? )';