John Gruber put in some good work to derive and test a regex to extract URLs from plain text."An Improved Liberal, Accurate Regex Pattern for Matching URLs" I needed to use it today and found it needs a bit of care to translate for use in Python, especially with regard to its Unicode characters. Here is my Python version, with a super-simple harness to use Gruber's test page: I'm not entirely sure I've translated the original with 100% fidelity, but this has worked fine for my purposes. I'm open to tweaks or suggestions, and will keep the Gist updated.