How to Extract a Twitter Profile URL (But Not Status URL) with a Regex

I've fairly recently added support within my site to replace URLs to Twitter profiles with @-username, so when the note was syndicated to Twitter it would be correct. However, because this is within the site's theme it depends completely on what is being used to render my site. This is less than ideal, and means if I were to move themes in the future, it would need to be reimplemented.

On Thursday, I noticed that the regex I'd been using hadn't quite worked:

In this case the URL hadn't been caught as it didn't handle the URL being at the end of line, and with the two of these reasons in mind, I sought to rewrite it.

My goal was to match only the profile URLs i.e. https://twitter.com/JamieTanna, not status URLs such as https://twitter.com/JamieTanna/status/1225494506558164992.

The Regex I've come up with is:

(https:\/\/twitter.com\/(?![a-zA-Z0-9_]+\/)([a-zA-Z0-9_]+))

In my case, I've implemented this with a negative lookahead, which allows me to ignore the whole match if it ends with a /, as that would indicate it's a status URL.

You can see it in action at regexr.com/4tsfr.

Written by Jamie Tanna's profile image Jamie Tanna on , and last updated on .

Content for this article is shared under the terms of the Creative Commons Attribution Non Commercial Share Alike 4.0 International, and code is shared under the Apache License 2.0.

#blogumentation #twitter #regex.

This post was filed under articles.

Interactions with this post

Interactions with this post

Below you can find the interactions that this page has had using WebMention.

Have you written a response to this post? Let me know the URL:

Do you not have a website set up with WebMention capabilities? You can use Comment Parade.