Regex
regex
December 19, 20226 min read

Regex for HTTPS URL validation

HTTP, or Hypertext Transfer Protocol, is a protocol for transmitting data on the internet. It is the foundation of the World Wide Web, and it is used to transfer data between a web server and a web client (usually a web browser).

An HTTP URL is a type of URL that specifies a resource on the internet using the HTTP protocol. HTTP URLs typically start with "http://" or "https://", and they are used to access web pages and other resources on the internet.

It's worth noting that the "s" in "https://" stands for "secure", which means that the connection between your computer and the server is encrypted to protect the privacy and security of the transmitted data. HTTP URLs that use the "https://" protocol are considered more secure than those that use the "http://" protocol, because they provide an additional layer of protection against eavesdropping and other types of cyber attacks.

In this article let's understand how we can create a regex for HTTP URL and how regex can be matched for HTTP URL.

Regex (short for regular expression) is a powerful tool used for searching and manipulating text. It is composed of a sequence of characters that define a search pattern. Regex can be used to find patterns in large amounts of text, validate user input, and manipulate strings. It is widely used in programming languages, text editors, and command line tools.

Structure of a Website HTTPS URL

The https url should have the following criteria and structure-

  • It should start with https
  • then it has to be followed by ://
  • then it may or maynot contain www.
  • then it must be followed by domain name
  • then it will be followed by top level domain(TLD) like .com, .net, .io etc.,
  • then it can also have query params in the url

Regex for checking if HTTPS URL is valid or not

Regular Expression-

/^(?:(?:(?:http):)?\/\/)(?:\S+(?::\S*)?@)?(?:(?!(?:10|127)(?:\.\d{1,3}){3})(?!(?:169\.254|192\.168)(?:\.\d{1,3}){2})(?!172\.(?:1[6-9]|2\d|3[0-1])(?:\.\d{1,3}){2})(?:[1-9]\d?|1\d\d|2[01]\d|22[0-3])(?:\.(?:1?\d{1,2}|2[0-4]\d|25[0-5])){2}(?:\.(?:[1-9]\d?|1\d\d|2[0-4]\d|25[0-4]))|(?:(?:[a-z0-9\\u00a1-\\uffff][a-z0-9\\u00a1-\\uffff_-]{0,62})?[a-z0-9\\u00a1-\\uffff]\.)+(?:[a-z\\u00a1-\\uffff]{2,}\.?))(?::\d{2,5})?(?:[\/?#]\S*)?$/igm

Test string examples for the above regex-

Input StringMatch Output
.as10does not match
https://www.google.commatches
#@$some .qwq.erasdoes not match
https://www.debugpointer.commatches
debugpointer.comdoes not matches

Here is a detailed explanation of the above regex-

/^(?:(?:(?:http|ftp):)?\/\/)(?:\S+(?::\S*)?@)?(?:(?!(?:10|127)(?:\.\d{1,3}){3})(?!(?:169\.254|192\.168)(?:\.\d{1,3}){2})(?!172\.(?:1[6-9]|2\d|3[0-1])(?:\.\d{1,3}){2})(?:[1-9]\d?|1\d\d|2[01]\d|22[0-3])(?:\.(?:1?\d{1,2}|2[0-4]\d|25[0-5])){2}(?:\.(?:[1-9]\d?|1\d\d|2[0-4]\d|25[0-4]))|(?:(?:[a-z0-9\\u00a1-\\uffff][a-z0-9\\u00a1-\\uffff_-]{0,62})?[a-z0-9\\u00a1-\\uffff]\.)+(?:[a-z\\u00a1-\\uffff]{2,}\.?))(?::\d{2,5})?(?:[\/?#]\S*)?$/igm

^ asserts position at start of a line
Non-capturing group (?:(?:(?:http):)?\/\/)
Non-capturing group (?:(?:http):)?
? matches the previous token between zero and one times, as many times as possible, giving back as needed (greedy)
Non-capturing group (?:http)
http matches the characters http literally (case insensitive)
: matches the character : with index 5810 (3A16 or 728) literally (case insensitive)
\/ matches the character / with index 4710 (2F16 or 578) literally (case insensitive)
\/ matches the character / with index 4710 (2F16 or 578) literally (case insensitive)
Non-capturing group (?:\S+(?::\S*)?@)?
? matches the previous token between zero and one times, as many times as possible, giving back as needed (greedy)
\S matches any non-whitespace character (equivalent to [^\r\n\t\f\v ])
+ matches the previous token between one and unlimited times, as many times as possible, giving back as needed (greedy)
Non-capturing group (?::\S*)?
? matches the previous token between zero and one times, as many times as possible, giving back as needed (greedy)
: matches the character : with index 5810 (3A16 or 728) literally (case insensitive)
\S matches any non-whitespace character (equivalent to [^\r\n\t\f\v ])
* matches the previous token between zero and unlimited times, as many times as possible, giving back as needed (greedy)
@ matches the character @ with index 6410 (4016 or 1008) literally (case insensitive)
Non-capturing group (?:(?!(?:10|127)(?:\.\d{1,3}){3})(?!(?:169\.254|192\.168)(?:\.\d{1,3}){2})(?!172\.(?:1[6-9]|2\d|3[0-1])(?:\.\d{1,3}){2})(?:[1-9]\d?|1\d\d|2[01]\d|22[0-3])(?:\.(?:1?\d{1,2}|2[0-4]\d|25[0-5])){2}(?:\.(?:[1-9]\d?|1\d\d|2[0-4]\d|25[0-4]))|(?:(?:[a-z0-9\\u00a1-\\uffff][a-z0-9\\u00a1-\\uffff_-]{0,62})?[a-z0-9\\u00a1-\\uffff]\.)+(?:[a-z\\u00a1-\\uffff]{2,}\.?))
1st Alternative (?!(?:10|127)(?:\.\d{1,3}){3})(?!(?:169\.254|192\.168)(?:\.\d{1,3}){2})(?!172\.(?:1[6-9]|2\d|3[0-1])(?:\.\d{1,3}){2})(?:[1-9]\d?|1\d\d|2[01]\d|22[0-3])(?:\.(?:1?\d{1,2}|2[0-4]\d|25[0-5])){2}(?:\.(?:[1-9]\d?|1\d\d|2[0-4]\d|25[0-4]))
Negative Lookahead (?!(?:10|127)(?:\.\d{1,3}){3})
Assert that the Regex below does not match
Non-capturing group (?:10|127)
1st Alternative 10
10 matches the characters 10 literally (case insensitive)
2nd Alternative 127
127 matches the characters 127 literally (case insensitive)
Non-capturing group (?:\.\d{1,3}){3}
{3} matches the previous token exactly 3 times
\. matches the character . with index 4610 (2E16 or 568) literally (case insensitive)
\d matches a digit (equivalent to [0-9])
{1,3} matches the previous token between 1 and 3 times, as many times as possible, giving back as needed (greedy)
Negative Lookahead (?!(?:169\.254|192\.168)(?:\.\d{1,3}){2})
Assert that the Regex below does not match
Non-capturing group (?:169\.254|192\.168)
1st Alternative 169\.254
2nd Alternative 192\.168
Non-capturing group (?:\.\d{1,3}){2}
{2} matches the previous token exactly 2 times
\. matches the character . with index 4610 (2E16 or 568) literally (case insensitive)
\d matches a digit (equivalent to [0-9])
Negative Lookahead (?!172\.(?:1[6-9]|2\d|3[0-1])(?:\.\d{1,3}){2})
Assert that the Regex below does not match
172 matches the characters 172 literally (case insensitive)
\. matches the character . with index 4610 (2E16 or 568) literally (case insensitive)
Non-capturing group (?:1[6-9]|2\d|3[0-1])
Non-capturing group (?:\.\d{1,3}){2}
Non-capturing group (?:[1-9]\d?|1\d\d|2[01]\d|22[0-3])
1st Alternative [1-9]\d?
2nd Alternative 1\d\d
3rd Alternative 2[01]\d
4th Alternative 22[0-3]
Non-capturing group (?:\.(?:1?\d{1,2}|2[0-4]\d|25[0-5])){2}
{2} matches the previous token exactly 2 times
\. matches the character . with index 4610 (2E16 or 568) literally (case insensitive)
Non-capturing group (?:1?\d{1,2}|2[0-4]\d|25[0-5])
Non-capturing group (?:\.(?:[1-9]\d?|1\d\d|2[0-4]\d|25[0-4]))
\. matches the character . with index 4610 (2E16 or 568) literally (case insensitive)
Non-capturing group (?:[1-9]\d?|1\d\d|2[0-4]\d|25[0-4])
2nd Alternative (?:(?:[a-z0-9\\u00a1-\\uffff][a-z0-9\\u00a1-\\uffff_-]{0,62})?[a-z0-9\\u00a1-\\uffff]\.)+(?:[a-z\\u00a1-\\uffff]{2,}\.?)
Non-capturing group (?:(?:[a-z0-9\\u00a1-\\uffff][a-z0-9\\u00a1-\\uffff_-]{0,62})?[a-z0-9\\u00a1-\\uffff]\.)+
Non-capturing group (?:[a-z\\u00a1-\\uffff]{2,}\.?)
Non-capturing group (?::\d{2,5})?
? matches the previous token between zero and one times, as many times as possible, giving back as needed (greedy)
: matches the character : with index 5810 (3A16 or 728) literally (case insensitive)
\d matches a digit (equivalent to [0-9])
Non-capturing group (?:[\/?#]\S*)?
? matches the previous token between zero and one times, as many times as possible, giving back as needed (greedy)
Match a single character present in the list below [\/?#]
\S matches any non-whitespace character (equivalent to [^\r\n\t\f\v ])
$ asserts position at the end of a line
Global pattern flags
i modifier: insensitive. Case insensitive match (ignores case of [a-zA-Z])
g modifier: global. All matches (don't return after first match)
m modifier: multi line. Causes ^ and $ to match the begin/end of each line (not only begin/end of string)

Hope this article was useful to check if the string is a valid http URL or not.

Share this blog
Tagged in :
regex
Like what you read?
Subscribe to our Newsletter
Subscribe to our email newsletter and unlock access to members-only content and exclusive updates.
About the Author
Satvik
Satvik
Entrepreneur
Satvik is a passionate developer turned Entrepreneur. He is fascinated by JavaScript, Operating System, Deep Learning, AR/VR. He has published several research papers and applied for patents in the field as well. Satvik is a speaker in conferences, meetups talking about Artificial Intelligence, JavaScript and related subjects. His goal is to solve complex problems that people face with automation. Related projects can be seen at - [Projects](/projects)
View all articles
Previous Article
December 12, 20225 min read
Next Article
December 30, 20225 min read