DebugPointer
Published on

Regex for CIN validation

Regex for CIN validation

CIN stands for Corporate Identification Number and is a unique 21-digit alphanumeric code assigned to a company by the Ministry of Corporate Affairs (MCA) in India. It is used to identify the company and is a part of the company's registration documents. In this article let's understand how we can create a regex for CIN and how regex can be matched for CIN number.

Regex (short for regular expression) is a powerful tool used for searching and manipulating text. It is composed of a sequence of characters that define a search pattern. Regex can be used to find patterns in large amounts of text, validate user input, and manipulate strings. It is widely used in programming languages, text editors, and command line tools.

Structure of CIN

  • CIN is a 21 digits alpha-numeric code.
    • It starts with either alphabet letter U or L.
    • Next five characters are reserved for digits (0-9).
    • Next two places are occupied by alphabet letters(A-Z-a-z).
    • Next four places are taken by digits(0-9).
    • Next three characters are reserved for alphabet letters (A-Za-z).
    • Next six characters are digits(0-9).
  • It should not contain any special character or whitespaces.

Regex for checking if CIN is valid

Regular Expression-

/^([LUu]{1})([0-9]{5})([A-Za-z]{2})([0-9]{4})([A-Za-z]{3})([0-9]{6})$/gmi

Test string examples for the above regex-

Input StringMatch Output
U12345ASDAS784CDE1234does not match
U43245ZA3424ERE134343matches
U12345AB6784CDE1234does not match
L75645XX2344FFE643322matches

U43245ZA3424ERE134343 L75645XX2344FFE643322 U12345AB6784CDE1234 U12345ASDAS784CDE1234

Here is a detailed explanation of the above regex-

/^([LUu]{1})([0-9]{5})([A-Za-z]{2})([0-9]{4})([A-Za-z]{3})([0-9]{6})$/gmi

1st Capturing Group ([LUu]{1})
Match a single character present in the list below [LUu]
{1} matches the previous token exactly one time (meaningless quantifier)
LUu matches a single character in the list LUu (case insensitive)
2nd Capturing Group ([0-9]{5})
Match a single character present in the list below [0-9]
{5} matches the previous token exactly 5 times
0-9 matches a single character in the range between 0 (index 48) and 9 (index 57) (case insensitive)
3rd Capturing Group ([A-Za-z]{2})
Match a single character present in the list below [A-Za-z]
{2} matches the previous token exactly 2 times
A-Z matches a single character in the range between A (index 65) and Z (index 90) (case insensitive)
a-z matches a single character in the range between a (index 97) and z (index 122) (case insensitive)
4th Capturing Group ([0-9]{4})
Match a single character present in the list below [0-9]
{4} matches the previous token exactly 4 times
0-9 matches a single character in the range between 0 (index 48) and 9 (index 57) (case insensitive)
5th Capturing Group ([A-Za-z]{3})
Match a single character present in the list below [A-Za-z]
{3} matches the previous token exactly 3 times
A-Z matches a single character in the range between A (index 65) and Z (index 90) (case insensitive)
a-z matches a single character in the range between a (index 97) and z (index 122) (case insensitive)
6th Capturing Group ([0-9]{6})
Match a single character present in the list below [0-9]
{6} matches the previous token exactly 6 times
0-9 matches a single character in the range between 0 (index 48) and 9 (index 57) (case insensitive)
$ asserts position at the end of a line
Global pattern flags
g modifier: global. All matches (don't return after first match)
m modifier: multi line. Causes ^ and $ to match the begin/end of each line (not only begin/end of string)
i modifier: insensitive. Case insensitive match (ignores case of [a-zA-Z])

Hope this article was useful to match CIN regex pattern.