DebugPointer
Published on

Regex for India PAN Number validation

Regex for India PAN Number validation

PAN number is a 10-digit unique identity number that is being provided for the citizens of India. It is alphanumeric issued by the Indian Income Tax Department to anyone who applies for it. In this article let's understand how we can create a regex for India PAN number and how regex can be matched for India PAN card number.

Regex (short for regular expression) is a powerful tool used for searching and manipulating text. It is composed of a sequence of characters that define a search pattern. Regex can be used to find patterns in large amounts of text, validate user input, and manipulate strings. It is widely used in programming languages, text editors, and command line tools.

Structure of PAN numbers

The PAN (or PAN number) is a ten-character long alpha-numeric unique identifier.

The PAN structure is as follows - Fourth character [P — Individual or Person ] Example: AEEEJ2124A

  • The first five characters are letters (in uppercase by default), followed by four numerals, and the last (tenth) character is a letter.
  • The first three characters of the code are three letters forming a sequence of alphabetical letters from AAA to ZZZ
  • The fourth character identifies the type of holder of the card. Each holder type is uniquely defined by a letter from the list below:
    • A — AOP (Association of persons)
    • B — BOI (Body of individuals)
    • C — Company
    • F — Firm
    • G — Government
    • H — HUF (Hindu Undivided Family)
    • L — Local authority
    • J — Artificial juridical person
    • P — Person (Individual)
    • T — Trust (AOP)
  • The fifth character of the PAN is the first character of either:
  • of the first name, surname or last name of the person, in the case of a "personal" PAN card, where the fourth character is "P" or of the name of the entity, trust, society, or organisation in the case of a company/HUF/firm/AOP/trust/BOI/local authority/artificial judicial person/government, where the - fourth character is "C", "H", "F", "A", "T", "B", "L", "J", "G".
  • The last (tenth) character is an alphabetic digit used as a check-sum to verify the validity of that current code.

Conditions to match PAN card number

Let's look at conditions that we have to satisfy to make sure that the PAN number is valid-

  • It should be ten characters long.
  • The first five characters should be any upper case alphabets.
  • The next four-characters should be any number from 0 to 9.
  • The last(tenth) character should be any upper case alphabet.
  • It should not contain any white spaces.

Regex for checking if PAN number is valid or not

Regular Expression-

/^[A-Z]{5}[0-9]{4}[A-Z]{1}$/gm

Test string examples for the above regex-

Input StringMatch Output
APOPQ24233does not match
ASSPQ2423Amatches
AP22Q2423Adoes not match
$$OP^^423Adoes not match

Here is a detailed explanation of the above regex-

/^[A-Z]{5}[0-9]{4}[A-Z]{1}$/gm

^ asserts position at start of a line
Match a single character present in the list below [A-Z]
{5} matches the previous token exactly 5 times
A-Z matches a single character in the range between A (index 65) and Z (index 90) (case sensitive)
Match a single character present in the list below [0-9]
{4} matches the previous token exactly 4 times
0-9 matches a single character in the range between 0 (index 48) and 9 (index 57) (case sensitive)
Match a single character present in the list below [A-Z]
{1} matches the previous token exactly one time (meaningless quantifier)
A-Z matches a single character in the range between A (index 65) and Z (index 90) (case sensitive)
$ asserts position at the end of a line
Global pattern flags
g modifier: global. All matches (don't return after first match)
m modifier: multi line. Causes ^ and $ to match the begin/end of each line (not only begin/end of string)

Hope this article was useful to match India PAN number regex pattern.