Hackerrank - Validating and Parsing Email Addresses Solution
A valid email address meets the following criteria:
- It's composed of a username, domain name, and extension assembled in this format:
[email protected]
- The username starts with an English alphabetical character, and any subsequent characters consist of one or more of the following: alphanumeric characters,
-
,.
, and_
. - The domain and extension contain only English alphabetical characters.
- The extension is , , or characters in length.
Given pairs of names and email addresses as input, print each name and email address pair having a valid email address on a new line.
Hint: Try using Email.utils() to complete this challenge. For example, this code:import email.utilsprint email.utils.parseaddr('DOSHI <[email protected]>')print email.utils.formataddr(('DOSHI', '[email protected]'))
produces this output:
('DOSHI', '[email protected]')
DOSHI <[email protected]>
Input Format
The first line contains a single integer, , denoting the number of email address.
Each line of the subsequent lines contains a name and an email address as two space-separated values following this format:
name <[email protected]>
Constraints
Output Format
Print the space-separated name and email address pairs containing valid email addresses only. Each pair must be printed on a new line in the following format:
name <[email protected]>
You must print each valid email address in the same order as it was received as input.
Sample Input
2
DEXTER <[email protected]>
VIRUS <virus!@variable.:p>
Sample Output
DEXTER <[email protected]>
Explanation
[email protected] is a valid email address, so we print the name and email address pair received as input on a new line.
virus!@variable.:p is not a valid email address because the username contains an exclamation point (!
) and the extension contains a colon (:
). As this email is not valid, we print nothing.
Solution in Python
import re
import email.utils
for _ in range(int(input())):
s = input()
u = email.utils.parseaddr(s)
if re.search("^[a-z][\w.-]+@[a-z]+\.[a-z]{1,3}$",u[-1],re.I):
print(s)
Start of line
^
Small letters
[a-z]
Alphanumeric and _
\w
Alphanumeric . - and _
[\w.-]
One or more than one occurrence
+
Means .
\.
Between 1 to 3 occurrence
{1,3}
End of line
$
re.I is used to ignore case so that [a-z] matches both upper and lower case since [a-z] matches lowercase only.