Hackerrank HTML Parser - Part 2 Solution

Hackerrank HTML Parser - Part 2 Solution

*This section assumes that you understand the basics discussed in HTML Parser - Part 1

.handle_comment(data)
This method is called when a comment is encountered (e.g. <!--comment-->).
The data argument is the content inside the comment tag:

from HTMLParser import HTMLParser

class MyHTMLParser(HTMLParser):
    def handle_comment(self, data):
          print "Comment  :", data


.handle_data(data)
This method is called to process arbitrary data (e.g. text nodes and the content of <script>...</script> and <style>...</style>).
The data argument is the text content of HTML.

from HTMLParser import HTMLParser

class MyHTMLParser(HTMLParser):
    def handle_data(self, data):
        print "Data     :", data

Task

You are given an HTML code snippet of  lines.
Your task is to print the single-line comments, multi-line comments and the data.

Print the result in the following format:

>>> Single-line Comment  
Comment
>>> Data                 
My Data
>>> Multi-line Comment  
Comment_multiline[0]
Comment_multiline[1]
>>> Data
My Data
>>> Single-line Comment:  

Note: Do not print data if data == '\n'.

Input Format

The first line contains integer , the number of lines in the HTML code snippet.
The next  lines contains HTML code.

Constraints

Output Format

Print the single-line comments, multi-line comments and the data in order of their occurrence from top to bottom in the snippet.

Format the answers as explained in the problem statement.

Sample Input

4
<!--[if IE 9]>IE9-specific content
<![endif]-->
<div> Welcome to HackerRank</div>
<!--[if IE 9]>IE9-specific content<![endif]-->

Sample Output

>>> Multi-line Comment
[if IE 9]>IE9-specific content
<![endif]
>>> Data
 Welcome to HackerRank
>>> Single-line Comment
[if IE 9]>IE9-specific content<![endif]

Solution in python3

Approach 1.

from html.parser import HTMLParser
class MyHTMLParser(HTMLParser):
    def handle_comment(self, data):
        prefix = '\n' in data and 'Multi-line Comment' or 'Single-line Comment'
        print('>>> {0}\n{1}'.format(prefix, data))
    def handle_data(self, data):
        if data is not '\n':
            print('>>> Data\n{0}'.format(data))
html = ""
for i in range(int(input())):
    html += input().rstrip()
    html += '\n'
parser = MyHTMLParser()
parser.feed(html)
parser.close()

Approach 2.

from html.parser import HTMLParser
html = ""
class MyHTMLParser(HTMLParser):
    def handle_comment(self,data):
        if('\n' in data):
            print(">>> Multi-line Comment")
        else:
            print(">>> Single-line Comment")
        print(data)
    def handle_data(self,data):
        if(data != '\n'):
            print(">>> Data")
            print(data)
for i in range(int(input())):
    html += input().rstrip()
    html += '\n'
#print(html)    
parser = MyHTMLParser()
parser.feed(html)
parser.close()

Approach 3.

from html.parser import HTMLParser
class MyHTMLParser(HTMLParser):
    def handle_comment(self, data):
        if data.count('\n') == 0:
            print('>>> Single-line Comment')
        else:
            print('>>> Multi-line Comment')
        print(data)
    def handle_data(self, data):
        if data.strip() != '':
            print('>>> Data')
            print(data)
html = ""       
for i in range(int(input())):
    html += input().rstrip()
    html += '\n'
parser = MyHTMLParser()
parser.feed(html)
parser.close()

Solution in python

Approach 1.

from HTMLParser import HTMLParser
class MyHTMLParser(HTMLParser):
    def handle_comment(self, data):
        if "\n" in data:
            print ">>> Multi-line Comment"
        else:
            print ">>> Single-line Comment"
        print data
    def handle_data(self, data):
        if len(data.strip())>0:
            print ">>> Data"
            print data
html = ""       
for i in range(int(raw_input())):
    html += raw_input().rstrip()
    html += '\n'
parser = MyHTMLParser()
parser.feed(html)
parser.close()

Approach 2.

from HTMLParser import HTMLParser
class MyHTMLParser(HTMLParser):
    def handle_comment(self, data):
        if "\n" in data:
            print ">>> Multi-line Comment "
        else:
            print ">>> Single-line Comment "
        print data
    def handle_data(self, data):
        if not data == '\n':
            print ">>> Data"
            print data 
html = ""       
for i in range(int(raw_input())):
    html += raw_input().rstrip()
    html += '\n'
parser = MyHTMLParser()
parser.feed(html)
parser.close()

Approach 3.

from HTMLParser import HTMLParser
class MyHTMLParser(HTMLParser):
    def handle_data(self, data):
        if data != "\n":
            print ">>> Data\n{}".format(data)
    def handle_comment(self, data):
        if "\n" not in data:
            print ">>> Single-line Comment\n{}".format(data)
        elif "\n" in data:
            print ">>> Multi-line Comment\n{}".format(data)
html = ""       
for i in range(int(raw_input())):
    html += raw_input().rstrip()
    html += '\n'
parser = MyHTMLParser()
parser.feed(html)
parser.close()

Subscribe to The Poor Coder | Algorithm Solutions

Don’t miss out on the latest issues. Sign up now to get access to the library of members-only issues.
[email protected]
Subscribe