By Vishal Basumatary in javascript — Mar 23, 2023

cheerio library to parse the meta tags in url

Cheerio is a JavaScript library designed to make working with HTML easier in the server-side environment. It provides an API that is mostly compatible with jQuery, which makes it very easy to use - you can use the same jQuery selectors you're used to in the browser to select and manipulate elements in the page. Using Cheerio, you can parse and extract the meta tags from a given URL in a few simple steps. First, you'll need to make an HTTP request to the URL in question. The easiest way to do this is to use the request library, which provides a simple API for making HTTP requests. Once the response is received, you'll need to parse the HTML content. To do this, you can use the cheerio.load() method, which takes an HTML string and returns a Cheerio object, which has all the same methods as jQuery. Once you have the Cheerio object, you can use the familiar jQuery selectors to extract the meta tags. For example, to get all the meta tags on the page, you could use the following code:


const cheerio = require('cheerio');
const request = require('request');

request('http://example.com', (err, response, html) => {
  if (!err && response.statusCode == 200) {
    const $ = cheerio.load(html);

    // Get all the meta tags
    const metaTags = $('meta');

    // Print the meta tags
    console.log(metaTags);
  }
});

The code will print out an array of Cheerio objects, one for each meta tag on the page. From there, you can use the .attr() and .val() methods to get the individual attributes and values of each meta tag. For example, to get the page's title, you could use the following code:


const title = $('title').text();

console.log(title);

The code above will print out the title of the page. By using Cheerio, you can easily extract and parse the meta tags from a given URL.

Subscribe to The Poor Coder | Algorithm Solutions