Puppeteer: End to End Testing framework

4 min readSep 11, 2019

Puppeteer & Headless Chrome for End-to-End Testing

I saw a video a few days ago on DevTips where they attempted to use Puppeteer, I’ve never used it myself and thought it looked really cool. So I gave it a try and I’m sharing what I’ve learned here.

What is Puppeteer?

Before we just dive into the code it’s important to understand what a technology we’re using is and why it exists.

A Headless Browser

Puppeteer comes with Chromium and runs “headless” by default. What is a headless browser? A headless browser is a browser for machines. It has no UI and allows a program — often called a scraper or a crawler — to read and interact with it.

An API

Headless browsers are great and all, but they can be a pain to use sometimes. Puppeteer, however, provides a really nice API or set of functions for interacting with it.

Why use any of this?

There’s so much you can do with Puppeteer and web scraping in general!

Make automated tests on a real web page,
Generate PDFs
Take screenshots
Grab data from websites and save it
Automate boring tasks
Puppeteer specifically is perhaps the best tool you can use IMO

On with the code!

let’s get started!

Prerequisites

If you’re following along you’ll need NodeJS installed, basic knowledge of the command line, knowledge of JavaScript and knowledge of the DOM.

Note: Your scraper code doesn’t have to be perfect. When doing your own projects don’t overthink it.

Project Setup

Make a folder ( name it whatever )
Open the folder in your terminal/command prompt
In your terminal run, npm init -y This will generate a package.json for managing project dependencies.
Then run npm install puppeteer This will install puppeteer which includes Chromium so don’t be surprised if it’s large.
Finally, open the folder in your favorite code editor and create an index.js file. You’ll also need these folders;screenshots, pdfs, and json if you’re following my example exactly.

A Simple Example

Now let’s try something simple ( but really cool! ) to verify that our setup is working. We’re going to take a screenshot of a web page and generate a PDF file.

Grabbing Data — Preparations

Using the same site from the example above we will grab some data and save it to a file. Let’s say in this scenario we only want the team name, year, wins and losses. The first step is to create some selectors.

A selector is just a path to the data. ( think CSS selectors ) We’ll come up with the paths here by using our browser’s developer tools. Open them on the page by opening your browser menu and looking for “developer tools”. I’ll be using Chrome and you can just press CTRL + Shift + I to open them.

On the site open the elements tab in your developer tools and find what data you want to grab. Take note of its structure, classes, etc.

Grabbing Data — In Code

Time to apply this to our code.

The main part of this is page.evaluate() this lets us run JS code in the browser and communicate back any data we want. This is all it takes to fetch data.

You may have noticed that we have access to the DOM here — this is the very nice and familiar API that Puppeteer provides!

Saving Data to a File

As a final touch, we’ll save this data to a file. In my case, I want the data in JSON format because that’s most easily used with JS.

Load the file system module from node
Convert the data to JSON with JSON.stringify()
Write the file with fs.writeFile()

More Advanced Scraping

Puppeteer supports things like single-page applications ( SPA ), simulating input, tests and more. They’re beyond the scope of this tutorial, but you can find examples in the Puppeteer documentation.

References and Links

Getting Started with Headless Chrome | Web | Google Developers

Getting started with Headless Chrome

developers.google.com

GoogleChrome/puppeteer

Headless Chrome Node API. Contribute to GoogleChrome/puppeteer development by creating an account on GitHub.

github.com

Thanks for reading! Leave any feedback or questions in the comments below.