Puppeteer is a Node.js library that provides a high-level API to control Chrome or Chromium over the DevTools Protocol. One common use case is automating the process of downloading files. While Puppeteer does not offer a direct method for file downloads, you can achieve this by intercepting network requests or manipulating browser settings.
1. Set Up Puppeteer:
npm install puppeteer
const puppeteer = require('puppeteer');
2. Launch Browser and Open a New Page:
const browser = await puppeteer.launch();
const page = await browser.newPage();
3. Set Download Behavior:
page._client.send()
method to specify the download path.
const downloadPath = '/path/to/downloads';
await page._client().send('Page.setDownloadBehavior', {
behavior: 'allow',
downloadPath: downloadPath
});
4. Trigger File Download:
await page.goto('https://example.com/download');
await page.click('a#downloadLink');
5. Wait for the Download to Complete:
await page.waitForTimeout(5000); // Wait for 5 seconds
6. Close the Browser:
await browser.close();
To download a PDF file, you would use the same steps outlined above. If the link to the PDF is a direct URL, you can also intercept the request and save the file locally.
await page.goto('https://example.com/report.pdf');
await page.click('a#downloadPdf');
If the file is behind a login, you’ll need to automate the login process before triggering the download.
await page.goto('https://example.com/login');
await page.type('#username', 'yourUsername');
await page.type('#password', 'yourPassword');
await page.click('button#login');
await page.waitForNavigation();
await page.goto('https://example.com/download');
await page.click('a#downloadLink');
This article gives you a practical overview of how to download files using Puppeteer. For further details, consult the official Puppeteer documentation.
Jorge García
Fullstack developer