Today I figured out which key APIs are needed in order to turn a Chrome extension into an automated scraper/text-inputter.
browser.tabs.onUpdated
That's what you're going to want when you load up a new page. It's an event listener in the background page that fires off when a page is going through its loading lifecycle. The loading status is accessible throughÂ
changeInfo.status
changeInfo is available in the onUpdated callback function, and when that becomes "complete", you can send a message to your content script to fire off any scraping/scripting.
So a basic example would look like
//content-script
So a basic example would look like
//content-script
sendMessage('start-scrape', {link: link, nextMessage: 'find-elements'})) onMessage('find-elements', () => { // this function will get triggered by the background page, // after the page is fully loaded. // so the scraping logic would go here })
// background-script
onMessage('start-scrape', ({data} => { // this updates the URL of the currently active tab browser.tabs.update({ url: data.link }); // we name the function openPage so we can remove this event listener browser.tabs.onUpdated.addListener(async function openPage( tabId, changeInfo ) { // this event gets fired multiple times on a page load // so we only want to start scraping when loading is complete if (changeInfo.status == "complete") { browser.tabs.onUpdated.removeListener(openPage); sendMessage( data?.nextMessage, {}, { context: "content-script", tabId: tabId } ); } }); })
It was tempting to assume that I could just use some familiar APIs to check if the document was loaded, but the browser API's message-passing system and onUpdated / onCreated APIs are actually much more reliable when you need to know the page content is fully loaded before beginning more complex automations.
I look forward to being able to share some more progress as this prototype becomes a product.
I look forward to being able to share some more progress as this prototype becomes a product.