How to Scrape Amazon Prices with Google Sheets

Scraping Amazon prices with Google sheets used to be “easy” since you could use an ImportXML formula with an xpath to target the price of an Amazon listing directly on Amazon.com. However, Amazon seems to have currently blocked that capability in Google sheets, leaving users of that method with a parse error. This new method I’ve discovered will use make use of a 3rd party Amazon price tracking website to scrape the prices instead of scraping them from Amazon itself.

This guide is a bit on the advanced side, but as long as you follow the code exactly as seen, you shouldn’t have any problems. If you need something robust to convert barcodes to ASINs or get prices using the AWS API, I've developed a sheet that you can purchase here. In any case, the code used in this tutorial also allows you to dynamically generate the prices of any number of Amazon products (up to 1000), using a keyword search within the sheet. The example below shows what you should end up with once you’ve completed it.
 

July 2016 Update: I now offer an Amazon Price Tracker tool that uses the Product Advertising API to track prices (but this blog post still works, too)

Scraping Amazon Listing URLs from Google

The first step to getting prices from Amazon is to generate a set of Amazon product links. If you already have a set of links, you can skip this step. However, I figured that most people will also need to generate a list or won’t want to generate it by hand. So, start by copying the selected code into your sheet (if you want to follow the example sheet above, copy this code into cell A6):

=ImportXML(CONCATENATE("http://www.google.com/search?q={site:www.amazon.com ",B3,"}&num=",D3), "//h3[@class='r']/a/@href")

Using the ImportXML formula, this code exclusively scrapes Amazon.com links found on a Google search results page. And, if you’re using this snippet, you’ll be able to input a keyword into B3 to return Amazon listings relating only to that keyword. Likewise, by entering a number into D3, you’ll be able to return that number of URLs from Google. For example, typing “beard oil” in B3 and “17” in D3, you’ll return 17 beard oil Amazon product listing URLs.

Cleaning Up Your Scraped URLs

So you’ll just have scraped a few links, but the URLs you’ve just scraped aren’t going to be of any use in their current state and they’ll probably look a little bit like this:

/url?q=http://www.amazon.com/Leven-Rose-Beard-Leave-In-Conditioner/dp/B00IB6IAOS&sa=U&ved=0CBQQFjAAahUKEwiztb63qpbJAhVMWj4KHTdvB-U&usg=AFQjCNHCgEto98Yas5bWR_3sWd9pbsRZPA

So to be able to work with them, use the following code (paste it into B6 if you’re following the example):

=iferror(substitute((RIGHT(A6,LEN(A6)-7)),REGEXEXTRACT(RIGHT(A6,LEN(A6)-7),"&s a=.*"),""),"")

This code uses a SUBSTITUTE and REGEXEXTRACT to find the text “&sa=” within the link and remove anything following it. It’ll also remove the first 7 characters of the scraped text. The function also includes error handling so that you can feed it with empty inputs and not have your entire sheet riddled with “N/As”. In the end, this code should leave you with a workable URL that looks something like this

http://www.amazon.com/Leven-Rose-Beard-Leave-In-Conditioner/dp/B00IB6IAOS/

Scraping Prices from Amazon URLs

Now that you’ve cleaned up your URLs, it’s time to scrape the prices of these products. Since Amazon currently blocks xpath scraping, we’ll be using a site that doesn’t. ezpricealerts.com is a site that keeps track of all prices on Amazon and lets us scrape the prices with an ImportXML formula in Google sheets. To generate your price, simply paste the code below (to keep with the example above, copy this code into cell D6):

New updated code (November 7, 2016):

=iferror(INDEX(importxml(concatenate("https://www.ezpricealerts.com/products/US/",right(substitute((RIGHT(A6,LEN(A6)-7)),REGEXEXTRACT(RIGHT(A6,LEN(A6)-7),"&sa=.*"),""),10)),"//p[@class='form-control-static text-success']"),1),"")

Using a combination of a few REGEXEXTRACT and SUBSTITUTE formulas, this code reassembles the URL of your Amazon listing into a URL that thetracktor.com recgonizes. For example, it’ll end up looking something a bit like this:

Once complete, this code then extracts the price of the ASIN found on this page using an ImportXML function referencing the xpath of the price. If you’re unsure of the definition of an xpath, it can be easily summarized as a unique address of a piece of data containined within a tag in an XML or HTML file. The ImportXML function uses this xpath to find the price tag on the webpage. To generate one easily, you can install a browser extension such as xPath helper. In our case, the xpath of the price on thetracktor.com was this:

//p[@class='form-control-static text-success']

There it is. A new way to use Google Sheets to scrape Amazon prices. The one downfall of this method is that sometimes ezpricealerts won’t update sale prices right away, so from time to time your prices might be a little bit high. But for quick research or generating Amazon review articles, this is a great way to get started. And, technically, you could directly scrape the price from the first step without “cleaning” the URL. However, it’s sometimes good to have a URL to go along with your price.

Found this guide useful? Feeling generous?