Web scraping using Google Docs – Xpath

Web scraping using Google Docs – Xpath


Using Google Docs, web scraping can be done easily through simple xpath statements. This example guides you through scraping headlines from the NYtimes.com website. This application is a powerful tool that can be modified to scrape almost anything off of a website. Xpath syntax works differently in Google docs, and you may have to re-arrange the code to make it function. Use the code in this video as a starting point.

21 comments

  1. Thanks very much for this video it was easy to put in to practice. Do you think you can make a video on how to scrape pictures and text off of websites using xpath? Thanks

  2. @testresearch099 thanks for getting back to me and taking your time to put the videos together. It makes computing a lot more simpler seeing it in a movie rather than reading text. Many Thanks.

  3. Hi,
    I found your video very useful, but I am having a problem where the data on the web page I want to scrape takes a few seconds to load, so the script isn't able to capture that data.

    I'm wondering if there is a delay() or sleep() function that can be used to load the page and then delay the data scrape for approx. 5 seconds?

    Thanks!

  4. @jolt472 Hey, I'm actually Testresearch099 – I've had this problem as well and haven't managed to solve it. In your case, I've never had a problem before. It would help if you could tell me the URL – send me a message if you like, be happy to *try* and help.

  5. Im new and need to ask this. What exactly am i doing with this. If im new into SEo what good would this doi me. Sorry im trying to figure this out. Thanks.

  6. this is also my problem when i tried this, have you solve this? do you know what to be use for "and also"? can you share it to me? thanks

  7. Hi, I am trying to implement this to scrape YouTube views off a video that is not mine. Any help? Getting stuck, I am trying to use the Span.

  8. hello friend, you know this video was testing and not to mistake may have, send me error press enter after typing the next sentence:

    =importxml(B5,"//div[@class='story']|//p[@class='summary']")

Leave a Reply

Your email address will not be published. Required fields are marked *