Check the url is indexed by Google using Python


Check the url is indexed by Google using Python

Indexed by Google is crucial to increase the number of the visitor to the site. You can check whether the URL is indexed by Google or not using Python. This page shows how to do it for all blog posts. The URLs of the blog posts are extracted from the sitemap. So, you can use this code for your page by just modifying the sitemap URL. If you know how to use Heroku, crontab or such light resources which enable regularly execution of scripts, and if you know how to send an e-mail using python, you can report the indexed/not-indexed page for hourly, daily, weekly or whenever you want. The notebook below is the example of the indexed/not-indexed URL list when this post is written.

This code is based on How to check which URLs have been indexed by Google using Python
In [1]:
import requests
import time
from bs4 import BeautifulSoup
from urllib.parse import urlencode
from urllib.request import urlopen
In [2]:
# Get sitemap

sitemap_url = 'https://pythonmatplotlibtips.blogspot.com/sitemap.xml'
data = urlopen(sitemap_url).read()
data = str(data)
splited = data.replace("</loc>","<loc>").split("<loc>")
urls = [ url for url in splited if 'pythonmatplotlibtips.blogspot.com' in url]
In [3]:
# Check if the url has been indexed

seconds = 10

proxies = {
    'https' : 'https://localhost:8123',
    'http' : 'http://localhost:8123'
    }

user_agent = 'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/48.0.2564.116 Safari/537.36'
headers = { 'User-Agent' : user_agent}

indexed = []
notindexed = []
retvals = []
for line in urls:
    query = {'q': 'info:' + line}
    google = "https://www.google.com/search?" + urlencode(query)
    data = requests.get(google, headers=headers)
    data.encoding = 'ISO-8859-1'
    soup = BeautifulSoup(str(data.content), "html.parser")
    try:
        check = soup.find(id="rso").find("div").find("div").find("h3").find("a")["href"]
        indexed.append(line)
    except AttributeError:
        notindexed.append(line)
    time.sleep(float(seconds))
In [4]:
# Format the result

yind = "\n".join(indexed)
ny = len(indexed)
nind = "\n".join(notindexed)
nn = len(notindexed)
content1 = "URL of sitemap:\n%s"%(sitemap_url)
content2 = "There are %d indexed URLs:\n%s"%(ny,yind)
content3 = "There are %d NOT indexed URLs\n%s"%(nn,nind)
content  = "\n%s\n\n\n%s\n\n%s"%(content1,content2,content3)
print(content)
URL of sitemap:
https://pythonmatplotlibtips.blogspot.com/sitemap.xml


There are 39 indexed URLs:
https://pythonmatplotlibtips.blogspot.com/2018/01/solve-animate-single-pendulum-odeint-artistanimation.html
https://pythonmatplotlibtips.blogspot.com/2018/01/try-using-all-mathtext-fontset-in-python-matplotlib.html
https://pythonmatplotlibtips.blogspot.com/2018/01/generate-average-image-using-python-and-PIL.html
https://pythonmatplotlibtips.blogspot.com/2018/01/combine-3d-two-2d-animations-in-one-figure-artistdanimation.html
https://pythonmatplotlibtips.blogspot.com/2018/01/combine-3d-two-2d-animations-in-one-figure-timedanimation.html
https://pythonmatplotlibtips.blogspot.com/2018/01/combine-two-2d-animations-in-one-figure-matplotlib-artistanimation.html
https://pythonmatplotlibtips.blogspot.com/2018/01/combine-two-2d-animations-in-one-figure-matplotlib-timedanimation.html
https://pythonmatplotlibtips.blogspot.com/2018/01/plot-three-wave-in-one-plot-pwm.html
https://pythonmatplotlibtips.blogspot.com/2017/12/try-all-legend-options-in-python-matplotlib-pyplot.html
https://pythonmatplotlibtips.blogspot.com/2017/12/air-flow-contourf-animation-matplotlib-artist-animation.html
https://pythonmatplotlibtips.blogspot.com/2017/12/cycloid-animation-artistanimation.html
https://pythonmatplotlibtips.blogspot.com/2017/12/draw-3d-line-animation-using-python-matplotlib-artistanimation.html
https://pythonmatplotlibtips.blogspot.com/2017/12/draw-3d-line-animation-using-python-matplotlib-funcanimation.html
https://pythonmatplotlibtips.blogspot.com/2017/12/better-way-to-chose-numbers-ticklables.html
https://pythonmatplotlibtips.blogspot.com/2017/12/arrange-multiple-images-in-one-large-image.html
https://pythonmatplotlibtips.blogspot.com/2017/12/draw-electric-field-lines-without-mayavi.html
https://pythonmatplotlibtips.blogspot.com/2017/12/plot-on-image-matplotlib-pyplot.html
https://pythonmatplotlibtips.blogspot.com/2017/12/cycloid-animation-funcanimation.html
https://pythonmatplotlibtips.blogspot.com/2017/12/speed-up-plotting-magnified-waveforms.html
https://pythonmatplotlibtips.blogspot.com/2017/12/draw-continuous-electric-field-lines-3d-plotly.html
https://pythonmatplotlibtips.blogspot.com/2017/12/change-space-betwen-label-line-handletextpad.html
https://pythonmatplotlibtips.blogspot.com/2017/12/draw-electric-field-lines-with-changing-color.html
https://pythonmatplotlibtips.blogspot.com/2017/12/plotly-first-time-operation-check-copy.html
https://pythonmatplotlibtips.blogspot.com/2017/12/plot-continuous-magnetic-field-lines.html
https://pythonmatplotlibtips.blogspot.com/2017/12/plot-electric-field-lines-around-point.html
https://pythonmatplotlibtips.blogspot.com/2017/12/draw-beautiful-electric-field-lines.html
https://pythonmatplotlibtips.blogspot.com/2017/12/the-effect-of-padinches-in-python.html
https://pythonmatplotlibtips.blogspot.com/2017/12/display-same-figure-with-changing-lines.html
https://pythonmatplotlibtips.blogspot.com/2017/12/draw-minor-ticks-at-arbitrary-place.html
https://pythonmatplotlibtips.blogspot.com/2017/12/draw-animation-graph-using-python.html
https://pythonmatplotlibtips.blogspot.com/2017/11/simple-way-to-draw-3d-random-walk-matplotlib.html
https://pythonmatplotlibtips.blogspot.com/2017/11/make-figures-changing-math-font.html
https://pythonmatplotlibtips.blogspot.com/2017/11/write-mu-greek-letter-symbol-in-python.html
https://pythonmatplotlibtips.blogspot.com/2017/11/set-aspect-ratio-figure-python-matplotlib-pyplot.html
https://pythonmatplotlibtips.blogspot.com/2017/10/how-to-arrange-two-ylabels-using-python.html
https://pythonmatplotlibtips.blogspot.com/2017/10/draw-several-plots-in-one-figure-python-matplotlib-pyploy.html
https://pythonmatplotlibtips.blogspot.com/2017/09/draw-two-legends-in-one-figure-using.html
https://pythonmatplotlibtips.blogspot.com/2017/08/one-ylabel-two-subplots-python-matplotlib-pyplot.html
https://pythonmatplotlibtips.blogspot.com/2017/07/write-text-annotation-on-image-using.html

There are 6 NOT indexed URLs
https://pythonmatplotlibtips.blogspot.com/2018/01/add-second-x-axis-below-first-x-axis-python-matplotlib-pyplot.html
https://pythonmatplotlibtips.blogspot.com/2018/01/add-second-x-axis-at-top-of-figure-python-matplotlib-pyplot.html
https://pythonmatplotlibtips.blogspot.com/2017/12/draw-axes-in-axes-using-zoomed-inset-axes.html
https://pythonmatplotlibtips.blogspot.com/2017/12/change-hatch-density-barplot-matplotlib-pyplot.html
https://pythonmatplotlibtips.blogspot.com/2017/12/simple-way-to-draw-electric-field-lines-using-plotly-offline.html
https://pythonmatplotlibtips.blogspot.com/2017/12/draw-flow-with-continuous-stream-line.html