Automatic pagerank checking script

Checking the pagerank of a website is easy. You just search google for the "google pagerank checker" keyword and you instantly get more than 3 million of results. Each website offers you the possibility to manually enter a web site url and returns you the rank. That's great, but what if you need to know the pagerank of more than 200 web sites? And what if you need to know other rankings as well?

I'm preparing an article with a top 100 blogs for software testing. I will not write down my personal choice, instead I will make a huge list of blogs and rank them according to the how to make a top blog list article on Several rankings are taken into account, like the Google pagerank (PR), Alexa reach, Alexa popularity, Technorati authority and the number of referring links according to Google.

Gathering a list of 300+ blogs to start with is already a good start. But retrieving the rankings of each blog by hand is something I don't intend to do. Therefore I created a script to check the rankings for a list of sites automatically. For each site in the input file, the script fetches the rank at each of the selected ranking-service. Each site and its ranks are written to a file to be opened with MS Excel afterwards.

In order to run the script you need to download AutoIt. You can download the latest version from Pick "AutoIt Full Installation" which includes ScITE, an excellent AutoIt editor providing syntax highlighting, autocompletion, and more.

The script is completely free to use, though a comment in the comment section of this post would be appreciated :-)

Preparation of the pagerank script

The rank check script consists out of four main files:
A configuration file (.ini)
An input file containing all sites to get the ranking for (.txt)
The script file (.au3)
An output file containing all sites with their rankings (.csv)
Three out of four files need to be created manually by copying and pasting the code below. The fourth file is generated automatically by the script.

The configuration file

Create a file "checkRanking.ini" and add following code to it:


#The name of the ranking input file
#The name of the output ranking file
#The separator of the output ranking file
#The separator of the ranking service details
#The tag in the ranking service url to replace by the site to rank
#The ranking value for sites not having a rank
#The ranking value for sites in the excluded domains

#The ranking services:
#parameter 1 = the url of the service to request
#parameter 2 = the regular expression to fetch the ranking

#The domain names which shouldn't get a ranking at a certain ranking-service
You can further tweak the configuration and add rating services to work with.

The site input file

Create a file "rankingInput.txt" and add the sites of your choice to it, for instance:

The script file

Create a file "checkRanking.au3" and add following code to it:

;----------------------- TESTINGMINDED -----------------------
;--- pagerank checking script from -----
;-------------------- DECLARATION SECTION --------------------

#include <ARRAY.AU3>
#include <FILE.AU3>
#include <INET.AU3>

Opt("TrayAutoPause", 0)

Global $sitesToRank


$scriptName = StringLeft(@ScriptName, StringLen(@ScriptName) - 4)
$iniLocation = @ScriptDir & '\' & $scriptName & '.ini'
If Not (FileExists($iniLocation)) Then
MsgBox(0, "Exception", "The configuration file could not be found: " & $iniLocation)

$rankingInputFileName = IniRead($iniLocation, "common", "rankingInputFileName", "NotFound")
$rankingOutputFileName = IniRead($iniLocation, "common", "rankingOutputFileName", "NotFound")
$rankingOutputSeparator = IniRead($iniLocation, "common", "rankingOutputSeparator", "NotFound")
$rankingServiceSeparator = IniRead($iniLocation, "common", "rankingServiceSeparator", "NotFound")
$rankingServiceUrlTag = IniRead($iniLocation, "common", "rankingServiceUrlTag", "NotFound")
$rankingValueNoRank = IniRead($iniLocation, "common", "rankingValueNoRank", "NotFound")
$rankingValueExcluded = IniRead($iniLocation, "common", "rankingValueExcluded", "NotFound")
$selectedRankings = IniRead($iniLocation, "launch", "selectedRankings", "NotFound")
$selectedRankingSettings = _arraycreate("", "", "", "", "", "", "", "", "", "", "", "", "", "")

; ------------------------------------------------------------------------------
; ------------------------------------------------------------------------------

$rankingInputFile = @ScriptDir & "\" & $rankingInputFileName

If Not (FileExists($rankingInputFile)) Then
MsgBox(0, "Exception", "The ranking input file could not be found: " & $rankingInputFile)

$timeStamp = @YEAR & '-' & @MON & '-' & @MDAY & '_' & @HOUR & '-' & @MIN & '-' & @SEC
$RankingOutputFile = @ScriptDir & "\" & $rankingOutputFileName
_FileReadToArray($rankingInputFile, $sitesToRank)

;retrieve the ranking service details, but only for the selected ranking services
$rankingSettings = _getSelectedRankingSettings($selectedRankings)

;create the ranking output file

;create the header of the output file
$rankHeader = ""
_addItemToList($rankHeader, "site")
For $i = 1 To UBound($rankingSettings) - 1
$rankingSetting = $rankingSettings[$i]
$rankingName = $rankingSetting[1]
_addItemToList($rankHeader, $rankingName)

FileWrite($RankingOutputFile, $rankHeader & @CRLF)

;loop through the sites to rank
For $i = 1 To UBound($sitesToRank) - 1
$rankedSite = $sitesToRank[$i]

For $j = 1 To UBound($rankingSettings) - 1
$rank = _getRank($sitesToRank[$i], $rankingSettings[$j])
_addItemToList($rankedSite, $rank)

FileWrite($RankingOutputFile, $rankedSite & @CRLF)

MsgBox(0, "Information", "The ranking of " & UBound($sitesToRank) - 1 & " sites at " & UBound($rankingSettings) - 1 & " ranking services has completed." & @CRLF & "Please check the ranking output file for the results: " & $RankingOutputFile)

; ------------------------------------------------------------------------------
; ------------------------------------------------------------------------------

Func _addItemToList(ByRef $list, $item)
If $list == "" Then
$list = $item
$list = $list & $rankingOutputSeparator & $item
EndFunc ;==>_addItemToList

Func _getSelectedRankingSettings($selectedRankings)

$selectedRankings = StringSplit($selectedRankings, $rankingOutputSeparator)
ReDim $selectedRankingSettings[UBound($selectedRankings)]
$selectedRankingSettings[0] = UBound($selectedRankings)

For $i = 1 To UBound($selectedRankings) - 1
$line = IniRead($iniLocation, "ranking-service", $selectedRankings[$i], "NotFound")

If $line == "NotFound" Then
MsgBox(0, "Exception", "Could find not ranking-service parameter " & $selectedRankings[$i] & " in the configuration file")
$lineParameters = StringSplit($line, $rankingServiceSeparator)

If ($lineParameters[0] == 2) Then

$domainLine = IniRead($iniLocation, "domains-to-exclude", $selectedRankings[$i], "NotFound")
Dim $domains[1]

If Not ($domainLine == "NotFound") Then
$domains = StringSplit($domainLine, $rankingServiceSeparator)

Dim $rankingSetting[5] = [4, $selectedRankings[$i], $lineParameters[1], $lineParameters[2], $domains]

$selectedRankingSettings[$i] = $rankingSetting
MsgBox(0, "Exception", "Ranking-service parameter " & $line & " does not contain the correct amount of subitems: ranking-service url, regular expression")



Return $selectedRankingSettings

EndFunc ;==>_getSelectedRankingSettings

Func _getRank($site, $rankingSettings)

$rankingName = $rankingSettings[1]
$rankingUrl = StringReplace($rankingSettings[2], $rankingServiceUrlTag, $site)
$rankingRegex = $rankingSettings[3]
$domainsToExclude = $rankingSettings[4]
$toExclude = False

For $i = 1 To UBound($domainsToExclude) - 1
If StringInStr($rankingUrl, $domainsToExclude[$i]) Then
$toExclude = True
$i = UBound($domainsToExclude)

If $toExclude = False Then

If $rankingName = "googleLink" Then
$rankingUrl = StringReplace($rankingSettings[2], $rankingServiceUrlTag, _urlEncode($site))
$rankingUrl = StringReplace($rankingSettings[2], $rankingServiceUrlTag, $site)

_inform("Getting " & $rankingName & " rank for " & $site)

;launch the ranking service request and get the source code
$source = _INetGetSource($rankingUrl)
;retrieve the ranking value
$rankingRegResult = StringRegExp($source, $rankingRegex, 1)

If @error Or ($rankingRegResult[0] == "") Then
Return $rankingValueNoRank
Return StringReplace(StringReplace($rankingRegResult[0], ",", ""), ".", "")
Return $rankingValueExcluded


EndFunc ;==>_getRank

Func _urlEncode($url)
$url = StringReplace($url, "/", "%2F")
$url = StringReplace($url, ":", "%3A")
Return $url
EndFunc ;==>_urlEncode

Func _inform($message, $timeout = 3)
TrayTip("checkRanking progress...", $message, $timeout)
EndFunc ;==>_inform

Proof running the script

At this stage, you should be able to run your script and get results. Save your script, ranking input file and configuration file, and run the script by pressing F5 in your ScITE editor. If you correctly followed all steps, a traytip appears to indicate the ranking has started. If not, then please read the error information carefully.

The ranking output file

After running the ranking script you should get a ranking output file named "rankingOutput.csv" in the same directory as where the script resides. The content should look like this:

If you encounter difficulties with the execution of these steps and you don't succeed in fixing the problem yourself, then leave a message in the comment section. Don't forget to include the error message shown in the ScITE output window. I will answer to your question as soon as possible.

Related Posts by Categories


Recent Articles

Top Commenters

Recent Comments