I'm preparing an article with a top 100 blogs for software testing. I will not write down my personal choice, instead I will make a huge list of blogs and rank them according to the how to make a top blog list article on Noop.nl. Several rankings are taken into account, like the Google pagerank (PR), Alexa reach, Alexa popularity, Technorati authority and the number of referring links according to Google.
Gathering a list of 300+ blogs to start with is already a good start. But retrieving the rankings of each blog by hand is something I don't intend to do. Therefore I created a script to check the rankings for a list of sites automatically. For each site in the input file, the script fetches the rank at each of the selected ranking-service. Each site and its ranks are written to a file to be opened with MS Excel afterwards.
In order to run the script you need to download AutoIt. You can download the latest version from http://www.autoitscript.com/. Pick "AutoIt Full Installation" which includes ScITE, an excellent AutoIt editor providing syntax highlighting, autocompletion, and more.
The script is completely free to use, though a comment in the comment section of this post would be appreciated :-)
Preparation of the pagerank script
The rank check script consists out of four main files:A configuration file (.ini)Three out of four files need to be created manually by copying and pasting the code below. The fourth file is generated automatically by the script.
An input file containing all sites to get the ranking for (.txt)
The script file (.au3)
An output file containing all sites with their rankings (.csv)
The configuration file
Create a file "checkRanking.ini" and add following code to it:You can further tweak the configuration and add rating services to work with.
[launch]
selectedRankings=googlePR,googleLink,alexaPopularity,technorati
[common]
#The name of the ranking input file
rankingInputFileName=rankingInput.txt
#The name of the output ranking file
rankingOutputFileName=rankingOutput.csv
#The separator of the output ranking file
rankingOutputSeparator=,
#The separator of the ranking service details
rankingServiceSeparator=§
#The tag in the ranking service url to replace by the site to rank
rankingServiceUrlTag=[url]
#The ranking value for sites not having a rank
rankingValueNoRank=N/A
#The ranking value for sites in the excluded domains
rankingValueExcluded=Excluded
[ranking-service]
#The ranking services:
#parameter 1 = the url of the service to request
#parameter 2 = the regular expression to fetch the ranking
googlePR=http://www.google.com/search?client=navclient-auto&ch=6-1484155081&features=Rank&q=info:[url]§.*:\d:(.*)
alexaPopularity=http://data.alexa.com/data?cli=10&dat=snbamz&url=[url]§POPULARITY\s\S+\sTEXT=\"(.\d+)\"
alexaReach=http://data.alexa.com/data?cli=10&dat=snbamz&url=[url]§REACH\sRANK=\"(\d+)\"
technorati=http://www.technorati.com/blogs/[url]§>Authority:\s([\d,]+)
googleLink=http://www.google.be/search?hl=nl&q=link%3A[url]§([\d\.]+)\smet\slinks\stot
#The domain names which shouldn't get a ranking at a certain ranking-service
[domains-to-exclude]
alexaPopularity=.thoughtworks.com§.msdn.com
alexaReach=.thoughtworks.com§.msdn.com
The site input file
Create a file "rankingInput.txt" and add the sites of your choice to it, for instance:
http://www.testingminded.com
http://www.testsquad.org
http://www.problogdesign.com
http://www.problogger.net
http://www.noop.nl
http://blogs.msdn.com/joshpoley/
http://blogs.thoughtworks.com/testblog
http://notexistingsite.com
The script file
Create a file "checkRanking.au3" and add following code to it:
;----------------------- TESTINGMINDED -----------------------
;--- pagerank checking script from www.testingminded.com -----
;-------------------- DECLARATION SECTION --------------------
#include <ARRAY.AU3>
#include <FILE.AU3>
#include <INET.AU3>
Opt("TrayAutoPause", 0)
Global $sitesToRank
_inform("Initializing...")
$scriptName = StringLeft(@ScriptName, StringLen(@ScriptName) - 4)
$iniLocation = @ScriptDir & '\' & $scriptName & '.ini'
If Not (FileExists($iniLocation)) Then
MsgBox(0, "Exception", "The configuration file could not be found: " & $iniLocation)
Exit
EndIf
$rankingInputFileName = IniRead($iniLocation, "common", "rankingInputFileName", "NotFound")
$rankingOutputFileName = IniRead($iniLocation, "common", "rankingOutputFileName", "NotFound")
$rankingOutputSeparator = IniRead($iniLocation, "common", "rankingOutputSeparator", "NotFound")
$rankingServiceSeparator = IniRead($iniLocation, "common", "rankingServiceSeparator", "NotFound")
$rankingServiceUrlTag = IniRead($iniLocation, "common", "rankingServiceUrlTag", "NotFound")
$rankingValueNoRank = IniRead($iniLocation, "common", "rankingValueNoRank", "NotFound")
$rankingValueExcluded = IniRead($iniLocation, "common", "rankingValueExcluded", "NotFound")
$selectedRankings = IniRead($iniLocation, "launch", "selectedRankings", "NotFound")
$selectedRankingSettings = _arraycreate("", "", "", "", "", "", "", "", "", "", "", "", "", "")
; ------------------------------------------------------------------------------
; MAIN
; ------------------------------------------------------------------------------
$rankingInputFile = @ScriptDir & "\" & $rankingInputFileName
If Not (FileExists($rankingInputFile)) Then
MsgBox(0, "Exception", "The ranking input file could not be found: " & $rankingInputFile)
Exit
EndIf
$timeStamp = @YEAR & '-' & @MON & '-' & @MDAY & '_' & @HOUR & '-' & @MIN & '-' & @SEC
$RankingOutputFile = @ScriptDir & "\" & $rankingOutputFileName
_FileReadToArray($rankingInputFile, $sitesToRank)
;retrieve the ranking service details, but only for the selected ranking services
$rankingSettings = _getSelectedRankingSettings($selectedRankings)
;create the ranking output file
_FileCreate($RankingOutputFile)
;create the header of the output file
$rankHeader = ""
_addItemToList($rankHeader, "site")
For $i = 1 To UBound($rankingSettings) - 1
$rankingSetting = $rankingSettings[$i]
$rankingName = $rankingSetting[1]
_addItemToList($rankHeader, $rankingName)
Next
FileWrite($RankingOutputFile, $rankHeader & @CRLF)
;loop through the sites to rank
For $i = 1 To UBound($sitesToRank) - 1
$rankedSite = $sitesToRank[$i]
For $j = 1 To UBound($rankingSettings) - 1
$rank = _getRank($sitesToRank[$i], $rankingSettings[$j])
_addItemToList($rankedSite, $rank)
Next
FileWrite($RankingOutputFile, $rankedSite & @CRLF)
Next
MsgBox(0, "Information", "The ranking of " & UBound($sitesToRank) - 1 & " sites at " & UBound($rankingSettings) - 1 & " ranking services has completed." & @CRLF & "Please check the ranking output file for the results: " & $RankingOutputFile)
; ------------------------------------------------------------------------------
; FUNCTIONS
; ------------------------------------------------------------------------------
Func _addItemToList(ByRef $list, $item)
If $list == "" Then
$list = $item
Else
$list = $list & $rankingOutputSeparator & $item
EndIf
EndFunc ;==>_addItemToList
Func _getSelectedRankingSettings($selectedRankings)
$selectedRankings = StringSplit($selectedRankings, $rankingOutputSeparator)
ReDim $selectedRankingSettings[UBound($selectedRankings)]
$selectedRankingSettings[0] = UBound($selectedRankings)
For $i = 1 To UBound($selectedRankings) - 1
$line = IniRead($iniLocation, "ranking-service", $selectedRankings[$i], "NotFound")
If $line == "NotFound" Then
MsgBox(0, "Exception", "Could find not ranking-service parameter " & $selectedRankings[$i] & " in the configuration file")
Exit
Else
$lineParameters = StringSplit($line, $rankingServiceSeparator)
If ($lineParameters[0] == 2) Then
$domainLine = IniRead($iniLocation, "domains-to-exclude", $selectedRankings[$i], "NotFound")
Dim $domains[1]
If Not ($domainLine == "NotFound") Then
$domains = StringSplit($domainLine, $rankingServiceSeparator)
EndIf
Dim $rankingSetting[5] = [4, $selectedRankings[$i], $lineParameters[1], $lineParameters[2], $domains]
$selectedRankingSettings[$i] = $rankingSetting
Else
MsgBox(0, "Exception", "Ranking-service parameter " & $line & " does not contain the correct amount of subitems: ranking-service url, regular expression")
Exit
EndIf
EndIf
Next
Return $selectedRankingSettings
EndFunc ;==>_getSelectedRankingSettings
Func _getRank($site, $rankingSettings)
$rankingName = $rankingSettings[1]
$rankingUrl = StringReplace($rankingSettings[2], $rankingServiceUrlTag, $site)
$rankingRegex = $rankingSettings[3]
$domainsToExclude = $rankingSettings[4]
$toExclude = False
For $i = 1 To UBound($domainsToExclude) - 1
If StringInStr($rankingUrl, $domainsToExclude[$i]) Then
$toExclude = True
$i = UBound($domainsToExclude)
EndIf
Next
If $toExclude = False Then
If $rankingName = "googleLink" Then
$rankingUrl = StringReplace($rankingSettings[2], $rankingServiceUrlTag, _urlEncode($site))
Else
$rankingUrl = StringReplace($rankingSettings[2], $rankingServiceUrlTag, $site)
EndIf
_inform("Getting " & $rankingName & " rank for " & $site)
;launch the ranking service request and get the source code
$source = _INetGetSource($rankingUrl)
;retrieve the ranking value
$rankingRegResult = StringRegExp($source, $rankingRegex, 1)
If @error Or ($rankingRegResult[0] == "") Then
Return $rankingValueNoRank
Else
Return StringReplace(StringReplace($rankingRegResult[0], ",", ""), ".", "")
EndIf
Else
Return $rankingValueExcluded
EndIf
EndFunc ;==>_getRank
Func _urlEncode($url)
$url = StringReplace($url, "/", "%2F")
$url = StringReplace($url, ":", "%3A")
Return $url
EndFunc ;==>_urlEncode
Func _inform($message, $timeout = 3)
TrayTip("checkRanking progress...", $message, $timeout)
EndFunc ;==>_inform
Proof running the script
At this stage, you should be able to run your script and get results. Save your script, ranking input file and configuration file, and run the script by pressing F5 in your ScITE editor. If you correctly followed all steps, a traytip appears to indicate the ranking has started. If not, then please read the error information carefully.The ranking output file
After running the ranking script you should get a ranking output file named "rankingOutput.csv" in the same directory as where the script resides. The content should look like this:If you encounter difficulties with the execution of these steps and you don't succeed in fixing the problem yourself, then leave a message in the comment section. Don't forget to include the error message shown in the ScITE output window. I will answer to your question as soon as possible.
site,googlePR,googleLink,alexaPopularity,technorati
http://www.testingminded.com,1,2,2020990,N/A
http://www.testsquad.org,3,13,1595485,5
http://www.problogdesign.com,5,392,44114,304
http://www.problogger.net,6,11000,4202,4130
http://www.noop.nl,4,420,166693,174
http://notexistingsite.com,N/A,N/A,N/A,N/A
http://blogs.msdn.com/joshpoley/,4,108,Excluded,N/A
http://blogs.thoughtworks.com/testblog,N/A,N/A,Excluded,N/A



ShareThis



Hi,
Thanks for this extremely useful resource. But I have a problem. Pageranks are not being calculated in my case. It shows N/A only. Where I went wrong? Please help me.
I can't think of a reason right away. Could you please paste here the content of your configuration and site input file? I will have a look then.
Nice script dude but the result is N/A for every sites, even the ones with known PR.
-Are you running this script from behind a proxy?
-Can you upload a zip with your script and configuration files and send me the link?
hey in my scenario,i have to capture the result also and have to save in excel or text file.for ex through putty it will login in to server,df-k command is given,then want to save that results and through autoit want to save the results in excel file.is it possible.
Putty supports outputting to a text file and Autoit has user defined functions supporting excel automation. So yes, this should be possible.
i think the reason for that would be the same as using bulk pagerank checkers repeatedly that the site errors out and gives N/A results instead
This is a really nice little script but I'm getting a N/A for every site, even the ones that I know have PR.
Any Suggestions?
Hello,
I copied the url into the browser and get a 403 error. It appears that google has changed its hashing algorithm. The url I provided in this article is therefore no longer working.
One of the options to switch to Ruby and use following script: https://github.com/pstadler/pagerank-service
Porting above ruby script to AutoIt is also an option :-)