@ozdemirburak opened this Pull Request on October 7th 2018 Contributor

fixes #13427


I've added an alternative method for fetching the global ranking from Alexa.

On the other hand, calling http://data.alexa.com/data?cli=10&url=DOMAIN is working currently in my case, however I had this issue in the previous month, so I do not know whether they are blacklisting or limiting IP ranges or not.

Finally, if one unexpectedly calls this new method frequently, probably will end up with being blacklisted.

@Findus23 commented on October 9th 2018 Member


Many thanks for your contribution. Can you explain what the fallback does?

I still only get Okay when I request http://data.alexa.com/data?cli=10&url=example.com

@ozdemirburak commented on October 9th 2018 Contributor


First, if an exception occurs, like in your case, then it sends a HTTP request to public Alexa ranking page of the website, for instance https://www.alexa.com/siteinfo/example.com, then matches the global ranking and local ranking rows.

Since that HTML is kinda dirty, it replaces multiple whitespace with a single space first, then what we get is something like below.

<strong class="metrics-data align-vmiddle">19,460</strong>
<strong class="metrics-data align-vmiddle">6,517</strong>

Finally, it matches what is between strong with the class metrics-data align-vmiddle, which is 19,460 here and returns it as an integer, 19460.

If it can't match anything, for instance imagine a scenario where Amazon/Alexa developers decide to change the value of the strong attribute's class, then it will return null.

BTW, this is my output right now, queried from Turkish IP address, and, is the OK message in XML format also, I can not remember?

<?xml version="1.0" encoding="UTF-8"?>
<!-- Need more Alexa data?  Find our APIs here: https://aws.amazon.com/alexa/ -->
<ALEXA VER="0.9" URL="example.com/" HOME="0" AID="=" IDN="example.com/">
<SD><POPULARITY URL="example.com/" TEXT="19460" SOURCE="panel"/><REACH RANK="16581"/><RANK DELTA="-3629"/><COUNTRY CODE="IN" NAME="India" RANK="6517"/></SD></ALEXA>
@Findus23 commented on October 11th 2018 Member

Ah, I misunderstood your code. Now it makes sense and at least according to https://stackoverflow.com/questions/50279057/alexa-site-rank-api your solution is the only way that works anymore (you could maybe try https://www.alexa.com/minisiteinfo/stackoverflow.com as it is probably simmer)

I have now tried from multiple networks including an university network and iPv6 and in never worked for me.

@ozdemirburak commented on October 11th 2018 Contributor

Updated the URL, and now using DOMXPath if it is OK to filter out the node value since it will need a complex and not good looking regex to filter out that value.

@diosmosis commented on December 4th 2018 Member

Works for me, thanks for the great contribution @ozdemirburak !

This Pull Request was closed on December 4th 2018
Powered by GitHub Issue Mirror