@tsteur opened this Issue on August 25th 2021 Member

Seeing this in the tests

image

and an error in the UI

image

Seems this regressed.

@sgiehl commented on August 25th 2021 Member

That actually seems to fail randomly

@Findus23 commented on August 25th 2021 Member

This is more or less expected as it is crawling websites that randomly block or rate limit the requests.

@tsteur commented on August 25th 2021 Member

@Findus23 so it's maybe now also blocked on travis because we run tests more often maybe? Maybe in that case we need to write the tests slightly different to ideally detect rate limit and ignore it (skip test or so) but not fail when the data or URL changes (not fail for any other error).

To fix it on cloud maybe we would need to request the data using JavaScript to go directly to the search engines? But I assume then it might be bit harder to test things work?

@Findus23 commented on August 26th 2021 Member

But then again I thought we changed the tests to test against a local copy of the HTTP response, but I might be wrong here

@sgiehl commented on August 26th 2021 Member

I guess there are still some tests that try to fetch the data from bing, alexa and google to ensure the response doesn't change

@tsteur commented on August 26th 2021 Member

created an internal issue for cloud to fix the rate limiting there where we might simply try to get data using JS directly from search engine or so.

Meanwhile as part of this issue be great to look into making the test stable. Like we could force to run into the rate limit and then check if the response code or response message includes rate limit information then we can catch this error specifically and ignore it in tests.

@geekdenz commented on September 14th 2021 Contributor

That actually seems to fail randomly

You are right @sgiehl ! That was the clue.

See https://github.com/matomo-org/matomo/blob/41b8240fcf79291dbabb2e30a38f321d44b64cf5/plugins/SEO/tests/Integration/SEOTest.php#L42

I debugged it and it seems it randomly fails on the User Agent that is passed.

So, I wrote this script:
https://gist.github.com/geekdenz/2496eaaf7c437ba49bc389e75a10b880

Ran it and found that it outputs an empty search result on offset $i % 3 === 2:

image

So, it seems to be this UA that it does not like:

https://github.com/matomo-org/matomo/blob/41b8240fcf79291dbabb2e30a38f321d44b64cf5/plugins/SEO/tests/Integration/SEOTest.php#L39

I'll change it to a different one, as Bing seems to not work with that. Also, I think it should maybe not be completely random and should be different on every subsequent call. But i realise that might be hard to do. Could it be $build_no % count($user_agents)?

@geekdenz commented on September 14th 2021 Contributor

FYI I was able to reproduce this in Safari changing the UA string to Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/59.0.3071.115 Safari/537.36

image
@justinvelluppillai commented on September 14th 2021 Contributor

@geekdenz nice find. Don't know if mt_rand is great in a test either, wonder if it could do something along the lines of either using each of the user agents one after the other or what @tsteur suggested and keep trying a different user agent if one fails (depending on whether there's a reason to test different user agents or not).

@geekdenz commented on September 14th 2021 Contributor

I think using the same user agent every time would suffice actually.

@geekdenz commented on September 14th 2021 Contributor

@sgiehl @tsteur

Why are we using an array of user agents here? I feel like there would have been an original reason/issue with just using one. Or was it just added to avoid a problem that was merely anticipated?

@sgiehl commented on September 15th 2021 Member

@geekdenz The code to query the metrics uses $_SERVER['HTTP_USER_AGENT']. So the useragent of the current user. I wonder if it might make sense to check if the results are empty and retry the query with another useragent. So the results are fetched correctly even if the useragent is not supported 🤔
Using random user agents in tests sounds valid. Otherwise we might not have discovered this. But would be good to comment in the code somewhere that bing might not return results for some useragents maybe

@justinvelluppillai commented on September 15th 2021 Contributor

@sgiehl @geekdenz I think the test should either test a few different UAs or just one rather than random. If we test one, we could use that same one with confidence for the backup useragent to retry with on empty results. Maybe worth making that as a separate issue - I've merged this one for now to prevent the failing test meantime

@tsteur commented on September 17th 2021 Member

@geekdenz this still seems to be failing randomly see https://app.travis-ci.com/github/matomo-org/matomo/jobs/537951969#L1130-L1130
image

or am I looking wrong? The test ran after the other PR was merged AFAIK

@geekdenz commented on September 17th 2021 Contributor

Yeah. Thanks for pointing it out. I saw this earlier as well but thought it might be an artifact of a merge not having happened somewhere. I think maybe the find I did was just looking like this in Safari. I'll have another look in Chrome. And then I might reinstate the array and make it retry with another UA if the test fails on the first etc. and only have it throw a failure of all fails.

Powered by GitHub Issue Mirror