Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Log analytics - UTF-8 link from search bug #5885

Closed
faitno opened this issue Jul 26, 2014 · 6 comments
Closed

Log analytics - UTF-8 link from search bug #5885

faitno opened this issue Jul 26, 2014 · 6 comments
Labels
Bug For errors / faults / flaws / inconsistencies etc. duplicate For issues that already existed in our issue tracker and were reported previously.

Comments

@faitno
Copy link

faitno commented Jul 26, 2014

I use piwik/misc/log-analytics/import_logs.py for my log analytics.
A small part of queries is defined as ????. However, if you look at the source HTML, then everything is OK.
In the picture on the link address http://yandex.ru/yandsearch?text=%EA%F3%EF%E8%F2%FC+%E2%FB%EF%F3%F1%EA%ED%EE%E5+%EF%EB%E0%F2%FC%E5&lr=213
which was well converted into Russian.
It is observed on 20% of all requests from Russian yandex and google.
piwik log analytics

@mattab mattab added the Bug label Aug 3, 2014
@mattab mattab added this to the Short term milestone Aug 3, 2014
@mattab
Copy link
Member

mattab commented Aug 3, 2014

Thanks for the report! Could you please attach a small log file that helps us reproduce the issue?

Also which command line did you run to import it?

We will investigate and fix the issue then!

@faitno
Copy link
Author

faitno commented Aug 5, 2014

I use bash script:
SCRIPT="python piwik/misc/log-analytics/import_logs.py --url=http://........../piwik/";
PARAM="--enable-http-redirects --enable-static --enable-bots --enable-http-errors --enable-reverse-dns --recorders=2 --add-sites-new-hosts";
PATHLOG="/var/log/nginx/";
for i in ls $PATHLOG;
do
if [[ "$i" = *"-access.log" ]]
then $SCRIPT $PATHLOG$i $PARAM;
echo $PATHLOG$i;
fi
done

@faitno
Copy link
Author

faitno commented Aug 8, 2014

in log file:
******.ru 95...* - - [08/Aug/2014:13:31:19 +0600] "GET /%D0%B8%D0%BC%D0%BF%D0%BB%D0%B0%D0%BD%D1%82%D0%B0%D1%86%D0%B8%D1%8F-%D0%B7%D1%83%D0%B1%D0%BE%D0%B2 HTTP/1.0" 200 68785 "http://yandex.ru/yandsearch?text=%E8%EC%EF%EB%E0%ED%F2%E0%F6%E8%FF+%E7%F3%E1%EE%E2+%E5%EA%E0%F2%E5%F0%E8%ED%E1%F3%F0%E3&lr=213" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0)"
and this visitor in piwik - screenshot:
piwik log analytics2

@mattab
Copy link
Member

mattab commented Dec 18, 2014

Hi @soulcreate thx for the report. I tried to add detection for this search engine but it didn't work. maybe @sgiehl you have some idea? my text fixture was:


- url: 'http://yandex.ru/yandsearch?text=%EA%F3%EF%E8%F2%FC+%E2%FB%EF%F3%F1%EA%ED%EE%E5+%EF%EB%E0%F2%FC%E5&lr=213'
  engine: 'Yandex'
  keywords: 'купить выпускное платье'

@faitno
Copy link
Author

faitno commented Jan 5, 2015

in php is very easy to adjust using the "rawurldecode" function
http://php.net/manual/en/function.rawurldecode.php

@mattab
Copy link
Member

mattab commented Mar 12, 2015

Issue was moved to the new repository for Piwik Log Analytics: https://github.com/piwik/piwik-log-analytics/issues

refs #7163

@mattab mattab closed this as completed Mar 12, 2015
@mattab mattab added the duplicate For issues that already existed in our issue tracker and were reported previously. label Mar 12, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug For errors / faults / flaws / inconsistencies etc. duplicate For issues that already existed in our issue tracker and were reported previously.
Projects
None yet
Development

No branches or pull requests

2 participants