New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
import_logs.py ignores lines after a line with http 200 status is processed #5161
Comments
Hi again, After some investigation I found the following: The only difference between two HTTP calls (each one corresponds to one of the lines in the file) to the Piwik Server is: 'action_name': u'303/URL=http%3A%2F%2Fwww.british-library.co.uk%2Fid%2Fresource%2F013541591' Specifically, 'action_name' is only included if the hit (i.e. line in the file) is an error or redirect. If it is not included (i.e. http 200 status), then the line is ignored by Piwik Server completely. Is this behavior a bug? Do I miss something? If I alter the import_log.py script to always include 'action_name' for all lines, then all lines are included in the results of the Piwik Server. But I am not sure if by doing so I could possibly cause other issues. Looking forward to a reply by a more Piwik familiar programmer.. :-) Thanks in advance! In detail:
|
What happens is that these two lines are for the same pageview, in the same second. By design Piwik tracks a given pageview only once per second. So could you try to set the other pageview 1 or 2 seconds later and try again? |
Hi Matt, Tried (clean Piwik installation) with the following lines: 66.249.76.11 - - +0100 "GET /id/resource/013541581 HTTP/1.1" 303 - "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)" Add still the result is the same. Also, the requests are for different resources in both cases. So I assumed that they should not be counted as a single page view even if the time-stamp was exactly the same. Any ideas? Thanks, |
Attachment: |
I created test-log.log as follows:
Then imported it with: Then I see both requests in the visitor log, see screenshot: http://issues.piwik.org/attachments/5161/two%20requests%20imported.png So please upgrade to latest beta: http://piwik.org/faq/how-to-update/faq_159/ and let me know if you still have the problem? |
Hi Matt, I followed the instruction at http://piwik.org/faq/how-to-update/faq_159/. I selected --> When checking for new version of Piwik, always get: "The latest beta release", but when I press "Check for updates" I get "You are using the latest piwik version: 2.2.2".. Can you please help. Is there any direct link for piwik latest beta? Thanks again! |
I found this one: http://builds.piwik.org/piwik-2.2.3-b4.zip, which seems to be the latest. I will try it and come back to you. |
ok it works fine with 2.2.3-b4! Thanks for the help Matt! When do we expect the stable 2.2.3? Do you think I could go live with 2.2.3-b4? I think the best solution is a patch, as the bug is quite serious. Is there a patch I could apply to 2.2.2? Regards, |
Attachment: two_lines_import |
Hi again, sorry for the wrong feedback. It still does not work with 2.2.3-b4. I imported after a fresh install the two lines and only the 303 is depicted. Is the latest beta 2.2.3-b4? Thanks, |
Hi @vspiliop do you still experience the issue with 2.9.1 ? please let us know, thanks |
Issue was moved to the new repository for Piwik Log Analytics: https://github.com/piwik/piwik-log-analytics/issues refs #7163 |
Hello to all!
I am using piwik for a customer and just found out the following very serious issue.
I am using the latest piwik (2.2.2), php 5.4.26 and Python 2.7 (r27:82525, Jul 4 2010, 09:01:59) v.1500 32 bit (Intel) on win32.
PROBLEM:
All lines (in the web log) after a line with HTTP status 200 are ignored!! i.e. in the following example only the first entry is included both to the Visits and to the Actions. This applies before or after I do the achieving. So archiving is irrelevant.
I just import (access.log : file with just 2 lines):
66.249.76.11 - - +0100 "GET /id/resource/013541589 HTTP/1.1" 303 - "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
66.249.76.11 - - +0100 "GET /doc/resource/007667232 HTTP/1.1" 200 - "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
via command:
analytics/ access.log --idsite=1 --recorders=2 --enable-http-errors --enable-http-redirects --enable-static --ena
ble-bots
Result:
0 lines parsed, 0 lines recorded, 0 records/sec (avg), 0 records/sec (current)
Parsing log access_006_bl.services.tso.co.uk.2014.05.12.log...
Purging Piwik archives for dates: 2014-05-11
To re-process these reports with your new update data, execute the following command:
Reference: http://piwik.org/docs/setup-auto-archiving/
Logs import summary
Website import summary
Performance summary
Kind Regards,
Vassilis
The text was updated successfully, but these errors were encountered: