Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

import_logs.py fail to populate actions/page tables #4946

Closed
anonymous-matomo-user opened this issue Apr 3, 2014 · 10 comments
Closed

import_logs.py fail to populate actions/page tables #4946

anonymous-matomo-user opened this issue Apr 3, 2014 · 10 comments
Assignees
Labels
Bug For errors / faults / flaws / inconsistencies etc. Major Indicates the severity or impact or benefit of an issue is much higher than normal but not critical.
Milestone

Comments

@anonymous-matomo-user
Copy link

In piwik 2.1 with NCSA extended logs (apache) I'm not able to view page actions but errors, statics and redirections.

Version: 2.1.1b10

python /var/www/html/logimport/misc/log-analytics/import_logs.py --recorders=8 --url=http://localhost/logimport/ /tmp/test30032014.log --login=admin --password=pass --token-auth=xxxxxxx --idsite=1

0 lines parsed, 0 lines recorded, 0 records/sec (avg), 0 records/sec (current)
Parsing log /tmp/test30032014.log...
Purging Piwik archives for dates: 2014-03-30
To re-process these reports with your new update data, execute the piwik/misc/cron/archive.php script, or see: [piwik.org] for more info.

Logs import summary

102 requests imported successfully
107 requests were downloads
398 requests ignored:
0 invalid log lines
11 requests done by bots, search engines, ...
19 HTTP errors
76 HTTP redirects
292 requests to static resources (css, js, ...)
0 requests did not match any known site
0 requests did not match any requested hostname

Website import summary

102 requests imported to 1 sites
1 sites already existed
0 sites were created:

0 distinct hostnames did not match any existing site:

Performance summary

Total time: 0 seconds
Requests imported per second: 148.91 requests per second

/usr/bin/php /var/www/html/logimport/console core:archive --url=http://localhost/logimport/ -v

INFO CoreConsole07:32:39 ---------------------------
INFO CoreConsole07:32:39 INIT
INFO CoreConsole07:32:39 Piwik is installed at: http://localhost/logimport/index.php
INFO CoreConsole07:32:39 Running Piwik 2.1.1-b10 as Super User: piwikadmin
INFO CoreConsole07:32:40 ---------------------------
INFO CoreConsole07:32:40 NOTES
INFO CoreConsole07:32:40 - If you execute this script at least once per hour (or more often) in a crontab, you may disable 'Browser trigger archiving' in Piwik UI > Settings > General Settings.
INFO CoreConsole07:32:40 See the doc at: [piwik.org]
INFO CoreConsole07:32:40 - Reports for today will be processed at most every 3600 seconds. You can change this value in Piwik UI > Settings > General Settings.
INFO CoreConsole07:32:40 - Reports for the current week/month/year will be refreshed at most every 3600 seconds.
INFO CoreConsole07:32:40 - Archiving was last executed without error 5 min 1s ago
INFO CoreConsole07:32:40 - Will process 0 websites with new visits since 5 min 0s
INFO CoreConsole07:32:40 - Will process 1 other websites because some old data reports have been invalidated (eg. using the Log Import script) , IDs: 1
INFO CoreConsole06:32:40 ---------------------------
INFO CoreConsole06:32:40 START
INFO CoreConsole06:32:40 Starting Piwik reports archiving...
INFO CoreConsole06:32:41 Archived website id = 1, period = day, Time elapsed: 1.113s
INFO CoreConsole06:32:42 Archived website id = 1, period = week, 29 visits, Time elapsed: 0.943s
INFO CoreConsole06:32:53 Archived website id = 1, period = month, 0 visits, Time elapsed: 10.732s
INFO CoreConsole06:33:00 Archived website id = 1, period = year, 47440 visits, Time elapsed: 7.205s
INFO CoreConsole06:33:00 Archived website id = 1, today = 0 visits, 4 API requests, Time elapsed: 20.004s done
INFO CoreConsole06:33:00 Done archiving!
INFO CoreConsole06:33:00 ---------------------------
INFO CoreConsole06:33:00 SUMMARY
INFO CoreConsole06:33:00 Total daily visits archived: 0
INFO CoreConsole06:33:00 Archived today's reports for 1 websites
INFO CoreConsole06:33:00 Archived week/month/year for 1 websites
INFO CoreConsole06:33:00 Skipped 0 websites: no new visit since the last script execution
INFO CoreConsole06:33:00 Skipped 0 websites day archiving: existing daily reports are less than 3600 seconds old
INFO CoreConsole06:33:00 Skipped 0 websites week/month/year archiving: existing periods reports are less than 3600 seconds old
INFO CoreConsole06:33:00 Total API requests: 4
INFO CoreConsole06:33:00 done: 1/1 100%, 0 v, 1 wtoday, 1 wperiods, 4 req, 20073 ms, no error
INFO CoreConsole06:33:00 Time elapsed: 20.074s
INFO CoreConsole06:33:00 ---------------------------
INFO CoreConsole06:33:00 SCHEDULED TASKS
INFO CoreConsole06:33:00 Starting Scheduled tasks...
INFO CoreConsole06:33:00 No task to run
INFO CoreConsole06:33:00 done
INFO CoreConsole06:33:00 ---------------------------
Keywords: import_logs.py

@anonymous-matomo-user
Copy link
Author

Attachment:
test.log.zip

@anonymous-matomo-user
Copy link
Author

Attachment: Overview with no pageviews recorded
Screenshot from 2014-04-01 10:06:02.png

@anonymous-matomo-user
Copy link
Author

Attachment:
Screenshot from 2014-04-01 07:32:41.png

@mattab
Copy link
Member

mattab commented May 1, 2014

This seems to be a frequent problem experienced also in the forums by several users <- check out this post with a patch that may solve the issue.

Increasing priority because many users reported it, and it results in data loss when importing logs.

@tsteur
Copy link
Member

tsteur commented May 5, 2014

In fe7b59c: refs #4946 commiting patch from forum post which makes sure to always have an action_name and therefore always have a title. From what I can see so far this does not fix the actual issue as I am still not able to import data from yesterday

@tsteur
Copy link
Member

tsteur commented May 5, 2014

Tried again a few times and it might work now. Would you mind testing?

@mattab
Copy link
Member

mattab commented May 5, 2014

In 5396cc9: Fixes #4946

  • Set idaction_name to 0 instead of NULL. This should fix the error of requests not recorded.
  • Only set the custom variables if it's not already set. Tests show it breaks log replay of ecommerce logs (which use custom variable slots)

@mattab
Copy link
Member

mattab commented May 5, 2014

In 7f50bdf: Refs #4946 Fix typo + tests

@mattab
Copy link
Member

mattab commented May 5, 2014

In 1bcac2d: Fix tests refs #4946

@mattab
Copy link
Member

mattab commented May 14, 2014

The fix was incorrect, see follow up ticket: #5113 'Page Name not defined' in page title reports - Outlinks being tracked as pages

@anonymous-matomo-user anonymous-matomo-user added this to the 2.2.1 - Piwik 2.2.1 milestone Jul 8, 2014
sabl0r pushed a commit to sabl0r/piwik that referenced this issue Sep 23, 2014
… to always have an action_name and therefore always have a title. From what I can see so far this does not fix the actual issue as I am still not able to import data from yesterday
sabl0r pushed a commit to sabl0r/piwik that referenced this issue Sep 23, 2014
 * Set idaction_name to 0 instead of NULL. This should fix the error of requests not recorded.
 * Only set the custom variables if it's not already set. Tests show it breaks log replay of ecommerce logs (which use custom variable slots)
sabl0r pushed a commit to sabl0r/piwik that referenced this issue Sep 23, 2014
sabl0r pushed a commit to sabl0r/piwik that referenced this issue Sep 23, 2014
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug For errors / faults / flaws / inconsistencies etc. Major Indicates the severity or impact or benefit of an issue is much higher than normal but not critical.
Projects
None yet
Development

No branches or pull requests

3 participants