@anonymous-matomo-user opened this Issue on July 11th 2013

Script import_logs.py seems to hang at log format detection if first line of the log is a wierd request.

I run import_logs.py in daily crontab -> I first grep the access logs for yesterdays date and then import that using import_logs.py. Unfortunately, it seems that first request yesterday was a bit wierd. That seems to break the script's log detection and script hangs indefinitely.

2013-07-11 04:11:36,728: [DEBUG] Detecting the log format
2013-07-11 04:11:36,728: [DEBUG] Format icecast2 does not match
2013-07-11 04:11:36,728: [DEBUG] Format iis does not match
2013-07-11 04:11:36,728: [DEBUG] Format s3 does not match
2013-07-11 04:11:36,728: [DEBUG] Format common_complete does not match
2013-07-11 04:11:36,728: [DEBUG] Format common does not match
2013-07-11 04:11:36,728: [DEBUG] Format common_vhost does not match
2013-07-11 04:11:36,729: [DEBUG] Format ncsa_extended does not match
0 lines parsed, 0 lines recorded, 0 records/sec (avg), 0 records/sec (current)
0 lines parsed, 0 lines recorded, 0 records/sec (avg), 0 records/sec (current)
(and this repeats indefinitely)

This is the log file which causes it to hang:

204.195.126.203 - - [10/Jul/2013:00:48:51 -0500] "\x80w\x01\x03\x01" 302 287 "-" "-"
204.195.126.203 - - [10/Jul/2013:00:48:51 -0500] "GET /HNAP1/ HTTP/1.1" 302 295 "http://208.82.205.193/" "Opera/9.0 (Windows NT 5.1; U; en)"
180.76.5.20 - - [10/Jul/2013:00:50:28 -0500] "GET /index.page HTTP/1.1" 302 314 "-" "Mozilla/5.0 (compatible; Baiduspider/2.0; +http://www.baidu.com/search/spider.html)"
@mattab commented on August 9th 2013 Member

Linked from #3163

@anonymous-matomo-user commented on September 25th 2013

In 1fac8e8e5c004b0c44abbd89c894ffc3b4120d0a: Fixes #4045, fail log importer if we cannot match a format to the first line of the logfile.

This Issue was closed on September 25th 2013
Powered by GitHub Issue Mirror