Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem importing w3c logs #11824

Closed
magnus-84 opened this issue Jun 28, 2017 · 1 comment
Closed

Problem importing w3c logs #11824

magnus-84 opened this issue Jun 28, 2017 · 1 comment
Labels
answered For when a question was asked and we referred to forum or answered it.

Comments

@magnus-84
Copy link

magnus-84 commented Jun 28, 2017

Hello

I have problems trying to import W3C logs from Incapsula services in to piwik. Below is the line i use to try to import the logfile. IP and domain info have been changed for protection.

/usr/bin/python /var/www/html/piwik/misc/log-analytics/import_logs.py --url=http://10.1.2.3 --idsite=8 --recorders=4 --enable-http-errors --enable-http-redirects --enable-static --enable-bots --log-format-name=w3c_extended --w3c-fields='#Fields: date time cs-vid cs-clapp cs-browsertype cs-js-support cs-co-support c-ip s-caip cs-clappsig s-capsupport s-suid cs(User-Agent) cs-sessionid s-siteid cs-countrycode s-tag cs-cicode s-computername cs-lat cs-long s-accountname cs-uri cs-postbody cs-version sc-action s-externalid cs(Referrer) s-ip s-port cs-method cs-uri-query sc-status s-xff cs-bytes cs-start cs-rule cs-severity cs-attacktype cs-attackid s-ruleName' /root/web.log --debug --debug

Debug output below

2017-06-28 11:21:17,172: [DEBUG] Accepted hostnames: all
2017-06-28 11:21:17,172: [DEBUG] Piwik Tracker API URL is: http://10.1.2.3
2017-06-28 11:21:17,172: [DEBUG] Piwik Analytics API URL is: http://10.1.2.3
2017-06-28 11:21:17,172: [DEBUG] No token-auth specified
2017-06-28 11:21:17,172: [DEBUG] No credentials specified, reading them from "/var/www/html/piwik/config/config.ini.php"
2017-06-28 11:21:17,240: [DEBUG] Authentication token token_auth is: 90871c8584ddf2265f54553a305b6ae1
2017-06-28 11:21:17,240: [DEBUG] Resolver: static
0 lines parsed, 0 lines recorded, 0 records/sec (avg), 0 records/sec (current)
2017-06-28 11:21:17,343: [DEBUG] Launched recorder
2017-06-28 11:21:17,343: [DEBUG] Launched recorder
2017-06-28 11:21:17,344: [DEBUG] Launched recorder
2017-06-28 11:21:17,344: [DEBUG] Launched recorder
Parsing log /root/web.log...
2017-06-28 11:21:17,345: [DEBUG] Based on 'Fields:' line, computed regex to be (?P\d+[-\d+]+\s+[\d+:]+)[.\d]?\s+(?:".?"|\S+)\s+(?:".?"|\S+)\s+(?:".?"|\S+)\s+(?:".?"|\S+)\s+(?:".?"|\S+)\s+"?(?P[\w*.:-])"?\s+(?:".?"|\S+)\s+(?:".?"|\S+)\s+(?:".?"|\S+)\s+(?:".?"|\S+)\s+(?P<user_agent>".?"|\S*)\s+(?:".?"|\S+)\s+(?:".?"|\S+)\s+(?:".?"|\S+)\s+(?:".?"|\S+)\s+(?:".?"|\S+)\s+(?:".?"|\S+)\s+(?:".?"|\S+)\s+(?:".?"|\S+)\s+(?:".?"|\S+)\s+(?:".?"|\S+)\s+(?:".?"|\S+)\s+(?:".?"|\S+)\s+(?:".?"|\S+)\s+(?:".?"|\S+)\s+(?:".?"|\S+)\s+(?:".?"|\S+)\s+(?:".?"|\S+)\s+(?:".?"|\S+)\s+(?P<query_string>\S*)\s+(?P\d+)\s+(?:".?"|\S+)\s+(?:".?"|\S+)\s+(?:".?"|\S+)\s+(?:".?"|\S+)\s+(?:".?"|\S+)\s+(?:".?"|\S+)\s+(?:".?"|\S+)\s+(?:".?"|\S+)
2017-06-28 11:21:17,350: [DEBUG] Invalid line detected (line did not match): #Software: Incapsula LOGS API

2017-06-28 11:21:17,350: [DEBUG] Invalid line detected (line did not match): #Version: 1.1

2017-06-28 11:21:17,350: [DEBUG] Invalid line detected (line did not match): #Date: 28/Jun/2017 07:28:59

2017-06-28 11:21:17,350: [DEBUG] Invalid line detected (line did not match): #Fields: date time cs-vid cs-clapp cs-browsertype cs-js-support cs-co-support c-ip s-caip cs-clappsig s-capsupport s-suid cs(User-Agent) cs-sessionid s-siteid cs-countrycode s-tag cs-cicode s-computername cs-lat cs-long s-accountname cs-uri cs-postbody cs-version sc-action s-externalid cs(Referrer) s-ip s-port cs-method cs-uri-query sc-status s-xff cs-bytes cs-start cs-rule cs-severity cs-attacktype cs-attackid s-ruleName

2017-06-28 11:21:17,351: [DEBUG] Invalid line detected (line did not match): "2017-06-28" "07:26:35" "a1f36498-c34a-45b9-b3a5-ee0bd00f91b6" "Chrome" "Browser" "false" "true" "123.123.123.123" "" "62a660e57ba257275cf7ccf699919eae18e07e84cb11c1075e99b1be98456059d3064ec14d3932ba6e89f5393a158b8b8c2572ad7ad7dadb0fe02a34ae4c3d504c035017bf9a6a7802bb898226378938" "NA" "774502" "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36" "452000660051880893" "44850949" "SE" "LS" "Stockholm" "www.example.com" "32.0000" "32.0000" "Customer" "www.example.com/artiklar/x/y/z/" "" "HTTP" "REQ_PASSED" "118866685985031205" "" "124.124.124.124" "80" "GET" "" "200" "123.123.123.123" "10117" "1498634795555" "" "" "" "" ""

Logs import summary

0 requests imported successfully
0 requests were downloads
5 requests ignored:
    0 HTTP errors
    0 HTTP redirects
    5 invalid log lines
    0 requests did not match any known site
    0 requests did not match any --hostname
    0 requests done by bots, search engines...
    0 requests to static resources (css, js, images, ico, ttf...)
    0 requests to file downloads did not match any --download-extensions

Website import summary

0 requests imported to 1 sites
    1 sites already existed
    0 sites were created:

0 distinct hostnames did not match any existing site:

Performance summary

Total time: 0 seconds
Requests imported per second: 0.0 requests per second

Original logfile example below.

#Software: Incapsula LOGS API
#Version: 1.1
#Date: 28/Jun/2017 07:28:59
#Fields: date time cs-vid cs-clapp cs-browsertype cs-js-support cs-co-support c-ip s-caip cs-clappsig s-capsupport s-suid cs(User-Agent) cs-sessionid s-siteid cs-countrycode s-tag cs-cicode s-computername cs-lat cs-long s-accountname cs-uri cs-postbody cs-version sc-action s-externalid cs(Referrer) s-ip s-port cs-method cs-uri-query sc-status s-xff cs-bytes cs-start cs-rule cs-severity cs-attacktype cs-attackid s-ruleName
"2017-06-28" "07:26:35" "a1f36498-c34a-45b9-b3a5-ee0bd00f91b6" "Chrome" "Browser" "false" "true" "123.123.123.123" "" "62a660e57ba257275cf7ccf699919eae18e07e84cb11c1075e99b1be98456059d3064ec14d3932ba6e89f5393a158b8b8c2572ad7ad7dadb0fe02a34ae4c3d504c035017bf9a6a7802bb898226378938" "NA" "774502" "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36" "452000660051880893" "44850949" "SE" "LS" "Stockholm" "www.example.com" "32.0000" "32.0000" "Customer" "www.example.com/artiklar/x/y/z/" "" "HTTP" "REQ_PASSED" "118866685985031205" "" "124.124.124.124" "80" "GET" "" "200" "123.123.123.123" "10117" "1498634795555" "" "" "" "" ""

I gues the problem is somthing in the regex? Any help would be appriciated. I have no knowledge of regex myself.

Regards
Magnus

@sgiehl
Copy link
Member

sgiehl commented Jun 30, 2017

@magnus-84: I've recreated the issue in the log importer repo: matomo-org/matomo-log-analytics#179

@sgiehl sgiehl closed this as completed Jun 30, 2017
@sgiehl sgiehl added the answered For when a question was asked and we referred to forum or answered it. label Jun 30, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
answered For when a question was asked and we referred to forum or answered it.
Projects
None yet
Development

No branches or pull requests

2 participants