@vsoch opened this Issue on May 25th 2021

Hi matomo! I am wondering if there is a best practice for ingesting S3 logs directly, ideally from S3 and not needing to sync them to the same matomo server first and then using the script? Thank you!

@sgiehl commented on May 26th 2021 Member

Hi @vsoch. Thanks for creating the issue. I assume you are talking about our log-importer?
Actually it should be possible to run the script an any server that supports python 3. You can send the extracted tracking request to any Matomo server that is reachable using the --url option. It might also be needed to set the --token-auth param manually in this case, at the log impoter might not be able to determine the token_auth automatically when running on a different server.

@vsoch commented on May 26th 2021

Thank you for the speedy response! So what would be best practice to consistently upload new logs from S3 - running a server or something like lambda alongside a Kubernetes deployment to run the log importer? Something else?

@diosmosis commented on May 26th 2021 Member

Hi @vsoch, we don't have an established best practice for this specific use case. A lambda probably wouldn't work since there's a hard 15 minute run time limit (if I recall correctly). Can you launch a kubernetes job for this? Eg, when a log file is uploaded to S3 (if that is how you are using S3), launching a kubernetes job to download and import it. There are a lot of ways to accomplish this, it really depends on what you want and how your architecture is set up.

@vsoch commented on May 26th 2021

Ah, gotcha! Thank you for this discussion - we also have in mind to do a kubernetes job, and wanted to check if there was a suggested best practice first. I can come back here and comment after we get it working. But safe to close the issue, thanks again for your help!

This Issue was closed on May 26th 2021
Powered by GitHub Issue Mirror