The goal of this ticket is to create an easy to use tool that will let anyone import in Piwik their Google Analytics data.
Google Analytics Importer
This issue requires #6094 (Create an API to let users import historical report data in Piwik)
Will importing history from GA cause duplication in Piwik? e.g. For a site which has been using GA for a year, but only 3 months in Piwik, we might want to import the previous 9 months to get 1 year of history.
But not the overlapping recent 3 months.
Do you plan on deduping or providing a data range filter?
Thanks
What's the overall status of this? It seems that there's no activity at all in both issues. Do you have this in your short term roadmap?
There is no update and while it is currently in the Short term
it may not be done in short term due to the complexity of the task.
We are really looking forward to this feature in our company. We cannot ask our clients to use our central Piwik installation until this has been released, since they have been relying on Google Analytics for too long and wish to keep their historical data.
What about importing raw google analytics data? I have a copy of every hit saved, can these be imported into piwik?
We won't be able to work on this, unless someone can sponsor us to work on it. Please contact us via https://piwik.org/development/ if you're able to sponsor the team!
Why do not we/you create a crowdfounding campaign? I think it would be of great success...
+10 (+10$ at least in a crowdfunding)
Possible already-available solution:
Is there a tool that allows us to export from Google Analytics to Apache-log similar access.log
files?
Then it would be easy to re-import to Piwik, with import_logs.py
...
What do you think?
@mattab how much money do you need to make the programming? I am very interested to get it.
Do you have some estimate price/hours to develop this feature?
+1
I am absolutely willing to pay for such a plugin/script.
+1 I would pay. @mattab can you create a kickstarter or similar?
Same for me. I'd be happy to contribute financially to help this feature / tool occur!
I have years of Google Analytics data logged before I learnt about Piwik, and I'm sure I'm not the only one. There'd be great value in implementing a tool like this if you guys can pull it off.
I would like to offer a development of such functionalities (that includes the API and the plugin to import GA data). A monthly financial contribution is crucial. Do you want me to set up a Patreon (or similar)?
In case of questions, please ask on sd@szymondukla.co.uk
We'd also like to work on this, but so far the problem is how would it technically work? @SzymonDukla do you have some thoughts on this? we haven't found a GA API that we can use to import all the data safely and accurately yet...
I have a project behind me now that involves Analytics data gathering through GA API. It's nearly (as close as we can really get) to raw data. I have a nice and flexible API wrapper written in PHP I'm going to use.
I've set up a Patreon: https://www.patreon.com/ga_for_piwik
I'll be answering all the questions on there.
Hi @SzymonDukla what's your idea of getting as close as you can get raw data? would you call the GA API with many dimensions to get raw data or so?
+1 i'd pay too. Please,please, help us to opt out of GA. I guess it would be a great functionnality of piwik!
I can see chats are going on on this subject, nice.
regards.
f.
Sorry for the delay, last two weeks have been crazy...
Getting back on this project, that's correct @mattab - I have a side project that involves getting and storing all the possible data from GA accounts and then generating monthly/weekly/yearly SEO reports based on that so I think I will have the same approach which is getting as many metrics in as many configurations (as some metrics are now working with other ones) exported into a JSON or similar file that then will be imported via Piwik plugin, processed and merged into Piwik itself.
@fabrice-regnier, thank you very much for your interest! I do that because I have the same problem - years worth of data on Google I have no way to export.
As mentioned, I've set up a Patreon (https://www.patreon.com/ga_for_piwik), let's move all further discussion on this project there for those interested :)
will have the same approach which is getting as many metrics in as many configurations (as some metrics are now working with other ones) exported into a JSON or similar file that then will be imported via Piwik plugin, processed and merged into Piwik itself.
Did you mean 'metrics' or rather 'dimensions'? Our previous version of the GA importer (since deprecated) used this trick, the problem was that back in the days there were not enough precise dimensions, so we couldn't get useful "Per visitor" information and import all visitors.
Could you show us an example of how you'll export the GA data to then import it in Piwik?
As mentioned, I've set up a Patreon (https://www.patreon.com/ga_for_piwik), let's move all further discussion on this project there for those interested :)
I was really interested in this and went to back your patreon and then I noticed that you are asking for over $2000/month to create the plugin, and then expect most of us to purchase it on top of that. I have a great appreciation for the amount of work involved and needing to be able to divert your attention from other projects, but I really prefer a game to have either free to play, pay to upgrade or pay to play, free to upgrade models of pricing, and REALLY hate when they stick it to me on both ends, which is exactly what this feels like.
It looks like @SzymonDukla is no longer supporting this issue and the patreon is not running. Can someone update the issue, outline who is responsible for creating or outsourcing this piece of work?
Hey @mattab, any updates on this?
No update @suzigrishpul
But so far we are not sure of the technical feasability of a GA->Matomo migration. If anyone is expert in GA API and has some idea how we could migrate data over to Matomo, we'd love to hear. We are pretty much keen to implement it in the next few months if it's technically possible.
(We likely will spend a few days investigating the feasability, later this year, if nobody comments here with points.)
@mattab cool, thanks! I wish I could help, will just watch for an update for now.
@mattab I can look at getting some of my guys involved in this later in February there is no problem. We are in the process of developing additional implantation with our CMS engine and Matamo. We did this previously with the Google api. So when we are finished this could dig much deeper into this.
that's really exciting news @alexgogan !
@alexgogan be good to keep us up to date and let us know if we can be of any help. We're keen on investigating this as well.
Also in general: It's obviously not the same but may be sometimes better than nothing to use Log Analytics to import old data if the logs still exist.
I've done some research into the feasibility of importing log data from Google Analytics into Matomo. I've come up with two possible solutions, one which probably isn't workable, and another whose feasibility won't become apparent until a PoC is built. Both solutions allow the visitor log to function w/ the imported data. If we just import reports, this won't be the case.
First, here are the limitations with Google Analytics API:
Here is the solution that probably isn't workable, but would be easier to understand:
userActivity.search
API method (https://developers.google.com/analytics/devguides/reporting/core/v4/rest/v4/userActivity/search?authuser=2) to query actions for all users in a site.Pros
Cons
The other solution uses the reporting API to make multiple requests that are combined into a result we can use:
Explanation for combining reports to get a report for every dimension:
Ideally, we want to get the number of hits for every dimension value (eg, browser name, action time, referrer name, etc.). Since there's a limit of 7 dimensions, we can't do this. Instead we have to find some way to combine queries for less dimensions. To illustrate how we do this, let's limit the number of dimensions and dimension values. We'll say there are 3 dimensions total (DA, DB, DC), 1 dimension value for each (A, B, C), and GA only allows selecting 2 dimensions.
So w/ these limitations, we have to find out what the report for dimensions = [DA, DB, DC] when we can only query two at a time. We'll do this by first querying every possible combination of two dimensions:
- | DA | DB | DC |
---|---|---|---|
DA | - | N(A, B) | N(A, C) |
DB | N(A, B) | - | N(B, C) |
DC | N(A, C) | N(B, C) | - |
(where N(...)
is just a placeholder for the hits)
The thing to understand here is that the hits for dimensions = [DA, DB] is the cardinality of the subset of all actions where DA = A & DB = B. We want to find the hits for dimensions = [DA, DB, DC], which is the cardinality of the subset of all actions where DA = A, DB = B & DC = C.
This is the intersection of sets (DA + DB), (DA + DC), (DB + DC), which means we can calculate N(A, B, C) by getting the minimum value of the cardinality of all these sets, ie: |DA + DB + DC| = min(|DA + DB|, |DA + DC|, |DB + DC|)
.
The idea is to use this property and apply to all the dimensions we're interested in and all their queried values. This will give us essentially the properties for every action.
Pros
N * (N - 1) * D
where N is the number of dimensions we need & D is the number of days. It will take some time to hit the query limit, no matter what the scale of the site is.Cons
EDIT: One way to get user IDs/client IDs would be to issue requests like those that are made in the UI. I'm not sure how well that would work, and I'm guessing the page size will be smaller so it would take more requests. Also I am not sure if there are any limits to doing that, since that's outside of the API.
I would say raw data may not be as important? Of course be great to have it for segmentation, but maybe not needed eg if there's a problem with the 50K req/day limit?
@diosmosis thanks for the report. Sounds like importing RAW data is tricky and not easily do-able at this time.
So we probably want to import Aggregated reports only. And forget about RAW data.
-> As a next step, would be interesting to understand what it would take to build a Google Analytics Report data importer?
@mattab We would map matomo dimensions to GA dimensions, query for report data, then map GA values to Matomo values (ie, dimensions). It's pretty straightforward (though value mapping might be difficult). We could also allow users to supply segments to make available (we can add segments to the report fetching API method).
In V1 we would just want to keep it simple and import "All visits" segment (ie. not import any segment and segment data from GA)
Status update:
Next things todo:
Hi Everyone! good news, we are almost done with the Google Analytics to Matomo Importer implementation. We'll start the beta release in a few days. Get ready for testing to import your Google Analytics data in Matomo :sunglasses:
OMG THIS IS THE BEST NEWS!!!! thank you so much!
On Sun, Aug 11, 2019, 4:59 PM Matthieu Aubry <notifications@github.com>
wrote:
Hi Everyone! good news, we are almost done with the Google Analytics to
Matomo Importer implementation. We'll start the beta release in a few days.
Get ready for testing to import your Google Analytics data in Matomo 😎—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/matomo-org/matomo/issues/6095?email_source=notifications&email_token=AD4OCBKRIZBKZSCMIGK2GRLQECKUHA5CNFSM4ATUFKS2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD4BKUOY#issuecomment-520268347,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AD4OCBN5BWIBBGDCGZM7IXLQECKUHANCNFSM4ATUFKSQ
.
@mattab . Thanks for this great work, Please I have migrated from 3.11 to 3.12 but I don't see the google analytics import section.
Thank you,
Hi @kimoudev the plugin hasn't been released yet, there are still a couple import accuracy bugs to work through before it's ready for a public beta. Should be soon though.
Hi @diosmosis . Ok, Thank you so much.
Hi Everyone,
We're looking forward to hearing how it goes for you to import your GA data into Matomo.
Kuddos @diosmosis for building this tool :muscle:
We're happy to help you migrate your GA data into Matomo and stay in control! :rocket:
Thanks to everyone who will help test this tool and report their feedback.
Hello
Thanks thanks a lot. You have made a big collaboration that makes
internet more free and less controlled by Google, the Big Brother nowadays.
Thanks again
On 22/8/19 12:00, Matthieu Aubry wrote:
Hi Everyone,
The Google Analytics Importer for Matomo is now released and available to all! 🎉
- Get the plugin from:
https://plugins.matomo.org/GoogleAnalyticsImporter (see also: How
do I install a plugin in Matomo?
https://matomo.org/faq/plugins/faq_21/)- Read the user guide at:
https://matomo.org/docs/google-analytics-importer/- Report your feedback here in this issue, or if you find a bug
please create a new issue in the plugin's issue tracker at:
https://github.com/matomo-org/plugin-GoogleAnalyticsImporter/issuesWe're looking forward to hearing how it goes for you to import your GA
data into Matomo.
Kuddos @diosmosis https://github.com/diosmosis for building this tool 💪
We're happy to help you migrate your GA data into Matomo and stay in
control! 🚀
Thanks to everyone who will help test this tool and report their feedback.—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
https://github.com/matomo-org/matomo/issues/6095?email_source=notifications&email_token=AAOBSY3YDZAALGHETVOW5WLQFZPSHA5CNFSM4ATUFKS2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD44R4EY#issuecomment-523836947,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAOBSY2LWVHIMA6BBTMN4ADQFZPSHANCNFSM4ATUFKSQ.
Hope you enjoy the Google Analytics importer tool. You can post your feedback in the issue tracker: https://github.com/matomo-org/plugin-GoogleAnalyticsImporter/issues
Will close it now as we consider it done. Exciting! :tada: