@jmumby opened this Issue on July 1st 2020

Google Sheets allows you to import data directly from HTTPS in the formula field. An example could be:

=IMPORTDATA("https://example.com/analytics/index.php?date=yesterday&filter_limit=10&flat=1&format=CSV&idCustomReport=3&idSite=5&language=en&method=CustomReports.getCustomReport&module=API&period=day&reportUniqueId=CustomReports_getCustomReport_idCustomReport--3&segment=&token_auth=XYZ&translateColumnNames=1")

You would hope the data would look like as follows:

image

However, it imports like this:

image

https://support.google.com/docs/answer/3093335?hl=en

@tsteur commented on July 1st 2020 Member

If anyone has some insights why that is the case, be great to let us know

@jmumby commented on July 1st 2020

I have investigated this further. Importing via IMPORTDATA() with the file in UTF-8 encoding does not have this issue. Importing with the file import feature (File > Import) of sheets also does not have this issue with either UTF-8 or UTF-16. It seems IMPORTDATA() does not support UTF-16 encoding.

@jmumby commented on July 2nd 2020

The changes made here: https://github.com/matomo-org/matomo/issues/8898#issuecomment-148374890 let you import directly.

@Findus23 commented on July 2nd 2020 Member

@jmumby Welcome in the hell of CSV files and Excel.
https://donatstudios.com/CSV-An-Encoding-Nightmare

For whatever insane reasons Excel cannot read a simple UTF-8 file that contains text separated by simple ";" or "," characters.

If you try to open this very simple UTF-8 file as a example.csv in Excel, it won't work any will only see corrupted characters:

product,€
test,2
something,20

(I can't test this myself as I don't have Windows, but when I tested this years ago, I already lost enough nerves)

The only way one can output a text file that is readable and writable* by Excel is to encode it as UTF-16 and add a Byte-Order-Mark (BOM) at the start of the file (which is a simple FEFF character that should not be needed by any software). And to make Matomo work with Excel, this is what Matomo does at the reporting API:

➜  ~/tmp curl "https://demo.matomo.org/index.php?date=yesterday&expanded=1&filter_limit=10&format=CSV&idSite=62&language=de&method=UserCountry.getContinent&module=API&period=day&segment=&token_auth=anonymous&translateColumnNames=1" -o test.csv
➜  ~/tmp file -kr test.csv
test.csv: CSV text
- , Little-endian UTF-16 Unicode text, with very long lines
➜  ~/tmp hexdump test.csv -n 16
0000000 feff 0042 0065 007a 0065 0069 0063 0068
0000010

Now interestingly not even less can read the file anymore as it is detected as a binary format:

➜  ~/tmp less test.csv 
"test.csv" may be a binary file.  See it anyway? 
<FF><FE>B^<a class='mention' href='https://github.com/e'>@e</a>^<a class='mention' href='https://github.com/z'>@z</a>^<a class='mention' href='https://github.com/e'>@e</a>^<a class='mention' href='https://github.com/i'>@i</a>^<a class='mention' href='https://github.com/c'>@c</a>^<a class='mention' href='https://github.com/h'>@h</a>^<a class='mention' href='https://github.com/n'>@n</a>^<a class='mention' href='https://github.com/u'>@u</a>^<a class='mention' href='https://github.com/n'>@n</a>
[...]

And this is also the reason why Google sheets doesn't work anymore. Google doesn't detect a normal text encoding at the beginning of the file, so it falls back to Windows-1252 where FF FE corresponds to ÿþ

I don't really have a solution for this apart from adding at GET parameter that makes Matomo output plain normal UTF-8 files.

* bonus fun fact: There are combinations of encodings Excel can read, but then not write again or even worse, write but then not read again.
bonus bonus fun fact: The behavior of Excel also differs between versions and whether you open the file normally or use the import CSV dialog.

@jmumby commented on July 2nd 2020

Adding an encoding parameter in the request may be a good solution.

@tsteur commented on July 29th 2020 Member

The changes made here: #8898 (comment) let you import directly.

@jmumby do you remember if both changes where needed or was maybe &convertToUnicode=0 change enough?

@jmumby commented on July 29th 2020

The changes made here: #8898 (comment) let you import directly.

@jmumby do you remember if both changes where needed or was maybe &convertToUnicode=0 change enough?

It appears to work with application/vnd.ms-excel

@tsteur commented on July 29th 2020 Member

Sweet. so maybe we don't need to change anything and only document to set the URL parameter &convertToUnicode=0 to make this work @jmumby @mattab

@sgiehl commented on November 27th 2020 Member

Can we close this issue? Guess nothing needs to be changed in the code. We could maybe only update this FAQ https://matomo.org/faq/new-to-piwik/how-to-import-matomo-data-in-to-google-data-studio/ or create a new one for importing into google spreadsheet directly...

Powered by GitHub Issue Mirror