Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

As log_action.idaction is just an INT, Matomo stops recording actions if idaction > 2^31 #19524

Open
stefan-work opened this issue Jul 13, 2022 · 6 comments
Labels
Bug For errors / faults / flaws / inconsistencies etc.
Milestone

Comments

@stefan-work
Copy link

stefan-work commented Jul 13, 2022

Summary

We're running Matomo in an environment with a rather high number of trackings. Recently we lost some until we noticed that log_action.idaction reached 2^31. In the log we saw that errror: Error in Matomo (tracker): Error query: Mysqli statement execute error : Out of range value for column 'idaction' at row 1 In query: INSERT INTO matomo_log_action (name, hash, type, url_prefix) VALUES (?,CRC32(?),?,?)

We altered all relevant tables and switched to BIGINT by hand. But from my point of view it would be better if Matomo wouldn't require such a "hack" and use BIGINT as default.

Do you still consider it better to save storage and force users to modify Matomo's tables. Or wouldn't be better to revert #10569 and use BIGINT as default?

Expected Behavior

Matomo can handle that number of actions without the need to change the database schema manually.

Current Behavior

Changes to the database schema need to be done on high-traffic sites.

Possible Solution

Revert #10569

Steps to Reproduce (for Bugs)

Use Matomo until log_action.id reaches 2^31

Your Environment

  • Matomo Version: 4.10.1
  • PHP Version: 7.4.6
  • Server Operating System: SLES 15.3 SP3 (Docker-Image)
@stefan-work stefan-work added the Potential Bug Something that might be a bug, but needs validation and confirmation it can be reproduced. label Jul 13, 2022
@bx80
Copy link
Contributor

bx80 commented Jul 13, 2022

Hi @stefan-work, thanks for sharing this experience, I'm sorry to hear that this limitation caused you problems and I'm glad you managed to work around it.

Database limitation are something we're currently evaluating and this sort of feedback is a valuable input to that process . 👍

@bx80 bx80 added Bug For errors / faults / flaws / inconsistencies etc. and removed Potential Bug Something that might be a bug, but needs validation and confirmation it can be reproduced. labels Jul 13, 2022
@bx80 bx80 added this to the 5.0.0 milestone Jul 13, 2022
@sgiehl
Copy link
Member

sgiehl commented Oct 7, 2022

The reason for not having idaction a BIGINT was explained here #3288 (comment)
Not sure if this is still something that is relevant. ping @tsteur @mattab

@bx80
Copy link
Contributor

bx80 commented Oct 9, 2022

BIGINT keys will be required to support distributed databases, especially if using random keys. Any future support for single schema would also depend on having a large enough key space. We could have a different schema in that case, but it would be better for data migration and interoperability to standardize on BIGINT keys unless there is still a good reason not to.

@sgiehl
Copy link
Member

sgiehl commented Oct 10, 2022

I don't think preserved disc space for keys should be that much of an issue nowadays, but just wanted to get a confirmation from @tsteur or @mattab as it seems we made the decision on purpose not to use bigint there in the past...
Also changing that columns means a potential big database migration, which we imho wanted to avoid for Matomo 5. If we decide to do that nevertheless, we could also consider merging #17466 to Matomo 5, as it was moved to Matomo 6, due to the required database migration.

@mattab mattab added the 5.0.0 label Jan 4, 2023
@mattab mattab modified the milestones: 5.0.0, 6.0.0 Jan 23, 2023
@mattab
Copy link
Member

mattab commented Jan 23, 2023

For new installations only (not during upgrade - this will be covered in #19524)
then we could change log_action.idaction to be BIGINT and also change all other fields.

But there are disk space implications as we found in #3288 (comment)

We just realised that InnoDB will book the row space (ie. 8 bytes per BIGINT) even for NULLable columns. therefore we decide that it's not needed for now, to make log_action.idaction BIGINT, as it would add a large overhead of (10 idaction* fields * 4 bytes overhead) per action, ie. an overhead of 10 x 4 bytes = 40b per action. Not willing to add such overhead for all users when only < 0.01% will have have a log_action table with more than 4 billion entries. So i'll partially revert the changes and only make BIGINT the primary/foreign keys log_visit.idvisit and log_link_visit_action.idlink_va

Maybe another solution could be to have a console command to let people optionally trigger the column upgrade when they need it.

@bx80
Copy link
Contributor

bx80 commented Jan 23, 2023

Out of interest I did some rough partial estimates on the monetary cost of increasing the idaction key size.
Typical AWS RDS storage cost: ~$0.115 per GiB/month.
4 bytes extra per BIGINT x 10 idaction keys = 40bytes extra per action.

Give these cost increases in table storage only (excluding index sizes, memory use and backup):

1,000,000 actions = 38MiB = $0.004 per month increase or $0.05 per year
10,000,000 actions = 381MiB = $0.043 per month increase or $0.54 per year
100,000,000 actions = 3,814MiB = $0.428 per month increase or $5.14 per year
1,000,000,000 actions = 38,146MiB = $4.28 per month increase or $51.40 per year

Since data was lost because this limit was hit, perhaps in addition to a console command to upgrade the columns we could also add a simple (weekly?) scheduled task to check tables with INT primary keys and send an admin warning if they have reached 90% (?) of the keyspace along with a link to an FAQ explaining how to upgrade the columns?

@mattab mattab removed the 5.0.0 label Aug 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug For errors / faults / flaws / inconsistencies etc.
Projects
None yet
Development

No branches or pull requests

4 participants