Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ecommerce Product average price may be incorrect, review implementation without using Custom Variables #12035

Closed
rosnoun opened this issue Sep 12, 2017 · 30 comments · Fixed by #15999
Assignees
Labels
Bug For errors / faults / flaws / inconsistencies etc.
Milestone

Comments

@rosnoun
Copy link

rosnoun commented Sep 12, 2017

When looking to the ecommerce products report, the average product price display incorrect value. The value seems to be multiplied by number of days for which there has been visit on the product page.

I've check that all my visits log product information report correct values directly into the database...

Piwik 3.1.0
Custom e-commerce report plugin (woocommerce)

@tsteur
Copy link
Member

tsteur commented Sep 12, 2017

Just had a look and the queries itself in the standard ecommerce plugin looks good.

Just to be sure: Are you using the standard ecommerce reporting plugin in Piwik or did you write your own custom reporting plugin? If so, it is open source and we can have a look at it?

@rosnoun
Copy link
Author

rosnoun commented Sep 12, 2017

I'm using the standard ecommerce build in reporting functions of piwik yes...
I've developped my own woocommerce javascript plugin to track ecommerce data, sorry for the confusion. This plugins modifies the javascript tracking code from WP-piwik to include ecommerce data, that's all...

@tsteur
Copy link
Member

tsteur commented Sep 12, 2017

Just to avoid any confusion. Do you mind letting me know which report you are looking at and experience this issue?

@rosnoun
Copy link
Author

rosnoun commented Sep 12, 2017

I'm looking at ecommerce -> product then both "product code" and "product name" reports show incorrect average price data
"Product category" does seems to be affected

@tsteur
Copy link
Member

tsteur commented Sep 12, 2017

When you see an Average Price... does it show next to it a number for "Unique purchases"?

image

Do you mind commenting this line https://github.com/piwik/piwik/blob/3.1.0/plugins/Goals/API.php#L324 meaning putting // (two slashes) in front of that line like // $rowView->renameColumn(Metrics::INDEX_ECOMMERCE_ITEM_PRICE_VIEWED, self::AVG_PRICE_VIEWED);

By looking at the code I can think of two problems which makes me think there really is a bug. The referenced link which renames "privce_viewed" to "avg_price" and then it does not calculate a proper value anymore...

and there are also filters like Average Price https://github.com/piwik/piwik/blob/3.1.0/plugins/Goals/Columns/Metrics/AveragePrice.php#L42-L46 where we fall back and divide by abandoned cards instead of orders (but I don't think this is the problem)

@rosnoun
Copy link
Author

rosnoun commented Sep 12, 2017

As an example, here after the report gives me from september 8 to september 12:
HE-orange-douce-10 | 0 € | - | - | 4 € | - | 3 | 0 %
HE-orange-douce-5 | 0 € | - | - | 3.40 € | - | 2 | 0 %

And here after the report gives me from september 12 to september 12:
HE-orange-douce-10 | 0 € | - | 2 € | - | 1 | -
HE-orange-douce-5 | 0 € | - | 1.70 € | - | 1 | -

In the database, I extract with this query:
SELECT server_time, custom_var_k2, custom_var_v2, custom_var_k3, custom_var_v3, custom_var_k4, custom_var_v4, custom_var_k5, custom_var_v5 FROM piwik_log_link_visit_action WHERE idsite = 3 AND custom_var_v3 LIKE 'HE-orange%';

And I obtain (CSV):
2017-09-08 22:24:12,"_pkp","2","_pks","HE-orange-douce-10","_pkn","Orange douce - Huile essentielle bio","_pkc","[""Huiles essentielles""]"
2017-09-08 22:24:17,"_pkp","2","_pks","HE-orange-douce-10","_pkn","Orange douce - Huile essentielle bio","_pkc","[""Huiles essentielles""]"
2017-09-08 22:24:17,"_pkp","1.7","_pks","HE-orange-douce-5","_pkn","Orange douce - Huile essentielle bio","_pkc","[""Huiles essentielles""]"
2017-09-08 22:24:21,"_pkp","2","_pks","HE-orange-douce-10","_pkn","Orange douce - Huile essentielle bio","_pkc","[""Huiles essentielles""]"
2017-09-08 22:24:21,"_pkp","1.7","_pks","HE-orange-douce-5","_pkn","Orange douce - Huile essentielle bio","_pkc","[""Huiles essentielles""]"
2017-09-09 21:31:47,"_pkp","2","_pks","HE-orange-douce-10","_pkn","Orange douce - Huile essentielle bio","_pkc","[""Huiles essentielles""]"
2017-09-12 21:11:11,"_pkp","2","_pks","HE-orange-douce-10","_pkn","Orange douce - Huile essentielle bio","_pkc","[""Huiles essentielles""]"
2017-09-12 21:11:31,"_pkp","2","_pks","HE-orange-douce-10","_pkn","Orange douce - Huile essentielle bio","_pkc","[""Huiles essentielles""]"
2017-09-12 21:11:31,"_pkp","1.7","_pks","HE-orange-douce-5","_pkn","Orange douce - Huile essentielle bio","_pkc","[""Huiles essentielles""]"
2017-09-12 21:23:07,"_pkp","2","_pks","HE-orange-douce-10","_pkn","Orange douce - Huile essentielle bio","_pkc","[""Huiles essentielles""]"
2017-09-12 21:23:07,"_pkp","1.7","_pks","HE-orange-douce-5","_pkn","Orange douce - Huile essentielle bio","_pkc","[""Huiles essentielles""]"

@rosnoun
Copy link
Author

rosnoun commented Sep 12, 2017

In the example I provided, the unique purchase is empty. But I've found other example were the unique purchase is 4 (for also 4 unit purchased) and the problem still occurs

I've commented the line as indicated.
After that, the average price shows 0

@rosnoun
Copy link
Author

rosnoun commented Sep 13, 2017

any update on the issue?

@tsteur
Copy link
Member

tsteur commented Sep 14, 2017

The price "should" be correct when there are unique purchases. When there are no purchases, the price should be zero as well I'd presume but maybe not @mattab ? I'd find it very confusing to not see any purchases but then still see a price. Are you sure your custom woocommerce tracker plugin is tracking the orders etc correctly? You should probably compare the shown pricing information with data from log_conversion and log_conversion_item.

Commenting the other line I mentioned should still leave the price when there are purchases but set all others to zero.

@mattab
Copy link
Member

mattab commented Sep 14, 2017

The price "should" be correct when there are unique purchases. When there are no purchases, the price should be zero as well I'd presume but maybe not @mattab ? I'd find it very confusing to not see any purchases but then still see a price.

FYI: the reason you can still see a price when there is no purchase, is when the following feature is used: "Tracking Product Page Views & Category Page Views (optional)" see the guide: https://piwik.org/docs/ecommerce-analytics/#tracking-product-page-views-category-page-views-optional
The tracking of product/category pageviews also include tracking of the price.

@rosnoun
Copy link
Author

rosnoun commented Sep 15, 2017

Indeed, as matab pointed out, this is what I have implemented in my code.
That's why I have product information in the piwik_log_link_visit_action table, which wouldn't be the case if I only had cart and order tracking (in such case, the data are indeed in log_conversion and log_conversion_item)
For the case in my example, I don't necessarily have data related HE-orange-douce-5 or HE-orange-douce-10 in log_conversion and log_conversion_item tables.

@tsteur
Copy link
Member

tsteur commented Sep 17, 2017

@mattab be good to make this more clear in the UI. I did not expect this behaviour... Eg beside "Ecommerce orders and Ecommerce abandoned carts" I would have expected another item "Ecommerce views" or "Product views" etc.

@rosnoun do you have this problem only for week / month / year periods or also when only viewing a day? Looks like the average is calculated directly in MySQL which looks to me like this might not be showing correct values when viewing for week/month/year/range in https://github.com/piwik/piwik/blob/3.1.1-b1/plugins/CustomVariables/Archiver.php#L125

Next problem is that it looks like it generates an average of all custom variable values here even if the custom variable slot is used for something else: https://github.com/piwik/piwik/blob/3.1.1-b1/plugins/CustomVariables/Archiver.php#L112-L114

After seeing this and quickly debugging it I stopped any further investigation as it is clear this feature was not properly implemented and needs to be re-built in a less "hacky" way. The data it generates may be quite buggy as you pointed out @rosnoun .

As the feature might require schema changes we cannot work on it before Piwik 4.0

@tsteur tsteur added the Bug For errors / faults / flaws / inconsistencies etc. label Sep 17, 2017
@mattab mattab added this to the 4.0.0 milestone Sep 18, 2017
@mattab
Copy link
Member

mattab commented Sep 18, 2017

@tsteur as per your suggestion, moving this issue to Piwik 4 where we'll try to improve the implementation of Ecommerce view tracking.

@mattab mattab changed the title Product average price is incorrect Ecommerce Product average price may be incorrect, review implementation without using Custom Variables Sep 18, 2017
@mattab mattab removed this from the 4.0.0 milestone Oct 6, 2017
@mattab mattab added this to the 4.0.0 milestone Nov 29, 2017
@katebutler katebutler self-assigned this Dec 19, 2019
@sgiehl
Copy link
Member

sgiehl commented Mar 2, 2020

@mattab @tsteur what exactly should be done here?

@tsteur
Copy link
Member

tsteur commented Mar 2, 2020

@sgiehl not quite sure. Could maybe move it out of Matomo 4. Maybe look at the initial comment again and this

@rosnoun do you have this problem only for week / month / year periods or also when only viewing a day? Looks like the average is calculated directly in MySQL which looks to me like this might not be showing correct values when viewing for week/month/year/range in https://github.com/piwik/piwik/blob/3.1.1-b1/plugins/CustomVariables/Archiver.php#L125

Next problem is that it looks like it generates an average of all custom variable values here even if the custom variable slot is used for something else: https://github.com/piwik/piwik/blob/3.1.1-b1/plugins/CustomVariables/Archiver.php#L112-L114

I'm not quite into the topic anymore so not quite sure what is happening there. Maybe we can fix some archiving queries.

In general it seems like ecommerce misuses the custom variables feature which we remove in Matomo 4? In that case would need to migrate this logic.

@mattab
Copy link
Member

mattab commented Mar 5, 2020

@sgiehl Maybe we could hijack this issue and change scope to "Remove custom variables in ecommerce implementation" (or create a separate issue, and move this one out of Matomo 4)

@sgiehl
Copy link
Member

sgiehl commented Apr 27, 2020

@tsteur @mattab just had a look at the way custom variables are used for ecommerce. It actually only affect ecommerce product views, where category, price, name and sku are tracked in custom variables.

Thought about just moving those in separate columns, like we did for search category and count, but I'm unsure if that would be the best solution. We actually already have dimensions for product name, price, category and sku, but those are in the log_conversion_item table and they are implemented using idactions and are segmentable that way. Should we convert the product view fields also using idactions? Guess that would be a longer conversion process, as we would need to look up and maybe insert new values in the log_action table.

Actually I wonder if it would make sense to change the log_conversion_item to not only track items that have been converted, but also those that have been viewed. Some fields (like quantity) would need to be nullable and we would need to add an additional type field to indicate if it's part of an order or a view. But then the existing ecommerce segments would automatically show product views. not sure if that would be expected, or if we would need to have new segments for product views or if they should not have any at all.

Any thoughts / suggestions?

@mattab
Copy link
Member

mattab commented Apr 27, 2020

Might work to use the existing table 👍
Just need to make sure the segment don't break and change meaning.
Would need introduce new segments for product views and rename labels for the ones we have maybe for clarity to show they are filtering visits that purchased a given product.

@tsteur
Copy link
Member

tsteur commented Apr 27, 2020

Just had a look into tracking, archiving, and storage etc how it all works.

If I see this correctly, so far as part of a page view we track basically these additional 4 values in the log_link_visit_action table

  • product category (only one)
  • product price (optional)
  • product sku (optional)
  • product name

Random first thought I had would it make sense to keep 5 custom_var_vX columns so log_link_visit_actions generally become more flexible and can store additional values? It's of course in some ways limited such as "what if we now want more data per action and the fields aren't enough etc". Also it wouldn't be always clear what is stored in which field. Could work generally for site search though as well etc.

On the other side conversion_item may allow us to potentially store more values, but then also wondering are we removing the hack that was used with custom variables just to now add a hack around "conversion items". In this case it's not a conversion item, there is no value for the orderId primary key etc.

Another random thought was what if "product view" was a specific action just like outlink etc. Then we could reuse idaction_name, custom_float for price, etc. but of course it's also not ideal (eg more tracking requests, not sure how this be shown in the visits log since it is basically only an enrichment maybe of the page view, etc).

With conversion item would indeed need to make sure existing segments still work the same way and they apply a filter on the item type. How would it impact the idvisit,idorder,idaction_sku), primary key? Can you see any other things it impacts?

@sgiehl
Copy link
Member

sgiehl commented Apr 28, 2020

I think the solution depends on the goals we want to achieve. Having additional "product view" actions has various advantages. Same applies for modifying and reusing the log_conversion_item table.
Basically I would say we need to decide what should be possible or not:

  • Do we want to support multiple product views per page view
  • Do we want to have new segments for product views or shall the old product segments include those
  • Shall product views support the same arguments as product sales (e.g. multiple categories)
  • Do we want to be able to generate reports based on product views easily and compare them with e.g. product sales

@tsteur
Copy link
Member

tsteur commented Apr 28, 2020

Not sure on the multiple product per page view, multiple categories, and more reports. It be nice probably, at the same time it be great to maybe ignore this for now and focus on the problem itself as these feature wouldn't have a big impact for now. In a few years we might want the feature though (could then still migrate the DB who knows what the DB will look like then as we then might also add other things like refunds, etc).

Do we want to have new segments for product views or shall the old product segments include those

The old product segments should stay and they should behave the same. Eg if we used the items table, then the segment should automatically only consider the items from orders. A new segment for product view items is not needed for now (be a different issue). If it comes easily out of the box with whatever implementation we go with, we take it I guess. It maybe shouldn't influence the decision though (hoping that helps).

I thought it be great to keep the data in log_link_visit_action in the columns they are already in so we wouldn't even need to migrate any data and we'd need to delete less custom variables columns. However, I realise this might not be as easy because we will be still having the custom variables feature just not in core. If we removed custom variables feature completely, then I'd be quite interested in keeping the data where they are and reuse those columns so actions can store additional data, just like site search does. However, we will still have custom variables so it's probably not simple.

I have not really a big preference. Is there a way you can estimate how much work it be to use the items table?

@sgiehl
Copy link
Member

sgiehl commented Apr 30, 2020

Actually I didn't even know how product views are displayed in Matomo, so I've now adjusted the VisitorGenerator to add some random views (matomo-org/plugin-VisitorGenerator#54). So we currently have no reports that displays the product views. They are filtered out of the custom variables reports and even in the visitor log they are displayed with their tracking parameter:

image

But as they are stored as custom variables it's kind of possible to segment by it if you know which parameter is what.

Back to the new implementation. Simply adding new columns in log_link_visit_action table, as we did for search category / count is quite easy to do and to migrate. Adjusting the log_conversion_item table would be a bigger load of work I guess as we would also need to adjust the whole implementation of ecommerce sales. Unfortunately I'm not very deep into that whole ecommerce implementation, so it's hard for me to estimate that properly without spending some hours to look that through.

@tsteur
Copy link
Member

tsteur commented Apr 30, 2020

@mattab can you confirm it's only used there? Is the feature even all that valuable then? Maybe could deprecate it until we eventually decide to implement a better implementation of it with more features?

Just looking at the UI (not code)... is it maybe used in the product conversion rate?

image

@sgiehl
Copy link
Member

sgiehl commented Apr 30, 2020

No doesn't seem so:

public function compute(Row $row)
{
$orders = $this->getMetric($row, 'orders');
$abandonedCarts = $this->getMetric($row, 'abandoned_carts');
$visits = $this->getMetric($row, 'nb_visits');
return Piwik::getQuotientSafe($orders === false ? $abandonedCarts : $orders, $visits, GoalManager::REVENUE_PRECISION + 2);
}

@tsteur
Copy link
Member

tsteur commented Apr 30, 2020

Thanks. Honestly, if that's all it does, I'd remove this feature and add it back later. @mattab

@mattab
Copy link
Member

mattab commented Apr 30, 2020

The feature is used to calculate:

  • Visits on a product (product views)
  • Conversion rate of a product (as seen in the code above where it uses $visits = $this->getMetric($row, 'nb_visits'); )

both are visible in the Ecommerce > Product reports. They are valuable, but if it's too hard to keep this feature we could possibly remove it...

Simply adding new columns in log_link_visit_action table, as we did for search category / count is quite easy to do and to migrate.

What would be estimate for this? or is there a downside with it?

(btw If we decide to remove the feature, users could still get the same "Product views" metric by Tracking a custom dimension of action scope in each product view. But I don't think they could get "Product conversion rate" in a custom report (because custom dimension of action scope don't yet have "per-goal" conversion metrics. if we had that, we could easily delete the "Product view" feature without any loss of feature))

@sgiehl
Copy link
Member

sgiehl commented May 15, 2020

Ok. So how shall we proceed? moving that into new columns should be quite easy and fast to implement.

@tsteur
Copy link
Member

tsteur commented May 17, 2020

@sgiehl no big preference. I'm tempted to simply remove it but if it's quick and easy to do feel free to put it into new columns. Wonder if then we just have these as custom link action columns that could be also used by other types eventually? Similar to custom_float. Not sure it makes sense but sometimes may be useful and would prevent needing random extra columns. Haven't thought too much about it though.

@sgiehl
Copy link
Member

sgiehl commented May 25, 2020

Tried to start implementing that by moving the custom variables into new fields. But it imho doesn't make much sense to continue here. That seems much more comprehensive than I thought in the first place.

I've now realized how the product views are integrated into the product reports.
There is some special code in the custom variables plugin like this

// IF we query Custom Variables scope "page" either: Product SKU, Product Name,
// then we also query the "Product page view" price which was possibly recorded.
$additionalSelects = false;
if (in_array($slot, array(\MatomoTracker::CVAR_INDEX_ECOMMERCE_ITEM_SKU, \MatomoTracker::CVAR_INDEX_ECOMMERCE_ITEM_NAME, \MatomoTracker::CVAR_INDEX_ECOMMERCE_ITEM_CATEGORY))) {
$additionalSelects = array($this->getSelectAveragePrice());
}

protected function aggregateActionByKeyAndValue($key, $value, $row)
{
$this->dataArray->sumMetricsActionsPivot($key, $value, $row);
if ($this->isReservedKey($key)) {
// Price tracking on Ecommerce product/category pages:
// the average is returned from the SQL query so the price is not "summed" like other metrics
$index = Metrics::INDEX_ECOMMERCE_ITEM_PRICE_VIEWED;
if (!empty($row[$index])) {
$this->dataArray->setRowColumnPivot($key, $value, $index, (float)$row[$index]);
}
}
}

It also seems possible to set multiple categories for a page view, which is then splitted while archiving:

protected function aggregateEcommerceCategories($key, $value, $row)
{
$ecommerceCategoriesAggregated = false;
if ($key == '_pkc'
&& $value[0] == '[' && $value[1] == '"'
) {
// In case categories were truncated, try closing the array
if (substr($value, -2) != '"]') {
$value .= '"]';
}
$decoded = json_decode($value);
if (is_array($decoded)) {
$count = 0;
foreach ($decoded as $category) {
if (empty($category)
|| $count >= GoalManager::MAXIMUM_PRODUCT_CATEGORIES
) {
continue;
}
$this->aggregateActionByKeyAndValue($key, $category, $row);
$ecommerceCategoriesAggregated = true;
$count++;
}
}
}
return $ecommerceCategoriesAggregated;
}

All those special reports are than used to enrich the product reports when the API is called:

protected function enrichItemsTableWithViewMetrics($dataTable, $recordName, $idSite, $period, $date, $segment)
{
// Enrich the datatable with Product/Categories views, and conversion rates
$customVariables = \Piwik\Plugins\CustomVariables\API::getInstance()->getCustomVariables($idSite, $period, $date, $segment, $expanded = false,
$_leavePiwikCoreVariables = true);
$mapping = array(
'Goals_ItemsSku' => '_pks',
'Goals_ItemsName' => '_pkn',
'Goals_ItemsCategory' => '_pkc',
);
$reportToNotDefinedString = array(
'Goals_ItemsSku' => Piwik::translate('General_NotDefined', Piwik::translate('Goals_ProductSKU')), // Note: this should never happen
'Goals_ItemsName' => Piwik::translate('General_NotDefined', Piwik::translate('Goals_ProductName')),
'Goals_ItemsCategory' => Piwik::translate('General_NotDefined', Piwik::translate('Goals_ProductCategory'))
);
$notDefinedStringPretty = $reportToNotDefinedString[$recordName];
$customVarNameToLookFor = $mapping[$recordName];
// Handle case where date=last30&period=day
if ($customVariables instanceof DataTable\Map) {
$customVariableDatatables = $customVariables->getDataTables();
$dataTables = $dataTable->getDataTables();
foreach ($customVariableDatatables as $key => $customVariableTableForDate) {
$dataTableForDate = isset($dataTables[$key]) ? $dataTables[$key] : new DataTable();
// we do not enter the IF
// if case idSite=1,3 AND period=day&date=datefrom,dateto,
if ($customVariableTableForDate instanceof DataTable
&& $customVariableTableForDate->getMetadata(Archive\DataTableFactory::TABLE_METADATA_PERIOD_INDEX)
) {
$dateRewrite = $customVariableTableForDate->getMetadata(Archive\DataTableFactory::TABLE_METADATA_PERIOD_INDEX)->getDateStart()->toString();
$row = $customVariableTableForDate->getRowFromLabel($customVarNameToLookFor);
if ($row) {
$idSubtable = $row->getIdSubDataTable();
$this->enrichItemsDataTableWithItemsViewMetrics($dataTableForDate, $idSite, $period, $dateRewrite, $segment, $idSubtable);
}
$dataTable->addTable($dataTableForDate, $key);
}
$this->renameNotDefinedRow($dataTableForDate, $notDefinedStringPretty);
}
} elseif ($customVariables instanceof DataTable) {
$row = $customVariables->getRowFromLabel($customVarNameToLookFor);
if ($row) {
$idSubtable = $row->getIdSubDataTable();
$this->enrichItemsDataTableWithItemsViewMetrics($dataTable, $idSite, $period, $date, $segment, $idSubtable);
}
$this->renameNotDefinedRow($dataTable, $notDefinedStringPretty);
}
}

Actually I guess we could move that all to the Goals archiver and do some additional queries there, to get those data directly while archiving (and no longer combine that in API).

But I doubt it make sense to store that in those new fields at all. Storing multiple categories in one field isn't that good. Also the conversion_item table only stores ids for name, sku and category[1-5] (linked to log_action table), so imho it would make sense to do the same for product views. But adding id fields to the log_link_visit_action table makes it impossible to convert existing data while updating. So we would need to break BC and need to provide a migration script for those who need the data...

Long story short:
We could migrate those fields to their own fields nevertheless and ignore that there is much space for improvements. Guess it would take a couple days to do that, as all the special handling for custom variables needs to be migrated to the goals plugin.

Shall I go on with that, or shall we discuss other solutions?

@tsteur
Copy link
Member

tsteur commented May 25, 2020

@sgiehl sure happy to discuss other solutions. Do you have any preference? Maybe give that preference a quick try for few hours to see if would work?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug For errors / faults / flaws / inconsistencies etc.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants