I had another thread about this in General Support, it got locked.
If your product page url looks like this: domain/category/product,
you may notice that the AC generated canonical url in the head section looks different: domain/product.
Reasonably, you can't stop a bot from crawling both sets of product-page urls, especially if the canonical url specified in the head doesn't match the native url that results from navigation.
This means a bot will catalog 2 different urls for each product page. From the bot's perspective, that's two different links to identical content, or "duplicate content".
My solution: change the code so that the AC-specified canonical url matches the native url.
Method 1.2.10 -
root/storefront/controller/pages/product/product.php
About line 170, change
'href' => $this->html->getSEOURL('product/product', '&product_id=' . $product_id, '&encode'),
to
'href' => "https://$_SERVER[HTTP_HOST]$_SERVER[REQUEST_URI]",
Method 1.2.11 -
root/storefront/controller/common/seo_url.php
About line 121, change
'href' => $url
to
'href' => "https://$_SERVER[HTTP_HOST]$_SERVER[REQUEST_URI]"
I have a small site, only 3 levels deep at any product page. This mod works for me. There are probably other considerations for larger sites.
It's a core mod. Keep a record of core mods so that you can rewrite them after system updates.