A Theory About Duplicate Content

This is more of a question for those who have more experience in search engine optimization. I’m asking because I haven’t yet tried to test this empirically and I’m wondering if it’s even worth the effort to do so — better to know I’m completely wrong before I invest, right?

It’s my understanding that Google categorizes different search terms to group those of common meaning. From another angle, what this means is that the search may use the category’s definition, rather than the individual words’. I wonder if Google does the same thing when judging duplicate content. Suppose that I have a website on Friedrich Hayek and the content sucks. One page might give a very superficial overview of his business cycle theory and the other three pages might cover exactly the same points, just written differently. My theory is that Google would know that the content is duplicate, because it can use term categories to figure out the general ideas being covered on each page.

A more concrete example of a site with duplicate content, but where each page has unique enough content to pass a simple duplication check (which might test only for copy and pasting) is Teeth Tomorrow™. Check out the product description page, the “Why Teeth Tomorrow™” page, and the “Benefits” page. They all pretty much cover the same exact points, just written in slightly different ways. My theory is that Google is penalizing that site — which, although fifth when you search for “Teeth Tomorrow”, is ranked below the doctor’s main site and below an infosite on dental implants (also owned by the same doctor), both of which have only one page on the product and not seven+.

Am I re-inventing the wheel? Am I wrong? Just wondering, because that should offer a clue on how content marketers should train their writers.