This follow-up project to vector search asks a slightly different question:
What happens if the input is not a query, but a piece of content itself?
Instead of searching “for” something, we let a URL describe what it means, and then find other content that means the same thing.
The Idea
Every page in the content database has already been processed and converted into a vector embedding that represents its meaning.
When someone enters a URL:
The content of that page is fetched
The page content is converted into a vector using the same embedding model
That vector is compared against the database of content vectors
Cosine similarity is used to measure semantic closeness
Results are ranked from highest to lowest similarity
Only results above a similarity threshold are returned
The output is a list of URLs that are semantically similar to the page you started with.
No keywords.No tags.No predefined categories.
Just meaning compared to meaning.
Why This Is Useful
This approach answers a very natural question:
If a visitor is engaging with a piece of content, what other similar content would they be interested in?
Because the comparison happens in vector space, similarity is based on:
Topic overlap
Shared themes
Implied intent
Emphasis and tone
Two pages do not need to share the same words to be considered relevant. They just need to express similar ideas.
This makes the system especially effective for:
Content discovery and recommendations
Contextual advertising (without needing consumer data)
How This Extends Vector Search
Traditional search starts with intent expressed as text.
This project treats content itself as intent.
By embedding both the input URL and the content database into the same semantic space, the system can answer questions like:
“What content is most like this?”
“What themes does this page naturally align with?”
It is the same underlying mechanism as vector search, applied to a more contextual input.
A Simple Takeaway
Keyword systems compare words.Vector systems compare meaning.
By using a URL as the input instead of a query, this project demonstrates how vector embeddings can power content-to-content relevance, not just search.
Once content is represented as vectors, finding related content becomes a geometry problem, not a text-matching problem.
And that turns out to be a much better fit for how content actually works.