How to use the Text Similarity Checker
Enter two strings of text in each of the boxes and hit calculate. The tool will determine, as a decimal how similar each of the text strings are to one another.
The tool is based on the Levenshtein distance algorithm in which calculates the number of changes (additions, deletions, moves or substitutions) required to get from one text string to another.
What is Levenshtein Distance?
The Levenshtein distance is a calculation of the difference from a starting string to an alternative string. For example converting the word from: “Ten” to “top” requires 2 changes.
In order, you would change “e” for “o” leaving you with “ton”
The second change would be to change “n” with “p” giving you the output string: “top”
Each character moved, substituted, added or removed counts as a change.
How to use the Levenshtein Distance Tool for SEO
Using this tool can be useful for SEO in a number of ways including both overtly obvious ways and for some tasks where the immediate benefit is less clear.
Close Duplication Checker
Do you have a paragraph of text or the body of an article that you suspect is closely duplicated to another piece? You can use the text similarity checker to determine how similar, as a percentage, two pieces of text are.
e.g. take the text string:
Climate change is one of the biggest challenges facing our planet today. Rising global temperatures, extreme weather patterns, and sea level rise are just a few of the impacts that we are already seeing. Here is a sustainable bin bag that degrades naturally over time so you can throw it out with your compost.
and compare it to:
Climate change is one of the biggest challenges facing our planet today. Rising global temperatures, extreme weather patterns, and sea level rise are just a few of the impacts that we are already seeing. Here is a sustainable trash can bag that degrades naturally over time so you can throw it out with your trash.
It is clear the text is very similar, but by using the string similarity tool, you’ll get a numeric score, highlighting how similar the passages are. In this case the similarity score is 0.96 – indicating a 96% match.
Content Scraping Checks
Have you found an article on another website that reads in a very similar way to one first published on your site?
If you’ve found a content piece or a passage of text on a competitor website that you suspect is copied from your own, you can use the tool to quickly determine how similar the passages of text actually are.
Taking a text string from the webpage 5 common mistakes with rel=canonical hosted on the Google Developers Blog, it is clear that the text has been copied across a number of websites:
Taking the paragraph from Google developers blog and comparing it to the first paragraph of Webmasterworld.com the text similarity checker scores a 0.98 match, indicating the text is 98% similar, enough to be sure it is exactly duplicated.
It is however, worth noting that WebmasterWorld.com is not a scraping website, it does reference and link to the main source of the information and the content is directly referenced correctly, as part of message board thread about the subject of rel=canonical tags.
Targeting Relevance Tests
Combine the usage of this tool with our keyword frequency tool and you can use it to assess how relevant your title tag and meta description are to the list of the most commonly occurring words from your keyword set.
Consider you are working on SEO for a financial services client that offers credit cards. Your keyword list might look something like:
credit cards best credit cards 2023 chase credit cards best cash back credit cards 0 interest credit cards credit cards apply credit cards applications credit cards accepted
By running this list through the keyword frequency tool you’d get the following list:
Word Count credit 8 cards 8 best 2 0 1 2023 1 chase 1 cash 1 back 1 interest 1 apply 1 applications 1 accepted 1
Using the commonly occurring words you could write a title tag like: Apply for Credit Cards Online Today
Use the title tag length checker to make sure your title is the correct length.
Then, run your list of commonly occurring words through the text similarity checker with your title tag to see how well the title tag “scores”.
The score for this title tag is quite low, 0.19. A second title tag like: 0% Cashback Credit Cards Apply Today
Scores 0.29, so could be considered “more optimised” and may, in a comparative test, perform better in gaining organic traffic from search engines vs the lower scoring, lower relevance title tag.
Repeat the process for your meta descriptions and you’ll get a list of title tags and meta descriptions that have a “relevance” score based on the characters from your keyword list and the title tags you’ve created.
You can schedule each title tag for testing to see whether the highest scoring title tags perform best in search results for the pages you are optimising.
Here is a beginners guide to writing title tags.