15,609,601 members
Articles / General Programming / Algorithms
Tip/Trick
Posted 12 Apr 2021

11.2K views
12 bookmarked

# How to Unscramble Any Word

Rate me:
This is an unscramble class that can be used to decypher any word.
I could never find a good way to unscramble a word on the interwebs. Every algorithm was either brute-force or permutations...

## Introduction

I could never find a good way to unscramble a word on the interwebs. Every algorithm was either brute-force or permutations.

The problem with brute-force is that it's a guessing game... very slow, or if you're lucky, very fast.

The problem with permutations is that once you go over 7 characters, you're using a bunch of memory.
e.g., a 12 character scrambled word has 479,001,600 configurations!

It finally dawned on me that if you sort the scrambled word and then sort the dictionary entries, then if we equate any sorted dictionary entry to our sorted scramble, then they must be a match!

There is probably some fancy machine learning algorithm that could do this, but my method works perfectly and instantly.

## Using the Code

You'll want to embed your favorite dictionary into your project (for speed and portability).

There are a lot of free dictionary files out there; here's the one I used... https://github.com/dwyl/english-words.

The work-horse is the `UnscrambleWord` method; this will take care of loading the dictionary, filtering and then sorting the results and storing them in a `List<string>` object that will be returned to you from the call.

C#
```class Unscramble
{
private static bool _dictionaryLoaded = false;
private static string _wordToUnscramble = "";
private static int _totalEntries = 0;
private static Dictionary<string, string> _sortedDictionary =
new Dictionary<string, string>();
private static List<string> _results = new List<string>();
private static Stopwatch _stopwatch;

//====================================================================================
/** We don't really need a constructor **/
//public Unscramble(string wordToUnscramble)
//{
//    _WordToUnscramble = wordToUnscramble;
//}

//====================================================================================
public List<string> UnscrambleWord(string wordToUnscramble, bool useFiltering = true)
{
_stopwatch = Stopwatch.StartNew();

if (string.IsNullOrEmpty(_wordToUnscramble))
{
_wordToUnscramble = wordToUnscramble;
}
else if (!_wordToUnscramble.Equals
(wordToUnscramble, StringComparison.OrdinalIgnoreCase) && useFiltering)
{   //If re-using the object and the word is different,
//we'll need to reload the dictionary
_wordToUnscramble = wordToUnscramble;
_results.Clear();
}
else if (_wordToUnscramble.Equals
(wordToUnscramble, StringComparison.OrdinalIgnoreCase))
{
_results.Clear(); //we should clear the results array so they don't stack
}

if (!_dictionaryLoaded) //the first call will be slightly slower

string scrambleSorted = SortWord(wordToUnscramble);

//var kvp = SortedDictionary.FirstOrDefault
//(p => SortedDictionary.Comparer.Equals(p.Value, scrambledSort));
var matchList = _sortedDictionary.Where
(kvp => kvp.Value == scrambleSorted).Select(kvp => kvp.Key).ToList();

if (matchList.Count > 0)
{
foreach (string result in matchList)
{
System.Diagnostics.Debug.WriteLine(\$"> Match: {result}");
}

_stopwatch.Stop();
System.Diagnostics.Debug.WriteLine(\$"> Elapsed time: {_stopwatch.Elapsed}");
return _results;
}
else //no matches
{
_stopwatch.Stop();
_results.Clear();
System.Diagnostics.Debug.WriteLine(\$"> Elapsed time: {_stopwatch.Elapsed}");
return _results;
}
}

//==================================================================================
private static void LoadEmbeddedDictionary(string wordText, bool filter = false)
{
char[] delims = new char[1] { '\n' };
string[] chunks;
int chunkCount = 0;
if (filter)
chunks = global::Utility.Properties.Resources.
DictionaryNums.ToUpper().Split(delims);
else
chunks = global::Utility.Properties.Resources.
DictionaryNums.ToUpper().Split(delims);

System.Diagnostics.Debug.WriteLine(\$"> Length filter: {wordText.Length}");
_sortedDictionary.Clear();
foreach (string str in chunks)
{
chunkCount++;
if (filter)
{
//we're assuming the word will have at least 3 characters...
//I mean would you really need this program if it was only two?
if ((str.Length == wordText.Length) &&
str.Contains(wordText.Substring(0, 1)) &&
str.Contains(wordText.Substring(1, 1)) &&
str.Contains(wordText.Substring(2, 1))) //just checking the 1st,
//2nd & 3rd letter will trim our search considerably
{
try
{
}
catch
{
//probably a key collision, just ignore
}
}
}
else
{
try
{
}
catch
{
//probably a key collision, just ignore
}
}
}
System.Diagnostics.Debug.WriteLine(\$">
Loaded {_sortedDictionary.Count} possible matches out of {chunkCount.ToString()}");
_totalEntries = chunkCount;
}

//=================================================================================
private static string SortWord(string str)
{
return String.Concat(str.OrderBy(c => c));

/*** Character Array Method ***
return String.Concat(str.OrderBy(c => c).ToArray());
*******************************/

char[] chars = input.ToArray();
Array.Sort(chars);
return new string(chars);
***************************/
}

#region [Helper Methods]
//=================================================================================
public TimeSpan GetMatchTime()
{
return _stopwatch.Elapsed;
}

//=================================================================================
public List<string> GetMatchResults()
{
return _results;
}

//=================================================================================
public int GetMatchCount()
{
return _results.Count;
}

//=================================================================================
public int GetFilterCount()
{
return _sortedDictionary.Count;
}

//=================================================================================
public int GetDictionaryCount()
{
return _totalEntries;
}
#endregion
}
```

### Testing/Implementation

To drive the code, you would do this...

C#
```string scrambled = "mctmouicnaino";
Unscramble obj1 = new Unscramble();
List<string> results = obj1.UnscrambleWord(scrambled);
if (results.Count > 0)
{
Console.WriteLine(\$"> Total matches: {obj1.GetMatchCount()}");
foreach (string str in results)
{
Console.WriteLine(\$">> {str}");
}
Console.WriteLine(\$"> Total time: {obj1.GetMatchTime()}");
Console.WriteLine(\$"> Filtered set: {obj1.GetFilterCount()}
out of {obj1.GetDictionaryCount()}");
}
else
{
Console.WriteLine("> No matches available:
Check your spelling, or the dictionary may be missing this word.");
}
```

In the class, we could add some more LINQ methods to change the order, take the top result, etc., but this should be a good remedy for any unscramble engine base.

## History

• 11th April, 2021: Initial version

Written By
United States
This member has not yet provided a Biography. Assume it's interesting and varied, and probably something to do with programming.

 First PrevNext
 Where is Utility.Properties.Resources.DictionaryNums ? largenqcd15-Apr-21 5:33 largenqcd 15-Apr-21 5:33
 Re: Where is Utility.Properties.Resources.DictionaryNums ? GuildOfCalamity15-Apr-21 19:44 GuildOfCalamity 15-Apr-21 19:44
 Re: Where is Utility.Properties.Resources.DictionaryNums ? largenqcd16-Apr-21 6:11 largenqcd 16-Apr-21 6:11
 Re: Where is Utility.Properties.Resources.DictionaryNums ? GuildOfCalamity16-Apr-21 6:41 GuildOfCalamity 16-Apr-21 6:41
 Kudos to the author for the high-level heuristic Martin Fierro14-Apr-21 5:59 Martin Fierro 14-Apr-21 5:59
 Re: Kudos to the author for the high-level heuristic GuildOfCalamity15-Apr-21 15:13 GuildOfCalamity 15-Apr-21 15:13
 Re: Kudos to the author for the high-level heuristic Martin Fierro22-Apr-21 6:45 Martin Fierro 22-Apr-21 6:45
 Please post working code.....working code included. Member 1088480414-Apr-21 3:47 Member 10884804 14-Apr-21 3:47
 Re: Please post working code.....working code included. GuildOfCalamity15-Apr-21 15:11 GuildOfCalamity 15-Apr-21 15:11
 Nice Approach - Another Variant Bob McGowan13-Apr-21 17:39 Bob McGowan 13-Apr-21 17:39
 Re: Nice Approach - Another Variant Member 1341177115-Apr-21 0:01 Member 13411771 15-Apr-21 0:01
 Words with wrong or missing letters from OCR Curt Krueger13-Apr-21 19:41 Curt Krueger 13-Apr-21 19:41
 Re: Words with wrong or missing letters from OCR Bob McGowan14-Apr-21 4:21 Bob McGowan 14-Apr-21 4:21
 Re: Words with wrong or missing letters from OCR Member 1341177115-Apr-21 0:02 Member 13411771 15-Apr-21 0:02
 How to Unscramble Any Word Doncp13-Apr-21 11:01 Doncp 13-Apr-21 11:01
 Re: How to Unscramble Any Word largenqcd13-Apr-21 14:01 largenqcd 13-Apr-21 14:01
 Re: How to Unscramble Any Word GuildOfCalamity16-Apr-21 5:34 GuildOfCalamity 16-Apr-21 5:34
 Re: How to Unscramble Any Word largenqcd16-Apr-21 6:32 largenqcd 16-Apr-21 6:32
 Re: How to Unscramble Any Word GuildOfCalamity16-Apr-21 6:42 GuildOfCalamity 16-Apr-21 6:42
 Message Closed 14-Nov-21 19:45 kuki yang 14-Nov-21 19:45
 Re: How to Unscramble Any Word largenqcd16-Apr-21 6:46 largenqcd 16-Apr-21 6:46
 My vote of 5 rspercy6513-Apr-21 8:32 rspercy65 13-Apr-21 8:32
 Re: My vote of 5 GuildOfCalamity15-Apr-21 15:03 GuildOfCalamity 15-Apr-21 15:03
 Anagrams and suggestions in VB HenkAlles18-Apr-21 23:54 HenkAlles 18-Apr-21 23:54
 There are two use cases for the code. One is solving anagrams and the other - as mentions in messages - is to fix misspelled search terms. For the latter, a ranked list of words is more helpful than an unsorted list. Adding Levenstein distance measure to calculate the rank of each anagram returns a list of words with the most likely - least number of changes - first. ```''' ''' Anagram and scrambled term alternatives generator. The anagrams and term alternatives are ''' matched with the supplied dictionary. Typically the Unscamble method is used in search ''' applications where alternative terms might yield better recall for the application. ''' ''' ''' ''' Dim unscrambler As New Anagram("words.txt") ''' ''' Console.WriteLine("Anagrams for 'asphalt'") ''' Dim newResults As List(Of String) = unscrambler.Anagrams("asphalt") ''' For Each str As String In newResults ''' Console.WriteLine(str) ''' Next ''' ''' Console.WriteLine() ''' Console.WriteLine("Alternatives for 'saphalt'") ''' Dim newnewResults = unscrambler.Unscramble("saphalt") ''' For Each item In newnewResults ''' Console.WriteLine(item.Term & vbTab & item.Rank) ''' Next ''' ''' Console.WriteLine() ''' Console.WriteLine("Alternatives for 'asphalt'") ''' Dim newnewnewResults = unscrambler.Unscramble("asphalt") ''' For Each item In newnewnewResults ''' Console.WriteLine(item.Term & vbTab & item.Rank) ''' Next ''' ''' ' Produced output from an English list of words. ''' ' ''' ' Anagrams for 'asphalt' ''' ' asphalt ''' ' spathal ''' ' taplash ''' ''' ' Alternatives for 'saphalt' ''' ' asphalt 0.714285714285714 ''' ' spathal 0.571428571428571 ''' ' taplash 0.428571428571429 ''' ''' ' Alternatives for 'asphalt' ''' ' asphalt 1 ''' ' spathal 0.428571428571429 ''' ' taplash 0.285714285714286 ''' Public Class Anagram Private terms As Dictionary(Of String, List(Of String)) Private termFileName As String Private termEnum As IEnumerable(Of String) ''' ''' Initializes a new instance of the class. ''' ''' ''' The CSV UTF-8 encoded term file where each term is on a separate line. ''' ''' File name cannot be null. ''' File doesn't exist. ''' ''' ''' Dim unscrambler As New Anagram("words.txt") ''' ''' Console.WriteLine("Anagrams for 'asphalt'") ''' Dim newResults As List(Of String) = unscrambler.Anagrams("asphalt") ''' For Each str As String In newResults ''' Console.WriteLine(str) ''' Next ''' ''' Console.WriteLine() ''' Console.WriteLine("Alternatives for 'saphalt'") ''' Dim newnewResults = unscrambler.Unscramble("saphalt") ''' For Each item In newnewResults ''' Console.WriteLine(item.Term & vbTab & item.Rank) ''' Next ''' Public Sub New(FileName As String) If FileName.IsNullOrEmpty() Then Throw New ArgumentNullException("File name cannot be null.") If Not File.Exists(FileName) Then Throw New IOException("File doesn't exist.") Me.termFileName = FileName End Sub ''' ''' Initializes a new instance of the class. ''' ''' List of terms. ''' TermList enumerator cannot be null. Public Sub New(TermList As IEnumerable(Of String)) If TermList Is Nothing Then Throw New ArgumentNullException("TermList enumerator cannot be null.") termEnum = TermList End Sub ''' ''' Generates anagrams of the specified word. Only terms that occur in the supplied ''' dictionary are returned. ''' ''' The word or term to generate anagram(s) for. ''' IEnumerable(Of System.String). ''' ''' Word parameter cannot be null, empty or just whitespace. ''' Public Function Anagrams(ByVal Word As String) As IEnumerable(Of String) If Word.IsNullOrEmptyOrWhiteSpace Then Throw New ArgumentException("Word parameter cannot be null, empty or just whitespace.") If Terms Is Nothing Then LoadTermDictionary() Dim results = Terms(SortByLetter(Word.ToUpper())) Return results End Function ''' ''' Unscrambles the specified word. Common typos in queries are based on the transposition ''' of 2 or more letter. For eample 'saphalt' might be entered instead of 'asphalt'. The ''' unscramble method returns a ranked list of alternatives based on the supplied ''' dictionary. The ranking is on a scale of [1.0 .. 0.0] where higher scores denote a ''' better (closer) alternative for the supplied term. ''' ''' The word. ''' IEnumerable(Of System.ValueTuple(Of System.String, System.Double)). ''' ''' Word parameter cannot be null, empty or just whitespace. ''' Public Function Unscramble(Word As String) As IEnumerable(Of (Term As String, Rank As Double)) If Word.IsNullOrEmptyOrWhiteSpace Then Throw New ArgumentException("Word parameter cannot be null, empty or just whitespace.") If Terms Is Nothing Then LoadTermDictionary() Word = Word.ToUpper() Dim results As New List(Of (Term As String, Rank As Double)) For Each term In Terms(SortByLetter(Word)) results.Add((term, 1 - (LevenshteinDistance(term.ToUpper(), Word) / term.Length()))) Next Return results.OrderByDescending(Function(x) x.Rank) End Function Private Sub LoadTermDictionary() terms = New Dictionary(Of String, List(Of String))() If termFileName IsNot Nothing Then For Each term As String In File.ReadAllText(termFileName).Split(vbLf) AddTerm(term) Next Else If termEnum IsNot Nothing Then For Each term As String In termEnum AddTerm(term) Next Else Throw New Exception("Cannot load dictionary terms.") End If End If End Sub Private Sub AddTerm(Term As String) Dim key As String = SortByLetter(Term.ToUpper()) If terms.ContainsKey(key) Then terms(key).Add(Term) Else Dim tl = New List(Of String)() From {Term} terms(key) = tl End If End Sub Private Shared Function SortByLetter(ByVal Term As String) As String Return String.Concat(Term.OrderBy(Function(c) c)) End Function ''' ''' Calculates the Levenshtein edit distance between strings. ''' ''' First string ''' Second string ''' Number of edits necessary to get from Text1 to Text2 Private Shared Function LevenshteinDistance(ByVal Text1 As String, ByVal Text2 As String) As Integer Const INSERT_PENALTY As Integer = 1 Const DELETE_PENALTY As Integer = 1 Const SUBSTITUTE_PENALTY As Integer = 1 Dim diffs As Integer(,) = New Integer(Text1.Length + 1, Text2.Length + 1) {} Dim cost As Integer For i As Integer = 0 To Text1.Length diffs(i, 0) = i Next For i As Integer = 0 To Text2.Length diffs(0, i) = i Next For i As Integer = 1 To Text1.Length For j As Integer = 1 To Text2.Length If (Text1.Chars(i - 1) = Text2.Chars(j - 1)) Then cost = 0 Else cost = SUBSTITUTE_PENALTY End If diffs(i, j) = Min(Min(diffs(i - 1, j) + INSERT_PENALTY, diffs(i, j - 1) + DELETE_PENALTY), diffs(i - 1, j - 1) + cost) Next Next Return diffs(Text1.Length, Text2.Length) End Function End Class ``` ​I'd rather live in a place where nothing makes sense but things work rather than a place where things make sense but don't work.
 An alternate algorithm Member 1122079213-Apr-21 8:31 Member 11220792 13-Apr-21 8:31
 Last Visit: 31-Dec-99 18:00     Last Update: 23-Mar-23 16:35 Refresh 12 Next ᐅ