Introduction
In this article, I will explain an algorithm which is able to generate permutations and combinations without using the classical mathematical concepts. Programmatically, I introduced a multi-sized buffer to make the algorithm capable of working efficiently with any generated sequence size. All of this is encapsulated in a C++ class that I named as CStrCombin
.
CStrCombin
class is handy in many fields like mathematical calculations, password generator softwares, word generators ...I tested this class in a demonstrative program and I obtained good results. I also compared the time execution of this algorithm with some softwares that do the same thing and I found that using CStrCombin
is much faster.
License
The algorithm and CStrCombin
class discussed in this article are entirely of my creation. However they're free for non commercial purposes. You can use the code as you want, as long as you don't claim it as your own. If you use it in your product, I hope to be notified.
This work is licensed under a Creative Commons Attribution-ShareAlike 2.5 License.
Background
Permutations (arrangements) Without Repetitions
A permutation is an arrangement of objects in different orders. The order of arrangements is important. The notation for a permutation is nPr
. n
is the total number of objects (characters) and r
is the number of objects to be chosen. r
is also called the number of places that can hold the generated sequence of chosen objects. We have always n>=r
.
Let's take an example:
The set of characters is {A,B,C,D}
(n=4
). We want to obtain all arrangements r=3
among n=4
. So, the result is:
BCD, ACD, CBD, ABD, CAD, BAD, BDC, ADC, DBC, ABC, DAC, BAC,
CDB, ADB, DCB, ACB, DAB, CAB, CDA, BDA, DCA, BCA, DBA, CBA.
4P3= 24
Permutations with Repetitions
It's calculated as n<sup>r</sup>
. n
can be less than r
.
Let's take an example:
The set of characters is {A,B,C,D}
(n=4
)
We want to obtain all arrangements with repetitions r=2
among n=4
. So, the result is:
DD, CD, BD, AD, DC, CC, BC, AC,
DB, CB, BB, AB, DA, CA, BA, AA.
42 = 16
Combinations
A combination is an unordered collection of unique elements. Given S
, the set of all possible unique elements, a combination is a subset of the elements of S
. The order of the elements in a combination is not important (two lists with the same elements in different orders are considered to be the same combination). Also, the elements cannot be repeated in a combination (every element appears uniquely once); this is often referred to as "without replacement/repetition".
Let's take an example:
The set of characters is {A,B,C,D}
(n=4
)
We want to obtain all combinations r=3
among n=4
. So, the result is:
BCD, ACD, ABD, ABC.
4C3= 4
Algorithm Basics
Our starting point is a set of characters given by the user. To add to that, the user also gives the number of places that will hold combinations/arrangements as desired. The algorithm is capable of generating arrangements with/without repetition and combinations. The first step of proceeding is the same for the 3 possible goals. As a first step, we assign to each character in the collection a number that represents an index into the collection. This is shown in the example below :
<small> collection | A 0 C 0 E F G 7 " J K L 9 N O , T R 7 ; U V # X Y Z ... @
---------- + ------------------------------------------------------------------------------------
index | 00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F 10 11 12 13 14 15 16 17 18 19 ... FF</small>
As you have noticed, the index value is an unsigned integer value that occupies 1 byte. This algorithm supports a maximum of 256 characters. From this point, the index values will always be manipulated in the hexadecimal form.
To simplify the explanation, let's have a collection of 4 letters and a number of 3 places. So we have our assignment like this :
collection | A B C D +---+---+---+
---------- + --------------- The generation sequence have | ? | ? | ? |
index | 00 01 02 03 to be composed of 3 places +---+---+---+
| LSB RSB
LSB: Left Side Byte
RSB: Right Side Byte
The second step consists of initializing the generation mechanism. Here resides the main idea of the algorithm. If we want to generate arrangements/combinations we have to calculate all the numbers between LSB LSB LSB
and RSB RSB RSB
. This means 00 00 00
and 03 03 03
. After doing this, we have to keep in mind that index values are simply a replacement of the real given characters. So when we have the number 00 00 00
, it automatically refers to the string AAA
and 03 03 03
refers to DDD
. It's evident to make the deduction that calculated numbers that come between 00 00 00
and 03 03 03
and have one or more elements that don't correspond to one of the indexes chosen earlier will not interest us in arrangements/combinations search.
Arrangements/combinations are selected by the following operations:
- Arrangements with repetitions are all collected sequences
- Arrangements without repetitions are all collected sequences that do not have any repeated element
- Combinations are all arrangements without repetitions in which the index of the elements of a sequence are sorted
In the following sections, we will view how to create the CStrCombin
class that applies the exposed algorithm.
CBuffer Definition
The CBuffer
class is defined as given below:
class CBuffer
{
public:
int m_size; unsigned char* m_buf; CBuffer(int,int); CBuffer(int); ~CBuffer(void);
void operator++ ();
bool operator< (CBuffer&);
bool operator== (CBuffer&);
bool operator<= (CBuffer&);
};
CBuffer Initialization with a Max Buffer Value
Constructor version 1 allows us to create a buffer which is initialized with a precise value. This value is based on an integer which is the max index in a given character collection.
This constructor is defined below:
CBuffer::CBuffer(int hex,int places) {
m_size=places;
m_buf=new unsigned char[m_size];
for(int i=0;i<m_size;i++)
m_buf[i]=hex;
}
Note that the buffer data is pointed by m_buf
.
CBuffer Initialization with a Null Buffer Value
To create the buffer containing 0 we can use constructor 2:
CBuffer::CBuffer(int places){
m_size=places;
m_buf=new unsigned char[m_size];
for(int i=0;i<m_size;i++)
m_buf[i]=0;
}
Counting Numbers from 0 to the Defined Max Value
This is done by the ++
operator:
void CBuffer::operator++ ()
{
for(int i=0;i<m_size;i++)
{
if(m_buf[i]!=0xff)
{
m_buf[i]++;
return;
}
else
m_buf[i]=0;
}
throw "Buffer overflow !"; }
Comparison Functions
Comparisons between CBuffer
objects may be needed especially with <=
operation. So let's write the corresponding operator. For doing that, we need to write <
operator and ==
operator. These will be called by <=
operator. The code is as follows:
bool CBuffer::operator< (CBuffer& obj)
{ for(int i=m_size-1;i>=0;i--)
{
if(obj.m_buf[i]>m_buf[i])
return true;
else
if(obj.m_buf[i]<m_buf[i])
return false;
}
return false;
}
bool CBuffer::operator== (CBuffer& obj)
{ for(int i=m_size-1;i>=0;i--)
if(obj.m_buf[i]!=m_buf[i])
return false;
return true;
}
bool CBuffer::operator<= (CBuffer& obj)
{
if(*this<obj || *this==obj)
return true;
else
return false;
}
Cleaning Memory
Before having destroyed the buffer, we can delete pointed data like this:
CBuffer::~CBuffer(void)
{
delete []m_buf;
}
Now we are able to create a buffer with any size. This work will be used in the CStrCombin
class.
CStrCombin Definition
The CStrCombin
class is defined as below:
class CStrCombin {
public:
struct stack_cell
{
unsigned char* buf;
struct stack_cell* prev;
};
private:
#define m_StrSize 256
int m_mode;
char m_str[m_StrSize+1];
int m_places;
stack_cell* m_pHead;
unsigned long long m_combNbr; unsigned long long m_combTime;
public:
enum RepetitionMode
{
combWithoutRep=1, combWithRep, combComb };
CStrCombin();
~CStrCombin();
void SetGenMode(int);
void SetGenStr(char*);
void SetGenPlaces(int);
void GenAndSaveCombToFile(char*);
void push(unsigned char*);
void pop(HANDLE);
unsigned long long GetCombFoundNbr();
unsigned long long GetCombTime();
};
In the following sections, we'll see how to use different class members for generating combinations/arrangements.
CStrCombin Initialization
Through the class constructor, we set different data members to default values :
CStrCombin::CStrCombin():m_mode(0),m_places(0),m_pHead(NULL),
m_combNbr(0),m_combTime(0)
{ }
SetGenMode
It allows us to choose the calculating mode that the algorithm will use. We have 3 choices:
combWithRep
: If chosen, the algorithm will calculate arrangements with repetitions combWithoutRep
: If chosen, the algorithm will calculate arrangements without repetitions combComb
: If chosen, the algorithm will calculate combinations
The function code is as follows:
void CStrCombin::SetGenMode(int mode)
{
if( mode == combWithRep || mode == combWithoutRep || mode == combComb)
m_mode=mode;
else
throw "Illegal mode value !";
}
This must be called before other functions members.
SetGenStr
It saves the sequence from which the generated sequences will be built.
The function code is as follows:
void CStrCombin::SetGenStr(char* str)
{
if(!str)
throw "Illegal string address !";
if(!strlen(str))
throw "Illegal string value !";
memset(m_str,0,m_StrSize+1);
strcpy(m_str,str);
}
This must be called after SetGenMode
.
SetGenPlaces
This function sets the number of places. The code is as follows:
void CStrCombin::SetGenPlaces(int places)
{
if( m_mode != combWithRep && m_mode != combWithoutRep &&
m_mode != combComb)
throw "Illegal mode value !";
if ( places<=0 || ( (m_mode==combWithoutRep || m_mode == combComb)
&& strlen(m_str)<places) )
throw "Illegal places value !";
m_places=places;
}
GenAndSaveCombToFile
It calculates and generates combinations or arrangements. Then it saves them to a specified file. This function uses the CBuffer
class.
The code is as follows:
void CStrCombin::GenAndSaveCombToFile(char* path)
{
m_combTime=GetTickCount();
HANDLE hFile=CreateFile(path,GENERIC_WRITE,0,NULL,
CREATE_ALWAYS,FILE_ATTRIBUTE_NORMAL,NULL);
if(hFile==INVALID_HANDLE_VALUE)
throw "Unable to create file";
CBuffer maxRge(strlen(m_str)-1,m_places);
CBuffer counter(m_places);
for(counter;counter<=maxRge;counter++)
{ bool save=true;
for(int i=0;i<counter.m_size;i++)
if(counter.m_buf[i]>=strlen(m_str))
{
save=false;
break;
}
if(m_mode == combWithoutRep || m_mode == combComb)
{
for(int i=0;i<counter.m_size-1;i++)
for(int j=i+1;j<counter.m_size;j++)
if(counter.m_buf[i]==counter.m_buf[j])
{
save=false;
goto _skip;
}
if(m_mode == combComb)
for(int i=0;i<counter.m_size-1;i++)
for(int j=i+1;j<counter.m_size;j++)
if(counter.m_buf[j]<counter.m_buf[i])
{
save=false;
goto _skip;
}
}
_skip:
if(!save) continue;
push(counter.m_buf);
m_combNbr++;
}
while(m_pHead)
pop(hFile);
CloseHandle(hFile);
m_combTime=GetTickCount()-m_combTime;
}
push
void CStrCombin::push(unsigned char* str)
{
stack_cell* pTmp=m_pHead;
m_pHead=new stack_cell;
m_pHead->prev=pTmp;
m_pHead->buf=new unsigned char[m_places];
memcpy(m_pHead->buf,str,m_places);
}
pop
void CStrCombin::pop(HANDLE hFile)
{
if(m_pHead)
{
stack_cell* pTmp=m_pHead->prev;
char* str=new char[m_places+3];
memset(str,0,m_places+3);
for(int i=0;i<m_places;i++)
str[i]=m_str[m_pHead->buf[i]];
strcat(str,"\r\n");
delete []m_pHead->buf;
delete m_pHead;
m_pHead=pTmp;
DWORD dw;
SetFilePointer(hFile,0,0,FILE_END);
WriteFile(hFile,str,strlen(str),&dw,NULL);
}
}
GetCombFoundNbr
^__strong style=<span class="code-string">FONT-WEIGHT: 400">//found combinations number
unsigned long long CStrCombin::GetCombFoundNbr()
{
return m_combNbr;
}
GetCombTime
^__strong style=<span class="code-string">FONT-WEIGHT: 400">// elapsed calculation time
unsigned ^__strong style=<span class="code-string">FONT-WEIGHT: 400">long long CStrCombin::GetCombTime()
{
return m_combTime;
}
Now the CStrCombin
class is ready for action. In the next section, we'll see a proof of concept of the explained technique: a concrete example that uses CStrCombin
.
Example
Now we'll take an example for testing the CStrCombin
class. The code finds all arrangements with repetitions for ABCD
sequence and stores the results in a file.
#include "stdafx.h"
#include <iostream>
#include "StrCombin.h"
#include <windows.h>
using namespace std;
int _tmain(int argc, _TCHAR* argv[])
{
try
{
CStrCombin obj;
obj.SetGenMode(CStrCombin::combWithRep);
obj.SetGenStr("ABCD");
obj.SetGenPlaces(4);
obj.GenAndSaveCombToFile("c:\\GenCombs.txt");
HANDLE hF=CreateFile("c:\\CombInfos.txt",GENERIC_WRITE,0,NULL,
CREATE_ALWAYS,FILE_ATTRIBUTE_NORMAL,NULL);
DWORD dw;
char w[100];
sprintf(w,"%d combinations found.\r\n%d ms elapsed.\r\n",
(unsigned int)obj.GetCombFoundNbr(),
unsigned int)obj.GetCombTime());
SetFilePointer(hF,0,0,FILE_END);
WriteFile(hF,w,strlen(w),&dw,NULL);
CloseHandle(hF);
}
catch(char* str)
{
cout<<str<<endl;
}
return 0;
}
History
- 01/09/2007 -
CStrCombin
Version 1.0