Mar 16
Shuffle shuffle
icon1 Krishnamurthy Koduvayur Viswanathan | icon2 programming | icon4 03 16th, 2009| icon3No Comments »

Several weeks ago, I found myself thinking, how I could shuffle a list/array given to me in a random order. This is a typically commonplace thing to do in several applications: online card games, your favourite music player etc.

The interesting thing about this problem is how some naive approaches, even though easy to code and efficient enough, do not achieve the desired randomness.

Lets define the problem: You have an input array which has elements in positions 1 through n. The objective is to produce a random permutation of the array. By a random permutation, we mean to say that each permutation of the array is equally likely. So a truly randomized algorithm will generate each permutation with a probability of 1/n!

The first approach that comes to mind is the naive approach of generating a random number between 1 and n for each element in the array and placing the element at the position indicated by the random number. This could be accomplished by using another auxillary array and placing elements into the new positions generated. The pseudocode could look something like the following:

NaiveShuffle(A[1...n], B[1....n])            //randomly permutes the elements in array A
for i from 1 to n
random <- RandomNumber(1,n)
B[random] <- A[i]
for i from 1 to n
A[i] <- B[i]

The above approach is very crude in the sense it uses an auxillary array and also parses through the array twice instead of just once. Still, the worst case running time of the above algorithm is O(n).

Random(1, n) generates a random number 1 and n (both inclusive). We assume that the random number generator generates truly random numbers in the interval specified. Also, we assume some kind of collision resolution mechanism. We are assuming that this method returns a random number in O(1) time.

All that said, a O(n) algorithm is not bad at all for this purpose. There is only one problem, this algorithm is WRONG! (hah…fat chance, didn’t we name it a naive algorithm?). Why is that? This is because in every iteration, we generate n possible choices. Since there are n such iterations, the total number of permutations generated is n.n.n.n…….n times = n^n

The  total number of permutations possible while shuffling an array is n!. Since n^n is not exactly divisible by n!, there have to be some permutations which appear more frequently than the others (basic pigeonhole principle). Thus this naive algorithm does not generate truly random permutations.

Lets try something else. We generate a random priority between 1 and n^3 the Random(1, n^3)  routine, and assign a it to each element in the array. Then sort the array based on these weights.

e.g. if the original array is A<1,2,3,4> and we generate priorities randomly as P<34,56,8,77>, then when we sort array A based on the increasing order of the priorities assigned using array P, then we get the shuffled array as <3, 1, 2, 4>.

We used the interval [1, n^3] to generate priorities so as to reduce collisions. I will not delve into the correctness of this algorithm. The running time of this shuffling by sorting algorithm depends on which sorting algorithm we use. Typically it is Big-Theta(n lg n).

Another approach to solve this problem, is to shuffle by swapping:
Shuffle[1...n]
for i from 2 to n
temp = Generate(x from 1 to i)
swap(i,temp)

Notice that the interval for generating random numbers keeps decreasing. For the first iteration, there are n choices. For the 2nd iteration, there are (n-1) choices and so on.

Hence the total number of choices generated by this algorithm is: n(n-1)(n-2)…..3.2.1 = n!

which is exactly equal to the total number of possible permutations. Moreover, since we iterate over the array only once, this algorithm runs in O(n) time.

Dec 24
Sending bulk emails using Outlook and C#
icon1 Krishnamurthy Koduvayur Viswanathan | icon2 development, programming | icon4 12 24th, 2008| icon31 Comment »

I have always derived pleasure writing programs that solve real world problems. This is one such problem that I was able to solve. With the holiday season on, you might want to send greetings to your numerous business contacts. If you have several contacts that you want to send personalized messages to, then you very well can imagine how much time and effort that will take.
So this is what I set out to do: create an application that would send out emails to several contacts, with a personalized greeting line, but similar message body. Also, depending on the type of contact, you might want to send a different message. E.g. if it is a close colleague of yours, then you might want to send a more personalized mail rather than a one liner. Since these are personalized emails, these need to be sent from your actual email id rather than an SMTP server on your dev machine. Also, I needed this application to work for someone else who runs only MS Office on his machine. So I decided to use Microsoft Office Outlook 2007 for this task.
The first thing to do was to decide the fomat in which I would store all the configuration information that would be used by the application: So I created two different text files, mail1.txt and mail2.txt each with a separate email message:

Hello {0}
Mail body

Where {0} is a placeholder that will be replaced by the receiver name. mail1.txt and mail2.txt have the same structure except for the mail body depending on the requirement.
Next, I needed to create a list of names and the corresponding email IDs to which the mails are to be sent. Also, I needed a flag that will indicate the type of message that is to be sent, i.e. mail1 or mail2. I created a comma separated file with the following format:

<Receiver’s name>, <email id>, <mail body to be sent, i.e. 1 or 2>

I wrote a console application in C# that uses the Microsoft Office 2007 Primary Interop assemblies to automate sending emails to all these contacts specified. The emails get sent using the default account configured in your Outlook. The code looks something like the following. Please note that this is a quick hack which actually works and that I have not really done a lot of error handling or exception management on this because I know the conditions under which this will be used.

Microsoft.Office.Interop.Outlook.Application app = null;
Microsoft.Office.Interop.Outlook._NameSpace ns = null;
Microsoft.Office.Interop.Outlook.PostItem item = null;
Microsoft.Office.Interop.Outlook.MAPIFolder inboxFolder = null;
Microsoft.Office.Interop.Outlook.MAPIFolder subFolder = null;
Microsoft.Office.Interop.Outlook.MailItem memo = null;
Microsoft.Office.Interop.Outlook.MAPIFolder sentFolder = null;
StreamReader addressReader = null;
StreamReader contentsReader = null;
StreamWriter logWriter = null; 

try
{
addressReader = new StreamReader(ConfigurationManager.AppSettings["addresses"]);
String currentLine = String.Empty;
String[] currentReceiver = null;
String messageBodyFile = String.Empty;
logWriter = new StreamWriter(Path.Combine(Environment.CurrentDirectory, "Log.txt"), false);
while (!addressReader.EndOfStream)
{
currentLine = addressReader.ReadLine();
currentReceiver = currentLine.Split(',');
switch (currentReceiver[2])
{
case "1":
messageBodyFile = ConfigurationManager.AppSettings["contentsFile1"];
break; 

case "2":
messageBodyFile = ConfigurationManager.AppSettings["contentsFile2"];
break; 

default:
Console.WriteLine("Could not send email to ", currentReceiver[0]);
logWriter.WriteLine("Could not send email to ", currentReceiver[0]);
currentReceiver[1] = String.Empty;
break;
} 

#region EmailInit 

app = new Microsoft.Office.Interop.Outlook.Application();
ns = app.GetNamespace("MAPI");
ns.Logon(null, null, false, false);
sentFolder = ns.GetDefaultFolder(OlDefaultFolders.olFolderSentMail);
memo = (Microsoft.Office.Interop.Outlook.MailItem)app.CreateItem(OlItemType.olMailItem); 

#endregion 

contentsReader = new StreamReader(messageBodyFile);
memo.To = currentReceiver[1].Trim();
memo.Subject = ConfigurationManager.AppSettings["mailSubject"].Trim();
memo.Body = String.Format(contentsReader.ReadToEnd(), currentReceiver[0]);
memo.BodyFormat = OlBodyFormat.olFormatHTML;
memo.Send();
Console.WriteLine("{0}: Sent email with body {1} to {2}:{3}", DateTime.Now, currentReceiver[2], currentReceiver[0], currentReceiver[1]);
logWriter.WriteLine("{0}: Sent email with body {1} to {2}:{3}", DateTime.Now, currentReceiver[2], currentReceiver[0], currentReceiver[1]);
contentsReader.Close();
contentsReader.Dispose();
}
} 

catch (System.Exception ex)
{
Console.WriteLine(ex.ToString());
EventLog.WriteEntry("Email Automation", ex.Message, EventLogEntryType.Error);
} 

finally
{
ns = null;
app = null;
inboxFolder = null;
addressReader.Close();
addressReader.Dispose();
logWriter.Close();
logWriter.Dispose();
}
Oct 9
Touchy feely gcc
icon1 Krishnamurthy Koduvayur Viswanathan | icon2 programming | icon4 10 9th, 2008| icon31 Comment »

I am writing code in C after several years. Needless to say, I am woefully out of touch and don’t remember the most basic of things. Add to that, I am writing code using a simple text editor and compiling it using gcc on commandline. Every time I see a funny error, it takes me a while to actually understand what is wrong. A really good IDE with awesome intellisense really does spoil you!

So I got this funny little compilation error which left me stumped:

/tmp/cckI2FzP.o:(.eh_frame+0×11): undefined reference to `__gxx_personality_v0′
collect2: ld returned 1 exit status

I googled and found that this error is normally related to C++, but I was writing code in plain old C. So what was wrong? I found later that I had named my code file as List.C instead of List.c. After renaming it to List.c, all was well.

Turns out that filename extensions in linux are case sensitive (wonder why I did not run into that problem all these years),  and that C is a commonly used extension for C++

Sep 3
Development man pages
icon1 Krishnamurthy Koduvayur Viswanathan | icon2 linux, programming | icon4 09 3rd, 2007| icon3No Comments »

I was doing some elementary IPC programming on linux, and was looking for the man pages of the library calls that I was using in my programs (for functions like “perror”, “execlp” etc.) ‘cos its been a while since I have done anything in C. The way to get the manual page for these calls is:

kv$ man 3 “functionName”

where the number 3 specifies section number 3, which stands for library calls.

Now, it so happens that the development man pages are excluded by default in Ubuntu linux. A quick google search told me that the name of the package I was looking for is “manpages-dev.

Do:
kv$ sudo apt-get install manpages-dev

and you get the documentation for all the methods that you want!