The hash calculation is only a very small percentage of the time spent in execution... Changing from MD5 to another algorithm wouldn't make significant difference.
I've done some performance testing, using a file of 50 columns (all string) with 512Mb of data, on a dual CPU virtual machine. The data is pushed back out to another text file.
V1.5 takes 153 seconds to process this with three hashs, on all columns, threading turned off, 150 seconds with threading turned on.
V1.5.1 (not released yet) takes 86 seconds to process this with three hashs, on all columns, threading turned off. I haven't tested with threading turned on, but don't expect much difference.
The difference in the versions is to do with ArrayList versus List performance in C#, and changing to only extending the byte array every 1000 bytes.
I don't recommend using the currently checked in code (although it does perform faster), as it hasn't been tested yet, and I know there is at least one unwanted "feature" in it at the moment (if you hash a column > 1000 characters it will crash).
I'll be doing some more performance testing, with data that matches your profile above, to see what other improvements can be made.