June 10th, 2013, 11:36 AM
Checking large strings of data for errors
If I have a very long string of data, such as an essay, and I have an almost identical essay with a few minor changes and/or errors, and I want to compare the two and count how many differences there are, how would I go about this?
In comparing short strings (one sentence) , I have used a server and a client with an engine (EssayEngine) which extends Remote, and imports java.rmi.Remote.
In the client program, I have declared a final string with the original sentence, and a final string with the changed sentence.
I have then compared the two using a registry lookup on the engine, and an engine check to compare the two strings. This only shows the position of one (the first error).
int h = engine.check(string1, string2);
(This is wrapped in a try catch.)
I just don't know how to scale up the program to check for multiple errors and count them. Any pointers would be great, thanks.
June 11th, 2013, 06:19 AM
Can the long Strings be broken up into smaller Strings that can be compared?
If not, then two Strings could be compared byte by byte.
A problem is re-synching the compare when a difference is found.
For example with these two Strings:
The first one has an extra x in it. When the compare finds that x in the first string does not match c in the second string, what should it do? If the x is treated as an insert, then the rest of the characters will match.
Last edited by NormR; June 11th, 2013 at 06:22 AM.