The Shed is going Social! Join us on FaceBook and Twitter and chime in on the conversation.
|
 |
|
Dev Shed Forums
> Programming Languages
> C Programming
|
Comparing arrays deleting duplicates
Discuss Comparing arrays deleting duplicates in the C Programming forum on Dev Shed. Comparing arrays deleting duplicates C programming forum discussing all C derivatives, including C#, C++, Object-C, and even plain old vanilla C. These languages are low level languages, and used on projects such as device drivers, compilers, and even whole computer operating systems.
|
|
 |
|
|
|
|

Dev Shed Forums Sponsor:
|
|
|

January 19th, 2004, 03:54 AM
|
|
Junior Member
|
|
Join Date: Jan 2004
Posts: 3
Time spent in forums: < 1 sec
Reputation Power: 0
|
|
|
Comparing arrays deleting duplicates
i have an array with a .txt file in it. i need to print all the unique words in the array. so i need to be able to only print one word if there are two of the same in the array.....i just need some help with the code in C. Help would be very good. i no that i need to have a second array and copy across a string at a time to see if its in there or not.
cheers, local 
|

January 20th, 2004, 03:35 AM
|
 |
Left due to despotic ad-min
|
|
Join Date: Jun 2003
Posts: 1,044
  
Time spent in forums: 2 Days 53 m 47 sec
Reputation Power: 12
|
|
It really depends on what you're doing. There are a few possible approaches.
1) As you're reading the text file, check if a word just read is already in your array. If not, don't add it.
2) sort the array (eg using qsort). This will guarantee that two identical entries will be located at adjacent locations in the array. After sorting you can then do a check of the form
PHP Code:
if strcmp(array[i], array[i-1]) == 0) /* i assumed > 0 and valid */
{
/* don't print it. */
}
else
{
/* print it */
}
3) Sort the array while reading it. This probably works better if you store to a linked list rather than an array, as it's more efficient to add entries into the middle of a linked list than into an array.
There are other strategies as well, but the above are probably the easiest to understand and code, even if they're not always efficient.
|

January 20th, 2004, 06:24 AM
|
|
Junior Member
|
|
Join Date: Jan 2004
Posts: 3
Time spent in forums: < 1 sec
Reputation Power: 0
|
|
|
basically all the words are stored in an array and i need to print out all the unique words in the array. so if there is 3 "the" then it will only print 1 of them.
|

January 20th, 2004, 07:30 AM
|
|
Junior Member
|
|
Join Date: Dec 2003
Location: Bangalore
Posts: 3
Time spent in forums: < 1 sec
Reputation Power: 0
|
|
If you have all the words in an array, as was suggested, you could sort the elements and then write something like:
/* n is the number of elements in the array */
for(int i=0;i<n 
{
int j = i;
while((j<n) && (strcmp(word_arr[i],word_arr[j]) == 0))
j++;
printf("%s",word_arr[i]);
i = j;
}
But as was mentioned, there are better methods available if you are willing to construct a linked list (keeping elements in a lexicographical order, and inserting only if the element is not present), keeping the elements in a binary tree (eliminating duplicates), or hashing.
|

January 22nd, 2004, 08:21 AM
|
|
Junior Member
|
|
Join Date: Jan 2004
Posts: 3
Time spent in forums: < 1 sec
Reputation Power: 0
|
|
|
int fncompare(const void*elem1, const void*elem2)// because this is declared as int they can not compare char
{
return strcmp((char*)elem1,(char*)elem2);
}
..................
qsort(word, 300, sizeof(word[0]), fncompare);
printf("%s", word[0]);
int uniqueCount;
uniqueCount = 0;
for(i=1; i<300; i++)
{
if (strcmp(word[i], word[i-1]) == 0)
{
printf("%s", word[i]);
}
else
{
break;
}
uniqueCount++;
}
printf("\nThere are %d unique words in the text file\n\n",uniqueCount);
}
-------------------------------------------------------------
just when i thought i got it it prints out the words that are the same not all the words and also when counting unique words it counts like 278 instead of bout 13 which it should be.
|

January 22nd, 2004, 01:13 PM
|
|
Junior Member
|
|
Join Date: Jan 2004
Location: California/Netherlands
Posts: 1
Time spent in forums: < 1 sec
Reputation Power: 0
|
|
|
If you care for performance:
use std::set to do this for you.
much faster and made for it.
|

January 23rd, 2004, 08:32 AM
|
|
Junior Member
|
|
Join Date: Dec 2003
Location: Bangalore
Posts: 3
Time spent in forums: < 1 sec
Reputation Power: 0
|
|
|
There seems to be an error in the algorithm you are using. Since you wish to eliminate all duplicates (that will occur in chunks once you sort them) you need to use a code that goes something like the one I gave last time in place of the loops you are using currently.
Just replace the smiley in the code with a semicolon and a closing parentheses.
|
Developer Shed Advertisers and Affiliates
| Thread Tools |
Search this Thread |
|
|
|
| Display Modes |
Rate This Thread |
Linear Mode
|
|
Posting Rules
|
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
HTML code is Off
|
|
|
|
|