Dev Shed Lounge
 
Forums: » Register « |  User CP |  Games |  Calendar |  Members |  FAQs |  Sitemap |  Support | 
User Name:
Password:
Remember me
Go Back   Dev Shed ForumsOtherDev Shed Lounge

Reply
Add This Thread To:
  Del.icio.us   Digg   Google   Spurl   Blink   Furl   Simpy   Y! MyWeb 
Thread Tools Search this Thread Rate Thread Display Modes
 
Unread Dev Shed Forums Sponsor:
Stay one step ahead of the competition. Evaluate and give feedback on some of the hottest web development tools on the market today. Make your opinion heard! Click Here
  #1  
Old May 26th, 2001, 03:12 PM
Warped Warped is offline
Junior Member
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Feb 2001
Posts: 4 Warped User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: < 1 sec
Reputation Power: 0
Eliminating duplicate lines in TXT files

Anybody know of a program to eliminate duplicate lines from CSV files ( usually ranging from 1-20MB)? I've found one but it is actually a HTML parser and doesn't support files over 2.5MB in size.

ie


Barbara Meyers,Associate,97401
Barbara Yvette Santiago,Member,90210
Barbara J Green,Associate,67541
Barbara Jean Hall,Moderator,43521
Barbara Yvette Santiago,Member,90210
Barbara Yvette Santiago,Member,90210
Barbara Jean Hall,Moderator,43521
Barbara Yvette Santiago,Member,90210


after being parsed I'd want all original records intact, but the exact duplicates removed.


Barbara Meyers,Associate,97401
Barbara Yvette Santiago,Member,90210
Barbara J Green,Associate,67541
Barbara Jean Hall,Moderator,43521


Thanks,
Zach Sniezko

Reply With Quote
  #2  
Old May 30th, 2001, 10:56 AM
Flink Flink is offline
Contributing User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Nov 2000
Location: Manchester, UK
Posts: 47 Flink User rank is Private First Class (20 - 50 Reputation Level)Flink User rank is Private First Class (20 - 50 Reputation Level) 
Time spent in forums: 1 h 48 m 9 sec
Reputation Power: 8
Write one...

I don't know if you've ever dabbled in VBA, but it's not at all difficult, and you could probably write that program in 10 minutes. Hit ALT-F11 in microsoft Word, and you're in the IDE.
There should be a few tutorials that do exactly what you're after.
Other than that, don't know any programs out there.

Give it a go - Visual Basic for Applications (VBA).
It basically automates anything in the Office Suite and then some.

Adam Mellor
www.chamele.com

Reply With Quote
  #3  
Old May 30th, 2001, 03:30 PM
epl epl is offline
Contributing User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Mar 2001
Location: Dublin
Posts: 413 epl User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 2 h 18 m 18 sec
Reputation Power: 8
If you are using office vba then you might as well use access to load this into a recordset and then write the output of a query (no duplicates) back to the text file...

Reply With Quote
Reply

Viewing: Dev Shed ForumsOtherDev Shed Lounge > Eliminating duplicate lines in TXT files


Thread Tools  Search this Thread 
Search this Thread:

Advanced Search
Display Modes  Rate This Thread 
Rate This Thread:


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
View Your Warnings | New Posts | Latest News | Latest Threads | Shoutbox
Forum Jump

 Free IT White Papers!
 
Accelerating Trading Partner Performance
One in five. That's how many partner transactions have at least one error. That is an amazing statistic, particularly given the extraordinary leaps in innovation across the global supply chain during the past two decades. Download this white paper to learn more.

 
Competing on Analytics
This Tech Analysis is designed to help identify characteristics shared by analytics competitors, and includes information about 32 organizations that have made a commitment to quantitative, fact-based analysis.

 
Cost Effective Scaling with Virtualization and Coyote Point Systems
An overview of the industry trend toward virtualization, how server consolidation has increased the importance of application uptime and the steps being taken to integrate load balancing technology with virtualized servers.

 
Five Checkpoints to Implementing IP Telephony
Implementation planning for IP PBX software and IP telephony has become vital as businesses replace discontinued legacy PBX phone systems. This informative whitepaper outlines five "checkpoints" for any implementation plan that will help make IP communications a successful proposition.

 
Hosted Email Security: Staying Ahead of New Threats
In the last two years, email has become a fierce battleground between the nefarious forces of spam and malware, and the heroes of messaging protection. The spam volumes increased alarmingly every month, bringing clever new forms of phishing and virus propagation attacks.

 

Forums: » Register « |  User CP |  Games |  Calendar |  Members |  FAQs |  Sitemap |  Support | 
  
 





© 2003-2008 by Developer Shed. All rights reserved. DS Cluster 1 hosted by Hostway