Dev Shed Lounge
 
Forums: » Register « |  User CP |  Games |  Calendar |  Members |  FAQs |  Sitemap |  Support | 
User Name:
Password:
Remember me
Go Back   Dev Shed ForumsOtherDev Shed Lounge

Reply
Add This Thread To:
  Del.icio.us   Digg   Google   Spurl   Blink   Furl   Simpy   Y! MyWeb 
Thread Tools Search this Thread Rate Thread Display Modes
 
Unread Dev Shed Forums Sponsor:
  #1  
Old May 13th, 2001, 07:57 PM
Warped Warped is offline
Junior Member
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Feb 2001
Posts: 4 Warped User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: < 1 sec
Reputation Power: 0
Question Dealing with Very Large Text Files...

As a course of my work I commonly deal with parsing data output (stored in ASCII text files). This output can commonly be hundreds of megabytes if not larger... At times, I want to inspect the data, but I often find even the most powerful of systems is no match for a 2 GB txt file. I know things would be much easier if I just had this all in a database, but we're just starting that integration now.

Heres my question, what is the best program to view and tinker with these VERY large text files? Usually I like to stay in win2k, but linux is also a viable option. Simplicity is good, though functionality and ease of use are also considered. WordPad has its limits, anything above 700 MB or so seems to lock up my machine pretty well. I don't even need to see the entire thing, maybe just a preview of say the first X number of lines... Something like a notepad that could open a 2GB file would be great...

Would buying more RAM or adding and additional processor (already running at 700 MHz) aid me in this persuit?

On a side note, what would be an optimal system configuration to parse such files? Is CPU or RAM more important when doing this? My system usually does fine when parsing these files, even the very big ones (the parsing scripts are a mixed bag: some PHP, others Perl, still others VB; whatever seems easiest or whoever has time to program them ). It does take a while, and sometimes the data comes out in an incorrect format, so its nice to preview the raw data to make adjustments before I parse it incorrectly by mistake...

Thanks,

Zach Sniezko

Reply With Quote
  #2  
Old May 16th, 2001, 08:15 AM
pieux pieux is offline
Seņor Member
Dev Shed Beginner (1000 - 1499 posts)
 
Join Date: Aug 2000
Posts: 1,156 pieux User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 1 h 17 m 59 sec
Reputation Power: 10
RAM is almost always beneficial. At the current prices, you should be able to get 1 GB of RAM for under $300 (two 512 MB PC133 SDRAM DIMMs).

However, take a look at UltraEdit. I've never had to edit 500 MB files, but I have had to open 2-5 MB files before and UltraEdit didn't even flinch. You can download an evaluation copy off their site.

It's also a great editor for coding -- it has many features I can no longer live without.
__________________
Michael

Reply With Quote
  #3  
Old May 16th, 2001, 08:23 AM
Shmengy Shmengy is offline
Junior Member
Dev Shed Newbie (0 - 499 posts)
 
Join Date: May 2001
Posts: 7 Shmengy User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: < 1 sec
Reputation Power: 0
Exclamation Watch out for Linux file size limits

Hi. I used to do computational chemistry (in a past life ) which would generate very large "scratch" files. These files would range in size from 2 - 30 Gigs.

Linux (at least Red Hat) does not like files which are larger than this size. Fortunately we were able to split the file into multiple 2 GIG chunks.

My point is, Linux may not be as viable an alternative as you would like.

Cheers,
Tim

Reply With Quote
  #4  
Old May 16th, 2001, 05:49 PM
tim snl's Avatar
tim snl tim snl is offline
Code Cruncher
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Mar 2001
Location: Tasmania, Australia
Posts: 121 tim snl User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 9 h 48 m 34 sec
Reputation Power: 9
Linux would be fine ... 2GB is a long way from 30GB

Linux has a command called head. The head command will take a file and give you just the first bit you ask for.
"head thefile -c 2000000 > afile" will create afile with the forst 2MB from thefile. You can then open afile in an editor easily.
__________________
Beware of a programmer with a screwdriver!

Last edited by tim snl : May 20th, 2001 at 05:59 PM.

Reply With Quote
  #5  
Old May 17th, 2001, 10:50 PM
Warped Warped is offline
Junior Member
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Feb 2001
Posts: 4 Warped User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: < 1 sec
Reputation Power: 0
Thanks for the help!

Reply With Quote
  #6  
Old May 18th, 2001, 01:27 PM
Helios Helios is offline
Member
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Sep 2000
Location: Chicago, USA
Posts: 73 Helios User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: < 1 sec
Reputation Power: 9
If you want to stay on Win2k like you said you may want to check out thegun.exe. I've been using it for quite a while and find it to be quite practical. The author claims that it has no max. file size limitations, the only limitation being your system's memory. I haven't been able to test this claim but I have loaded pretty large files and it is very fast.

It is completely free and the download is only 6k. You can get it here:

http://www.pbq.com.au/home/hutch/thegun.htm

Reply With Quote
  #7  
Old May 18th, 2001, 02:58 PM
pieux pieux is offline
Seņor Member
Dev Shed Beginner (1000 - 1499 posts)
 
Join Date: Aug 2000
Posts: 1,156 pieux User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 1 h 17 m 59 sec
Reputation Power: 10
I looked into it further and found the following (excerpt from their site):

Disk based text editing - up to 2GB file size, minimum RAM used even for multi-megabyte files.

Reply With Quote
Reply

Viewing: Dev Shed ForumsOtherDev Shed Lounge > Dealing with Very Large Text Files...


Thread Tools  Search this Thread 
Search this Thread:

Advanced Search
Display Modes  Rate This Thread 
Rate This Thread:


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
View Your Warnings | New Posts | Latest News | Latest Threads | Shoutbox
Forum Jump



 Free IT White Papers!
 
How to Present Effectively Online
This white paper offers practical and actionable advice on the key steps that any presenter should consider as they plan and execute a Webinar or online meeting.

 
Open Source Security Myths
Open Source Software (OSS) is computer software whose source code is available to the general public with relaxed or non-existent intellectual property restrictions (or arrangement such as the public domain), and is usually developed with the input of many contributors.

 
Power and Cooling Capacity Management for Data Centers
This paper describes the principles for achieving power and cooling capacity management.

 
Scalable, Fault-Tolerant NAS for Oracle - The Next Generation
For several years NAS has been evolving as a storage alternative for Oracle databases, and for good reason: NAS is quite often the simplest, most cost-effective storage approach for Oracle. Learn about the benefits that HP's approach to scalable NAS brings to Oracle environments in this comprehensive white paper.

 
Understanding Web Application Security Challenges
This white paper discusses many common threats and preventive measures for Web application security, and explains what you can do to help protect your organization.

 

Forums: » Register « |  User CP |  Games |  Calendar |  Members |  FAQs |  Sitemap |  Support | 
  
 





© 2003-2009 by Developer Shed. All rights reserved. DS Cluster 4 hosted by Hostway
Stay green...Green IT