Thread: Read text file

Page 1 of 2 12 Last
  • Jump to page:
    #1
  1. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Mar 2003
    Posts
    61
    Rep Power
    12

    Read text file


    I am working on a class for use with servlets, that process HTML-templates.

    I need to read an entire HTML file at once, not line by line.

    How do I go about doing that?
  2. #2
  3. No Profile Picture
    Clueless llama
    Devshed Regular (2000 - 2499 posts)

    Join Date
    Feb 2001
    Location
    Lincoln, NE. USA
    Posts
    2,353
    Rep Power
    117
    I am not exactly sure on why you need to read it all at once, but you can accomplish this by reading the fileInputStream into a StringBuffer. Then the StringBuffers toString() method will give you the contents all at once.
  4. #3
  5. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Mar 2003
    Posts
    61
    Rep Power
    12
    Oh well....

    I did like this, and it works OK for me:

    File file = new File(filename);
    RandomAccessFile raf = new RandomAccessFile(file, "r");
    int len = (int)raf.length();
    byte[] bytes = new byte[len];
    raf.read(bytes);
    raf.close();
    String data = new String(bytes);


    I have also made a test using FIS instead og RAF, but it seems that RAF is a bit faster.
  6. #4
  7. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Mar 2003
    Posts
    61
    Rep Power
    12
    How exactly should I read a FIS into a StringBuffer?
  8. #5
  9. No Profile Picture
    Clueless llama
    Devshed Regular (2000 - 2499 posts)

    Join Date
    Feb 2001
    Location
    Lincoln, NE. USA
    Posts
    2,353
    Rep Power
    117
    RandomAccessFile appears to have been written because that is similar to how you would read a file in C. I have never used it.

    If you want to read a file using the classes that Sun suggests, you can still do it similar to the way you were doing it with the RandomAccessFile. Something like this:
    Code:
    public class TestFileReader {
    
    	public static void main(String[] args) {
    		File file = new File(args[0]);
    		char[] chars = new char[(int) file.length()];
    		try {
    			BufferedReader reader = new BufferedReader(new FileReader(file));
    			reader.read(chars);
    			reader.close();
    		} catch (FileNotFoundException e) {
    		} catch (IOException e) {
    		}
    		String data = new String(chars);
    		//System.out.println(data);		
    	}
    }
    Hope this helps.
  10. #6
  11. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Mar 2003
    Posts
    61
    Rep Power
    12
    I have made a test where I do it your way using a 78 kB file in a loop (100 times) and it seems slower than using RAF.

    Result with FileReader: 1141 ms
    Result with RandomAccessFile: 954 ms
  12. #7
  13. No Profile Picture
    Clueless llama
    Devshed Regular (2000 - 2499 posts)

    Join Date
    Feb 2001
    Location
    Lincoln, NE. USA
    Posts
    2,353
    Rep Power
    117
    Then by all means use the RandomAccessFile. I was merely pointing out how to do it the "Sun" way using Streams. Using the RandomAccessFile class is not wrong. When you originally asked you did not mention any particular class you were using. There is often more than one "right" way to do something.

    However, I must say when I ran my own tests I got a different answer. I made two seperate classes so there would be no chance the O/S or JVM were caching values between runs. Those classes were:
    Code:
    public class TestFileAccessRAF {
    
    	public static void main(String[] args) {
    		File file = new File("./licence.txt");
    		byte[] chars = new byte[(int) file.length()];
    		long start = System.currentTimeMillis();
    		try {
    			RandomAccessFile reader = new RandomAccessFile(file, "r");
    			for(int i = 0; i < 10000; i++) {
    				reader.read(chars);
    			}
    			reader.close();
    		} catch (FileNotFoundException e) {
    		} catch (IOException e) {
    		}
    		long end = System.currentTimeMillis();
    		String data = new String(chars);
    		//System.out.println(data);
    		System.out.println("It took " + (end - start) + " milliseconds to run");
    	}
    }
    
    ---------------------------------------
    public class TestFileAccessStream {
    
    	public static void main(String[] args) {
    		File file = new File("./licence.txt");
    		char[] chars = new char[(int) file.length()];
    		long start = System.currentTimeMillis();
    		try {
    			BufferedReader reader = new BufferedReader(new FileReader(file));
    			for(int i = 0; i < 10000; i++) {
    				reader.read(chars);
    			}
    			reader.close();
    		} catch (FileNotFoundException e) {
    		} catch (IOException e) {
    		}
    		long end = System.currentTimeMillis();
    		String data = new String(chars);
    		//System.out.println(data);
    		System.out.println("It took " + (end - start) + " milliseconds to run");
    	}
    
    }
    The file was a 95k file. You'll notice I ran them 10,000 times in each loop. The results I got back in three runs were:

    TestFileAccessStream:
    It took 78 milliseconds to run
    It took 62 milliseconds to run
    It took 63 milliseconds to run


    TestFileAccessRAF:
    It took 2031 milliseconds to run
    It took 2063 milliseconds to run
    It took 2031 milliseconds to run


    I ran these tests out of Eclipse IDE. Odd how our results were so different, I am curious why. Can you post the code you used for testing?
    Last edited by Nemi; March 14th, 2003 at 12:10 PM.
  14. #8
  15. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Mar 2003
    Posts
    61
    Rep Power
    12
    Actually, I was timing everything including opeing, reading, closing the file, and creating the string.

    Here is my code:
    Code:
    import java.io.*;
    
    public class filetest {
        public static void main(String[] args) {
    
    	String filename = "C:/somefile.txt";
    	
        	try {
            	long start = System.currentTimeMillis();
    		for (int i=0; i<100; i++)
    		{
    			File file = new File(filename);
    			char[] chars = new char[(int) file.length()];
    			BufferedReader reader = new BufferedReader(new FileReader(file));
    			reader.read(chars);
    			reader.close();
    			String data = new String(chars);
    		}
            	long end = System.currentTimeMillis();
            	System.out.println(end-start);
        	} catch(Exception e) {
            	e.printStackTrace();
        	}
    
        	try {
            	long start = System.currentTimeMillis();
    	
    		for (int i=0; i<100; i++)
    		{
    			File f = new File(filename);
            		byte[] buffer = new byte[(int) f.length()];
            		FileInputStream fis = new FileInputStream(f);
            		fis.read(buffer);
            		fis.close();
            		String str = new String(buffer);
    		}
    		long end = System.currentTimeMillis();
    		System.out.println(end-start);
        	} catch(Exception e) {
            	e.printStackTrace();
        	}
        
    	try {
            	long start = System.currentTimeMillis();
    		for (int i=0; i<100; i++)
    		{
            		RandomAccessFile raf = new RandomAccessFile(filename, "r");
            		byte[] buffer = new byte[(int) raf.length()];
            		raf.readFully(buffer);
            		raf.close();
            		String str = new String(buffer);
    		}
            	long end = System.currentTimeMillis();
            	System.out.println(end-start);
        	} catch(Exception e) {
            	e.printStackTrace();
        	}
        }
    }
  16. #9
  17. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Mar 2003
    Posts
    61
    Rep Power
    12
    As you can see, I also tried using FileInputStream just for fun.
  18. #10
  19. No Profile Picture
    Clueless llama
    Devshed Regular (2000 - 2499 posts)

    Join Date
    Feb 2001
    Location
    Lincoln, NE. USA
    Posts
    2,353
    Rep Power
    117
    OPk, the problem with your testing is that you create a lot of temp objects. By the time it gets to the second or third loop the garbage collector may have kicked in, affecting your times. This is why I put each test in a seperate file/class and ran them independantly.
  20. #11
  21. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Mar 2003
    Posts
    61
    Rep Power
    12
    I did run my test again, this time using three different files/classes.
    Still, RandomAccessFile seems to be the fastest, but the performance of FileInputStream is almost as good.
    FileReader, however, is slower also in this test.

    I think that your test is incomplete, because contents of a BufferedReader may be "buffered" so you don't actually read from the file 10.000 times, but only the first time.
  22. #12
  23. No Profile Picture
    Clueless llama
    Devshed Regular (2000 - 2499 posts)

    Join Date
    Feb 2001
    Location
    Lincoln, NE. USA
    Posts
    2,353
    Rep Power
    117
    That is possible, but you must realize that the bufferedReader will generally speed up disk access considerably. This is why it was made and why I used it. If you are doing disk access and not using a bufferedReader or BufferedInputStream you are needlessly slowing down your application.

    Here is a Sun tutorial on Streams
    http://java.sun.com/docs/books/tutorial/essential/io/
    Last edited by Nemi; March 14th, 2003 at 05:29 PM.
  24. #13
  25. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Mar 2003
    Posts
    61
    Rep Power
    12
    I don't think that disk access actually takes place.
    The OS is caching files in RAM, otherwise it would not be possible to read a file in less than 1 ms.

    BufferedReader or BufferedInputStream is not of any use if the file is only read once. If there were to be many read() operations on the same stream during execution, using BufferedStream/Reader would be the wisest choice. But since there is only one read(), the buffering probably only introduces some overhead (however small).
  26. #14
  27. No Profile Picture
    Clueless llama
    Devshed Regular (2000 - 2499 posts)

    Join Date
    Feb 2001
    Location
    Lincoln, NE. USA
    Posts
    2,353
    Rep Power
    117
    You make a good point, and that may be true.
  28. #15
  29. No Profile Picture
    Clueless llama
    Devshed Regular (2000 - 2499 posts)

    Join Date
    Feb 2001
    Location
    Lincoln, NE. USA
    Posts
    2,353
    Rep Power
    117
    It is very interesting. I reran some tests just now (I must be really bored) and got some new data.

    I ran the tests without the bufferedReader. I also tried two different loops. One set of tests had the construction and closing of the reader/raf in the loop, the other test had the only the reading in the loop.
    Code:
    Code with reading only in loop:
    -------------------------------------
    public class TestFileRAF {
    
    	public static void main(String[] args) {
    		File file = new File(args[0]);
    		byte[] chars = new byte[(int) file.length()];
    		long start = System.currentTimeMillis();
    		try {
    			RandomAccessFile reader = new RandomAccessFile(file, "r");
    			for (int i = 0; i < 10000; i++) {
    				reader.read(chars);
    			}
    			reader.close();
    		} catch (FileNotFoundException e) {
    		} catch (IOException e) {
    		}
    		long end = System.currentTimeMillis();
    		String data = new String(chars);
    		//System.out.println(data);
    		System.out.println("Took " + (end - start) + " milliseconds to run");
    	}
    }
    -----------
    public class TestFileReader {
    
    	public static void main(String[] args) {
    		File file = new File(args[0]);
    		char[] chars = new char[(int) file.length()];
    		long start = System.currentTimeMillis();
    		try {
    			FileReader reader = new FileReader(file);
    			for (int i = 0; i < 10000; i++) {
    				reader.read(chars);
    			}
    			reader.close();				
    		} catch (FileNotFoundException e) {
    		} catch (IOException e) {
    		}
    		long end = System.currentTimeMillis();
    		String data = new String(chars);
    		//System.out.println(data);		
    		System.out.println("Took " + (end - start) + " milliseconds to run");
    	}
    }
    Code:
    Code with new object creation in loop:
    --------------------------------------
    public class TestFileReader {
    
    	public static void main(String[] args) {
    		File file = new File(args[0]);
    		char[] chars = new char[(int) file.length()];
    		long start = System.currentTimeMillis();
    		try {
    			for (int i = 0; i < 1000; i++) {
    				FileReader reader = new FileReader(file);
    				reader.read(chars);
    				reader.close();				
    			}
    		} catch (FileNotFoundException e) {
    		} catch (IOException e) {
    		}
    		long end = System.currentTimeMillis();
    		String data = new String(chars);
    		//System.out.println(data);		
    		System.out.println("Took " + (end - start) + " milliseconds to run");
    	}
    }
    ---------------
    public class TestFileRAF {
    
    	public static void main(String[] args) {
    		File file = new File(args[0]);
    		byte[] chars = new byte[(int) file.length()];
    		long start = System.currentTimeMillis();
    		try {
    			for (int i = 0; i < 1000; i++) {
    				RandomAccessFile reader = new RandomAccessFile(file, "r");
    				reader.read(chars);
    				reader.close();
    			}
    		} catch (FileNotFoundException e) {
    		} catch (IOException e) {
    		}
    		long end = System.currentTimeMillis();
    		String data = new String(chars);
    		//System.out.println(data);
    		System.out.println("Took " + (end - start) + " milliseconds to run");
    	}
    }
    What I found is that when you create and destroy the FileReader/RAF in the loop, the RAF wins. However, if you are only reading in the loop, the FileReader wins (FileInputStream should be even faster).

    This says that the overall cost of constructing a stream is greater, but the actual usage of a stream appears to be cheaper. Opinon?
Page 1 of 2 12 Last
  • Jump to page:

IMN logo majestic logo threadwatch logo seochat tools logo