I'd expect the operating system to cache the file if it's small enough.
I'd expect the operating system to cache the file if it's small enough. If you're concerned about performance, have you proved it to be a bottleneck? I'd just do the simplest thing and not worry about it until you have a specific reason to.
I mean, you could just read the whole thing into memory and then do the two passes on the result, but again that's going to be more complicated than just reading from the start again with a new reader.
The Buffered readers are meant to read a file sequentially. What you are looking for is the java.io. RandomAccessFile, and then you can use seek() to take you to where you want in the file.
The random access reader is implemented like so: try{ String fileName = "c:/myraffile. Txt"; File file = new File(fileName); RandomAccessFile raf = new RandomAccessFile(file, "rw"); raf.readChar(); raf. Seek(0); } catch (FileNotFoundException e) { // TODO Auto-generated catch block e.printStackTrace(); } catch (IOException e) { // TODO Auto-generated catch block e.printStackTrace(); } The "rw" is a mode character which is detailed here: http://java.sun.com/j2se/1.4.2/docs/api/java/io/RandomAccessFile.html#mode The reason the sequential access readers are setup like this is so that they can implement their buffers and that things can not be changed beneath their feet.
For example the file reader that is given to the buffered reader should only be operated on by that buffered reader. If there was another location that could affect it you could have inconsistent operation as one reader advanced its position in the file reader while the other wanted it to remain the same now you use the other reader and it is in an undetermined location.
The best way to proceed is to change your algorithm, in a way in which you will NOT need the second pass. I used this approach a couple of times, when I had to deal with huge (but not terrible, i.e. Few GBs) files which didn't fit the available memory.It might be hard, but the performance gain usually worths the effort.
The usual reason why you don't want to do that is there might be memory issues – Davide Nov 4 '08 at 17:53.
MattB: Huge text file, would rather not store it. But I can't argue with the simplicity of your answer. It's certainly worth a try.Thanks.
– Jon Skeet Nov 4 '08 at 17:45 What I was trying to get at my answer wasn't so much "read the whole thing into memory" but to have you take a look at your algorithm to see if you really need to read it twice. – matt be Nov 4 '08 at 18:28 (continued) If you're having problems with BufferedReader, but you really don't need to use it like that, you can save time by not having to solve problems that you don't need to. – matt be Nov 4 '08 at 18:29.
Jon Skeet: That's the way I currently have it implemented. It works fine, performance is certainly NOT an issue. I just felt a tinge of shame from not having a more elegant approach.Thanks.
(It's generally a good idea to hit "Add comment" on the answer you're replying to, rather than adding a new answer. ) You're doing exactly what you need to, with no over-engineering. Seems elegant enough to me :) – Jon Skeet Nov 4 '08 at 17:48 He can't add a comment until he's got 50 rep – Dave Nov 4 '08 at 17:53 Yep.
If you really, truly can't do it all in one pass, I don't think that you should feel bad about leveraging the file system and whatever caching the OS is giving you. Sometimes a format is poorly designed for one-pass processing, and you can't change it (X.509 CRL structures, for example). – erickson Nov 4 '08 at 17:56.
I agree with Jon Skeet. Why is it so bad to have 2 BufferedReader's? I guess if the file is too large you could have some gain by keeping it in memory.
But in that case, you must choose between memory consumption and some gain in performance.
Ryan P: I didn't know about RandomAccessReader, thanks. It seems (I'm biased I admit) that BufferedReader should have a method "topOfFile" "startOfStream" that does what RandomAccessReader. Seek(0) does.
The whole business about mark() and reset() in BufferedReader smacks of poor design. This may be the last time I ever use BufferedReader. Thanks again.
The whole business about mark() and reset() in BufferedReader smacks of poor design. " why don't you extend this class and have it do a mark() in the constructor() and then do a seek(0) in topOfFile() method. BR, ~A.
About mark/reset: The mark method in BufferedReader takes a readAheadLimit parameter which limits how far you can read after a mark before reset becomes impossible. Resetting doesn't actually mean a file system seek(0), it just seeks inside the buffer. To quote the Javadoc: readAheadLimit - Limit on the number of characters that may be read while still preserving the mark.
After reading this many characters, attempting to reset the stream may fail. A limit value larger than the size of the input buffer will cause a new buffer to be allocated whose size is no smaller than limit. Therefore large values should be used with care.
Anjan b: or I could just post non-constructive answers to other people's questions. Ryan P's suggestion to use RandomAccessReader makes your post moot at best. @Zarkonnen: I UNDERSTAND and DISLIKE the mark/reset paradigm.
Your post implies I dislike it because I don't get it. Incorrect. I don't believe I should have to write code that's aware of the structure and length of the file it's buffering in order to simply go to an arbitrary point in it.
I should be able to call mark() before I read the nth line/char/String and go back there whenever I please, not if and only if I haven't passed some arbitrary number. What's worse is the way the behavior exhibited if you incorrectly compute/guess/estimate the readAheadLimit. Suffice it to say, anyone who like cookie dough will be in for a treat because mark()/reset() is definitely half-baked.
Thanks again to everyone who posted. I enjoyed thinking about and discussing the issue from all angles.
RandomAccessReader is extremely slow class, do not use it unless performance isn't an issue.
I cant really gove you an answer,but what I can give you is a way to a solution, that is you have to find the anglde that you relate to or peaks your interest. A good paper is one that people get drawn into because it reaches them ln some way.As for me WW11 to me, I think of the holocaust and the effect it had on the survivors, their families and those who stood by and did nothing until it was too late.