Java - is it possible to read a file line by line, stop, and then immediately start reading bytes where I stopped?

My first thought is to read the whole thing into a ByteBuffer or a ByteArrayOutputStream without trying to process it, then locate the tag by comparing byte values. Once you know where the text part ends and the binary part begins, you process each part as appropriate.

Not very big file; I like the simplicity of this. I'll give it a try. – NMoney Aug 27 '09 at 18:43 actually I really like this now that I read about it.So the plan would be to read the hole thing into a bytebuffer (i know how big the file is in bytes so this buffer will be the right size).

Then I search the bytebuffer for my end tag, and then I slice the buffer right there. Would that work? I imagine searching for my end tag would involve search for the first byte, and if found, check second, third etc to confirm.

– NMoney Aug 27 '09 at 18:48 That's what I was thinking. – Alan Moore Aug 27 '09 at 18:55 is the most flexible option in terms of processing the bytes to read both sections into separate byte arrays (byte)? Is there some way to instead of pass a fileinputstream into a filereader that I pass a byte array?

One of the byte arrays would be full of ascii encoded text and I would like to buffer it if possible and read out lines (like with BufferedReader or Scanner). Is such a thin possible? – NMoney Aug 27 '09 at 20:25 ah, I could just pass my byte array into a ByteArrayInputStream, which I could pass into my InputStreamReader to convert bytes to chars, right?

And from there to a FileReader and then maybe a BufferedReader? – NMoney Aug 27 '09 at 20:31.

It is possible, but as far as I know not with the classes from the API. You can do it manually - open it as a BufferedInputStream, which supports mark/reset. You read block by block (byte) and you parse it as ASCII.

Eventually you accumulate it in a buffer until you hit the marker. But before you read you call mark. If you believe you read all you needed in ASCII, you call reset and then you call read to dump the rest of the ASCII part.

And now you have a BufferedInputStream (which is an InputStream) ready for reading the binary part of the file.

I don't know how far down the end tag is, so the only data structure I can think of is an arraylist. Looking at buffer it seems I need to know how much to allocate it, which I don't.Is the best way to deal with this stuff an arraylist? – NMoney Aug 27 '09 at 17:10 You read 100 bytes.

Does it contain the end marker (easy to test, because of the ASCII encoding)? No, then it's part of the string. Remember it somewhere (to parse it as a string).

You read next block. Again, it doesn't contain the end marker, you keep track of it. And so on.

At one point, you read a block that has the end marker. You cut the fist part (before the marker), you store it for String parsing. You rewind to the beginning of the block, you read/skip bytes till after the marker and you have the right binary input stream.

You concatenate the accumulated pieces and use a Reader. TBC – Marian Aug 27 '09 at 18:25 You will need to be careful about the end marker spawning across two consecutive block. You can store the bytes as List before concatenation, to avoid repeated System.

Arraycopys BTW, that 100 is bad. You should use something like 4096 or 16384. – Marian Aug 27 '09 at 18:26.

I think the best idea would be to abandon the concept of "lines". To find the end tag, create a ring buffer that's just big enough to contain the end tag, read into it byte-by-byte, and after each byte check if it contains the tag. There are more sophisticated and efficient search algorithms, but the difference is only relevant with longer search terms (presumably your end tag is short).

I don't think he can chose the file format. I saw the kind of files he describes. For example, I believe that the Java2SE installation kit for Linux is stored in the same way.

– Marian Aug 27 '09 at 14:27 I'm not saying he has to change the file format, just that he shouldn't try to read it one byte at a time rather than depending on the concept of "lines". – Michael Borgwardt Aug 27 '09 at 14:29 @michael: is there a standard java class for ring buffer? Couldn't find a corresponding java site after googling "ring buffer java" – NMoney Aug 27 '09 at 14:33 Sorry, I read only the first sentence, I admit :-D – Marian Aug 27 '09 at 14:47 No, there's no implementation in the standard API.

But it's a very simple data structure to implement yourself. Alternatively, you could abuse an ArrayDeque for this purpose by calling removeFirst() for each add() once its length equals the end tag's. – Michael Borgwardt Aug 27 '09 at 14:56.

Yup, you're right about the byte-by-byte. Abstraction has its disadvantages.

Crimson: AAAAAAARRRRRRRRRRRRGGGGGGGGGGGGGG – NMoney Aug 27 '09 at 14:22 Java's strong distinction between character and byte streams, while useful for ensuring that you always are dealing with data correctly and distinguishing between strings and encodings thereof, does make this a bit difficult. – Michael E Aug 27 '09 at 14:23.

If it's static, see java.sun.com/javase/6/docs/api/java/nio/....

It's static, but I don't see how a mappedbytebuffer really offers me much more than a normal bytebuffer for just reading all the bytes into arrays and such. – NMoney Aug 28 '09 at 8:52.

I cant really gove you an answer,but what I can give you is a way to a solution, that is you have to find the anglde that you relate to or peaks your interest. A good paper is one that people get drawn into because it reaches them ln some way.As for me WW11 to me, I think of the holocaust and the effect it had on the survivors, their families and those who stood by and did nothing until it was too late.

Related Questions