Buffered files (for faster disk access)?

Windows file caching is very effective, especially if you are using Vista or later TFileStream is a loose wrapper around the Windows ReadFile() and WriteFile() API functions and for many use cases the only thing faster is a memory mapped file.

Windows file caching is very effective, especially if you are using Vista or later. TFileStream is a loose wrapper around the Windows ReadFile() and WriteFile() API functions and for many use cases the only thing faster is a memory mapped file. However, there is one common scenario where TFileStream becomes a performance bottleneck.

That is if you read or write small amounts of data with each call to the stream read or write functions. For example if you read an array of integers one item at a time then you incur a significant overhead by reading 4 bytes at a time in the calls to ReadFile(). Again, memory mapped files are an excellent way to solve this bottleneck, but the other commonly used approach is to read a much larger buffer, many kilobytes say, and then resolve future reads of the stream from this in memory cache rather than further calls to ReadFile().

This approach only really works for sequential access. From the use pattern described in your updated question, I think you may find the following classes would improve performance for you: type IStreamErrorHandler = interface '{B2A95D51-DD0D-49C2-9511-638EE4F911C8}' procedure HandleError(const Msg: string=''); end; TBaseCachedFileStream = class(TStream, IStreamErrorHandler) private function QueryInterface(const IID: TGUID; out Obj): HResult; stdcall; function _AddRef: Integer; stdcall; function _Release: Integer; stdcall; protected FHandle: THandle; FOwnsHandle: Boolean; FCache: PByte; FCacheSize: Integer; FPosition: Int64;//the current position in the file (relative to the beginning of the file) FCacheStart: Int64;//the postion in the file of the start of the cache (relative to the beginning of the file) FCacheEnd: Int64;//the postion in the file of the end of the cache (relative to the beginning of the file) FFileName: string; FLastError: DWORD; procedure HandleError(const Msg: string); procedure RaiseSystemError(const Msg: string; LastError: DWORD); overload; procedure RaiseSystemError(const Msg: string); overload; procedure RaiseSystemErrorFmt(const Msg: string; const Args: array of const); function CreateHandle(FlagsAndAttributes: DWORD): THandle; virtual; abstract; function GetFileSize: Int64; virtual; procedure SetSize(NewSize: Longint); override; procedure SetSize(const NewSize: Int64); override; function FileRead(var Buffer; Count: Longword): Integer; function FileWrite(const Buffer; Count: Longword): Integer; function FileSeek(const Offset: Int64; Origin: TSeekOrigin): Int64; public constructor Create(const FileName: string); overload; constructor Create(const FileName: string; CacheSize: Integer); overload; constructor Create(const FileName: string; CacheSize: Integer; Handle: THandle); overload; virtual; destructor Destroy; override; property CacheSize: Integer read FCacheSize; function Read(var Buffer; Count: Longint): Longint; override; function Write(const Buffer; Count: Longint): Longint; override; function Seek(const Offset: Int64; Origin: TSeekOrigin): Int64; override; end; TBaseCachedFileStreamClass = class of TBaseCachedFileStream; IDisableStreamReadCache = interface '{0B6D0004-88D1-42D5-BC0F-447911C0FC21}' procedure DisableStreamReadCache; procedure EnableStreamReadCache; end; TReadOnlyCachedFileStream = class(TBaseCachedFileStream, IDisableStreamReadCache) (* This class works by filling the cache each time a call to Read is made and FPosition is outside the existing cache. By filling the cache we mean reading from the file into the temporary cache.

Calls to Read when FPosition is in the existing cache are then dealt with by filling the buffer with bytes from the cache. *) private FUseAlignedCache: Boolean; FViewStart: Int64; FViewLength: Int64; FDisableStreamReadCacheRefCount: Integer; procedure DisableStreamReadCache; procedure EnableStreamReadCache; procedure FlushCache; protected function CreateHandle(FlagsAndAttributes: DWORD): THandle; override; function GetFileSize: Int64; override; public constructor Create(const FileName: string; CacheSize: Integer; Handle: THandle); overload; override; property UseAlignedCache: Boolean read FUseAlignedCache write FUseAlignedCache; function Read(var Buffer; Count: Longint): Longint; override; procedure SetViewWindow(const ViewStart, ViewLength: Int64); end; TWriteCachedFileStream = class(TBaseCachedFileStream, IDisableStreamReadCache) (* This class works by caching calls to Write. By this we mean temporarily storing the bytes to be written in the cache.As each call to Write is processed the cache grows.

The cache is written to file when: 1. A call to Write is made when the cache is full.2. A call to Write is made and FPosition is outside the cache (this must be as a result of a call to Seek).

3. The class is destroyed. Note that data can be read from these streams but the reading is not cached and in fact a read operation will flush the cache before attempting to read the data.

*) private FFileSize: Int64; FReadStream: TReadOnlyCachedFileStream; FReadStreamCacheSize: Integer; FReadStreamUseAlignedCache: Boolean; procedure DisableStreamReadCache; procedure EnableStreamReadCache; procedure CreateReadStream; procedure FlushCache; protected function CreateHandle(FlagsAndAttributes: DWORD): THandle; override; function GetFileSize: Int64; override; public constructor Create(const FileName: string; CacheSize, ReadStreamCacheSize: Integer; ReadStreamUseAlignedCache: Boolean); overload; destructor Destroy; override; function Read(var Buffer; Count: Longint): Longint; override; function Write(const Buffer; Count: Longint): Longint; override; end; { TBaseCachedFileStream } constructor TBaseCachedFileStream. Create(const FileName: string); begin Create(FileName, 0); end; constructor TBaseCachedFileStream. Create(const FileName: string; CacheSize: Integer); begin Create(FileName, CacheSize, 0); end; constructor TBaseCachedFileStream.

Create(const FileName: string; CacheSize: Integer; Handle: THandle); const DefaultCacheSize = 16*1024; //16kb - this was chosen empirically - don't make it too large otherwise the progress report is 'jerky' begin inherited Create; FFileName := FileName; FOwnsHandle := Handle=0; if FOwnsHandle then begin FHandle := CreateHandle(FILE_ATTRIBUTE_NORMAL); end else begin FHandle := Handle; end; FCacheSize := CacheSize; if FCacheSize0) then begin CloseHandle(FHandle); end; inherited; end; function TBaseCachedFileStream. QueryInterface(const IID: TGUID; out Obj): HResult; begin if GetInterface(IID, Obj) then begin Result := S_OK; end else begin Result := E_NOINTERFACE; end; end; function TBaseCachedFileStream. _AddRef: Integer; begin Result := -1; end; function TBaseCachedFileStream.

_Release: Integer; begin Result := -1; end; procedure TBaseCachedFileStream. HandleError(const Msg: string); begin if FLastError0 then begin RaiseSystemError(Msg, FLastError); end; end; procedure TBaseCachedFileStream. RaiseSystemError(const Msg: string; LastError: DWORD); begin raise EStreamError.

Create(Trim(Msg+' ')+GetSystemErrorString(LastError)); end; procedure TBaseCachedFileStream. RaiseSystemError(const Msg: string); begin RaiseSystemError(Msg, GetLastError); end; procedure TBaseCachedFileStream. RaiseSystemErrorFmt(const Msg: string; const Args: array of const); begin RaiseSystemError(Format(Msg, Args)); end; function TBaseCachedFileStream.

GetFileSize: Int64; begin if not GetFileSizeEx(FHandle, Result) then begin RaiseSystemErrorFmt('GetFileSizeEx failed for %s. ', FFileName); end; end; procedure TBaseCachedFileStream. SetSize(NewSize: Longint); begin SetSize(Int64(NewSize)); end; procedure TBaseCachedFileStream.

SetSize(const NewSize: Int64); begin Seek(NewSize, soBeginning); if not Windows. SetEndOfFile(FHandle) then begin RaiseSystemErrorFmt('SetEndOfFile for %s.', FFileName); end; end; function TBaseCachedFileStream. FileRead(var Buffer; Count: Longword): Integer; begin if Windows.

ReadFile(FHandle, Buffer, Count, LongWord(Result), nil) then begin FLastError := 0; end else begin FLastError := GetLastError; Result := -1; end; end; function TBaseCachedFileStream. FileWrite(const Buffer; Count: Longword): Integer; begin if Windows. WriteFile(FHandle, Buffer, Count, LongWord(Result), nil) then begin FLastError := 0; end else begin FLastError := GetLastError; Result := -1; end; end; function TBaseCachedFileStream.

FileSeek(const Offset: Int64; Origin: TSeekOrigin): Int64; begin if not SetFilePointerEx(FHandle, Offset, @Result, ord(Origin)) then begin RaiseSystemErrorFmt('SetFilePointerEx failed for %s.', FFileName); end; end; function TBaseCachedFileStream. Read(var Buffer; Count: Integer): Longint; begin RaiseAssertionFailed(Result); end; function TBaseCachedFileStream. Write(const Buffer; Count: Integer): Longint; begin RaiseAssertionFailed(Result); end; function TBaseCachedFileStream.

Seek(const Offset: Int64; Origin: TSeekOrigin): Int64; //Set FPosition to the value specified - if this has implications for the //cache then overriden Write and Read methods must deal with those. Begin case Origin of soBeginning: FPosition := Offset; soEnd: FPosition := GetFileSize+Offset; soCurrent: inc(FPosition, Offset); else RaiseAssertionFailed; end; Result := FPosition; end; { TReadOnlyCachedFileStream } constructor TReadOnlyCachedFileStream. Create(const FileName: string; CacheSize: Integer; Handle: THandle); begin inherited; SetViewWindow(0, inherited GetFileSize); end; function TReadOnlyCachedFileStream.

CreateHandle(FlagsAndAttributes: DWORD): THandle; begin Result := Windows. CreateFile( PChar(FFileName), GENERIC_READ, FILE_SHARE_READ, nil, OPEN_EXISTING, FlagsAndAttributes, 0 ); if Result=INVALID_HANDLE_VALUE then begin RaiseSystemErrorFmt('Cannot open %s. ', FFileName); end; end; procedure TReadOnlyCachedFileStream.

DisableStreamReadCache; begin inc(FDisableStreamReadCacheRefCount); end; procedure TReadOnlyCachedFileStream. EnableStreamReadCache; begin dec(FDisableStreamReadCacheRefCount); end; procedure TReadOnlyCachedFileStream. FlushCache; begin FCacheStart := 0; FCacheEnd := 0; end; function TReadOnlyCachedFileStream.

GetFileSize: Int64; begin Result := FViewLength; end; procedure TReadOnlyCachedFileStream. SetViewWindow(const ViewStart, ViewLength: Int64); begin if ViewStartinherited GetFileSize then begin RaiseAssertionFailed; end; FViewStart := ViewStart; FViewLength := ViewLength; FPosition := 0; FCacheStart := 0; FCacheEnd := 0; end; function TReadOnlyCachedFileStream. Read(var Buffer; Count: Longint): Longint; var NumOfBytesToCopy, NumOfBytesLeft, NumOfBytesRead: Longint; CachePtr, BufferPtr: PByte; begin if FDisableStreamReadCacheRefCount>0 then begin FileSeek(FPosition+FViewStart, soBeginning); Result := FileRead(Buffer, Count); if Result=-1 then begin Result := 0;//contract is to return number of bytes that were read end; inc(FPosition, Result); end else begin Result := 0; NumOfBytesLeft := Count; BufferPtr := @Buffer; while NumOfBytesLeft>0 do begin if (FPosition=FCacheEnd) then begin //the current position is not available in the cache so we need to re-fill the cache FCacheStart := FPosition; if UseAlignedCache then begin FCacheStart := FCacheStart - (FCacheStart mod CacheSize); end; FileSeek(FCacheStart+FViewStart, soBeginning); NumOfBytesRead := FileRead(FCache^, CacheSize); if NumOfBytesRead=-1 then begin exit; end; Assert(NumOfBytesRead>=0); FCacheEnd := FCacheStart+NumOfBytesRead; if NumOfBytesRead=0 then begin FLastError := ERROR_HANDLE_EOF;//must be at the end of the file break; end; end; //read from cache to Buffer NumOfBytesToCopy := Min(FCacheEnd-FPosition, NumOfBytesLeft); CachePtr := FCache; inc(CachePtr, FPosition-FCacheStart); Move(CachePtr^, BufferPtr^, NumOfBytesToCopy); inc(Result, NumOfBytesToCopy); inc(FPosition, NumOfBytesToCopy); inc(BufferPtr, NumOfBytesToCopy); dec(NumOfBytesLeft, NumOfBytesToCopy); end; end; end; { TWriteCachedFileStream } constructor TWriteCachedFileStream.

Create(const FileName: string; CacheSize, ReadStreamCacheSize: Integer; ReadStreamUseAlignedCache: Boolean); begin inherited Create(FileName, CacheSize); FReadStreamCacheSize := ReadStreamCacheSize; FReadStreamUseAlignedCache := ReadStreamUseAlignedCache; end; destructor TWriteCachedFileStream. Destroy; begin FlushCache;//make sure that the final calls to Write get recorded in the file FreeAndNil(FReadStream); inherited; end; function TWriteCachedFileStream. CreateHandle(FlagsAndAttributes: DWORD): THandle; begin Result := Windows.

CreateFile( PChar(FFileName), GENERIC_READ or GENERIC_WRITE, 0, nil, CREATE_ALWAYS, FlagsAndAttributes, 0 ); if Result=INVALID_HANDLE_VALUE then begin RaiseSystemErrorFmt('Cannot create %s. ', FFileName); end; end; procedure TWriteCachedFileStream. DisableStreamReadCache; begin CreateReadStream; FReadStream.

DisableStreamReadCache; end; procedure TWriteCachedFileStream. EnableStreamReadCache; begin Assert(Assigned(FReadStream)); FReadStream. EnableStreamReadCache; end; function TWriteCachedFileStream.

GetFileSize: Int64; begin Result := FFileSize; end; procedure TWriteCachedFileStream. CreateReadStream; begin if not Assigned(FReadStream) then begin FReadStream := TReadOnlyCachedFileStream. Create(FFileName, FReadStreamCacheSize, FHandle); FReadStream.

UseAlignedCache := FReadStreamUseAlignedCache; end; end; procedure TWriteCachedFileStream. FlushCache; var NumOfBytesToWrite: Longint; begin if Assigned(FCache) then begin NumOfBytesToWrite := FCacheEnd-FCacheStart; if NumOfBytesToWrite>0 then begin FileSeek(FCacheStart, soBeginning); if FileWrite(FCache^, NumOfBytesToWrite)NumOfBytesToWrite then begin RaiseSystemErrorFmt('FileWrite failed for %s. ', FFileName); end; if Assigned(FReadStream) then begin FReadStream.

FlushCache; end; end; FCacheStart := FPosition; FCacheEnd := FPosition; end; end; function TWriteCachedFileStream. Read(var Buffer; Count: Integer): Longint; begin FlushCache; CreateReadStream; Assert(FReadStream. FViewStart=0); if FReadStream.

FViewLengthFFileSize then begin FReadStream. SetViewWindow(0, FFileSize); end; FReadStream. Position := FPosition; Result := FReadStream.

Read(Buffer, Count); inc(FPosition, Result); end; function TWriteCachedFileStream. Write(const Buffer; Count: Longint): Longint; var NumOfBytesToCopy, NumOfBytesLeft: Longint; CachePtr, BufferPtr: PByte; begin Result := 0; NumOfBytesLeft := Count; BufferPtr := @Buffer; while NumOfBytesLeft>0 do begin if ((FPositionFCacheEnd))//the current position is outside the cache or (FPosition-FCacheStart=FCacheSize)//the cache is full then begin FlushCache; Assert(FCacheStart=FPosition); end; //write from Buffer to the cache NumOfBytesToCopy := Min(FCacheSize-(FPosition-FCacheStart), NumOfBytesLeft); CachePtr := FCache; inc(CachePtr, FPosition-FCacheStart); Move(BufferPtr^, CachePtr^, NumOfBytesToCopy); inc(Result, NumOfBytesToCopy); inc(FPosition, NumOfBytesToCopy); FCacheEnd := Max(FCacheEnd, FPosition); inc(BufferPtr, NumOfBytesToCopy); dec(NumOfBytesLeft, NumOfBytesToCopy); end; FFileSize := Max(FFileSize, FPosition); end.

I do millions of readings in sequential manner. So, it looks like a buffered file will indeed help me. – Altar Apr 12 at 18:56 Thanks a lot for the code.

I will try it now. +1 – Altar Apr 12 at 19:03 @Remy thanks for the edit, I always get the names of those functions wrong! – David Heffernan Apr 12 at 19:16 @David.

There is a call to a function or procedure called 'RaiseAssertionFailed'. I cannot find it in Delphi. Whitout it the code won't compile.

– Altar Apr 12 at 19:18 2 Lord sweet Jesus. Delphi class: 11.2 seconds. Your class: 1.6 seconds.

– Altar Apr 12 at 19:41.

The TFileStream class internally uses the CreateFile function which always uses a buffer to manage the file, unless which you specify the FILE_FLAG_NO_BUFFERING flag (be aware which you can't specify this flag directly using the TFileStream). For more information you can check these links CreateFile Function Windows File Buffering also you can try the TGpHugeFileStream which is part of the GpHugeFile unit from Primoz Gabrijelcic.

1 That's what RRUZ is implying, but in many cases it is simply not true. There is an overhead to calling ReadFile that becomes significant if you read small pieces at a time. – David Heffernan Apr 12 at 18:19 1 @Altar, No, i' am not saying that, I say which the TFileStream uses buffer to hold the data, that works ok in most of cases.

Now if you want improve the performance you can write from scratch a object (class) to access and write the file using a bigger buffer or use a class like the TGpHugeFileStream. – RRUZ Apr 12 at 18:25 1 TFileStream doesn't use a buffer. It's just a lightweight wrapper around ReadFile/WriteFile.

Windows has file caches which these API routines benefit from. – David Heffernan Apr 12 at 18:30 2 @RRUZ I don't think that counts as a buffer in this discussion! – David Heffernan Apr 12 at 18:35 1 @Altar in a now deleted comment you state that you need read/write access.Is that correct?

– David Heffernan Apr 12 at 18:47.

If you have this kind of codes while Stream. Position Size to a variable and it will speed up. Stream.

Size use three virtual function calls to find out the actual size Cheers.

Thanks. Great idea. – Altar Apr 14 at 8:46.

I cant really gove you an answer,but what I can give you is a way to a solution, that is you have to find the anglde that you relate to or peaks your interest. A good paper is one that people get drawn into because it reaches them ln some way.As for me WW11 to me, I think of the holocaust and the effect it had on the survivors, their families and those who stood by and did nothing until it was too late.

Related Questions