What kind of disks, network etc.? So many questions There's a really good blog post by Paul Randal on SQL Server 2008: FILESTREAM Performance check it out. There's also a 25-page whitepaper on FILESTREAM available - also covering some performance tuning tips But also check out the Microsoft Research TechReport "To Blog or Not To Blob" at: research.microsoft.com/apps/pubs/default... It's a very profound and very well based article that put all those questions through their paces Their conclusion: The study indicates that if objects are larger than one megabyte on average NTFS has a clear advantage over SQL Server. If the objects are under 256 kilobytes, the database has a clear advantage Inside this range, it depends on how write intensive the workload is, and the storage age of a typical replica in the system So judging from that - if your blobs are typically less than 1 MB, just store them as a VARBINARY(MAX) in the database.
If they're typically larger, then just the FILESTREAM feature I wouldn't worry so much about performance rather than other benefits of FILESTREAM over "unmanaged" storage in a NTFS file folder: storing files outside the database without FILESTREAM, you have no control over them: no access control provided by the database the files aren't part of your SQL Server backup the files aren't handled transactionally, e.g. You could end up with "zombie" files which aren't referenced from the database anymore, or "skeleton" entries in the database without the corresponding file on disk Those features alone make it absolutely worthwhile to use FILESTREAM.
What kind of disks, network etc.? So many questions....... There's a really good blog post by Paul Randal on SQL Server 2008: FILESTREAM Performance - check it out. There's also a 25-page whitepaper on FILESTREAM available - also covering some performance tuning tips. But also check out the Microsoft Research TechReport "To Blog or Not To Blob" at: research.microsoft.com/apps/pubs/default... It's a very profound and very well based article that put all those questions through their paces.
Their conclusion: The study indicates that if objects are larger than one megabyte on average, NTFS has a clear advantage over SQL Server. If the objects are under 256 kilobytes, the database has a clear advantage. Inside this range, it depends on how write intensive the workload is, and the storage age of a typical replica in the system.
So judging from that - if your blobs are typically less than 1 MB, just store them as a VARBINARY(MAX) in the database. If they're typically larger, then just the FILESTREAM feature. I wouldn't worry so much about performance rather than other benefits of FILESTREAM over "unmanaged" storage in a NTFS file folder: storing files outside the database without FILESTREAM, you have no control over them: no access control provided by the database the files aren't part of your SQL Server backup the files aren't handled transactionally, e.g. You could end up with "zombie" files which aren't referenced from the database anymore, or "skeleton" entries in the database without the corresponding file on disk Those features alone make it absolutely worthwhile to use FILESTREAM.
That the white paper I was trying to remember "FILESTREAM Storage in SQL Server 2008" – Remus Rusanu Dec 29 '09 at 21:59 Thanks for the response. If a web site accesses FILESTREAM files via the streaming API, what configuration must be done at the firewall to enable that traffic? Right now we open port 1433, but that is it.
– John Dec 30 '09 at 23:38.
1 you more vividly paint the picture of the difference the underlying hardware will make :-) – marc_s Dec 29 '09 at 22:00.
Reading a FILESTREAM over Win32 is quite fast. See Managing FILESTREAM Data by Using Win32. You should follow the FILESTREAM best practices though.
After all, this is what powers Sharepoint and MS would not bet something as important as Office (==Sharepoint) on unperformance storage. There are some case studies and white papers around FILESTREAM, I could only digg out Laren Electronics Fuels Analysis of Formula One Racing Data with SQL Server but I know there are more with more detailed numeric data. If I recall correctly it shows that that FILESTREAM in general shadows SMB performance by about 90-95% factor, over a certain file size.
For small files the overhead of obtaining the FILESTREAM API handle starts to show up. I'd also second Marc in recommending reading over the Research paper on the topic (there is also a Channel 9 interview with Catharine van Ingen, available on iTunes podcasts too, where she speaks about this work), but bear in mind that the paper is published in 2006 before FILESTREAM was officially released, so it does not consider the FILESTREAM specifics. As for your second question, asking about performance by only specifying the load and not the capacity of the system is a non-sense.
A 128 CPU Superdome with a mountain of storage SANs won't even notice your load. A SQL runing on a 256 MB laptop with a mountain of spyware won't even get to see your load...
I cant really gove you an answer,but what I can give you is a way to a solution, that is you have to find the anglde that you relate to or peaks your interest. A good paper is one that people get drawn into because it reaches them ln some way.As for me WW11 to me, I think of the holocaust and the effect it had on the survivors, their families and those who stood by and did nothing until it was too late.