Signature files for document retrieval?

You might look at Semantic Hacker or Yahoo Term Extraction.

Firstly, lets clarify some terminology. A Digital Signature is intended to be equivilent to a handwritten signature (see en.wikipedia.org/wiki/Digital_signature for a better description and overview). When a digital signature is applied to a document you get a higher level of assurance of the authenticity of the document (you have a better idea if the document was forged or not).

The answers from Adam and Robert both refer to methods for verifying document integrity (that the document is unchanged). While a digital signature also provides this, a checksum (hash) does not provide authenticity. So it's important that we establish the needs of your "Signature file".

I will assume that you are talking about Digital Signatures, rather than checksums as the other answers address checksums. You will want to compose a PKCS#7 detached signature (jargon - a standard format signature that does not contain the data, so it can be stored seperately). To acheive this I recommend you use a standard library such as OpenSSL (which is portable).

For more information on PKCS#7 see http://www.rsa.com/rsalabs/node.asp?id=2129 For more information on OpenSSL see http://www.openssl.org.

Md5sum might be what you are looking for. Source code for generating md5 signatures is available if you Google around. From Wikipedia: Because almost any change to a file will cause its MD5 hash to also change, the MD5 hash is commonly used to verify the integrity of files (i.e.

, to verify that a file has not changed as a result of file transfer, disk error, meddling, etc. ). The md5sum program is installed by default in most Unix, Linux, and Unix-like operating systems or compatibility layers. BSD variants (including Mac OS X) have a similar utility called md5.

Versions for Microsoft Windows do exist.

Similarly to Adam's suggestion, if you're working on a very large amount of documents, it might be a good idea to check out SHA1 and sha1sum. Less collisions, and a bit more advanced encryption.

I cant really gove you an answer,but what I can give you is a way to a solution, that is you have to find the anglde that you relate to or peaks your interest. A good paper is one that people get drawn into because it reaches them ln some way.As for me WW11 to me, I think of the holocaust and the effect it had on the survivors, their families and those who stood by and did nothing until it was too late.

Related Questions