Embedding source into pdb, and having debugger(s) use it?

I've read over this and wanted to summarize my understanding for clarity.

I've read over this and wanted to summarize my understanding for clarity Today the debugger uses the PDB to gain the disk path to a file and checksum which was compiled to create a given section of an executable. The debugger then attempts to load the file using both the local disk and available symbol server. Under this proposal we would skip the middle man by just embedding the file itself into the PDB.

Eureka, no more searching for source! As someone who's done their fair share of digging for source code in this manner I like the idea of having one package for all your debugging needs. There are a couple of facets to consider about this proposal though.

The first is the actual embedding of the source code into the PDB. This is very doable. The PDB is essentially a light weight file database.

There is structure to what it encodes but AFAIK you can put whatever you want into certain slots (local variable values / types for example). There may be size limitations for certain slots but I'm sure you could invent an encoding scheme to break large files up into chunks. The second facet is having the debugger actually load the file from the PDB vs. searching for it on disk.

I'm not as familiar with that part of the debugger but from what I understand it only uses 2 pieces of information to locate the file The path to the file on disk The checksum of said file (used to disambiguate files with the same name) I'm fairly certain this is the only information it passes onto a symbol server. This makes it unfeasible to implement a symbol server because it won't have access to the PDB (assuming of course I'm right). I dug around hoping there was a VS COM component you could override which would allow you to intercept the loading of the file for a given path but I couldn't find one.

One approach I think would be feasible though would be Embed the source in the PDB Have a tool which can both extract the source to a known location and rewrite the PDB to point to that place. This wouldn't be quite what you want though.

If I write the source files into the pdb, all I need to do is also include sufficient metadata (checksum, filepath, encoding, etc). Also, it seems like there must be more information than that written (or at least that can be written) since source server seems to work with a full server path and version number. The 'step 2' was a source server locally (in-proc as extension) that could read from pdb.

– James Manning Sep 8 at 4:10 BTW, it didn't occur to me at the time, but if the pdb format allows (or doesn't have a problem with) writing files to alternate data streams, and if the API's used by the debugger are ones that support ADS, then it seems like one approach that might Just Work would be rewriting the file paths in the pdb to point to the pdb itself with the necessary ADS identifier. IOW, it starts out as foo/bar/baz/SomeClass.Cs in the pdb, but the rewrite makes it path/on/disk/SomeAssembly. Pdb:foo/bar/baz/SomeClass.

Cs - no idea if that would work, though. – James Manning Sep 8 at 4:18 In terms of the 'extract source, rewrite PDB', if the extensibility API exists such that an extension could get notified when a particular assembly's symbols are about to get loaded, it could at least do this on a 'just in time' / as-needed basis. That would even potentially be better since it could even notice if the original source file has the same checksum as what would get extracted and just leave it alone (potentially allowing things like Edit-and-Continue to work, which I'm guessing the 'extracted to temp' source would not) – James Manning Sep 8 at 4:23 Your summary is exactly right, although the 'using both the local disk and available symbol server' is confusing to me, as I would have expected it to be 'available source server' instead.

My mental model (which may be wildly inaccurate) is that the loading of the symbols (\\symbols\symbols!) is one thing but then loading the source is completely decoupled from where/how the pdb was loaded. Aside from that confusion on my part, it's dead on. :) – James Manning Sep 8 at 4:34 @James the idea of multiple streams may work.

I can't think of a reason it wouldn't but I'm not very knowledgable about streams either. – JaredPar Sep 8 at 15:16.

I cant really gove you an answer,but what I can give you is a way to a solution, that is you have to find the anglde that you relate to or peaks your interest. A good paper is one that people get drawn into because it reaches them ln some way.As for me WW11 to me, I think of the holocaust and the effect it had on the survivors, their families and those who stood by and did nothing until it was too late.

Related Questions