Non-Contiguous Struct (PDB format)

Alloth · January 20, 2025, 10:12pm

Hello,

I’m trying to write a binary template that parses Microsoft’s PDB debug files. However, the format is paged, meaning a data stream of structures can be split across non-contiguous, fixed-size pages. I can coalesce the pages into a contiguous local uchar array, but I can’t do anything else with that.

I know you can FSeek inside a structure definition, but in this case I don’t know where exactly it will be split. Is there a way to get around this limitation?

Cheers,
Alloth

sweetscape · January 21, 2025, 3:27pm

Unfortunately, there is no easy way to handle this type of data in Binary Templates right now. We are planning on adding some extensions in the future to handle these types of files but we are still working on how it will operate. There is a template for Microsoft DOC files in the repository that works by assuming that the file is first defragmented (all the blocks put in order) and I’m not sure if PDB could use a similar technique. Another option could be to use a short script to write all the blocks in the PDB file to a temporary file in order, and then run a template on the temporary file. This is of course not ideal and we hope to have something to handle these types of files in not too long.

Graeme
SweetScape Software

Alloth · January 21, 2025, 6:04pm

It’s good to know that I wasn’t missing anything obvious and I look forward to this new functionality. Any time I write an encoder or decoder for a file format I like to create a corresponding template because 010’s interface is much nicer to look at than text dumps, and this is the first time I couldn’t figure out how to accomplish that. As for the defragmented temporary file, I think that’s my best option, but I’ll do that outside the editor as a pre-processing step.

k77 · May 4, 2025, 12:47pm

Hello - I ran into this problem recently. For now I am using a script to pre-process the file. The script appends all the non-contiguous segments onto the end of the file in a single contiguous block. ( same as the temp file approach really )

Initially I was hoping I could read these blocks into a bytearray and then assign that memory to a struct… chat gippity thought it was possible, but I couldnt get it to work.

Cheers

sweetscape · May 6, 2025, 2:03pm

There is currently no way to do this right now, but we are planning on adding a way hopefully sometime later this year. Cheers!

Graeme
SweetScape Software

GLT · May 10, 2025, 2:36pm

I’ve been working with an object file format where non-contiguous structs or arrays might be useful for collecting disjoint data into a united view in the template results. I considered writing section descriptors and content to a temp file but after sketching out some code for that, I decided that it could get by with the fragmented but in-file-order template results.

Welcome to the forums, and thanks for helping to bring attention to this concept.