The live back-and-forth is the main point of what I'm asking for — I tried your cpdf (thanks for the mention; will add it to my list) and it too doesn't help; all it does is, somewhere 9000-odd lines into the JSON file, turn the part of the content stream corresponding to what I mentioned in the earlier comment into:
This is just a more verbose restatement of what's in the PDF file; the real questions I'm asking are:
- How can a user get to this part, from viewing the PDF file? (Note that the PDF page objects are not necessarily a flat list; they are often nested at different levels of “kids”.)
- How can a user understand these instructions, and “see” how they correspond to what is visually displayed on the PDF file?
This might actually be something very valuable to me.
I have a bunch of documents right now that are annual statutory and financial disclosures of a large institute, and they are just barely differently organized from each year to the next to make it too tedious to cross compare them manually. I've been looking around for a tool that could break out the content and let me reorder it so that the same section is on the same page for every report.