If by "checking" you mean to examine the archive structure to determine whether it's corrupt, and "extracting" you mean "write the archive's contents to disk" then they are fundamentally the same thing. The only difference is that checking sends the content to /dev/null instead of a file.
As to why they're writing the contents to disk I can only speculate. Perhaps they're using a library that doesn't expose an "extract to memory" feature, or maybe it's an anti-zipbomb measure to avoid out-memory/denial-of-service attacks.
Have you ever done any work in this area? Because it sounds like you know what you’re talking about, except it’s all nonsense.
Zip format can be de/compressed progressively, which is one reason why it’s nice for HTTP transport encoding. The file format is decompressed one record at a time and many or most libraries can give you this as a stream, so it never has to hit disk or be “sent to dev/null”.
If you take responsibility for streaming the records to disk (trivial), then you can check the canonical path before writing, and any other filesystem sanity tests you want to do.
Last year I implemented zip reading and zip writing in a hobby project of mine. I'm not an expert, but I know enough to write a working zip reader/writer.
> Zip format can be de/compressed progressively, which is one reason why it’s nice for HTTP transport encoding.
Do you mean HTTP transfer encoding? If so then it's not the zip archive format that's used, but rather the deflate compression algorithm (which zip also uses.)
> The file format is decompressed one record at a time
But not necessarily in the order they appear.
> many or most libraries can give you this as a stream, so it never has to hit disk or be “sent to dev/null”.
My point is that the compressed bytes have to be decompressed and checksummed in both extraction and checking, but after that the bytes may either be written or discarded.
> If you take responsibility for streaming the records to disk (trivial), then you can check the canonical path before writing, and any other filesystem sanity tests you want to do.
That's true but there's nothing wrong with the paths in this case.
As to why they're writing the contents to disk I can only speculate. Perhaps they're using a library that doesn't expose an "extract to memory" feature, or maybe it's an anti-zipbomb measure to avoid out-memory/denial-of-service attacks.