Dllicious – shared object usage analysis on Linux

matthias509 · on Feb 6, 2022

Saving disk space isn’t nearly as exciting as saving CPU cache space which is much more precious. That means that something like libc will almost always be in cache from something calling it.

5a04fcd55b8 · on Feb 6, 2022

I think dynamic linking is great for saving (DRAM) memory, but for caches I am having trouble imagining a case in which this would make a practical difference.

You would need two distinct, long-lived dynamically-linked executables, both being hot at the same time, and both having libc (or another common library) in the hotpath at the same time.

To elaborate,

- If two hot processes are the same executable, static linking would yield the same results: They will be mmap()ed to the same physical addresses (and I'm assuming physical indexing or equivalent here, otherwise ASLR really negates any dynamic linking advantage wrt caches).

- If the processes are not hot or not long-lived, everything will be dominated (by orders of magnitude) by media access, context switches, and virtual page allocation.

- Sure, we could imagine a non-hot process trashing the cache for a hot one, and doing it less so with dynamic linking. But again, context switch is the bigger concern. A complete cache flush would make little difference here.

- And I'm not even going to touch on the various cache-timing vuln mitigations out there.

That being said, I am a staunch proponent of dynamic linking being the general case, for the memory and storage savings. Security is a common argument too, and although I'm not expert enough to have an opinion on this one, I do love the traditional distro approach to packages.

azalemeth · on Feb 6, 2022

> You would need two distinct, long-lived dynamically-linked executables, both being hot at the same time, and both having libc (or another common library) in the hotpath at the same time.

Like, for example, bash and a process you are executing in parallel in a loop?

5a04fcd55b8 · on Feb 6, 2022

I may be missing something... Is bash exec()ing something in the loop? If so, caches will be the least of our concerns; there will be context switches, system calls, and even I/O, all orders of magnitude slower than even DRAM access. Otherwise, what is bash doing in the loop? When is bash ever CPU-bound?... in a library function call... that is also called by the other process?

matthias509 · on Feb 6, 2022

Imagine the other extreme where every single executable on the system statically linked libc. Assuming libc calls are pretty common(which I think is safe) I would expect the cache to get trashed.

vvanders · on Feb 6, 2022

The upper bound is interesting but my limited intuition is it's a very upper bound. LTO/LTCG will drive that down a lot. Last time I was looking at Rust binary sizes things like building the stdlib with heavy LTO/opt size had some pretty big influences on final binary size.

gbin · on Feb 6, 2022

The size computation is overblown no? A static link would probably remove most of the symbols as they are probably unused?

zeroping · on Feb 6, 2022

The author specifically called that out. You are correct.

They also didn't look at DLLs used by other DLLs.

This is like a back-of-the-envelope calculation. A quick first pass to get an order of magnitude.

Vogtinator · on Feb 6, 2022

> They also didn't look at DLLs used by other DLLs.

If so, then the data is too wrong to be of any value.

However, ldd lists also indirect dependencies so I think this was already accounted for (though that might've been by accident):

~> readelf -d /bin/bash | grep NEEDED 0x0000000000000001 (NEEDED) Shared library: [libreadline.so.8] 0x0000000000000001 (NEEDED) Shared library: [libc.so.6] ~> ldd /bin/bash linux-vdso.so.1 (0x00007fff15191000) libreadline.so.8 => /lib64/libreadline.so.8 (0x00007f2a24818000) libc.so.6 => /lib64/libc.so.6 (0x00007f2a2460e000) libtinfo.so.6 => /lib64/libtinfo.so.6 (0x00007f2a245db000) /lib64/ld-linux-x86-64.so.2 (0x00007f2a249b1000)

mid-kid · on Feb 7, 2022

One thing I noticed is that you're running `ldd` on every executable file on the system. Incidentally, `ldd` returns valid output when ran on .so files (which are always executable). This means you're counting an extra use of certain libraries not just for the executables that use them but also the libraries that depend on them.

Also it seems like you're also scanning a bunch of flatpak apps and steam apps, which provide an environment for common libraries, but expect each program to statically link anything not shipped within these common environments.

Both of these skew the graph of most used libraries, and the former skews the additional space required for static linking a little.

It's also worth mentioning that inlining, the lack of a requirement for symbol names/version info/dependency info, and dead code stripping, make static binaries significantly smaller than the size of program + so.

I'm a huge proponent of shared libraries, but these numbers wouldn't really mean much in reality, unfortunately. Though for me, the main benefit of shared libraries is being to individually patch and upgrade them, without a care for how many apps actually use them nor how. This has been a godsent in terms of adding functionality and making my computer behave like I want it to.

fabianhjr · on Feb 6, 2022

On my NixOS computer filtering for the most used:

  10093 libcap.so.2
  49344 libc.so.6
  27555 libdl.so.2
  13296 libffi.so.8
  16112 libgcc_s.so.1
  10086 libgcrypt.so.20
  11352 libglib-2.0.so.0
  10128 libgpg-error.so.0
  13015 liblzma.so.5
  22664 libm.so.6
  11337 libpcre.so.1
  29942 libpthread.so.0
  11036 libresolv.so.2
  19510 librt.so.1
  17287 libz.so.1
  11362 libzstd.so.1

Though, some usages are not the same as others even if they have the same ABI so probably hurting performance a bit. :P

aij · on Feb 6, 2022

Is that having GCed all but the current generation of packages? Or filtering for a single nixpkgs version some other way?

Even then, I would expect NixOS to do some unsharing of shared libraries, so it would probably be best to use the full path to determine how much each shared object is actually shared.

For example, I have 87 different `libzstd.so.1` in my `/nix/store` right now.

mprovost · on Feb 6, 2022

Sorted:

  49344 libc.so.6
  29942 libpthread.so.0
  27555 libdl.so.2
  22664 libm.so.6
  19510 librt.so.1
  17287 libz.so.1
  16112 libgcc_s.so.1
  13296 libffi.so.8
  13015 liblzma.so.5
  11362 libzstd.so.1
  11352 libglib-2.0.so.0
  11337 libpcre.so.1
  11036 libresolv.so.2
  10128 libgpg-error.so.0
  10093 libcap.so.2
  10086 libgcrypt.so.20

choeger · on Feb 6, 2022

> DLL’s add a level of complexity to writing and using software, and newer languages like Rust and Go have eschewed them,

Yeah, no. Every sane compiler developer would always prefer DLLs over static linking because of modularity. It makes the compiler much smaller if you can defer the combination of modules to a linker and thus have proper separate compilation.

Unfortunately, separate compilation with a C-like ABI only works well with a relatively limited monomorphic language. All other languages would have to leave quite a few optimizations on the table or make some suboptimal choices for uniform object representation (compare OCaml's ABI with Rust or Haskell).

Consequently, a modern language would face the tremendous (but I think interesting) task to extend the C-ABI in a way that allows these optimizations and separate compilation. Understandably, developers then play down the importance of separate compilation and choose a global approach.

adrian_b · on Feb 6, 2022

I do not understand what you mean.

There is no relationship between whether static or dynamic linking is used and modularity.

When a program is decomposed in modules, those are compiled separately and the complete executable program is made by linking, either statically or dynamically.

The static vs. dynamic option does not influence the semantics of the program regarding modularity.

Dynamic linking offers a few extra features, e.g. delaying the linking of a library to some time after a program starts and choosing one between more libraries at that time, but exactly the same functionality can be implemented in a statically linked program (using pointers to functions), with no difference in behavior (but with different costs in memory space and execution time; which costs are larger will be different for each particular case).

Actually a compiler that targets only static linkers will be slightly smaller, because many of the standards for dynamic linking, e.g. the UNIX SVR4/ELF ABI, require the compiler to emit additional instructions and data structures whenever external variables or functions are accessed, in comparison with the case when only static linking is used.

Dynamic linking is an additional complication for the compiler, not a simplification. Proper separate compilation of modules is the easiest with only static linking, when the compiler just has to emit appropriate relocation and linking data (which was actually the job of an assembler for the traditional UNIX compilers, which generated an assembly program, not an ELF object file), besides what it needs to do for compiling a monolithic program.

choeger · on Feb 6, 2022

> When a program is decomposed in modules, those are compiled separately

No, they're not. At least not in languages with nontrivial features and optimizations. Consider the identity function id = \\x.x in some module A and it's usage A.id 42 in some module B. Unless you commit to uniform object representation, excluding several optimizations, there's no way to compile A separately from it's use in B (because specialization of A.id is required). That fact excludes the option of creating dynamic libraries because you would expect the dynamic library compiled from A to be used in stead of A (with the necessary type interface data). Similar problems occur with polymorphic data structures.

Separate compilation, modularity, and dynamic linking are all aspects of the same problem.

I think that's why Rust uses a global compilation approach and if I am not completely mistaken, only Swift tries to have dynamic linking with polymorphism.

gspr · on Feb 6, 2022

I don't think the parent meant to use "modularity" to mean "pertaining to a programming language's concept of modules", but rather something along the lines of "pertaining to the act of splitting something into self-contained, clearly delineated, interoperating pieces".

choeger · on Feb 6, 2022

Exactly. Modularity on the frontend is dead simple. Modularity all the way down to executing machine code is hard.

jcranmer · on Feb 6, 2022

As a compiler developer, yeah, no, you're wrong. Dynamic linking creates several more challenges for compilation than static linking.

If you're trying to be cross-platform, you have the major issue that dynamic libraries have very different models on different platforms. Windows DLLs require all of the exported--and imported--names to be explicitly identified at compile time; MACH-O and ELF files don't require that. On the other hand, ELF files have symbol visibility which can be toggled to do something similar, but the standard expectations in C are completely different: every symbol is capable of being exported (or overridden--thanks symbol preemption!). (I don't have much familiarity with MACH-O's two-namespace system, so I won't talk about it further).

There are other issues. DSOs make it hard to use linker features such as arranging things in a section to make a distributed static list (e.g., for registering reflection stuff). Even doing static constructors in dynamic libraries is hard. Arranging for single static addresses (as C/C++ require) can be challenging. On Windows, malloc in one DLL can't be freed by free in another... sometimes. It also requires defining a stable ABI, since you have no reasonable expectation that the other side of the dynamic library is using the same compiler version you are. In general, crossing the dynamic library boundary is like doing FFI, given how little control you have over the process.

No, static linking is much easier. You don't have to hold in your head so much more crazy semantics with dynamic linking.

aij · on Feb 6, 2022

Static linking does not preclude separate compilation though. Go and Rust both use separate compilation with static linking by default (at least last I used them... it's been a while). I know Rust at least used to support dynamic linking too -- it just wasn't the default. And C also supports static linking of course.

choeger · on Feb 6, 2022

I don't think Rust uses separate compilation. And go just seems to not care about a stable ABI.

pjmlp · on Feb 6, 2022

It surely does, you can pack crates as rlibs.

throwaway984393 · on Feb 6, 2022

Excellent trolling. My only gripe:

> So huzzah, you now have some real data for your next Internet Argument

You don't need real data for internet arguments. Make something up and sound convincing, it works at least 60% of the time according to studies by a guy whose wiki blog I read once.

thayne · on Feb 6, 2022

This includes libc in the analysis, but even rust and go link against libc.