DOCUMENT ID: 1551-02 SYNOPSIS: Solaris shared library Q&A OS RELEASE: PRODUCT: Solaris KEYWORDS: shared library libraries object DESCRIPTION: What are shared libraries in Solaris? SOLUTION: Copyright 1995 Phillip VandryDistribute freely ------------------------------------------------------------------ Definitions: Application binary: static linking: dynamic linking: Q. What is a shared library? A. A shared library, or shared object file, or dynamic library, is a file in the ELF file format (see elf(3E)) that contains data and usually code that is needed by application binaries. It, along with any other shared libraries an application binary depends on, is transparently loaded into memory when the binary is executed, making itself available to the binary. Q. What are the advantages of shared libraries? A. The major advantage of a shared library is that it may contain code that more than one binary depends on. Thus, this code is only stored once, in the shared library, and not in each individual binary. This saves storage space on disk, especially when dealing with very common code that nearly every program needs. But the benefits go farther than the space savings: if the common code needs to be updated (for example, a patch to fix a bug), this need only be done once, in the shared library. (There are also more reasons, covered below). Another big advantage is that programs which use the non shared versions of the system supplied libraries like libc, are not supported and are not ABI compliant. See "Why does my program, statically linked under a different version of Solaris, dump core?" and "Why do I get unresolved references when trying to link statically?" Q. What are the disadvantages of shared libraries? A. It takes time for a program to load all of the shared library files it needs and to link them to the main program and to other shared libraries (called dynamic linking). The program will take longer to start up. Programs that depend on shared libraries will not work if any of the libraries they depend upon are not available. For people who distribute software, this often means developers must restrict themselves to using only the shared libraries that are sure to be present on every system. Unfortunately of course, nothing is for sure and according to Murphy's law, there will always be someone who doesn't have whatever library, or has a different version that the developers used and the different version doesn't work properly. Arguably, it is cleaner (more elegant) to have a program that is self contained and requires nothing but itself. Finally, the system must be able to locate the require libraries wherever they may be, which is, again, error prone. (See "What is LD_LIBRARY_PATH?") Q. What systems can use shared libraries? A. The model of shared libraries described in this document comes from AT&T's System V UNIX operating system and is used in derivatives of this OS, including Solaris. Other systems may implement shared libraries slightly differently, but the concept is always the same. Shared libraries resemble Microsoft Windows .DLL (Dynamic Load Library) files. Q. What does a shared object file look like? A. Shared object files have the .so (for shared object) extension. They often have a numerical extension on top of that. The file(1) command will tell you if a file is a shared library (it will say "dynamic lib") by examing its contents. Q. How are shared libraries used in Solaris? A. Almost all of the binaries that come with Solaris (such as the ones found in the /usr/bin directory) are dynamically linked, which means they depend on shared libraries to execute. The only ones which aren't are a few critical system programs that may need to be (or must) run at times when the shared libraries cannot be used, such as very early in the boot process. Note that technically, the kernel is such an exceptional program. Solaris also uses shared libraries in another way than simply loading a set of required libraries when run as described in the first question. By their nature, shared libraries can be opened and imported into a running program at any time. As an example of this, take the getpwnam(3C) function. This function returns information about a user including her user id and the name of her home directory. It looks in the /etc/nsswitch.conf file (see nsswitch.conf(4)) to determine what access method to use to find this information. It might look in files (the /etc/passwd file), NIS (Network Information name Service, aka yp), or NIS+ (New version of NIS). (It would also be possible to define another access method, but nobody has done this). After having looked in the nsswitch.conf file, the getpwnam function loads a shared library with the name nss_database.so.1 where "database" is the name of whichever access method was found in nsswitch.conf. Each access method has a corresponding file: nss_files.so.1, nss_nis.so.1, nss_nisplus.so.1. (you can see these files by typing "ls /usr/lib".) Each shared library file contains the same set of functions so that functions like getpwnam is able to call them all the same way, but each version of the functions searches a different database. To summarize, it is possible to select one or another version of a set of alternative functions by loading one or another shared library containing the same function, depending on values found in a configuration file, or even depending on user input. The above "trick" is used not only to select multiple databases for a database lookup, but also to select mutiple similar network protocols in much the same way. Q. What is LD_LIBRARY_PATH? A. LD_LIBRARY_PATH is an environment variable that tells the system where to find shared libraies when they are needed, if the program itself does not specify this information. You can view the current value of this variable by typing "echo $LD_LIBRARY_PATH" in a shell. The contents consist of a colon separated list of directories to search sequentially. If it is not set, the default is "/usr/lib". If LD_LIBRARY_PATH does not list the directory where a particular required library is found, and the program which requires this binary does not specify which directory to find it in either, then the program will abort. For this reason, it is considered good practice, when making programs, to have the program specify to the dynamic linker where its libraries may be found if they are not found in the standard directory /usr/lib. In this way, no program should depend on the correct setting of LD_LIBRARY_PATH. A program can be considered broken, or at the very least, lacking robustness, if it doesn't work without LD_LIBRARY_PATH set. (See also "Why should I use -R?") Q. So should I use shared libraries? A. For the system libraries, the answer is certainly yes. You should not link with the static versions of libraries like libc (See "Why does my program, statically linker under a different version of Solaris, dump core?" and "Why do I get unresolved references when trying to link statically?"). For libraries you create yourself or are otherwise not part of Solaris, it will depend. A few examples of public domain or shareware packages which have libraries is in order: pbmplus: pbmplus, which is a suite of graphics format converters plus a few other goodies, includes over 100 binaries (e.g. giftoppm, pnmtotiff). All of them share a substantial amount of common code for dealing with the package's internal representation of graphics. A lot of space could be wasted by storing this code 100 times on your disk. The common code can be placed in a shared library. When this is done, each individual binary can be smaller than 10K while it might otherwise have been over 20K with all the common code statically linked in. The tradeoff for this is that pbmplus does not support the creation of shared libraries for this purpose. It takes a fair amount of Makefile editing to get this to work. elm: elm, a popular Mail User Agent, includes as part of the source, a (static) library libutil.a. This library is used only a few times, always in programs that are part of the ELM package. Few of the advantages of shared libraries exist in this case. Given the extra overhead of loading shared libraries at run time and the need to compile them as position independant code, it probably isn't worth making a shared library out of this. Q. How are shared libraries implemented at a low level? A. Programs are either dynamically linked or statically linked. Statically linked programs cannot load shared libraries. Dynamically linked programs usually do. When a statically linked program is run, it is loaded into memory, and the processor jumps to a set location in the program which performs program initialization. For C programs, this function sets things up and then calls the program's main() function. When a dynamically linked program is run, it is also loaded into memory, but control is not transfered to a set location in the program. Instead, the program contains the name of an interpreter under which it should run. In practice, there is only one interpreter, called /usr/lib/ld.so.1. This file is known as the dynamic linker and is a *VERY* critical file for the operation of a system! The dynamic linker read the information that was recorded in the program when it was made and loads the apropriate shared libraries. It uses the LD_LIBRARY_PATH environment variabe (described above) to locate them. It also uses LD_RUN_PATH. This paramater serves a similar function, but it comes from the program, not the environment. Thus the program can itself specify directories where the dynamic linker should look without absolutely depending on the proper setting of LD_LIBRARY_PATH. Shared libraries are mapped into virtual memory using the mmap(2) system call. Once the loading process is complete, control is transferred to a special function in the main program and we continue as with a statically linked program. Q. How do I make a shared library? A. Making a shared library is a lot like making a regular library, but there are a few differences. First of all, you should probably compile your code in a position independant fashion. Here's why: Normally, compilers make relocatable code. Because a set of assembly language instructions (a program) usually needs to know where in memory it is being executed from but it cannot be predicted when a program is created where in memory it will be loaded when it is run, compilers must generate almost all of the code for a program, and leave the rest to be filled in once it is known where the program has been loaded, at run time. For this purpose, they prepare and include in the program tables of the locations that need adjusting. If a shared library uses relocatable code, then this relocation must be performed every time the shared library is loaded into a new program, at a new place in memory. Thus each copy of the shared library in memory is slightly different after the relocation has taken place. If relocation wasn't required, then all copies of a shared object in memory would be identical, and in fact, it would be possible for all programs simultaneously requiring the same shared library to share the same copy in memory (thus the name "shared" library). This a a very huge advantage of shared libraries. In fact, this is possible if position independant code is generated. Position Independant Code (PIC) is less efficient than normal relocatable code because programs must sometimes use alternate methods of performing the same tasks. However, no code needs to be modified at run time, as indicated above. With the Sun C compiler, supply the "-K pic" option on the cc command line. With the GNU compiler, supply the "-fPIC" option on the gcc command line. Finally, once your code is compiled, you can create a shared object file as follows. ("-z text" is optional, but usually a good idea. It ensures that your code os really position independant). ld -G -z text -o outputfile.so Q. Why should I use -R when linking programs? A. The -R option to the linker allows you to specify directories (one per occurence or -R) where the program being linked should look for its shared libraries when it is run. It should be used whenever a program requires libraries stored in directories other than /usr/lib, so that the program will find all the libraries it needs. Note that a similar effect can be achieved by using the LD_LIBRARY_PATH variable (See "What is LD_LIBRARY_PATH"), but depending on this is very strongly discouraged because the program will always depend on this variable being set correctly and it will fail to run whenever it isn't! On a properly configured system, all programs have been linked with -R if they need libraries from elsewhere than /usr/lib, and LD_LIBRARY_PATH need not be used. Q. What are the numerical extensions on shared objects for? A. The numerical extensions (e.g. libc.so.1) indicate version numbers. Each time a new and incompatible version of a shared library is created, the version is incremented. The old version should be kept around for the sake of programs that were linked with the previous version. When making a program, the linker will look for shared libraries by the extention ".so". Thus, for libraries which have numerical extensions, there should be a symbolic link ending in .so pointing to the desired version of the library (the most recent), e.g. libc.so --> libc.so.1. Note that the version number of a shared library should only be changed when the library's interface changes. Changes in the implementation do not matter to programs using the library, so it is possible to, say, optimize the code in a library, thereby optimizing every program that uses the library, without changing the version number. Also, when a shared library is produced, and the above numerical version and symbolic link convention will be used, the -h options needs to be supplied to the linker. The argument to -h is the filename which programs linked with the new library should look for to find the library. It should be the name the library will be installed under. If a library was not made using -h, then the name under which programs using that library will look for it under is the name under which the linker found the library when those programs were produced. This is the .so file (since the linker only scans for .so files), which is a symbolic link to the most recent version. This is bad because the most recent version may not be the same one the program was linked against, and if not, it will end up trying to use an incompatible version. Using -h with the canonical name of the library forces the program to always load the same version of the library. Q. Why does my program, statically linked under a different version of Solaris, dump core? A. Statically linked programs do not comply with the Solaris Application Binary Interface (ABI). This means that it is not supported. The C library and a number of the other libraries that come with the OS contain code that assumes things which may change from one version of Solaris to another. Normally, this is reasonable because programs are dynamically linked and they always fetch the appropriate code from the libraries that exist on the system where the program is run. But this doesn't work for statically linked programs, because they use whatever code they were statically linked with. Q. Why do I get unresolved references when trying to link statically? Q. Why is there no "libdl.a"? A. As described in "How are shared libraries used in Solaris?", functions like getpwnam() dynamically link code when they are called by programs, depending on which sources the system is configured to use. In statically linked programs, there is no dynamic linker available, yet these functions try to use it. This is what the linker complains about. The recommended way to circumvent this is, of course, to link dynamically. However, here's another workaround if you must use it (don't distribute programs you make like this unless you enjoy headaches!) A command line like this will make a dynamically linked program but which uses functions in static system libraries: cc -Bstatic .... -Bdynamic -ldl -Bstatic Q. Where can I find X shared library? A. The standard shared libraries that come with Solaris follow. There should never be a problem locating the ones in /usr/lib. The others may not be found if $LD_LIBRARY_PATH does not contain their directory and the program does not suggest a search location itself. For these cases, you can add the apropriate directory to LD_LIBRARY_PATH (See "What is LD_LIBRARY_PATH?"). You may, of course have additional libraries if you've installed software other than the basic Solaris environment. /usr/lib/libC.so.3 /usr/lib/libC.so.5 /usr/lib/libadm.so /usr/lib/libadm.so.1 /usr/lib/libadmagt.so /usr/lib/libadmagt.so.1 /usr/lib/libadmapm.so /usr/lib/libadmapm.so.1 /usr/lib/libadmcom.so /usr/lib/libadmcom.so.1 /usr/lib/libadmsec.so /usr/lib/libadmsec.so.1 /usr/lib/libaio.so /usr/lib/libaio.so.1 /usr/lib/libauth.so /usr/lib/libauth.so.1 /usr/lib/libbsm.so /usr/lib/libbsm.so.1 /usr/lib/libc.so /usr/lib/libc.so.1 /usr/lib/libc2.so /usr/lib/libc2.so.1 /usr/lib/libc2stubs.so /usr/lib/libc2stubs.so.1 /usr/lib/libdl.so /usr/lib/libdl.so.1 /usr/lib/libelf.so /usr/lib/libelf.so.1 /usr/lib/libintl.so /usr/lib/libintl.so.1 /usr/lib/libkrb.so /usr/lib/libkrb.so.1 /usr/lib/libkstat.so /usr/lib/libkstat.so.1 /usr/lib/libkvm.so /usr/lib/libkvm.so.1 /usr/lib/libld.so.1 /usr/lib/liblddbg.so.2 /usr/lib/libm.so /usr/lib/libm.so.1 /usr/lib/libmapmalloc.so /usr/lib/libmapmalloc.so.1 /usr/lib/libnisdb.so /usr/lib/libnisdb.so.2 /usr/lib/libnsl.so /usr/lib/libnsl.so.1 /usr/lib/libposix4.so /usr/lib/libposix4.so.1 /usr/lib/librac.so /usr/lib/librac.so.1 /usr/lib/libresolv.so.1 /usr/lib/librpcsvc.so /usr/lib/librpcsvc.so.1 /usr/lib/libsocket.so /usr/lib/libsocket.so.1 /usr/lib/libsys.so /usr/lib/libsys.so.1 /usr/lib/libthread.so /usr/lib/libthread.so.1 /usr/lib/libthread_db.so /usr/lib/libthread_db.so.0 /usr/lib/libvolmgt.so /usr/lib/libvolmgt.so.1 /usr/lib/libw.so /usr/lib/libw.so.1 /usr/lib/nss_compat.so.1 /usr/lib/nss_dns.so.1 /usr/lib/nss_files.so.1 /usr/lib/nss_nis.so.1 /usr/lib/nss_nisplus.so.1 /usr/lib/straddr.so /usr/lib/straddr.so.2 /usr/dt/lib/libDtHelp.so /usr/dt/lib/libDtHelp.so.1 /usr/dt/lib/libDtSvc.so /usr/dt/lib/libDtSvc.so.1 /usr/dt/lib/libDtTerm.so /usr/dt/lib/libDtTerm.so.1 /usr/dt/lib/libDtWidget.so /usr/dt/lib/libDtWidget.so.1 /usr/dt/lib/libMrm.so /usr/dt/lib/libMrm.so.3 /usr/dt/lib/libUil.so /usr/dt/lib/libUil.so.3 /usr/dt/lib/libXm.so /usr/dt/lib/libXm.so.3 /usr/dt/lib/libcsa.so /usr/dt/lib/libcsa.so.0 /usr/dt/lib/libtt.so /usr/dt/lib/libtt.so.2 /usr/openwin/lib/libX.so /usr/openwin/lib/libX.so.4 /usr/openwin/lib/libX11.so /usr/openwin/lib/libX11.so.4 /usr/openwin/lib/libXaw.so /usr/openwin/lib/libXaw.so.4 /usr/openwin/lib/libXaw.so.5 /usr/openwin/lib/libXext.so /usr/openwin/lib/libXext.so.0 /usr/openwin/lib/libXi.so /usr/openwin/lib/libXi.so.5 /usr/openwin/lib/libXinput.so /usr/openwin/lib/libXinput.so.0 /usr/openwin/lib/libXmu.so /usr/openwin/lib/libXmu.so.4 /usr/openwin/lib/libXol.so /usr/openwin/lib/libXol.so.3 /usr/openwin/lib/libXt.so /usr/openwin/lib/libXt.so.4 /usr/openwin/lib/libXtst.so /usr/openwin/lib/libXtst.so.1 /usr/openwin/lib/libce.so /usr/openwin/lib/libce.so.0 /usr/openwin/lib/libdeskset.so /usr/openwin/lib/libdeskset.so.0 /usr/openwin/lib/libdga.so /usr/openwin/lib/libdga.so.1 /usr/openwin/lib/libdps.so /usr/openwin/lib/libdps.so.5 /usr/openwin/lib/libdpstk.so /usr/openwin/lib/libdpstk.so.5 /usr/openwin/lib/libdstt.so /usr/openwin/lib/libdstt.so.0 /usr/openwin/lib/libolgx.so /usr/openwin/lib/libolgx.so.3 /usr/openwin/lib/libowconfig.so /usr/openwin/lib/libowconfig.so.0 /usr/openwin/lib/libpsres.so /usr/openwin/lib/libpsres.so.5 /usr/openwin/lib/libtiff.so /usr/openwin/lib/libtiff.so.3 /usr/openwin/lib/libtt.so /usr/openwin/lib/libtt.so.1 /usr/openwin/lib/libxil.so /usr/openwin/lib/libxil.so.1 /usr/openwin/lib/libxview.so /usr/openwin/lib/libxview.so.3 DATE APPROVED: 11/27/95