第 3 章 系統初始化

目录

3.1. 啓動過程概述
3.1.1. 第一階段:BIOS
3.1.2. 第二階段:引載加載程序
3.1.3. 第三階段:迷你 Debian 系統
3.1.4. 第四階段:常規 Debian 系統
3.2. Systemd init
3.2.1. 主機名
3.2.2. 檔案系統
3.2.3. 網路介面初始化
3.2.4. 核心訊息
3.2.5. 系統訊息
3.2.6. System management under systemd
3.2.7. Customizing systemd
3.3. udev 系統
3.3.1. 核心模組初始化

作爲系統管理員,粗略地瞭解 Debian 系統的啓動和配置方式是明智的。儘管準確的細節在安裝的軟件包及對應的文檔中,但這些知識對我們大多數人來說都是必須掌握的。

筆者基於自己和其他人的過往及現在的知識,盡己所能地提供關於 Debian 系統的知識要點及其配置的快速概覽作爲讀者的參考。由於 Debian 系統在不斷地更新中,系統的狀況可能已經有所變化。在對系統做任何修改之前,請參考各個軟件包的最新文檔。

[提示] 提示

bootup(7) describes the system bootup process based on systemd . (Recent Debian)

[提示] 提示

boot(7) describes the system bootup process based on UNIX System V Release 4. (Older Debian)

計算機系統從上電事件到能爲用戶提供完整的操作系統(OS)功能爲止,需要經歷幾個階段的啓動過程

爲簡便起見,筆者將討論範圍限定在具有默認安裝的典型 PC 平臺上。

典型的啓動過程像是一個四級的火箭。每一級火箭將系統控制權交給下一級。

當然,這些階段可以有不同的配置。比如,你編譯了自己的內核,則可能會跳過迷你 Debian 系統的步驟。因此,在讀者親自確認之前,請勿假定自己系統的情況也是如此。

[注意] 注意

對於 SUN 或 Macintosh 系統等非傳統 PC 平臺來說,ROM 上的 BIOS 及磁盤上的分區可能大不相同(第 9.5.2 节 “硬碟分割槽配置”)。對於這種情況,請另尋對應平臺相關的文檔。

BIOS 是啓動過程的第一階段,在上電事件後開始。CPU 的程序計數器在上電事件後被初始化爲一個特定的內存地址,駐留在只讀存儲器(ROM)中的 BIOS 就是從這個特定的內存地址開始執行。

BIOS 執行硬件的基本初始化(POST: 上電自檢)並將系統控制權交給你指定的下一步驟。BIOS 通常和硬件一同提供。

BIOS 啓動屏幕通常指示了進入 BIOS 配置界面所需的按鍵。流行的按鍵是 F1、F2、F10、Esc、Ins 和 Del 鍵。假如你的啓動屏幕被一個漂亮的圖形界面隱藏,你可以按下某些按鍵(比如 ESC)取消隱藏。這些按鍵高度依賴於硬件。

硬件位置和 BIOS 啓動的代碼的優先級可以在 BIOS 配置界面中選擇。通常,在已選擇的設備(硬盤、軟件、CD-ROM……)中,最先找到的設備的最開始的幾個扇區將被加載到內存,並執行其中的初始化代碼。初始化代碼可以是以下任意一種。

  • 引導加載代碼

  • 類似 FreeDOS 這樣的過濾型操作系統的內核代碼

  • 能夠加載到如此小的空間中的目標操作系統的內核代碼

通常,系統從主硬件的特定分區中引導。傳統 PC 硬盤的最開始兩個扇區中包含了主引導記錄(MBR)。在 MBR 的末尾記錄了磁盤分區信息及引導選擇。BIOS 中執行的首段引導加載代碼佔據了 MBR 的其餘部分。

引導加載程序是啓動過程的第二階段,由 BIOS 啓動。引導加載程序將系統內核映像和 initrd 映像加載到內存並將控制權交給它們。initrd 映像是根文件系統映像,其支持程度依賴於所使用的引導加載程序。

Debian 系統通常使用 Linux 內核作爲其默認的系統內核。當前 2.6/3.x 版本 Linux 內核的 initrd 映像從技術上說是 initramfs(初始化 RAM 文件系統)映像。initramfs 映像是根文件系統中所有文件的 cpio 歸檔再經過 gzip 壓縮得到。

[警告] 警告

使用新的 multi-segment initramfs 之後,上述內容已不正確。請參見錯誤 #790100

Debian 系統默認將 PC 平臺的 GRUB 引導加載程序的第一階段代碼安裝在 MBR 中。可用的引導加載程序和配置選項如下。


[警告] 警告

Do not play with boot loaders without having bootable rescue media (USB memory stick, CD or floppy) created from images in the grub-rescue-pc package. It makes you boot your system even without functioning bootloader on the hard disk.

傳統 GRUB 的菜單配置文件位於 /boot/grub/menu.lst。例如,文件中有如下的配置條目。

title           Debian GNU/Linux
root            (hd0,2)
kernel          /vmlinuz root=/dev/hda3 ro
initrd          /initrd.img

GRUB 第 2 版的菜單配置文件位於 /boot/grub/grub.cfg。此文件由 /usr/sbin/update-grub 根據 "/etc/grub.d/*" 中的模板及 "/etc/default/grub" 中的設置自動生成。例如,文件中有如下的配置條目。

menuentry "Debian GNU/Linux" {
        set root=(hd0,3)
        linux /vmlinuz root=/dev/hda3
        initrd /initrd.img
}

這些示例中,GRUB 參數的含義如下。


[注意] 注意

傳統 GRUB 使用的分區號爲 Linux 內核及各種實用工具使用的分區號減 1。GRUB 第 2 版修復了這個問題。

[提示] 提示

在標識一個塊設備時,可能需要使用 UUID(參見第 9.5.3 节 “使用 UUID 訪問分割槽”)而不是類似 "/dev/hda3" 這樣的文件名,例如 "root=UUID=81b289d5-4341-4003-9602-e254a17ac232 ro"。

[提示] 提示

如果使用了 GRUB,內核的啓動參數可以在 /boot/grub/grub.cfg 裏面設置。在 Debian 系統裏,你不應該直接編輯 /boot/grub/grub.cfg。你可以通過編輯 /etc/default/grub 文件中 GRUB_CMDLINE_LINUX_DEFAULT 的值並運行 update-grub(8) 來更新 /boot/grub/grub.cfg

[提示] 提示

通過使用鏈式引導技術,你可以在一個引導裝載程序中啓動另一個引導裝載程序。

參見 “info grub” 及 grub-install(8)

迷你 Debian 系統是啓動流程的第三階段,由引導加載程序啓動。它會在內存中運行系統內核和根文件系統。這是啓動流程的一個可選準備階段。

[注意] 注意

“迷你 Debian 系統”是筆者自創的術語,用於在本文檔中描述啓動流程的第三個階段。這個系統通常被稱爲 initrd 或 initramfs 系統。內存中類似的系統在 Debian 安裝程序中使用。

The "/init" program is executed as the first program in this root filesystem on the memory. It is a program which initializes the kernel in user space and hands control over to the next stage. This mini-Debian system offers flexibility to the boot process such as adding kernel modules before the main boot process or mounting the root filesystem as an encrypted one.

  • The "/init" program is a shell script program if initramfs was created by initramfs-tools. You can interrupt this part of the boot process to gain root shell by providing "break=init" etc. to the kernel boot parameter. See the "/init" script for more break conditions. This shell environment is sophisticated enough to make a good inspection of your machine's hardware. Commands available in this mini-Debian system are stripped down ones and mainly provided by a GNU tool called busybox(1).

  • The "/init" program is a binary systemd program if initramfs was created by dracut. ** Commands available in this mini-Debian system are stripped down systemd(1) environment.

[小心] 小心

當在一個只讀的根文件系統上時,使用 mount 命令需要添加 -n 選項。

常規 Debian 系統是啓動流程的第四階段,由迷你 Debian 系統啓動。迷你 Debian 系統的內核在此環境下繼續運行。根文件系統將由內存切換到實際的硬盤文件系統上。

init 程序是系統執行的第一個程序(PID=1),它啓動其它各種程序以完成主引導流程。init 程序的默認路徑是 ”/sbin/init“,但可通過內核啓動參數修改,例如 ”init=/path/to/init_program"。

默認的 init 程序一直在變化中:

  • squeeze 之前的 Debian,使用簡單的 SysV 風格的 init。

  • wheezy 版本的 Debian 對 SysV 風格的 init 做了改進:使用 LSB 頭將啓動步驟排序,同時並行執行啓動腳本。

  • jessie版本的 Debian 將默認 init 切換成 systemd,以使用事件驅動和並行初始化。

[提示] 提示

你的系統中實際使用的 init 命令可以使用 “ps --pid 1 -f” 命令確認。

[提示] 提示

"/sbin/init" is symlinked to "/lib/systemd/systemd" after Debian jessie.


[提示] 提示

有關啟動流程加速的最新資訊,請參見 Debian 維基:啟動流程加速詞條。

This section describes how system is started by the systemd(1) program with PID=1 (i.e., init process).

The systemd init process spawns processes in parallel based on the unit configuration files (see systemd.unit(5)) which are written in declarative style instead of SysV-like procedural style. These are loaded from a set of paths (see systemd-system.conf(5)) as follows:

  • "/lib/systemd/system": OS default configuration files

  • "/etc/systemd/system": system administrator configuration files which override the OS default configuration files

  • "/run/systemd/system": run-time generated configuration files which override the installed configuration files

Their inter-dependencies are specified by the directives "Wants=", "Requires=", "Before=", "After=", … (see "MAPPING OF UNIT PROPERTIES TO THEIR INVERSES" in systemd.unit(5)). The resource controls are also defined (see systemd.resource-control(5)).

The suffix of the unit configuration file encodes their types as:

  • *.service describes the process controlled and supervised by systemd. See systemd.service(5).

  • *.device describes the device exposed in the sysfs(5) as udev(7) device tree. See systemd.device(5).

  • *.mount describes the file system mount point controlled and supervised by systemd. See systemd.mount(5).

  • *.automount describes the file system auto mount point controlled and supervised by systemd. See systemd.automount(5).

  • *.swap describes the swap device or file controlled and supervised by systemd. See systemd.swap(5).

  • *.path describes the path monitored by systemd for path-based activation. See systemd.path(5).

  • *.socket describes the socket controlled and supervised by systemd for socket-based activation. See systemd.socket(5).

  • *.timer describes the timer controlled and supervised by systemd for timer-based activation. See systemd.timer(5).

  • *.slice manages resources with the cgroups(7). See systemd.slice(5).

  • *.scope is created programmatically using the bus interfaces of systemd to manages a set of system processes. See systemd.scope(5).

  • *.target groups other unit configuration files to create the synchronization point during start-up. See systemd.target(5).

Upon system start up (i.e., init), the systemd process tries to start the "/lib/systemd/system/default.target (normally symlinked to "graphical.target"). First, some special target units (see systemd.special(7)) such as "local-fs.target", "swap.target" and "cryptsetup.target" are pulled in to mount the filesystems. Then, other target units are also pulled in by the target unit dependencies. For details, read bootup(7).

systemd offers backward compatibility features. SysV-style boot scripts in "/etc/init.d/rc[0123456S].d/[KS]<name>" are still parsed and telinit(8) is translated into systemd unit activation requests.

[小心] 小心

Emulated runlevel 2 to 4 are all symlinked to the same "multi-user.target".

The mount options of normal disk and network filesystems are set in "/etc/fstab". See fstab(5) and 第 9.5.7 节 “通過掛載選項優化檔案系統”.

The configuration of the encrypted filesystem is set in "/etc/crypttab". See crypttab(5)

The configuration of software RAID with mdadm(8) is set in "/etc/mdadm/mdadm.conf". See mdadm.conf(5).

[警告] 警告

每次啟動的時候,在掛載了所有檔案系統以後,"/tmp", "/var/lock", 和 "/var/run" 中的臨時檔案會被清空。

The systemd offers not only init system but also generic system management functionalities such as journal logging, login management, time management, network management. etc..

The systemd(1) is managed by several commands:

  • the systemctl(1) command controls the systemd system and service manager (CLI),

  • the systemsdm(1) command controls the systemd system and service manager (GUI),

  • the journalctl(1) command queries the systemd journal,

  • the loginctl(1) command controls the systemd login manager, and

  • the systemd-analyze(1) analyzes system boot-up performance.

Here are a list of typical systemd management command snippets. For the exact meanings, please read the pertinent manpages.

表 3.5. List of typical systemd management command snippets

Operation Type Command snippets
GUI for service manager GUI "systemadm" (systemd-ui package)
List all target unit configuration Unit "systemctl list-units --type=target"
List all service unit configuration Unit "systemctl list-units --type=service"
List all unit configuration types Unit "systemctl list-units --type=help"
List all socket units in memory Unit "systemctl list-sockets"
List all timer units in memory Unit "systemctl list-timers"
Start "$unit" Unit "systemctl start $unit"
Stop "$unit" Unit "systemctl stop $unit"
Reload service-specific configuration Unit "systemctl reload $unit"
Stop and start all "$unit" Unit "systemctl restart $unit"
Start "$unit" and stop all others Unit "systemctl isolate $unit"
Switch to "graphical" (GUI system) Unit "systemctl isolate graphical"
Switch to "multi-user" (CLI system) Unit "systemctl isolate multi-user"
Switch to "rescue" (single user CLI system) Unit "systemctl isolate rescue"
Send kill signal to "$unit" Unit "systemctl kill $unit"
Send kill signal to "$unit" Unit "systemctl kill $unit"
Check if "$unit" service is active Unit "systemctl is-active $unit"
Check if "$unit" service is failed Unit "systemctl is-failed $unit"
Check status of "$unit|$PID|device" Unit "systemctl status $unit|$PID|$device"
Show properties of "$unit|$job" Unit "systemctl show $unit|$job"
Reset failed "$unit" Unit "systemctl reset-failed $unit"
List dependency of all unit services Unit "systemctl list-dependencies --all"
List unit files installed on the system Unit file "systemctl list-unit-files"
Enable "$unit" (add symlink) Unit file "systemctl enable $unit"
Disable "$unit" (remove symlink) Unit file "systemctl disable $unit"
Unmask "$unit" (remove symlink to "/dev/null") Unit file "systemctl unmask $unit"
Mask "$unit" (add symlink to "/dev/null") Unit file "systemctl mask $unit"
Get default-target setting Unit file "systemctl get-default"
Set default-target to "graphical" (GUI system) Unit file "systemctl set-default graphical"
Set default-target to "multi-user" (CLI system) Unit file "systemctl set-default multi-user"
Show job environment Environment "systemctl show-environment"
Set job environment "variable" to "value" Environment "systemctl set-environment variable=value"
Unset job environment "variable" Environment "systemctl unset-environment variable"
Reload all unit files and daemons Lifecycle "systemctl daemon-reload"
Shut down the system System "systemctl poweroff"
Shut down and reboot the system System "systemctl reboot"
Suspend the system System "systemctl suspend"
Hibernate the system System "systemctl hibernate"
View job log of "$unit" Journal "journalctl -u $unit"
View job log of "$unit" ("tail -f" style) Journal "journalctl -u $unit -f"
Show time spent for each initialization steps Analyze "systemd-analyze time"
List of all units by the time to initialize Analyze "systemd-analyze blame"
Load and detect errors in "$unit" file Analyze "systemd-analyze verify $unit"
Track boot process by the cgroups(7) Cgroup "systemd-cgls"
Track boot process by the cgroups(7) Cgroup "ps xawf -eo pid,user,cgroup,args"
Track boot process by the cgroups(7) Cgroup Read sysfs under "/sys/fs/cgroup/systemd/"

Here, "$unit" in the above examples may be a single unit name (suffix such as .service and .target are optional) or, in many cases, multiple unit specifications (shell-style globs "*", "?", "[]" using fnmatch(3) which will be matched against the primary names of all units currently in memory).

System state changing commands in the above examples are typically preceded by the "sudo" to attain the required administrative privilege.

The output of the "systemctl status $unit|$PID|$device" uses color of the dot ("●") to summarize the unit state at a glance.

  • White "●" indicates an "inactive" or "deactivating" state.

  • Red "●" indicates a "failed" or "error" state.

  • Green "●" indicates an "active", "reloading" or "activating" state.

With default installation, many network services (see 第 6 章 網路應用) are started as daemon processes after network.target at boot time by systemd. The "sshd" is no exception. Let's change this to on-demand start of "sshd" as a customization example.

First, disable system installed service unit.

 $ sudo systemctl stop sshd.service
 $ sudo systemctl mask sshd.service

The on-demand socket activation system of the classic Unix services was through the indetd superserver. Under systemd, the equivalent can be enabled by adding *.socket and *.service unit configuration files.

sshd.socket for specifying a socket to listen on

[Unit]
Description=SSH Socket for Per-Connection Servers

[Socket]
ListenStream=22
Accept=yes

[Install]
WantedBy=sockets.target

sshd@.service as the matching service file of sshd.socket

[Unit]
Description=SSH Per-Connection Server

[Service]
ExecStart=-/usr/sbin/sshd -i
StandardInput=socket

Then reload.

 $ sudo systemctl daemon-reload

Linux 核心 2.6 和更新的核心,udev 系統 提供了自動硬體發現和初始化機制。(參見 udev(7)).在核心發現每個裝置的基礎上,udev 系統使用從 sysfs 檔案系統 (參見 第 1.2.12 节 “procfs 和 sysfs”)的資訊啟動一個使用者程序,使用 modprobe(8) 程式 (參見 第 3.3.1 节 “核心模組初始化”)載入支援它所要求的核心模組, 建立相應的裝置節點。

[提示] 提示

如果由於某些理由,"/lib/modules/<kernel-version>/modules.dep"沒有被 depmod(8) 正常生成,模組可能不會被 udev 系統按期望的方式載入。執行"depmod -a" 來修復它。

裝置節點的名字,可以通過"/etc/udev/rules.d/"裡的 udev 檔案來配置.當前預設的規則傾向建立動態生成的名字,除了光碟機和網路裝置外,會生成非靜態的裝置名。通過新增和光碟機、網路裝置類似的個性化規則,你也可以為 USB 盤之類的其它裝置,生成靜態裝置名。 參見 "Writing udev rules" 或 "/usr/share/doc/udev/writing_udev_rules/index.html".

由於 udev 系統是一個正在變化的事物,我在其它文件進行了詳細描述,在這裡只提供了最少的資訊。

[提示] 提示

"/etc/fstab"裡面的掛載規則,裝置節點不必需是靜態的。你能夠使用 UUID 來掛載裝置,來代替"/dev/sda"之類的裝置名. 參見 第 9.5.3 节 “使用 UUID 訪問分割槽”.

通過 modprobe(8) 程式新增和刪除核心模組,使我們能夠從使用者程序來配置正在執行的 Linux 核心。udev 系統(參見 第 3.3 节 “udev 系統”)自動化它的呼叫來幫助核心模組初始化。

下面的非硬體模組和特殊的硬體驅動模組,需要被預先載入,把它們在"/etc/modules"檔案裡列出 (參見 modules(5)).

modprobe(8) 程式的配置檔案是按 modprobe.conf(5)的說明放在"/etc/modprobes.d/" 目錄下,(如果你想避免自動載入某些核心模組,考慮把它們作為黑名單放在"/etc/modprobes.d/blacklist" 檔案裡.)

"/lib/modules/<version>/modules.dep" 檔案由 depmod(8) 程式生成,它描述了 modprobe(8) 程式使用的模組依賴性.

[注意] 注意

如果你在啟動時出現模組載入問題,或者 modprobe(8)時出現模組載入問題, "depmod -a" 可以通過重構"modules.dep"來解決這些問題。

modinfo(8) 程式顯示 Linux 核心模組資訊。

lsmod(8) 程式以好看的格式展示"/proc/modules"的內容,顯示當前核心載入了哪些模組。

[提示] 提示

你能夠精確識別你係統上的硬體。 參見第 9.4.3 节 “硬體識別”.

[提示] 提示

你可以在啟動時配置硬體來啟用期望的硬體特徵。參見 第 9.4.4 节 “硬體配置”.

[提示] 提示

你可以重新編譯核心來增加你的特殊裝置的支援。參見 第 9.9 节 “核心”.