Category Archives: kernel

Debugging Linux Kernel on QEMU Using GDB

Kernel is magic. Well, not really. All my experiences involving programming in userland. Could I step up to enter the land of kernel? Sure, but before that I need to arm myself with knowledge. Especially when dealing with kernel debugging.

This article will discuss about kernel debugging. When writing this, I use:

  • Linux Kernel 4.5
  • GDB 7.7.1
  • Qemu 2.5.0
  • Busybox 1.24.2
  • ParrotSec Linux for host

Although in the end we can run minimalistic kernel, this approach is not close to “real world” yet.

Preparation

Download the kernel source code from https://www.kernel.org/ and extract it to /usr/src

wget https://cdn.kernel.org/pub/linux/kernel/v4.x/linux-4.5.tar.xz
mv linux-4.5.tar.xz /usr/src
cd /usr/src
tar -xf linux-4.5.tar.xz -C linux

Download busybox and extract it to /usr/src. We will use this for creating initramfs.

wget https://busybox.net/downloads/busybox-1.24.2.tar.bz2
mv busybox-1.24.2.tar.bz2
cd /usr/src
tar -xf busybox-1.24.2.tar.bz2 -C busybox

ParrotSec is debian derivative.

I use latest Qemu, you can read it from here.

Compile Linux Kernel

It’s a bit different to usual routine, we need to enable debug info.

cd /usr/src/linux
mkdir build
make menuconfig O=build

Select “Kernel hacking” menu.

Go to “Compile-time checks and compiler options”.

  • Enable the “Compile the kernel with debug info”
  • Enable the “Compile the kernel with frame pointers”
  • Enable the “Provide GDB scripts for kernel debugging”.

Search for “KGDB: kernel debugger” and make sure it is checked.

Go to the build directory and build from there

cd build
make bzImage -j $(grep ^Processor | wc -l)

Creating Initramfs

We need some kind of environment with basic command line tools. Something that provided by binutils, like: ls, cat, mkdir, etc. It is called initramfs (Initial RAM file system). The idea is to provide a minimal “root file system” as a temporary file system so Kernel can prepare all works before mounting the real file system. We will use Busybox.

cd /usr/src/busybox
mkdir build
make menuconfig O=build

Select “Busybox Settings” > “General Configuration” and uncheck the “Enable options for full-blown desktop systems” and check”. Go back and select “Build Options” and check “Build BusyBox as a static binary (no shared libs).

make && make install

This will create a new directory “_install” and install our busybox there. This _install will be base for initramfs.

Next we create a new “initramfs” directory and create some configuration files inside.

mkdir /usr/src/initramfs
cd /usr/src/initramfs
cp -R /usr/src/busybox/build/_install rootfs
cd rootfs
rm linuxrc
mkdir -p dev etc newroot proc src sys tmp

We don’t need “linuxrc” sine we are going to use initramfs boot scheme.

Create a file etc/wall.txt and fill it:

######################################
#                                    #
#      Kernel Debug Environment      #
#                                    #
######################################

Remember init? Once our kernel is up, we need init to spawn everything necessary. However in our minimalistic system, our init only needed to spawn tty. Now create and populate init file with following content.

#!/bin/sh
dmesg -n 1

mount -t devtmpfs none /dev
mount -t proc none /proc
mount -t sysfs none /sys

cat /etc/wall.txt

while true; do
   setsid cttyhack /bin/sh
done

The cttyhack is a small utility for spawning tty device. This way, we can rest assure that when we execute the “exit” command new shell will be started automatically.

We need to make the init file is executable.

chmod +x init

Next we need to pack initramfs.

cd /usr/src/initramfs/rootfs
find . | cpio -H newc -o | gzip > ../rootfs.igz

Running Kernel on Qemu

Next thing to do is to launch the kernel inside Qemu.

qemu-system-x86_64 -no-kvm -s -S 
-kernel /usr/src/linux/build/arch/x86/boot/bzImage 
-hda /mnt/jessie.ext3 -append "root=/dev/sda"

At this point, we will see a blank QEMU terminal windows.

The -s option is a shorthand for -gdb tcp::1234, i.e. open a gdbserver on TCP port 1234.

The -S option stops the CPU to start at startup. Now QEMU is waiting for kernel to start in debug mode.

Running GDB

The QEMU had launched the kernel and waiting for debugging. The next thing is to launch GDB and do the debugging.

gdb

In our host, we need to load the the same kernel load in QEMU and point our target to QEMU.

file /usr/src/linux/build/arch/x86/boot/bzImage
set architecture i386:x86-64:intel
set remote interrupt-sequence Ctrl-C
target remote :1234

Let’s try it, using GDB:

continue
bt

As for now, GDB still not appreciate the size of registers changing. As for our kernel, there is a time when our kernel change from 16 bit to 32 bit (or 64 bit). You might notice that when we run QEMU we specify -S so QEMU will stop at startup. At that time, Linux will change to full 64 bit (or 32 bit) kernel. If you don’t do something, GDB will keep complaining about “remote packet reply is too long”.

To circumvent it, we can just disconnect and then reconnect.

disconnect
set architecture i386:x86-64:intel
target remote :1234

QEMU [Paused]_021  LXTerminal_022

The post Debugging Linux Kernel on QEMU Using GDB appeared first on Xathrya.ID.

Linux Kernel Source & Versioning

Kernel Versioning

Anyone can build Linux kernel. Linux Kernel is provided freely on http://www.kernel.org/. From the earlier version until the latest version are available. Kernel is release regularly and use a versioning system to distinguish earlier and later kernel. To know Linux Kernel version, a simple command uname can be used. For example, I invoke this and got message

# uname -r
3.7.8-gnx-z30a

At that command output, you can see dotted decimal string 3.7.8. This is the linux kernel version. In this dotted decimal string, the first value 3 denotes major relase number. Second number 7 denotes minor release and the third value 8 is called the revision number. The major release combined with minor release is called the kernel series. Thus, I use kernel 3.7

Another string after 3.7.8 is gnx-z30a. I’m using a self-compiled kernel and add -gnx-z30a as a signature of my kernel version. Some distribution also gives their signature after the kernel, such as Ubuntu, Fedore, Red Hat, etc.

An example of building kernel can be read at this article.

Kernel Source Exploration

For building the linux kernel , you will need latest or any other stable kernel sources . For example we have taken the sources of stable kernel release version 3.8.2 . Different versions of Linux Kernel sources can be found at  http://www.kernel.org . Get latest or any stable release of kernel sources from there.

Assuming you have download the stable kernel release source on your machine, extract the source and put it to /usr/src directory.

Most of the kernel source is written in C .It is organized in various directories and subdirectories . Each directory is named after what it contains .  Directory structure of kernel may look like the below diagram.

Linux Kernel Source

Know let’s dive more into each directories.

arch/

Linux kernel can be installed on a handheld device to huge servers. It supports intel, alpha, mips, arm, sparc processor architectures . This ‘arch’ directory further contains subdirectories for a specific processor architecture. Each subdirectory contains the architecture dependent code. For example , for a PC , code will be under arch/i386 directory , for arm processor , code will be under arch/arm/arm64 directory etc.

init/

LILO or linux loader loads the kernel into memory and then control is passed to an assembler routine, arch/x86/kernel/head_x.S. This routine is responsible for hardware initialization , and hence it is architecture specific. Once hardware initialization is done , control is passed to start_kernel() routine that is defined in init/main.c . This routine is analogous to main() function in any ‘C’ program , it’s the starting point of kernel code . After the architecture specific setup is done , the kernel initialization starts and this kernel initialization code is kept under init directory. The code under this directory is responsible for proper kernel initialization that includes initialization of page addresses, scheduler, trap, irq, signals, timer, console etc.. The code under this directory is also responsible for processing the boot time command line arguments.

crypto/

This directory contains source code of different encryption algorithms , e.g. md5,sha1,blowfish,serpent and many more . All these algorithms are implemented as kernel modules . They can be loaded and unloaded at run time . We will talk about kernel modules in subsequent chapters.

documentation/

This directory contains documentation of kernel sources.

drivers/

If we understand the device driver code , it is splitted into two parts. One part communicates with user, takes commands from user , displays output to user etc. The other part communicates with the device, for example controlling the device , sending or receiving commands to and from the device etc. The part of the device driver that communicates with user is hardware independent and resides under this ‘drivers’ directory. This directory contains source code of various device drivers. Device drivers are implemented as kernel modules. As a matter of fact, majority of the linux kernel code is composed of the device drivers code , so majority of our discussion too will roam around device drivers.

This directory is further divided into subdirectories depending on the device’s driver code it contains.

drivers/block/
contains drivers for block devices,e.g. hard disks.
drivers/cdrom/
contains drivers for proprietary cd-rom drives.
drivers/char/
contains drivers for character devices , e.g. – terminals, serial port, mouse etc.
drivers/isdn/
contains isdn drivers.
drivers/net/
contains drivers for network cards.
drivers/pci/
contains drivers for pci bus access and control.
drivers/scsi/
contains drivers for scsi devices.
drivers/ide/
contains drivers for ide devices
drivers/sound/
contains drivers for various soundcards.

Another part of a device driver, that communicates with the device is hardware dependent, more specifically bus dependent. It is dependent on the type of bus which device uses for the communication. This bus specific code resides under the arch/ directory

fs/

Linux has got support for lot of file systems, e.g. ext2,ext3, fat, vfat,ntfs, nfs,jffs and more. All the source code for these different file systems supported is given in this directory under file system specific sudirectory,e.g. fs/ext2, fs/ext3 etc.

Also, linux provides a virtual file system(VFS) that acts like a wrapper to these different file systems . Linux virtual file system interface enables the user to use different file systems under one single root ( ‘/’) . Code for vfs also resides here. Data structures related to vfs are defined in include/linux/fs.h. Please take a note , it is very important header file for kernel development.

kernel/

This is one of the most important directories in kernel. This directory contains the generic code for kernel subsystem i.e. code for system calls , timers, schedulers, DMA , interrupt handling and signal handling. The architecture specific kernel code is kept under arch/*/kernel.

include/

Along with the kernel/ directory this include/ directory also is very important for kernel development .It includes generic kernel headers . This directory too contains many subdirectories . Each subdirectory contains the architecture specific header files .

ipc/

Code for all three System V IPCs(semaphores, shared memory, message queues) resides here.

lib/

Kernel’s library code is kept under this directory. The architecture specific library’s code resides under arch/*/lib.

mm/

This too is very important directory for kernel development perspective. It contains generic code for memory management and virtual memory subsystem. Again, the architecture specific code is in arch/*/mm/ directory. This part of kernel code is responsible for requesting/releasing memory, paging, page fault handling, memory mapping, different caches etc.

net/

The code for kernel’s networking subsystem resides here. It includes code for various protocols like ,TCP/IP, ARP, Ethernet, ATM, Bluetooth etc. . It includes socket implementation too , quite interesting directory to look into for networking geeks.

scripts/

This directory includes kernel build and configuration subsystem. This directory has scripts and code that is used to configure and build kernel.

security/

This directory includes security functions and SELinux code, implemented as kernel modules.

sound/

This directory includes code for sound subsystem.

module/

When the kernel is compiled , lot of code is compiled as modules which will be added later to kernel image at runtime. This directory holds all those modules. It will be empty until the kernel is built at least once.

Apart from these important directories , also there are few files under the root of kernel sources.

  • COPYING – Copyright and licensing (GNU GPL v2).
  • CREDITS – partial credits-file of people that have contributed to the Linux project.
  • MAINTAINERS – List of maintainers who maintain kernel subsystems and drivers. It also describes how to submit kernel changes.
  • Makefile – Kernel’s main or root makefile.
  • README – This is the release notes for linux kernel. it explains how to install and patch the kernel , and what to do if something goes wrong .

Documentation

We can use make documentation targets to generate linux kernel documentation. By running these targets, we can construct the documents in any of the formats like pdf, html,man page, psdocs etc.

For generating kernel documentation, give any of the commands from the root of your kernel sources.

make pdfdocs
make htmldocs
make mandocs
make psdocs

Source Browsing

Browsing source code of a large project like linux kernel can be very tedious and time consuming . Unix systems have provided two tools, ctags and cscope for browsing the codebase of large projects. Source code browsing becomes very convenient using those tools. Linux kernel has built-in support for cscope.

Using cscope, we can:

  • Find all references of a symbol
  • Find function’s definition
  • Find the caller graph of a function
  • Find a particular text string
  • Change the particular text string
  • Find a particular file
  • Find all the files that includes a particular file

The post Linux Kernel Source & Versioning appeared first on Xathrya.ID.

Kernel Mode and Context

User Mode and Kernel Mode

In Linux, application / software is fall into two category: user programs and kernel. Linux kernel runs under a special privileged mode compared to user applications. In this mode, kernel runs in a protected memory space and has access to entire hardware. This memory space and privileged state collectively is known as kernel space or kernel mode.

On contrary, userapplications run under user-space and have limited access to resources and hardware. User space application can’t directly access hardware or kernel space memory, but kernel has access to entire memory space. To communicate with hardware, a user application need to do system call and ask service from kernel.

Different Contexts of Kernel Code

Entire kernel code can be divided into three categories.

  1. Process Context
  2. Interrupt Context
  3. Kernel Context

Process Context

User applications can’t access the kernel space directly but there is an interface using which user applications can call the functions defined in  the kernel space. This interface is known as system call. A user application can  request for kernel services using a system call.

read() , write() calls are examples of a system call. A user application calls read() / write() , that in turn invokes sys_read() / sys_write() in the kernel space . In this case kernel code executes the request of user space application. At this point, a kernel code that executes on the request or on behalf on a userapplications is called process context code. All system calls fall in this category.

Interrupt Context

Whenever a device wants to communicate with the kernel, it sends an interrupt signal to the kernel. The moment kernel receives an interrupt request from the hardware, it starts executing some routine in the response to that interrupt request. This response routine is called as interrupt service routine or an interrupt handler. Interrupt handler routine is said to execute in the interrupt context.

Kernel Context

There is some code in the linux kernel that is neither invoked by a user application nor it is invoked by an interrupt. This code is integral to the kernel and keeps on  running always . Memory management  , process management , I/O schedulers , all that code lies in  this category. This code is said to execute in the kernel context.

The post Kernel Mode and Context appeared first on Xathrya.ID.

Linux Kernel, Components, and Integration

Kernel and Linux Kernel

In term of Computer Science, Kernel is the core of an Operating System. A machine (for example: personal computer) can use various hardware produced by different vendors. All can be assembled into a single machine. A hardware like processor, RAM, Hard Disk, etc is the component to build a computer. But once the computer is built, we need an operating system to make all of these hardware can be operated. The kernel does the job.

Operating System receives the request from user and processes it on user’s behalf. Requests are received by command shell or some other kind of user interface and are processed by the kernel. So, kernel acts like an engine of the operating system which enables a user to use a computer system. Shell is the outer part of the operating system that provides an interface to the user for communicating with kernel.

kernel

Kernel is bridging Applications and Hardware

Linux is one of Kernel. It is a UNIX-like kernel created by Linus Torvalds on 1991. Linux is Open Source means everyone can contribute, develop, and make their own kernel using Linus’ kernel. Nowadays, every smart system use kernel to operate and some of those system using (or maybe subset of) Linux.

Components

If we observe more, kernel can be divide into some components. The major components forming a kernel are:

  • Low Level Drivers : Architecture specific drivers and are responsible for CPU, MMU and on-board devices initialization .
  • Process Scheduler : Scheduler is responsible for fair cpu time slice allocation  to different processes. Imagine you have some resource and must ensure every application can use fair amount of resource.
  • Memory Manager : Memory management system is responsible for allocating and sharing memory to different processes.
  • File System : Linux supports many file system types, e.g. –  fat, ntfs, jffs and lot more. User doesn’t have to worry about the complexities of underlying file system type. For this linux provides a single interface, named as virtual file system . Using a single Virtual File System interface users can use the services of different underlying file systems. The complexities of different file systems  are abstracted from the user.
  • Network Interface : This component of linux kernel provides access and control to different networking devices.
  • Device Drivers : These are high level drivers .
  • IPC :  Inter Process Communication , IPC subsystem allows different processes to  share data  among themselves.

Fig.-1.2-Kernel-components1

Integrations

We have seen that a kernel consists of different components.  Integration design tells how these different components are integrated to create kernel’s binary image.

There are mainly two integration designs used for operating system kernels , monolithic and micro. Although there are more than two but we will limit our discussion into two most used integration design.

In monolithic design all the kernel components are built together into a single static binary image . At boot up time , entire kernel gets loaded and then runs as a single process in a single address space. All the kernel components/services exist in that static kernel image . All the kernel services are running and available all the time.

Since inside the kernel everything resides in a single address space, no IPC kind of mechanism is needed for communicating between kernel services. For all these reasons monolithic kernels are high performance. Most of the unix kernels are monolithic kernels.

The disadvantage of this static kernel is lack of modularity and hotswap ability. Once the static kernel image is loaded, we can’t add/remove any  component or service from the kernel. Our option is only change the hardcoded kernel. Another reason is that kernel use much memory. So, resource consumption is higher in case of monolithic kernels.

The second kind of kernel is microkernel. In microkernel a single static kernel image is not built, instead kernel image is broken down into different small services.

At boot up time , core kernel services are loaded , they run in privileged mode . Whenever some service is required , it has to get loaded for running.

Unlike monolithic kernel all services are not up and running  all the time. They run as and when requested. Also, unlike monolithic kernels , services in microkernels run in separate address spaces. Therefore communication between two different services requires IPC mechanism . For all these reasons microkernels are not high performance kernels but they require less resources to run .

Linux kernel takes best of both these designs. Fundamentally it is a monolithic kernel. Entire linux kernel and all its services run as a single process , in a single address space , achieving very high performance . But it also has the capability to load / unload services at run time in the form of kernel modules .

The post Linux Kernel, Components, and Integration appeared first on Xathrya.ID.