Device Drivers, Part 9: I/O Control in Linux

Input/Output

This article, which is part of the series on Linux device drivers, talks about the typical ioctl() implementation and usage in Linux.

“Get me a laptop, and tell me about the x86 hardware interfacing experiments in the last Linux device drivers’ lab session, and also about what’s planned for the next session,” cried Shweta, exasperated at being confined to bed due to food poisoning at a friend’s party.

Shweta’s friends summarised the session, and told her that they didn’t know what the upcoming sessions, though related to hardware, would be about. When the doctor requested them to leave, they took the opportunity to plan and talk about the most common hardware-controlling operation, ioctl().

Introducing ioctl()

Input/Output Control (ioctl, in short) is a common operation, or system call, available in most driver categories. It is a one-bill-fits-all kind of system call. If there is no other system call that meets a particular requirement, then ioctl() is the one to use.

Practical examples include volume control for an audio device, display configuration for a video device, reading device registers, and so on — basically, anything to do with device input/output, or device-specific operations, yet versatile enough for any kind of operation (for example, for debugging a driver by querying driver data structures).

The question is: how can all this be achieved by a single function prototype? The trick lies in using its two key parameters: command and argument. The command is a number representing an operation. The argument command is the corresponding parameter for the operation. The ioctl() function implementation does a switch … case over the commmand to implement the corresponding functionality. The following has been its prototype in the Linux kernel for quite some time:

int ioctl(struct inode *i, struct file *f, unsigned int cmd, unsigned long arg);

However, from kernel 2.6.35, it changed to:

long ioctl(struct file *f, unsigned int cmd, unsigned long arg);

If there is a need for more arguments, all of them are put in a structure, and a pointer to the structure becomes the ‘one’ command argument. Whether integer or pointer, the argument is taken as a long integer in kernel-space, and accordingly type-cast and processed.

ioctl() is typically implemented as part of the corresponding driver, and then an appropriate function pointer is initialised with it, exactly as in other system calls like open(), read(), etc. For example, in character drivers, it is the ioctl or unlocked_ioctl (since kernel 2.6.35) function pointer field in the struct file_operations that is to be initialised.

Again, like other system calls, it can be equivalently invoked from user-space using the ioctl() system call, prototyped in <sys/ioctl.h> as:

int ioctl(int fd, int cmd, ...);

Here, cmd is the same as what is implemented in the driver’s ioctl(), and the variable argument construct (...) is a hack to be able to pass any type of argument (though only one) to the driver’s ioctl(). Other parameters will be ignored.

Note that both the command and command argument type definitions need to be shared across the driver (in kernel-space) and the application (in user-space). Thus, these definitions are commonly put into header files for each space.

Querying driver-internal variables

To better understand the boring theory explained above, here’s the code set for the “debugging a driver” example mentioned earlier. This driver has three static global variables: status, dignity, and ego, which need to be queried and possibly operated from an application. The header file query_ioctl.h defines the corresponding commands and command argument type. A listing follows:

#ifndef QUERY_IOCTL_H
#define QUERY_IOCTL_H
#include <linux/ioctl.h>

typedef struct
{
	int status, dignity, ego;
} query_arg_t;

#define QUERY_GET_VARIABLES _IOR('q', 1, query_arg_t *)
#define QUERY_CLR_VARIABLES _IO('q', 2)
#define QUERY_SET_VARIABLES _IOW('q', 3, query_arg_t *)

#endif

Using these, the driver’s ioctl() implementation in query_ioctl.c would be as follows:

#include <linux/module.h>
#include <linux/kernel.h>
#include <linux/version.h>
#include <linux/fs.h>
#include <linux/cdev.h>
#include <linux/device.h>
#include <linux/errno.h>
#include <asm/uaccess.h>

#include "query_ioctl.h"

#define FIRST_MINOR 0
#define MINOR_CNT 1

static dev_t dev;
static struct cdev c_dev;
static struct class *cl;
static int status = 1, dignity = 3, ego = 5;

static int my_open(struct inode *i, struct file *f)
{
	return 0;
}
static int my_close(struct inode *i, struct file *f)
{
	return 0;
}
#if (LINUX_VERSION_CODE < KERNEL_VERSION(2,6,35))
static int my_ioctl(struct inode *i, struct file *f, unsigned int cmd, unsigned long arg)
#else
static long my_ioctl(struct file *f, unsigned int cmd, unsigned long arg)
#endif
{
	query_arg_t q;

	switch (cmd)
	{
		case QUERY_GET_VARIABLES:
			q.status = status;
			q.dignity = dignity;
			q.ego = ego;
			if (copy_to_user((query_arg_t *)arg, &q, sizeof(query_arg_t)))
			{
				return -EACCES;
			}
			break;
		case QUERY_CLR_VARIABLES:
			status = 0;
			dignity = 0;
			ego = 0;
			break;
		case QUERY_SET_VARIABLES:
			if (copy_from_user(&q, (query_arg_t *)arg, sizeof(query_arg_t)))
			{
				return -EACCES;
			}
			status = q.status;
			dignity = q.dignity;
			ego = q.ego;
			break;
		default:
			return -EINVAL;
	}

	return 0;
}

static struct file_operations query_fops =
{
	.owner = THIS_MODULE,
	.open = my_open,
	.release = my_close,
#if (LINUX_VERSION_CODE < KERNEL_VERSION(2,6,35))
	.ioctl = my_ioctl
#else
	.unlocked_ioctl = my_ioctl
#endif
};

static int __init query_ioctl_init(void)
{
	int ret;
	struct device *dev_ret;


	if ((ret = alloc_chrdev_region(&dev, FIRST_MINOR, MINOR_CNT, "query_ioctl")) < 0)
	{
		return ret;
	}

	cdev_init(&c_dev, &query_fops);

	if ((ret = cdev_add(&c_dev, dev, MINOR_CNT)) < 0)
	{
		return ret;
	}
	
	if (IS_ERR(cl = class_create(THIS_MODULE, "char")))
	{
		cdev_del(&c_dev);
		unregister_chrdev_region(dev, MINOR_CNT);
		return PTR_ERR(cl);
	}
	if (IS_ERR(dev_ret = device_create(cl, NULL, dev, NULL, "query")))
	{
		class_destroy(cl);
		cdev_del(&c_dev);
		unregister_chrdev_region(dev, MINOR_CNT);
		return PTR_ERR(dev_ret);
	}

	return 0;
}

static void __exit query_ioctl_exit(void)
{
	device_destroy(cl, dev);
	class_destroy(cl);
	cdev_del(&c_dev);
	unregister_chrdev_region(dev, MINOR_CNT);
}

module_init(query_ioctl_init);
module_exit(query_ioctl_exit);

MODULE_LICENSE("GPL");
MODULE_AUTHOR("Anil Kumar Pugalia <email_at_sarika-pugs_dot_com>");
MODULE_DESCRIPTION("Query ioctl() Char Driver");

And finally, the corresponding invocation functions from the application query_app.c would be as follows:

#include <stdio.h>
#include <sys/types.h>
#include <fcntl.h>
#include <unistd.h>
#include <string.h>
#include <sys/ioctl.h>

#include "query_ioctl.h"

void get_vars(int fd)
{
	query_arg_t q;

	if (ioctl(fd, QUERY_GET_VARIABLES, &q) == -1)
	{
		perror("query_apps ioctl get");
	}
	else
	{
		printf("Status : %d\n", q.status);
		printf("Dignity: %d\n", q.dignity);
		printf("Ego    : %d\n", q.ego);
	}
}
void clr_vars(int fd)
{
	if (ioctl(fd, QUERY_CLR_VARIABLES) == -1)
	{
		perror("query_apps ioctl clr");
	}
}
void set_vars(int fd)
{
	int v;
	query_arg_t q;

	printf("Enter Status: ");
	scanf("%d", &v);
	getchar();
	q.status = v;
	printf("Enter Dignity: ");
	scanf("%d", &v);
	getchar();
	q.dignity = v;
	printf("Enter Ego: ");
	scanf("%d", &v);
	getchar();
	q.ego = v;

	if (ioctl(fd, QUERY_SET_VARIABLES, &q) == -1)
	{
		perror("query_apps ioctl set");
	}
}

int main(int argc, char *argv[])
{
	char *file_name = "/dev/query";
	int fd;
	enum
	{
		e_get,
		e_clr,
		e_set
	} option;

	if (argc == 1)
	{
		option = e_get;
	}
	else if (argc == 2)
	{
		if (strcmp(argv[1], "-g") == 0)
		{
			option = e_get;
		}
		else if (strcmp(argv[1], "-c") == 0)
		{
			option = e_clr;
		}
		else if (strcmp(argv[1], "-s") == 0)
		{
			option = e_set;
		}
		else
		{
			fprintf(stderr, "Usage: %s [-g | -c | -s]\n", argv[0]);
			return 1;
		}
	}
	else
	{
		fprintf(stderr, "Usage: %s [-g | -c | -s]\n", argv[0]);
		return 1;
	}
	fd = open(file_name, O_RDWR);
	if (fd == -1)
	{
		perror("query_apps open");
		return 2;
	}

	switch (option)
	{
		case e_get:
			get_vars(fd);
			break;
		case e_clr:
			clr_vars(fd);
			break;
		case e_set:
			set_vars(fd);
			break;
		default:
			break;
	}

	close (fd);

	return 0;
}

Now try out query_app.c and query_ioctl.c with the following operations:

  • Build the query_ioctl driver (query_ioctl.ko file) and the application (query_app file) by running make, using the following Makefile:
    # If called directly from the command line, invoke the kernel build system.
    ifeq ($(KERNELRELEASE),)
    
    	KERNEL_SOURCE := /usr/src/linux
    	PWD := $(shell pwd)
    default: module query_app
    
    module:
    	$(MAKE) -C $(KERNEL_SOURCE) SUBDIRS=$(PWD) modules
    
    clean:
    	$(MAKE) -C $(KERNEL_SOURCE) SUBDIRS=$(PWD) clean
    	${RM} query_app
    
    # Otherwise KERNELRELEASE is defined; we've been invoked from the
    # kernel build system and can use its language.
    else
    
    	obj-m := query_ioctl.o
    
    endif
  • Load the driver using insmod query_ioctl.ko.
  • With appropriate privileges and command-line arguments, run the application query_app:
    • ./query_app — to display the driver variables
    • ./query_app -c — to clear the driver variables
    • ./query_app -g — to display the driver variables
    • ./query_app -s — to set the driver variables (not mentioned above)
  • Unload the driver using rmmod query_ioctl.

Defining the ioctl() commands

"Visiting time is over," yelled the security guard. Shweta thanked her friends since she could understand most of the code now, including the need for copy_to_user(), as learnt earlier. But she wondered about _IOR, _IO, etc., which were used in defining commands in query_ioctl.h. These are usual numbers only, as mentioned earlier for an ioctl() command. Just that, now additionally, some useful command related information is also encoded as part of these numbers using various macros, as per the POSIX standard for ioctl. The standard talks about the 32-bit command numbers, formed of four components embedded into the [31:0] bits:

  1. The direction of command operation [bits 31:30] -- read, write, both, or none -- filled by the corresponding macro (_IOR, _IOW, _IOWR, _IO).
  2. The size of the command argument [bits 29:16] -- computed using sizeof() with the command argument's type -- the third argument to these macros.
  3. The 8-bit magic number [bits 15:8] -- to render the commands unique enough -- typically an ASCII character (the first argument to these macros).
  4. The original command number [bits 7:0] -- the actual command number (1, 2, 3, ...), defined as per our requirement -- the second argument to these macros.

Check out the header <asm-generic/ioctl.h> for implementation details.

  • dastagir

    Sir, u are great. U helped me a lot. pls keep going

    • anil_pugalia

      Thanks for your appreciation. I am happy, it helped you.

  • http://twitter.com/anil_pugalia Anil Pugalia

    Thanks for appreciating.

  • rakesh

    I became fan of you for your generosity to take time and explain so greatly.
    Thank you

    • http://www.facebook.com/anil.pugalia Anil Pugalia

      Thanks for becoming my fan.

  • Senthil

    Really excellent work ,i have read so many books for understanding “ioctl” but everybody explains it as a toughest thing in Device drivers but you are outstanding in explaining it in a simple way.

    • http://twitter.com/anil_pugalia Anil Pugalia

      Thanks for reading & writing the feedback.

  • focus

    I enjoy reading your tutorial.. Thanks for keeping it simple and clear. Keep up the good work.

    • http://twitter.com/anil_pugalia Anil Pugalia

      Thanks for reading & appreciating.

  • Rama

    Hi Anil,

    Really these articles are very useful to start up. I have one query on open call.

    in open system call if we give device file name as input it finally interact with that device_open function in driver. if we look at device_open syntax, the first parameter is struct inode*. The struct inode* internally has i_cdev, by using macros we can get major and minor number. The question is how your device file name linked to driver open function?. how device file name in open system call identifies the major number?. After sucess it returns file descriptor. I know the sequence once we got file descriptor how it interact with file structure. Could you please explain how open system call internally works from user space to till kernel space, not with respect to general system call execution, with respect to device file name to driver_open flow

    • http://twitter.com/anil_pugalia Anil Pugalia

      user space open call with the file name goes to VFS, which checks out the inode of the file, based on its absolute path, thus populating the struct inode for it. And then creates a struct file based on that, and then it invokes the driver_open (the actual system call), with both of these as the parameters. On success of the same, a per-process free file descriptor (integer) is allotted and returned back to the user. Hope that clarifies your doubt. In short, VFS is the key translator.

  • John

    Hi sir,

    I want to write a module that returns inode of specific file ,so how can i code it to get inode of particular file?

    • http://twitter.com/anil_pugalia Anil Pugalia

      You do not really need to write a module for that. You may use the already available system call “lstat”.

      • John

        Thanks for the reply Sir,

        Now i want to trace the code of inode & ioctl files of ext4 using printk statements for better understanding there working.How can i do that whether i need to recomplie kernel?

        • http://twitter.com/anil_pugalia Anil Pugalia

          If ext4 is compiled as a module, you do not need to recompile the kernel, but just recompile the module; otherwise recompile the kernel by making ext4 as module and then follow the above step.

  • Anand

    Great article..Understood lots of things..simple prgrm explanation,now evn i can start to write my own in better manner..thanks..i would like to get these type of DD explanation more n more..:-)

    • anil_pugalia

      Thanks for your appreciation.

  • http://www.facebook.com/bicepjai Jayaram Prabhu

    amazing materials you have here ! keep up the gud work !

    • http://twitter.com/anil_pugalia Anil Pugalia

      Thanks for the motivational words.

  • dhanamjaya

    what is the use of magicnumber

    • anil_pugalia

      To make it unique enough.

  • Eshwar

    For Latest Kernels use the following Makefile

    #3.7.10-1.16
    # If called directly from the command line, invoke the kernel build system.

    ifeq ($(KERNELRELEASE),)

    # KERNEL_SOURCE := /usr/src/linux

    KERNEL_SOURCE := /lib/modules/$(shell uname -r)/build

    PWD := $(shell pwd)

    default: module query_app

    #make -C /lib/modules/$(shell uname -r)/build M=$(PWD) modules

    module:

    $(MAKE) -C $(KERNEL_SOURCE) SUBDIRS=$(PWD) modules

    clean:

    $(MAKE) -C $(KERNEL_SOURCE) SUBDIRS=$(PWD) clean

    ${RM} query_app

    # Otherwise KERNELRELEASE is defined; we’ve been invoked from the

    # kernel build system and can use its language.

    else

    obj-m := query_ioctl.o

    endif

    • anil_pugalia

      It is nothing to do with latest kernels. It’s how the kernel source/headers are organized in your particular disto. In fact, this file would work with previous versions of kernel as well.

  • Arvind

    I have one query, What is the use of encoding command? why they have not used just command number?

    • anil_pugalia

      The intention behind this POSIX standard is to be able to verify the parameters, direction, etc related to the command, even before it is passed to the driver, say by VFS. It is just that Linux has not yet implemented the verification part.

  • vlad

    Hi Anil
    Thanks for these articles.
    I have one question. I built and ran the query_ioctl.ko and query_app. When I had insmoded the module and then I ran the query_app -g I obtained the result status: 1, diginity: 3, ego: 5. Please, can you explain where and how are these variables initialized to these values?

    • anil_pugalia

      They are defined as static global variables in the driver. Check out the variable definitions in query_ioctl.c

      • vlad

        Yeah, I can see it now.
        I think I should analyze your examples in more details line by line, not only copying & compiling.
        Thanks.

        • anil_pugalia

          :) Sure. If you want to get the real hang of it, you should analyze, at least after you have copied, compiled, and executed.

  • Peter

    Thanks for this article. It helped me understand the basics. However, when I’m asking for advices about a driver-related kernel crash, a friend who develops kernel drivers for Windows and Linux say that user memory must be locked (e.g. via mlock() ), otherwise it cannot be correct as the memory may have been swapped out. Any comment? Thanks.

    • anil_pugalia

      You need not do that explicitly, as that is taken care by the copy_from_user() & copy_to_user() functions in the above driver.

  • Tulga Khosbayar

    I’m getting the error( *** missing separator. Stop.)

    I think it’s because of the makefile. Can you help me?

    • pjmpjm

      @tulgakhosbayar:disqus

      replace spaces at beginning of lines in Makefile with tab
      hope this helps

      pjm

    • Prakash S

      Try changing line no.2 form ‘ifeq’ to ‘ifneq’ and delete ‘else’ on line 17.
      If yo dont get. comment ur error here..

  • Sunitha Parunandi

    Hi Sir,

    can i test this step by step using gdb , if so please how can set up to test this driver.

    • anil_pugalia

      Not directly, as gdb is convenient for user space app debugging. Though, it can be used for remote debugging, if you have enabled the kgdb into kernel. Or, you may use kdb, again if it is enabled in the kernel.

  • sumik

    can you please provide a proper make file

  • Newt

    Wow. This tutorial is amazing! Thank you so much! :)

    • anil_pugalia

      Thanks for your appreciation.

  • thatskriptkid

    Error recovery is sometimes best handled with the goto statement. (c) Linux Device Driver, 3rd edtiion,Chapter 2.

    You should not handle errors like you did, instead use GOTO (kernel space style). For example:

    static int chrdev_init(void)

    {

    device_class=class_create(THIS_MODULE,DEV_NAME);

    if (IS_ERR(device_class)) {

    printk(KERN_WARNING “class_create() failedn”);

    return 1;

    }

    if(alloc_chrdev_region(&DEV_NUM,0,1,DEV_NAME)!=SUCCESS) {//allocate char device number dynamically

    printk(KERN_WARNING “alloc_chrdev_region() failedn”);

    goto class_destroy;

    }

    if(cdev_create()) {

    printk(KERN_WARNING “cdev_create() failedn”);

    goto unreg;

    }

    device=device_create(device_class,NULL,DEV_NUM,NULL,DEV_NAME);//creates a device and registers it with sysfs

    if (IS_ERR(device)) {

    printk(KERN_WARNING “device_create() failedn”);

    goto cdev_del;

    }

    return SUCCESS; //chrdev_init() success

    unreg:

    unregister_chrdev_region(DEV_NUM,1);

    cdev_del:

    cdev_del(&my_cdev);

    class_destroy:

    class_destroy(device_class);

    return 1;

    }

    • anil_pugalia

      You are perfectly correct. It is just because of the habit, I developed during my early days of programming.

All published articles are released under Creative Commons Attribution-NonCommercial 3.0 Unported License, unless otherwise noted.
Open Source For You is powered by WordPress, which gladly sits on top of a CentOS-based LEMP stack.

Creative Commons License.