Have you ever experienced hard lockups and seen no trace of the cause in your log files? Those situations can be even more of a pain if you do not have physical access to the machine since you will not be able to look for kernel oops on the console. You could buy a serial console or an ip kvm but if you don’t have the need for remote control, but would really like to be able to debug without being physically present you need to check out netconsole. Netconsole sends printk messages over UDP.

Setting up netconsole is not difficult but the syntax can be a bit tiresome. Netconsole needs several bits of information in order to function properly.

  • dev_name – Local network interface name
  • local_port – Source UDP port to use
  • remote_port – Remote agent’s UDP port
  • local_ip – Source IP address to use
  • remote_ip – Remote agent’s IP address
  • local_mac – Local interface’s MAC address
  • remote_mac – Remote agent’s MAC address Of those remote_mac tends to be the tricky one. Not because its hard to get but because it is slightly mis-leading. If the remote agent is in the same subnet its the mac of the remote agent, but if the remote agent is not in the same subnet (think logging over internet) then you really need the mac address of the gateway that will handle the traffic (if you have multiple wans). Typically your looking for the mac of your default gateway.

Find MAC of remote agent in same subnet

MAC=$(ping -c 1 $REMOTE_AGENT > /dev/null ; arp -n $REMOTE_AGENT | grep ^$REMOTE_AGENT | awk '{print $3}')
echo Remote MAC: $MAC

Find MAC of default gw

GATEWAY=$(netstat -rn | awk '/^ {print $2}')
MAC=$(ping -c 1 $GATEWAY > /dev/null ; arp -n $GATEWAY | grep ^$GATEWAY | awk '{print $3}')
echo Remote MAC: $MAC

Initialize netconsole

Now you should have enough information to go ahead and initalize netconsole so lets give it a test

modprobe netconsole netconsole=local_port@local_ip/dev_name,remote_port@remote_ip/remote_mac

Now we still need to get something listening on the remote and test if it actually works. Log into your remote machine and run

nc -l -p remote_port -u | tee  somelogfile.log

For a more permanent setup you might want to use syslog but this will suffice for now. If it’s a short term but long running test you might be well advised to run that from a screen session.

Good now we have the remote listening on udp with netcat. We should make sure that the messages are getting logged. Log back into the machine thats running netconsole (local_ip) and run the following.

dmesg -n 8

This will increase the number of things that get logged.

Now find an innocuous kernel module that you can load and unload (i like to use floppy)

rmmod floppy (in case its already loaded)
modprobe floppy

You should have seen some output on your remote machine that looks something like

Floppy drive(s): fd0 is 1.44M
FDC 0 is a post-1991 82077

Great now you have netconsole working! If you get kernel oops your remote box should display it and log it to a file as well.

Want to make netconsole active through reboots? No problem we just need to edit a few files.

First lets get netconsole loading on boot by adding the module to /etc/modules

echo "netconsole" >> /etc/module

That was easy enough, but we need to make sure it has the proper options as well so lets add the module options to /etc/modprobe.d/netconsole

echo "options netconsole netconsole=local_port@local_ip/dev_name,remote_port@remote_ip/remote_mac" > /etc/modprobe.d/netconsole

That should do it. Go ahead and try rebooting the machine running netconsole and watch your remote to see the boot msgs that happen after netconsole loads.

Note: there is a dynamic way to specify how netconsole is configured but you need to have CONFIG_NETCONSOLE_DYNAMIC in your kernel and since debian etch does not have this by default I wont cover it here. For more information check out the netconsole doc in the kernel source /usr/src/linux/Documentation/networking/netconsole.txt.

Now if you would like to make the remote side a bit more permanent thats pretty easy as well. Lets install and configure syslog-ng.

aptitude install syslog-ng

append the following to your /etc/syslog-ng/syslog-ng.conf

Note: make sure your set remote_port as you did above

source net { udp(ip("") port(remote_port)); };
destination netconsole { file("/var/log/$HOST/netconsole.log"); };
log { source(net); destination(netconsole); };

Now restart syslog-ng

/etc/init.d/syslog-ng restart

Now you should be able to find the logs in /var/log/local_ip/netconsole.log on your remote machine. Note: local_ip is the ip of the machine that was running netconsole