Appendix F - Troubleshooting
Viewing error messages at boot-time
Much information can be gained by reviewing the error messages when the system boots. Unfortunately, many of these errors scroll by so quickly, that you can't read them. To view them, after the system has booted, press the [Shift]-[PgUp] and [Shift]-[PgDn] keys to scroll up and down through the system console.
ping
Use ping to try to contact and get a response from each interface. Note that, from the inside, you should be able to get a response from both interfaces on the firewall. From the outside, you should not get any responses from either interface (unless you change the default firewall rules in /etc/ipfilter.conf). Ping also works differently in Linux than in Windows. Consider the following ping command in Windows:
ping 165.138.255.10 [Enter]
Assuming there is a device at this address, and it is configured to reply to ping requests, four packets will be sent to the device at this address, and the device will send four reply packets back, at which point, you will be returned to the command prompt:
C:\WINDOWS>ping 165.138.255.10
Pinging 165.138.255.10 with 32 bytes of data:
Reply from 165.138.255.10: bytes=32 time=10ms TTL=255
Reply from 165.138.255.10: bytes=32 time=1ms TTL=255
Reply from 165.138.255.10: bytes=32 time<10ms TTL=255
Reply from 165.138.255.10: bytes=32 time<10ms TTL=255
Ping statistics for 165.138.255.10:
Packets: Sent = 4, Received = 4, Lost = 0 (0% loss),
Approximate round trip times in milli-seconds:
Minimum = 0ms, Maximum = 10ms, Average = 2ms
C:\WINDOWS>_
In Linux, you will not see anything appear on the screen when issuing this command. This is because in Linux, you must specify the number of packets to ping with. If you don't, your station will continue to ping the target host in the background until [Ctrl]-[c] is pressed to stop pinging (or [Ctrl]-[z] is pressed which would suspend the ping process). To ping a host four times from a Linux box, and display the results, issue the following command:
ping -c 4 165.138.255.10 [Enter]
ip
ip is a command that can be used to obtain much needed information about the state of your interfaces. ip requires additional parameters in order to carry out a task. If you need to see your route table, typing ip route will display information similar to:
# ip route
165.138.255.0/25 dev eth0 proto kernel scope link src 165.138.255.1
192.168.2.0/24 via 192.168.1.254 dev eth1
192.168.3.0/24 via 192.168.1.254 dev eth1
192.168.1.0/24 dev eth1 proto kernel scope link src 192.168.1.3
default via 165.138.255.126 dev eth0
This table indicates that there are 2 additional networks behind the firewall which are being routed through the firewall, and that to reach these networks, the firewall must send packets to the foreign router located at 192.168.1.254. It also forwards all packets received from the outside on the interface at 192.168.1.3 to the network 192.168.1.0. The last line indicates that 165.138.255.126 is the default gateway to the internet for this firewall.
For status information on the NICs, typing
ip addr [Enter]
will produce output similar to the following:
# ip addr
1: lo: <LOOPBACK,UP> mtu 3924 qdisc noqueue
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 brd 127.255.255.255 scope global lo
2: brg0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop
link/ether fe:fd:06:c7:99:8d brd ff:ff:ff:ff:ff:ff
3: eth0: <BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast qlen 100
link/ether 00:0d:f0:ba:ce:12 brd ff:ff:ff:ff:ff:ff
inet 165.183.252.1/25 brd 165.183.252.127 scope global eth0
4: eth1: <BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast qlen 100
link/ether 00:0c:f0:bb:96:4b brd ff:ff:ff:ff:ff:ff
inet 192.168.1.3/24 brd 192.168.1.255 scope global eth1
In this example, ignore interfaces 1 and 2. Interface 3 has the name "eth0". It is operational (look for the "UP" inside the '<' and '>' symbols). It's MAC address is 00:0d:f0:ba:ce:12, and its IP address is 165.183.252.1. It broadcasts requests to 165.183.252.127. Similarly, interface 4 is named "eth1", is up, has a MAC address of 00:0c:f0:bb:96:4b, an IP address of 192.168.1.3, and a broadcast address of 192.168.1.255.
There are a number of files that will be important to know about when troubleshooting. These are listed in the table below.
| /etc/ipfilter.conf | This is the list of firewall rules that get applied to the firewall. This file can be a bit daunting. The ipchains HOWTO should provide some assistance in understanding this file (see the Resources page for a link to the IPCHAINS HOWTO.). |
| /etc/network.conf | This is the file that is used to configure networking on the LRP firewall. See Charles Steinkuehler's network.conf readme file for instructions on what the options in this file do. |
| /etc/hosts.allow | This file determines which hosts can establish a connection directly with the firewall (i.e., a secure shell connection with Putty, or viewing the status web pages on the firewall). |
| /etc/dnscache.conf | This, obviously, is the dnscache configuration file. |
| /etc/inetd.conf | This is the services database. It lists each service made available on the firewall. If you're having trouble establishing a secure shell session, or viewing the web status pages on the firewall, this is one place you can look (in addition to the hosts.allow file). |
| /etc/modules | This file manages the NIC modules, and special modules necessary for passing certain types of traffic through the firewall. If the NIC is giving you trouble, or you're having problems with passing certain types of traffic (ftp, irc, real audio, MS NetMeeting, etc), look here. |
| /etc/seawall.conf | This file is used to set options for Seattle Firewall. |
| /etc/seawall/servers | This file describes which systems in the private network host which services. |
| /etc/seawall/nat | This file provides for network address translation between additional IP addresses set up on the external interface. |
| /etc/init.d/network | This is the collection of functions that actually apply many of the configuration settings in the /etc/network.conf file. |
| /etc/init.d/block.sh | If you configured traffic blocking using the block.sh file, this is where you can adjust those rules. These are firewall rules just like in the ipfilter.conf file, and as such, can be understood by reading the IPCHAINS HOWTO. |
| /var/sh-www/Netmon.html | If you are using the lrpStat JAVA applet, this file is used to configure the appearance of the graphs that are displayed in the browser window. |
SPECIFIC ISSUES
Problem 1: System hangs at boot
Symptoms:
The firewall hangs up when booting. The last thing to show up on the screen is:
Loading root.lrp.....
Solution:
When presented with this, I have discovered that, one way or another, the root.lrp file became corrupted. Copy a new root.lrp onto the floppy and try to boot again.
Problem 2: Error during backup
Symptoms:
When trying to perform a backup, I get the error "Could not mount backup device."
Solution:
This is often a result of a write protected floppy, or, the system thinks that there is already a floppy mounted. The floppy drive must be unmounted prior to performing a backup.
Problem 3: Error during backup
Symptoms:
When trying to back up, I received a message saying that there was insufficient space. It appeared as if root.lrp was backed up, but now the firewall won't boot.
Solution:
When I experienced this, I determined that there were log files on my firewall which became so large, that they took up most of the ramdisk, leaving little space for backing up to the floppy. As a result, the root.lrp that was written was corrupted. See Problem 2 above.
Problem 4: 'do_try_to_free_pages...' errors
Symptoms:
After my firewall boots, I see the following messages:
VM: do_try_to_free_pages failed for <process_name>
...
VM: killing process <process_name>
Solution:
You do not have enough RAM. Add more RAM to your system. EigerStein should have at least 12Mb RAM, though 16Mb is better.
Problem 5: 'not found' errors
Symptoms:
When executing a script file, I get a bunch of "not found" errors.
Solution:
This is due to the difference in how *nix and DOS/Windows systems handle new lines. If you wrote the script in Edit or Notepad, this symptom will result. Check the results of the script to see if it did what it was supposed to. If it did, the errors are essentially benign. If you want to get rid of them, however, rewrite the script using ae on the firewall.
Problem 6: CPU Information dumped to the screen
Symptoms:
When I boot the firewall, I get the following errors:
Unable to handle kernel paging request at virtual address 00003e8b
current->tss.cr3 = 01971000, %cr3=01971000
*pde = 00000000
Oops: 0002
CPU: 0
EIP: 0040:[<00002238>]
EFLAGS: 00010046
(followed by values stored in the CPU registers, etc.)
Solution:
While the specifics of your situation may be slightly different (i.e., the addresses), this is typically a memory problem. Try new memory in the machine.
Problem 7: Identical MAC addresses on NICs
Symptoms:
The MAC addresses of the pcmcia NICs in my firewall are the same, and neither one listed match the actual MAC addresses printed on the NICs.
Solution:
Somehow, the pcmcia modules are unable to acquire the MAC address from the NIC. When this happens, you will need to manually assign MAC addresses. Do so with the following line:
ip link set ethN address XX:XX:XX:XX:XX:XX
where N is 0, 1, 2, etc., and XX:XX:XX:XX:XX:XX is the actual MAC address of your NIC. Note that attempting to issue this command is the wrong place could cause this setting to be overridden (if it's placed to early in the boot sequence), or not get applied (if it's placed too late in the boot sequence). I have found that placing it in the /etc/init.d/network file, in the iface_up() function, directly after the 'vb echo -n...' line seems to work quite nicely.