热门关键字:  ubuntu  分区  Fedora  linux系统进程  函数

当前位置 :| 主页>Linux教程>内核研究>

kmdb和mdb调试Solaris内核

来源: 作者: 时间:2008-08-27 Tag: 点击:


操作系统版本:

> ::showrev
Hostname: palace
Release: 5.11
Kernel architecture: i86pc
Application architecture: amd64
Kernel version: SunOS 5.11 i86pc snv_57
Platform: i86pc
我们知道,这台机器是amd64 CPU, 操作系统是Solaris 11 build 57.

因为bcme是硬件驱动,因此下面的信息有助于我们了解驱动的基本情况;

网卡信息(通过vendor id和device id: pciex14e4,167a, 可以查到网卡具体型号):

> ::prtconf ! grep bcme
fffffffec01f09b0 pciex14e4,167a, instance #0 (driver name: bcme)
网卡驱动版本(v10.0.3):

> ::modinfo ! grep bcme
140 fffffffff7fd2000 1d3d0 1 bcme (Broadcom GbE Driver v10.0.3)
系统panic的消息已经告诉我们这是一个bad trap:

> ::status
debugging crash dump vmcore.0 (64-bit) from palace
operating system: 5.11 snv_57 (i86pc)
panic message:
BAD TRAP: type=e (#pf Page fault) rp=ffffff0003cb9000 addr=86 occurred in module "genunix" due to a NULL pointer dereference
dump content: kernel pages only
这个bad trap实际上就是Page fault时,访问了一个NULL指针,而且addr=86,已经很明显不是一个合法的地址。

让我们验证一下,首先,查看调用栈的详细信息:

> ::stackregs
ffffff0003cb9100 ddi_dma_unbind_handle+0xf(66)
ffffff0003cb91a0 bcme_tx_sctgth+0x305()
ffffff0003cb91f0 UM_SendPacketIntel+0x56()
ffffff0003cb9230 UM_SendMBLKPacket+0xf2()
ffffff0003cb9280 UM_SendPacketsMP+0x91()
ffffff0003cb92a0 UM_ProcessSendPackets+0x66()
ffffff0003cb92d0 UM_SendPacket+0x9e()
ffffff0003cb92f0 t3TxPacket+0xbb()
ffffff0003cb9320 bcme_wput+0x77()
ffffff0003cb9390 putnext+0x22b(fffffffec1954638, fffffffee22bf300)
ffffff0003cb9470 tcp_send_data+0x72a(fffffffec82452c0, fffffffee21e3650, fffffffee22bf300)
ffffff0003cb95a0 tcp_send+0xa7b(fffffffee21e3650, fffffffec82452c0, 4ec, 28, 14, 0, ffffff0003cb965c, ffffff0003cb9660,
ffffff0003cb9664, ffffff0003cb9618, 13a00be, 7fffffff)
ffffff0003cb9680 tcp_wput_data+0x77f(fffffffec82452c0, 0, 0)
ffffff0003cb97e0 tcp_rput_data+0x2b99(fffffffec82450c0, fffffffee2240f80, fffffffec1563f00)
ffffff0003cb9820 tcp_input+0x4a(fffffffec82450c0, fffffffee2240f80, fffffffec1563f00)
ffffff0003cb98a0 squeue_enter_chain+0x11d(fffffffec1563f00, fffffffee2240f80, fffffffee2240f80, 1, 1)
ffffff0003cb9990 ip_input+0x96f(fffffffec1b9f428, 0, fffffffee2240f80, 0)
ffffff0003cb99e0 ip_rput+0x119(fffffffec1954540, fffffffee2240f80)
ffffff0003cb9a50 putnext+0x22b(fffffffec19547d0, fffffffee2240f80)
ffffff0003cb9ac0 t3SendUp+0x2dd()
ffffff0003cb9ae0 t3ProcessRxPacket+0x98()
ffffff0003cb9b10 bcme_recv+0x87()
ffffff0003cb9b40 LM_RxPackets_Service+0x56()
ffffff0003cb9b60 LM_ISR_ServiceSoftInt+0x29()
ffffff0003cb9bc0 bcme_intr+0x30e()
ffffff0003cb9c20 av_dispatch_autovect+0x78(10)
ffffff0003cb9c60 dispatch_hardint+0x2f(10, 0)
ffffff0003c05ac0 switch_sp_and_call+0x13()
ffffff0003c05b10 do_interrupt+0x9b(ffffff0003c05b20, 1)
ffffff0003c05b20 _interrupt+0xba()
ffffff0003c05c10 mach_cpu_idle+6()
ffffff0003c05c40 cpu_idle+0xc8()
ffffff0003c05c60 idle+0x10e()
ffffff0003c05c70 thread_start+8()
mdb不但打出了函数名,而且有些函数的参数都得到了,其中ddi_dma_unbind_handle的参数是0x66,panic时,系统正在执行的指令时ddi_dma_unbind_handle+0xf,我们可以反汇编这个函数看看该指令是什么?


> ddi_dma_unbind_handle::dis
ddi_dma_unbind_handle: pushq %rbp
ddi_dma_unbind_handle+1: movq %rsp,%rbp
ddi_dma_unbind_handle+4: subq $0x10,%rsp
ddi_dma_unbind_handle+8: movq %rdi,-0x8(%rbp)
ddi_dma_unbind_handle+0xc: movq %rdi,%rdx
ddi_dma_unbind_handle+0xf: movq 0x20(%rdx),%rsi
ddi_dma_unbind_handle+0x13: movq 0x98(%rsi),%rdi
ddi_dma_unbind_handle+0x1a: xorl %eax,%eax
ddi_dma_unbind_handle+0x1c: call *0xe8(%rsi)
ddi_dma_unbind_handle+0x22: leave
ddi_dma_unbind_handle+0x23: ret

反汇编的结果有时会比较难看懂,好在OpenSolaris已经开放源代码了,我们可以对照代码:

http://cvs.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/uts/common/os/sunddi.c#ddi_dma_unbind_handle

int
ddi_dma_unbind_handle(ddi_dma_handle_t h)
{
ddi_dma_impl_t *hp = (ddi_dma_impl_t *)h;
dev_info_t *hdip, *dip;
int (*funcp)(dev_info_t *, dev_info_t *, ddi_dma_handle_t);

dip = hp->dmai_rdip;
hdip = (dev_info_t *)DEVI(dip)->devi_bus_dma_unbindhdl;
funcp = DEVI(dip)->devi_bus_dma_unbindfunc;
return ((*funcp)(hdip, dip, h));
}
不难看出,ddi_dma_unbind_handle+0xf指令正是以下这行,访问dmai_rdip成员的语句:

dip = hp->dmai_rdip;

可以用mdb验证一下:

> ::offsetof ddi_dma_impl_t dmai_rdip
offsetof (ddi_dma_impl_t, dmai_rdip) = 0x20
ddi_dma_unbind_handle+0xf对应的指令是:

ddi_dma_unbind_handle+0xf: movq 0x20(%rdx),%rsi

%rdx寄存器的值可以从以下命令得到:

> ::regs
%rax = 0x0000000000000000 %r9 = 0xfffffffec02023c8
%rbx = 0x0000000000000018 %r10 = 0x0000000000002000
%rcx = 0x0000000000000009 %r11 = 0x0000000000000001
%rdx = 0x0000000000000066 %r12 = 0xfffffffec1dc8fb0
%rsi = 0x0000000000000009 %r13 = 0x0000000000000018
%rdi = 0x0000000000000066 %r14 = 0xfffffffec1dc9298
%r8 = 0x0000000000000180 %r15 = 0xfffffffec1dc9288

%rip = 0xfffffffffba189ff ddi_dma_unbind_handle+0xf
%rbp = 0xffffff0003cb9100
%rsp = 0xffffff0003cb90f0
%rflags = 0x00010286
id=0 vip=0 vif=0 ac=0 vm=0 rf=1 nt=0 iopl=0x0
status=<of,df,IF,tf,SF,zf,af,PF,cf>

%cs = 0x0030 %ds = 0x004b %es = 0x004b
%trapno = 0xe %fs = 0x0000 %gs = 0x01c3
%err = 0x0
所以ddi_dma_unbind_handle+0xf指令实际上是去访问0x20(%rdx)对应的地址而发生的bad trap, 即addr=88:

> 0x66+0x20/X
mdb: failed to read data from target: no mapping for address
0x86:

再看一下ddi_dma_unbind_handle(9F)里的说明,这个函数几乎是所有的使用dma的驱动都需要用到的:

Kernel Functions for Drivers ddi_dma_unbind_handle(9F)

NAME
ddi_dma_unbind_handle - unbinds the address in a DMA handle

SYNOPSIS
#include <sys/ddi.h>
#include <sys/sunddi.h>

int ddi_dma_unbind_handle(ddi_dma_handle_t handle);

PARAMETERS
handle The DMA handle previously allocated by a call to
ddi_dma_alloc_handle(9F).


相关文章:
精通initramfs构建step by step
Linux利用kexec迅速切换内核
进程上下文VS中断上下文
内核通知链 学习笔记
linux spi子系统驱动分析
menuconfig 配置选项
《Linux操作系统内核实习》之练习一
udev详解
什么叫微内核,宏内核?
Linux 信号signal处理机制
开发简单的 Linux2.6 内核模块
删除内核的perl脚本
Linux2.6内核usb gadget驱动移植
GCC hacks in the Linux kernel
iomem
kernel学习的想法
让自己的驱动支持udev
linux内核编译步骤
内核的等待队列
Linux内核wait_queue深入分析
升级和删除内核
SD卡驱动分析2
Linux Kernel VDSO本地权限提升漏洞
内核中的TCP的追踪分析-15-TCP(IPV4)的客户端与
linux 2.6内核可加载模块的编译
内核模块HelloWorld
在环回接口上发送一个数据报
ARP初始化
1分钟编译FreeBSD内核
linux设备模型之uart驱动架构分析