Discussion:
VirtualBox causes kernel panic when starting VM
Jason Herring
2011-04-28 07:54:55 UTC
Permalink
This all worked fine until I upgraded to snv_134 from 09/06

I was running 3.x version of Vbox. Vbox interface starts OK. I start a VM and the Solaris kernel panics:

Apr 28 00:09:04 atlantis ^Mpanic[cpu0]/thread=ffffff00064cbc60:
Apr 28 00:09:04 atlantis genunix: [ID 103648 kern.notice] recursive mutex_enter, lp=ffffff01a3c447e8 owner=ffffff00064cbc60 thread=ffffff00064cbc60
Apr 28 00:09:04 atlantis unix: [ID 100000 kern.notice]
Apr 28 00:09:04 atlantis genunix: [ID 655072 kern.notice] ffffff00064cb560 unix:mutex_panic+73 ()
Apr 28 00:09:04 atlantis genunix: [ID 655072 kern.notice] ffffff00064cb5c0 unix:mutex_vector_enter+190 ()
Apr 28 00:09:04 atlantis genunix: [ID 655072 kern.notice] ffffff00064cb690 si3124:si_mop_commands+6e ()
Apr 28 00:09:04 atlantis genunix: [ID 655072 kern.notice] ffffff00064cb700 si3124:si_reject_all_reset_pkts+7c ()
Apr 28 00:09:04 atlantis genunix: [ID 655072 kern.notice] ffffff00064cb760 si3124:si_tran_reset_dport+9b ()
Apr 28 00:09:04 atlantis genunix: [ID 655072 kern.notice] ffffff00064cb7d0 sata:sata_scsi_reset+ab ()
Apr 28 00:09:04 atlantis genunix: [ID 655072 kern.notice] ffffff00064cb800 scsi:scsi_reset+52 ()
Apr 28 00:09:04 atlantis genunix: [ID 655072 kern.notice] ffffff00064cb870 sd:sd_sense_key_medium_or_hardware_error+fb ()
Apr 28 00:09:04 atlantis genunix: [ID 655072 kern.notice] ffffff00064cb8d0 sd:sd_decode_sense+e5 ()
Apr 28 00:09:04 atlantis genunix: [ID 655072 kern.notice] ffffff00064cb930 sd:sd_handle_auto_request_sense+100 ()
Apr 28 00:09:04 atlantis genunix: [ID 655072 kern.notice] ffffff00064cb980 sd:sdintr+145 ()
Apr 28 00:09:04 atlantis genunix: [ID 655072 kern.notice] ffffff00064cb9b0 scsi:scsi_hba_pkt_comp+15c ()
Apr 28 00:09:04 atlantis genunix: [ID 655072 kern.notice] ffffff00064cba00 sata:sata_txlt_rw_completion+1d3 ()
Apr 28 00:09:04 atlantis genunix: [ID 655072 kern.notice] ffffff00064cbad0 si3124:si_mop_commands+401 ()
Apr 28 00:09:04 atlantis genunix: [ID 655072 kern.notice] ffffff00064cbb40 si3124:si_intr_command_error+f7 ()
Apr 28 00:09:04 atlantis genunix: [ID 655072 kern.notice] ffffff00064cbbb0 si3124:si_intr+227 ()
Apr 28 00:09:04 atlantis genunix: [ID 655072 kern.notice] ffffff00064cbc00 unix:av_dispatch_autovect+7c ()
Apr 28 00:09:04 atlantis genunix: [ID 655072 kern.notice] ffffff00064cbc40 unix:dispatch_hardint+33 ()
Apr 28 00:09:04 atlantis genunix: [ID 655072 kern.notice] ffffff0006405aa0 unix:switch_sp_and_call+13 ()
Apr 28 00:09:04 atlantis genunix: [ID 655072 kern.notice] ffffff0006405af0 unix:do_interrupt+114 ()
Apr 28 00:09:04 atlantis genunix: [ID 655072 kern.notice] ffffff0006405b10 pcplusmp:apic_setspl+5c ()
Apr 28 00:09:04 atlantis genunix: [ID 655072 kern.notice] ffffff0006405b50 unix:splr+55 ()
Apr 28 00:09:04 atlantis genunix: [ID 802836 kern.notice] ffffff0006405c60 4156044f0f0 ()
Apr 28 00:09:04 atlantis unix: [ID 100000 kern.notice]
Apr 28 00:09:04 atlantis genunix: [ID 672855 kern.notice] syncing file systems...
Apr 28 00:09:04 atlantis genunix: [ID 904073 kern.notice] done
Apr 28 00:09:05 atlantis genunix: [ID 111219 kern.notice] dumping to /dev/zvol/dsk/rpool/dump, offset 65536, content: kernel
Apr 28 00:09:14 atlantis genunix: [ID 100000 kern.notice]
Apr 28 00:09:14 atlantis genunix: [ID 665016 kern.notice] ^M100% done: 147441 pages dumped,
Apr 28 00:09:14 atlantis genunix: [ID 851671 kern.notice] dump succeeded

I upgraded to VBox 4.0.4 and then 4.0.6 and got the same result.

uname -a
SunOS atlantis 5.11 snv_134b i86pc i386 i86pc Solaris

Any thoughts?
--
This message posted from opensolaris.org
Oscar del Rio
2011-04-28 15:52:02 UTC
Permalink
Post by Jason Herring
This all worked fine until I upgraded to snv_134 from 09/06
I upgraded to VBox 4.0.4 and then 4.0.6 and got the same result.
Did you remove the old virtualbox 3.x "kernel interface" before upgrading?

From the Manual:
http://www.virtualbox.org/manual/ch02.html#idp5657568
Post by Jason Herring
Uninstallation of VirtualBox on Solaris requires root permissions. To
pkgrm SUNWvbox
After confirmation, this will remove VirtualBox from your system.
If you are uninstalling VirtualBox version 3.0 or lower, you need to
pkgrm SUNWvboxkern
Jason Herring
2011-04-28 16:00:51 UTC
Permalink
More fun. I NFS mounted the VM from a Linux box and went to start the VM - this also panicked the *opensolaris* kernel! Maybe this is something to do with the network stack? Though I see a lot of references to sata/scsi and si3124 (sata controller card in machine) in the panic:

Apr 28 08:52:54 atlantis unix: [ID 836849 kern.notice]
Apr 28 08:52:54 atlantis ^Mpanic[cpu0]/thread=ffffff00064cbc60:
Apr 28 08:52:54 atlantis genunix: [ID 103648 kern.notice] recursive mutex_enter, lp=ffffff019e1fc1e8 owner=ffffff00064cbc60 thread=ffffff00064cbc60
Apr 28 08:52:54 atlantis unix: [ID 100000 kern.notice]
Apr 28 08:52:54 atlantis genunix: [ID 655072 kern.notice] ffffff00064cb560 unix:mutex_panic+73 ()
Apr 28 08:52:54 atlantis genunix: [ID 655072 kern.notice] ffffff00064cb5c0 unix:mutex_vector_enter+190 ()
Apr 28 08:52:54 atlantis genunix: [ID 655072 kern.notice] ffffff00064cb690 si3124:si_mop_commands+6e ()
Apr 28 08:52:54 atlantis genunix: [ID 655072 kern.notice] ffffff00064cb700 si3124:si_reject_all_reset_pkts+7c ()
Apr 28 08:52:54 atlantis genunix: [ID 655072 kern.notice] ffffff00064cb760 si3124:si_tran_reset_dport+9b ()
Apr 28 08:52:54 atlantis genunix: [ID 655072 kern.notice] ffffff00064cb7d0 sata:sata_scsi_reset+ab ()
Apr 28 08:52:54 atlantis genunix: [ID 655072 kern.notice] ffffff00064cb800 scsi:scsi_reset+52 ()
Apr 28 08:52:54 atlantis genunix: [ID 655072 kern.notice] ffffff00064cb870 sd:sd_sense_key_medium_or_hardware_error+fb ()
Apr 28 08:52:54 atlantis genunix: [ID 655072 kern.notice] ffffff00064cb8d0 sd:sd_decode_sense+e5 ()
Apr 28 08:52:54 atlantis genunix: [ID 655072 kern.notice] ffffff00064cb930 sd:sd_handle_auto_request_sense+100 ()
Apr 28 08:52:54 atlantis genunix: [ID 655072 kern.notice] ffffff00064cb980 sd:sdintr+145 ()
Apr 28 08:52:54 atlantis genunix: [ID 655072 kern.notice] ffffff00064cb9b0 scsi:scsi_hba_pkt_comp+15c ()
Apr 28 08:52:54 atlantis genunix: [ID 655072 kern.notice] ffffff00064cba00 sata:sata_txlt_rw_completion+1d3 ()
Apr 28 08:52:54 atlantis genunix: [ID 655072 kern.notice] ffffff00064cbad0 si3124:si_mop_commands+401 ()
Apr 28 08:52:54 atlantis genunix: [ID 655072 kern.notice] ffffff00064cbb40 si3124:si_intr_command_error+f7 ()
Apr 28 08:52:54 atlantis genunix: [ID 655072 kern.notice] ffffff00064cbbb0 si3124:si_intr+227 ()
Apr 28 08:52:54 atlantis genunix: [ID 655072 kern.notice] ffffff00064cbc00 unix:av_dispatch_autovect+7c ()
Apr 28 08:52:54 atlantis genunix: [ID 655072 kern.notice] ffffff00064cbc40 unix:dispatch_hardint+33 ()
Apr 28 08:52:54 atlantis genunix: [ID 655072 kern.notice] ffffff0006405aa0 unix:switch_sp_and_call+13 ()
Apr 28 08:52:54 atlantis unix: [ID 100000 kern.notice]
Apr 28 08:52:54 atlantis genunix: [ID 672855 kern.notice] syncing file systems...
Apr 28 08:52:54 atlantis genunix: [ID 904073 kern.notice] done
Apr 28 08:52:55 atlantis genunix: [ID 111219 kern.notice] dumping to /dev/zvol/dsk/rpool/dump, offset 65536, content: kernel
Apr 28 08:53:03 atlantis genunix: [ID 100000 kern.notice]
Apr 28 08:53:03 atlantis genunix: [ID 665016 kern.notice] ^M100% done: 129259 pages dumped,
Apr 28 08:53:03 atlantis genunix: [ID 851671 kern.notice] dump succeeded

I need to get this VM up and running - any thoughts? I might have to go to Linux for this server if I can't get this figured out and I'd rather not do that.
--
This message posted from opensolaris.org
Dmitry G. Kozhinov
2011-04-28 16:39:53 UTC
Permalink
Post by Jason Herring
I was running 3.x version of Vbox
Have you tried to create a new virtual machine in the latest version of VirtualBox rather than trying to run the buggy instance of older version VM?

This will require to install/configure OpenSolaris though, but this should not be so hard to do.
--
This message posted from opensolaris.org
Michael Schuster
2011-04-28 17:12:05 UTC
Permalink
Post by Dmitry G. Kozhinov
Post by Jason Herring
I was running 3.x version of Vbox
Have you tried to create a new virtual machine in the latest version of VirtualBox rather than trying to run the buggy instance of older version VM?
he did say that he had the same results with 4.0.4.

the stack trace seems to indicate that VBox is merely the trigger, not
the cause. find out what piece of software si3124 is, and take it from
there - this may even be a known problem (hint: google "si3124
recursive mutex enter" or something like that).

HTH
Michael
--
regards/mit freundlichen GrĂ¼ssen
Michael Schuster
Paul Griffith
2011-04-28 17:43:40 UTC
Permalink
Post by Jason Herring
This all worked fine until I upgraded to snv_134 from 09/06
Apr 28 00:09:04 atlantis genunix: [ID 103648 kern.notice] recursive mutex_enter, lp=ffffff01a3c447e8 owner=ffffff00064cbc60 thread=ffffff00064cbc60
Apr 28 00:09:04 atlantis unix: [ID 100000 kern.notice]
Apr 28 00:09:04 atlantis genunix: [ID 655072 kern.notice] ffffff00064cb560 unix:mutex_panic+73 ()
Apr 28 00:09:04 atlantis genunix: [ID 655072 kern.notice] ffffff00064cb5c0 unix:mutex_vector_enter+190 ()
Apr 28 00:09:04 atlantis genunix: [ID 655072 kern.notice] ffffff00064cb690 si3124:si_mop_commands+6e ()
Apr 28 00:09:04 atlantis genunix: [ID 655072 kern.notice] ffffff00064cb700 si3124:si_reject_all_reset_pkts+7c ()
Apr 28 00:09:04 atlantis genunix: [ID 655072 kern.notice] ffffff00064cb760 si3124:si_tran_reset_dport+9b ()
Apr 28 00:09:04 atlantis genunix: [ID 655072 kern.notice] ffffff00064cb7d0 sata:sata_scsi_reset+ab ()
Apr 28 00:09:04 atlantis genunix: [ID 655072 kern.notice] ffffff00064cb800 scsi:scsi_reset+52 ()
Apr 28 00:09:04 atlantis genunix: [ID 655072 kern.notice] ffffff00064cb870 sd:sd_sense_key_medium_or_hardware_error+fb ()
Apr 28 00:09:04 atlantis genunix: [ID 655072 kern.notice] ffffff00064cb8d0 sd:sd_decode_sense+e5 ()
Apr 28 00:09:04 atlantis genunix: [ID 655072 kern.notice] ffffff00064cb930 sd:sd_handle_auto_request_sense+100 ()
Apr 28 00:09:04 atlantis genunix: [ID 655072 kern.notice] ffffff00064cb980 sd:sdintr+145 ()
Apr 28 00:09:04 atlantis genunix: [ID 655072 kern.notice] ffffff00064cb9b0 scsi:scsi_hba_pkt_comp+15c ()
Apr 28 00:09:04 atlantis genunix: [ID 655072 kern.notice] ffffff00064cba00 sata:sata_txlt_rw_completion+1d3 ()
Apr 28 00:09:04 atlantis genunix: [ID 655072 kern.notice] ffffff00064cbad0 si3124:si_mop_commands+401 ()
Apr 28 00:09:04 atlantis genunix: [ID 655072 kern.notice] ffffff00064cbb40 si3124:si_intr_command_error+f7 ()
Apr 28 00:09:04 atlantis genunix: [ID 655072 kern.notice] ffffff00064cbbb0 si3124:si_intr+227 ()
Apr 28 00:09:04 atlantis genunix: [ID 655072 kern.notice] ffffff00064cbc00 unix:av_dispatch_autovect+7c ()
Apr 28 00:09:04 atlantis genunix: [ID 655072 kern.notice] ffffff00064cbc40 unix:dispatch_hardint+33 ()
Apr 28 00:09:04 atlantis genunix: [ID 655072 kern.notice] ffffff0006405aa0 unix:switch_sp_and_call+13 ()
Apr 28 00:09:04 atlantis genunix: [ID 655072 kern.notice] ffffff0006405af0 unix:do_interrupt+114 ()
Apr 28 00:09:04 atlantis genunix: [ID 655072 kern.notice] ffffff0006405b10 pcplusmp:apic_setspl+5c ()
Apr 28 00:09:04 atlantis genunix: [ID 655072 kern.notice] ffffff0006405b50 unix:splr+55 ()
Apr 28 00:09:04 atlantis genunix: [ID 802836 kern.notice] ffffff0006405c60 4156044f0f0 ()
Apr 28 00:09:04 atlantis unix: [ID 100000 kern.notice]
Apr 28 00:09:04 atlantis genunix: [ID 672855 kern.notice] syncing file systems...
Apr 28 00:09:04 atlantis genunix: [ID 904073 kern.notice] done
Apr 28 00:09:05 atlantis genunix: [ID 111219 kern.notice] dumping to /dev/zvol/dsk/rpool/dump, offset 65536, content: kernel
Apr 28 00:09:14 atlantis genunix: [ID 100000 kern.notice]
Apr 28 00:09:14 atlantis genunix: [ID 665016 kern.notice] ^M100% done: 147441 pages dumped,
Apr 28 00:09:14 atlantis genunix: [ID 851671 kern.notice] dump succeeded
I upgraded to VBox 4.0.4 and then 4.0.6 and got the same result.
uname -a
SunOS atlantis 5.11 snv_134b i86pc i386 i86pc Solaris
Any thoughts?
Could this be the issue....

http://dlc.sun.com/osol/on/downloads/b136/on-changelog-b136.html

Issues Resolved:
BUG/RFE:6786704recursive mutex_enter from si3124:si_tran_reset_dport caused by bus reset
BUG/RFE:6927730si_mop_commands panics when called from si_watchdog_handler when the disk doesn't respond
Files Changed:
update:usr/src/uts/common/io/sata/adapters/si3124/si3124.c
update:usr/src/uts/common/sys/sata/adapters/si3124/si3124var.h
--
Paul Griffith |CSE Technical Team
Dept. of Computer Science and Engineering - York University
Tel: 416-736-2100 x70258|Fax: 416-736-5872
Loading...