PVE外接USB3存储随机断开
原创大约 3 分钟
PVE外接USB3存储随机断开
背景
在PVE系统中通过
USB3
断开连接外置存储的时候经常莫名的断开, 对外表现就是访问存储的目录出现IO错误
, 需要重启系统才能再次识别磁盘。
PS: 即使用UUID
挂载磁盘也存在IO错误
问题
journalctl -r | grep 'USB disconnect'
Nov 15 22:53:04 xxxxx kernel: usb 4-4: USB disconnect, device number 2
Nov 15 20:42:20 xxxxx kernel: usb 4-4: USB disconnect, device number 1
排查命令和尝试内容
- 查看磁盘
断开重连后
sdb
会重新加载识别为sbc
PSsda
是内置磁盘不影响
lsblk -o NAME,SIZE,MOUNTPOINTS | grep 'sd'
sda 1931.5G
└─sda1 1931.5G /mnt/ssd
sdb 1931.5G /mnt/hd
- 查看USB设备信息
# lsusb
Bus 004 Device 002: ID 15xx:05xx JXXXX Technology Corp. / JXXXX USA Technology Corp. Gen1 SATA 6Gb/s Bridge
# dmesg | grep -i usb
[ 0.011546] ACPI: SSDT 0x000000004333F000 001919 (v02 ALASKA UsbCTabl 00001000 INTL 20200717)
[ 0.240638] ACPI: USB4 _OSC: OS supports USB3+ DisplayPort+ PCIe+ XDomain+
[ 0.240642] ACPI: USB4 _OSC: OS controls USB3+ DisplayPort+ PCIe+ XDomain+
[ 0.728778] ACPI: bus type USB registered
[ 0.728778] usbcore: registered new interface driver usbfs
[ 0.728778] usbcore: registered new interface driver hub
[ 0.728778] usbcore: registered new device driver usb
[ 1.733518] xhci_hcd 0000:00:0d.0: new USB bus registered, assigned bus number 1
[ 1.734869] xhci_hcd 0000:00:0d.0: new USB bus registered, assigned bus number 2
[ 1.734871] xhci_hcd 0000:00:0d.0: Host supports USB 3.2 Enhanced SuperSpeed
.....
[ 1.742673] hub 4-0:1.0: USB hub found
.....
[ 2.220868] usb 4-4: new SuperSpeed USB device number 2 using xhci_hcd
[ 2.234093] usb 4-4: New USB device found, idVendor=XXXX, idProduct=XXXX, bcdDevice=43.04
[ 2.234096] usb 4-4: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[ 2.234097] usb 4-4: Product: JXXXX
[ 2.234097] usb 4-4: Manufacturer: JXXXX
[ 2.234098] usb 4-4: SerialNumber: 000000000XXX
[ 2.238015] usb-storage 4-4:1.0: USB Mass Storage device detected
[ 2.238164] scsi host1: usb-storage 4-4:1.0
# lsusb -v -d <VendorID>:<ProductID>
# lsusb -v -d XXXX:XXXX
# lsusb -t
/: Bus 04.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/4p, 10000M
|__ Port 4: Dev 2, If 0, Class=Mass Storage, Driver=usb-storage, 5000M
- 查看电源管理模式
# cat /sys/bus/usb/devices/4-4/power/control
on
- 模拟断开USB和插入USB
避免反复重启,插拔线缆
# 4-4 是usb标识号
echo -n "4-4" > /sys/bus/usb/drivers/usb/unbind
echo -n "4-4" > /sys/bus/usb/drivers/usb/bind
修改内核参数GRUB 配置
配置
# nano /etc/default/grub
GRUB_CMDLINE_LINUX_DEFAULT="quiet intel_iommu=on usbcore.quirks=VID:PID:k"
VID:PID 是设备的供应商 ID 和产品 ID。
[quirk] 是要应用的修复行为,常见的选项包括:
k:强制设备重置(reset)。
n:禁用 USB 设备的某些特性。
d:延迟对设备的某些操作。
m:禁用设备的内存映射等。
查看BIOS配置
然而经过上面尝试,经过几周的折腾仍旧奇怪随机断开
解决方法
退化使用USB2.0接口连接存储,牺牲了传输速率从300+Mb/s 降到了 20+MB/s
后续措施
种种表现指向
[ 490.597895] usb 4-4: reset SuperSpeed USB device number 2 using xhci_hcd
xhci_hcd
等待内核更新在尝试
# uname -r
6.8.12-3-pve
# pveversion --verbose
proxmox-ve: 8.2.0 (running kernel: 6.8.12-3-pve)
pve-manager: 8.2.7 (running version: 8.2.7/3e0176e6bb2ade3b)
proxmox-kernel-helper: 8.1.0
pve-kernel-6.2: 8.0.5
proxmox-kernel-6.8: 6.8.12-3