PVE排查内存问题
原创2025年12月15日大约 3 分钟
PVE排查内存问题
背景
PVE经过一系列的优化后,CPU 温度降到了 55 °C ~ 75 °C。但是我我发现我 32G的内存只有16G,另外一个内存好久之前就无法识别了。
环境信息
PVE版本信息
root@:~# pveversion -v
proxmox-ve: 8.4.0 (running kernel: 6.8.12-15-pve)
pve-manager: 8.4.14 (running version: 8.4.14/b502d23c55afcba1)
proxmox-kernel-helper: 8.1.4
proxmox-kernel-6.8: 6.8.12-15内存信息(修复后显示正常)
root@:~# dmidecode -t memory
# dmidecode 3.4
Getting SMBIOS data from sysfs.
SMBIOS 3.5.0 present.
Handle 0x0027, DMI type 16, 23 bytes
Physical Memory Array
Location: System Board Or Motherboard
Use: System Memory
Error Correction Type: None
Maximum Capacity: 64 GB
Error Information Handle: Not Provided
Number Of Devices: 2
Handle 0x0028, DMI type 17, 92 bytes
Memory Device
Array Handle: 0x0027
Error Information Handle: Not Provided
Total Width: 64 bits
Data Width: 64 bits
Size: 16 GB
Form Factor: SODIMM
Set: None
Locator: Controller0-ChannelA-DIMM0
Bank Locator: BANK 0
Type: DDR4
Type Detail: Synchronous
Speed: 3200 MT/s
Manufacturer: Crucial Technology
Serial Number: E663E9A5
Asset Tag: 6676543210
Part Number: CT36G7SFRA32A.C25FT
Rank: 2
Configured Memory Speed: 3200 MT/s
Minimum Voltage: 1.2 V
Maximum Voltage: 1.2 V
Configured Voltage: 1.2 V
Memory Technology: DRAM
Memory Operating Mode Capability: Volatile memory
Firmware Version: Not Specified
Module Manufacturer ID: Bank 6, Hex 0x9B
Module Product ID: Unknown
Memory Subsystem Controller Manufacturer ID: Unknown
Memory Subsystem Controller Product ID: Unknown
Non-Volatile Size: None
Volatile Size: 16 GB
Cache Size: None
Logical Size: None
Handle 0x0029, DMI type 17, 92 bytes
Memory Device
Array Handle: 0x0027
Error Information Handle: Not Provided
Total Width: 64 bits
Data Width: 64 bits
Size: 16 GB
Form Factor: SODIMM
Set: None
Locator: Controller1-ChannelA-DIMM0
Bank Locator: BANK 0
Type: DDR4
Type Detail: Synchronous
Speed: 3200 MT/s
Manufacturer: Crucial Technology
Serial Number: E11A95DC
Asset Tag: 98555543210
Part Number: CT55G8SFRA32A.C9FF
Rank: 1
Configured Memory Speed: 3200 MT/s
Minimum Voltage: 1.2 V
Maximum Voltage: 1.2 V
Configured Voltage: 1.2 V
Memory Technology: DRAM
Memory Operating Mode Capability: Volatile memory
Firmware Version: Not Specified
Module Manufacturer ID: Bank 6, Hex 0x9B
Module Product ID: Unknown
Memory Subsystem Controller Manufacturer ID: Unknown
Memory Subsystem Controller Product ID: Unknown
Non-Volatile Size: None
Volatile Size: 16 GB
Cache Size: None
Logical Size: None常用命令
root@:~# journalctl | grep -Ei "EDAC|MCE|DIMM|memory error"
Dec 14 12:06:18 mac-pro kernel: EDAC MC: Ver: 3.0.0
Dec 14 12:06:19 mac-pro kernel: caller igen6_probe+0x193/0x8b0 [igen6_edac] mapping multiple BARs
Dec 14 12:06:19 mac-pro kernel: EDAC MC0: Giving out device to module igen6_edac controller Intel_client_SoC MC#0: DEV 0000:00:00.0 (INTERRUPT)
Dec 14 12:06:19 mac-pro kernel: EDAC MC1: Giving out device to module igen6_edac controller Intel_client_SoC MC#1: DEV 0000:00:00.0 (INTERRUPT)
Dec 14 12:06:19 mac-pro kernel: EDAC igen6: v2.5.1
Dec 14 12:06:19 mac-pro kernel: EDAC igen6 MC1: HANDLING IBECC MEMORY ERROR
Dec 14 12:06:19 mac-pro kernel: EDAC igen6 MC1: ADDR 0x7fffffffe0
Dec 14 12:06:19 mac-pro kernel: EDAC igen6 MC0: HANDLING IBECC MEMORY ERROR
Dec 14 12:06:19 mac-pro kernel: EDAC igen6 MC0: ADDR 0x7fffffffe0
Dec 14 12:25:59 mac-pro kernel: EDAC MC: Ver: 3.0.0
Dec 14 12:26:00 mac-pro kernel: caller igen6_probe+0x193/0x8b0 [igen6_edac] mapping multiple BARs
Dec 14 12:26:00 mac-pro kernel: EDAC MC0: Giving out device to module igen6_edac controller Intel_client_SoC MC#0: DEV 0000:00:00.0 (INTERRUPT)
Dec 14 12:26:00 mac-pro kernel: EDAC MC1: Giving out device to module igen6_edac controller Intel_client_SoC MC#1: DEV 0000:00:00.0 (INTERRUPT)
Dec 14 12:26:00 mac-pro kernel: EDAC igen6: v2.5.1
Dec 14 12:26:00 mac-pro kernel: EDAC igen6 MC1: HANDLING IBECC MEMORY ERROR
Dec 14 12:26:00 mac-pro kernel: EDAC igen6 MC1: ADDR 0x7fffffffe0
Dec 14 12:26:00 mac-pro kernel: EDAC igen6 MC0: HANDLING IBECC MEMORY ERROR
Dec 14 12:26:00 mac-pro kernel: EDAC igen6 MC0: ADDR 0x7fffffffe0
## 插插拔拔内存条后的日志
Dec 14 19:07:23 mac-pro kernel: EDAC MC: Ver: 3.0.0
Dec 14 19:07:24 mac-pro kernel: caller igen6_probe+0x193/0x8b0 [igen6_edac] mapping multiple BARs
Dec 14 19:07:24 mac-pro kernel: EDAC MC0: Giving out device to module igen6_edac controller Intel_client_SoC MC#0: DEV 0000:00:00.0 (INTERRUPT)
Dec 14 19:07:24 mac-pro kernel: EDAC MC1: Giving out device to module igen6_edac controller Intel_client_SoC MC#1: DEV 0000:00:00.0 (INTERRUPT)
Dec 14 19:07:24 mac-pro kernel: EDAC igen6 MC1: HANDLING IBECC MEMORY ERROR
Dec 14 19:07:24 mac-pro kernel: EDAC igen6 MC1: ADDR 0x7fffffffe0
Dec 14 19:07:24 mac-pro kernel: EDAC igen6 MC0: HANDLING IBECC MEMORY ERROR
Dec 14 19:07:24 mac-pro kernel: EDAC igen6 MC0: ADDR 0x7fffffffe0
Dec 14 19:07:24 mac-pro kernel: EDAC igen6: v2.5.1内存不支持 ECC, 没有严重错误。