一背景1. 讲故事微信里有一位朋友找到我说他们公司的程序存在内存暴涨问题自己分析了下没有找到原因让我看下怎么回事由于大家都有dump分析基础所以交流互通上还是很顺利的接下来就是上dump分析啦。二内存暴涨分析1. 为什么会内存暴涨先还是老套路用!address -summary观察下内存分布情况输出如下0:000 !address -summary --- Usage Summary ---------------- RgnCount ----------- Total Size -------- %ofBusy %ofTotal Free 363 7dfde87c7000 ( 125.992 TB) 98.43% unknown 9276 201e5858000 ( 2.007 TB) 99.96% 1.57% Heap 65 02547f000 ( 596.496 MB) 0.03% 0.00% Image 1855 009d35000 ( 157.207 MB) 0.01% 0.00% Stack 93 002c00000 ( 44.000 MB) 0.00% 0.00% Other 9 0001de000 ( 1.867 MB) 0.00% 0.00% TEB 31 00003e000 ( 248.000 kB) 0.00% 0.00% PEB 1 000001000 ( 4.000 kB) 0.00% 0.00% --- State Summary ---------------- RgnCount ----------- Total Size -------- %ofBusy %ofTotal MEM_FREE 363 7dfde87c7000 ( 125.992 TB) 98.43% MEM_RESERVE 690 2012b6d4000 ( 2.005 TB) 99.82% 1.57% MEM_COMMIT 10640 0ec155000 ( 3.689 GB) 0.18% 0.00%从卦中可以看到,总计3.6G的总提交内存看样子都落到了Unk区域最好是托管层吃掉了否则就麻烦了接下来使用!dumpheap -stat观察输出如下0:000 !dumpheap -stat Statistics: MT Count TotalSize Class Name ... 0179c7715cb0 1,847,901 451,265,880 Free 7ffc6e0a2888 2 536,870,960 System.WeakReferenceMicrosoft.Extensions.DependencyInjection.ServiceProvider[] 7ffc6e0a2260 60,873,978 1,460,975,472 System.WeakReferenceMicrosoft.Extensions.DependencyInjection.ServiceProvider Total 63,333,893 objects, 2,494,520,292 bytes从卦中可以看到程序中有6087w个弱引用接下来使用!dumpheap -mt 7ffc6e0a2260观察下列表详情然后用!gcroot观察其引用根参考如下0:000 !dumpheap -mt 7ffc6e0a2260 Address MT Size 017988001000 7ffc6e0a2260 24 017988001018 7ffc6e0a2260 24 017988001030 7ffc6e0a2260 24 017988001048 7ffc6e0a2260 24 017988001060 7ffc6e0a2260 24 017988001078 7ffc6e0a2260 24 017988001090 7ffc6e0a2260 24 0179880010a8 7ffc6e0a2260 24 ... 017a405f1020 7ffc6e0a2260 24 0:000 !gcroot 0179880010a8 Caching GC roots, this may take a while. Subsequent runs of this command will be faster.等了20多分钟都没有出来结果可能 6kw 的根纵横交错让windbg不堪重负没有就没撤了使用内存搜索法寻找上级所属对象。这里就选择017a405f1020对象来开刀。0:000 !dumpobj /d 17a405f1020 Name: System.WeakReference1[[Microsoft.Extensions.DependencyInjection.ServiceProvider, Microsoft.Extensions.DependencyInjection]][] MethodTable: 00007ffc6e0a2888 EEClass: 00007ffc6dbeb4f8 Tracked Type: false Size: 536870936(0x20000018) bytes Array: Rank 1, Number of elements 67108864, Type CLASS (Print Array) Fields: None 0:000 s-q 0 L?0xffffffffffffffff 17a405f1020 00000179c95861d0 0000017a405f1020 03a0dcfa03a0dcfa 0:000 !lno 0000017a405f1020 Before: 017a405f1000 32 (0x20) Free Current: 017a405f1020 24 (0x18) System.WeakReferenceMicrosoft.Extensions.DependencyInjection.ServiceProvider[] Error Detected: Object 17a405f1020 has a bad member at offset 12054c00: ??? [verify heap] Could not find object after 17a405f1020 Heap local consistency not confirmed. 0:000 !lno 00000179c95861d0 Before: 0179c95861c8 32 (0x20) System.Collections.Generic.ListSystem.WeakReferenceMicrosoft.Extensions.DependencyInjection.ServiceProvider Next: 0179c95861e8 24 (0x18) System.WeakReferenceMicrosoft.Extensions.DependencyInjection.ServiceProvider[] Heap local consistency confirmed. 0:000 !dumpobj /d 179c95861c8 Name: System.Collections.Generic.List1[[System.WeakReference1[[Microsoft.Extensions.DependencyInjection.ServiceProvider, Microsoft.Extensions.DependencyInjection]], System.Private.CoreLib]] MethodTable: 00007ffc6e0a2340 EEClass: 00007ffc6dce0000 Tracked Type: false Size: 32(0x20) bytes File: D:\xxx\A_api\System.Private.CoreLib.dll Fields: MT Field Offset Type VT Attr Value Name 00007ffc6de328f0 400209f 8 System.__Canon[] 0 instance 0000017a405f1020 _items 00007ffc6dc894b0 40020a0 10 System.Int32 1 instance 60873978 _size 00007ffc6dc894b0 40020a1 14 System.Int32 1 instance 60873978 _version 00007ffc6de328f0 40020a2 8 System.__Canon[] 0 staticdynamic statics NYI s_emptyArray 0:000 s-q 0 L?0xffffffffffffffff 179c95861c8 00000179c77571d8 00000179c95861c8 0000000000000000 00000179c95861b8 00000179c95861c8 0800004e00000000 0:000 !lno 00000179c77571d8 Failed to find the segment of the managed heap where the object 179c77571d8 resides 0:000 !lno 00000179c95861b8 Before: 0179c9586108 192 (0xc0) Microsoft.Extensions.DependencyInjection.DependencyInjectionEventSource Next: 0179c95861c8 32 (0x20) System.Collections.Generic.ListSystem.WeakReferenceMicrosoft.Extensions.DependencyInjection.ServiceProvider Heap local consistency confirmed.根据卦中的图和输出终于找到了原来是DependencyInjectionEventSource._providers承担了所有接下来的关注点就来到了DependencyInjectionEventSource。2. xxxEventSource 是什么从名字上看和 ETW 事件有关接下来用!eeversion观察 .net 版本寻找其对应的C#源代码。0:000 !eeversion 6.0.3624.51421 free 6,0,3624,51421 Commit: f1dd57165bfd91875761329ac3a8b17f6606ad18 Workstation mode SOS Version: 9.0.13.2701 retail build从上面的源代码看其实也看不出来个所以毕竟底层的架构我不熟悉本着我不是第一个吃螃蟹的人所以拿关键词在网上索一下果然 stephentoub 大佬在去年4月份就发现了这个问题在 .net10 中做了修复看描述是一个优化级的bug官方链接https://github.com/dotnet/runtime/issues/114599 截图如下修改后的代码如下果然加了很多的业务逻辑来处理。[NonEvent] public void ServiceProviderBuilt(ServiceProvider provider) { lock (_providers) { int providersCount _providers.Count; if (providersCount 0 (_survivingProvidersCount isint spc ? (uint)providersCount 2 * (uint)spc : providersCount _providers.Capacity)) { _providers.RemoveAll(static p !p.TryGetTarget(out _)); _survivingProvidersCount _providers.Count; } _providers.Add(new WeakReferenceServiceProvider(provider)); } WriteServiceProviderBuilt(provider); }从官方描述来看就是有人创建了 scope但后续没有调用 dispose 方法来及时释放导致框架中的 WeakReference 引用滞留引发内存暴涨可以说两者都有责任吧。解决办法很简单两种方式检查代码里写 BuildServiceProvider 的地方没有即时的 Dispose。升级到 .NET10 这是最简单粗暴的方法。把结论告诉朋友后朋友终于在2天后给我反馈了好消息好心情溢于言表三总结dump之旅是一个修理工不断自我修炼的过程必须学会在绝望中寻找希望的能力。