记一次 Visual Studio 2022 卡死分析

一:背景

1. 讲故事

最近不知道咋了,各种程序有问题都寻上我了,你说 .NET 程序有问题找我能理解,Windows 崩溃找我,我也可以试试看,毕竟对 Windows 内核也知道一丢丢,那 Visual Studio 有问题找我就说不过去了,但又不好拒绝,就让朋友发下卡死的 dump 我看一看。.

二:WinDbg 分析

1. 到底是哪里的卡死

因为 VS 是窗体程序,所以在卡死的时候看下主线程便知,使用 ~0s;!clrstack 即可。

0:000> k
 # Child-SP          RetAddr               Call Site
00 0000004b`acaf9b90 000001ed`309f0f28     0x00007ffb`1b77bfe8
01 0000004b`acaf9b98 00007ffb`4a03e397     0x000001ed`309f0f28
02 0000004b`acaf9ba0 00007ffb`4a04c08e     PresentationFramework_ni!System.Windows.Controls.ItemContainerGenerator.DoLinearSearch+0x1d7
03 0000004b`acaf9c70 00007ffb`4ab3bd36     PresentationFramework_ni!System.Windows.Controls.ItemContainerGenerator.ContainerFromItem+0x8e
04 0000004b`acaf9ce0 00007ffb`4ab3bd6e     PresentationFramework_ni!System.Windows.Automation.Peers.ItemAutomationPeer.GetWrapper+0xc6
05 0000004b`acaf9d20 00007ffb`4ab3c94f     PresentationFramework_ni!System.Windows.Automation.Peers.ItemAutomationPeer.GetWrapperPeer+0xe
06 0000004b`acaf9d60 00007ffb`4ba3f72c     PresentationFramework_ni!System.Windows.Automation.Peers.ItemAutomationPeer.IsControlElementCore+0xf
07 0000004b`acaf9d90 00007ffb`4ba42026     PresentationCore_ni!System.Windows.Automation.Peers.AutomationPeer.IsControlElement+0x3c
08 0000004b`acaf9de0 00007ffb`4ba41e8b     PresentationCore_ni!System.Windows.Automation.Peers.AutomationPeer.IsControlElement+0x36
09 0000004b`acaf9e20 00007ffb`4bcc5632     PresentationCore_ni!System.Windows.Automation.Peers.AutomationPeer.GetPropertyValue+0x7b
0a 0000004b`acaf9e70 00007ffb`4c182cf8     PresentationCore_ni!MS.Internal.Automation.ElementUtil.<>c__DisplayClass11_0.<Invoke>b__0+0x32
0b 0000004b`acaf9eb0 00007ffb`4c182bf6     WindowsBase_ni!System.Windows.Threading.ExceptionWrapper.InternalRealCall+0x68
0c 0000004b`acaf9f20 00007ffb`4c180202     WindowsBase_ni!System.Windows.Threading.ExceptionWrapper.TryCatchWhen+0x36
0d 0000004b`acaf9f70 00007ffb`4bca4423     WindowsBase_ni!System.Windows.Threading.Dispatcher.LegacyInvokeImpl+0x172
0e 0000004b`acafa010 00007ffb`480629e1     PresentationCore_ni!MS.Internal.Automation.ElementUtil.Invoke+0xb3
0f 0000004b`acafa070 00007ffb`7af71059     UIAutomationTypes_ni+0x729e1
10 0000004b`acafa0e0 00007ffb`7ae13eba     clr!COMToCLRDispatchHelper+0x39
11 0000004b`acafa110 00007ffb`7af70fb7     clr!COMToCLRWorker+0x1ea
12 0000004b`acafa1b0 00007ffb`625b4cc9     clr!GenericComCallStub+0x57
...

我丢,这线程栈一看有意外发现哈,这 PresentationCore_ni 不是 WPF 的专用库嘛,下面还有 clr ,看样子 VS 的UI是 WPF 写的,顿时有一种亲切感,那既然是 .NET 程序我还是可以分析的。

进一步观察线程栈,可以看到它没有非托管的部分,诸如:user32.dllntdll.dll,也就说明此时的卡死只是托管层面,接下来使用 sos 专有的 !clrstack 观察,删减后如下:

0:000> !clrstack
OS Thread Id: 0x8144 (0)
        Child SP               IP Call Site
0000004bacafa220 00007ffb1b77bfe8 [ComMethodFrame: 0000004bacafa220] 
0000004bacafb0d8 00007ffb1b77bfe8 [InlinedCallFrame: 0000004bacafb0d8] MS.Win32.UnsafeNativeMethods.OleSetClipboard(System.Runtime.InteropServices.ComTypes.IDataObject)
0000004bacafb0d8 00007ffb4c357e22 [InlinedCallFrame: 0000004bacafb0d8] MS.Win32.UnsafeNativeMethods.OleSetClipboard(System.Runtime.InteropServices.ComTypes.IDataObject)
0000004bacafb0a0 00007ffb4c357e22 DomainNeutralILStubClass.IL_STUB_PInvoke(System.Runtime.InteropServices.ComTypes.IDataObject)
0000004bacafb180 00007ffb4ba0f5ed System.Windows.Clipboard.CriticalSetDataObject(System.Object, Boolean)
0000004bacafb1c0 00007ffae3b1bd6a Microsoft.VisualStudio.Text.Utilities.WpfClipboardService.SetData(System.String, System.String, System.String, Boolean, System.String, Boolean, Boolean)
0000004bacafb210 00007ffae3b1bc1c Microsoft.VisualStudio.Text.Operations.Implementation.EditorOperations.CopyToClipboard(System.String, System.String, Boolean, Boolean)
0000004bacafb270 00007ffae3b199a2 Microsoft.VisualStudio.Text.Operations.Implementation.EditorOperations+c__DisplayClass176_0.b__0()
0000004bacafb2a0 00007ffae3b196ca Microsoft.VisualStudio.Text.Operations.Implementation.EditorOperations.CopySelection()
...
0000004bacafcc20 00007ffb2edd765b JetBrains.ReSharper.Feature.Services.Clipboard.CopyPasteAssistManager.DoCopyOrCut(JetBrains.Application.DataContext.IDataContext, JetBrains.Application.UI.Actions.DelegateExecute)
0000004bacafcc70 00007ffb2edd7628 JetBrains.ReSharper.Feature.Services.Clipboard.CopyPasteAssistManager.DoCopyOrCut(JetBrains.Application.DataContext.IDataContext, JetBrains.Application.UI.Actions.DelegateExecute)
0000004bacafcd00 00007ffb2edd5407 JetBrains.ReSharper.InplaceRefactorings.CutCopyPaste.CopyPasteManager.DoCopyOrCut(JetBrains.Application.DataContext.IDataContext, Boolean, JetBrains.Application.UI.Actions.DelegateExecute)
0000004bacafcd70 00007ffb2edd53c4 JetBrains.ReSharper.InplaceRefactorings.CutCopyPaste.CopyPasteManager.DoCopyOrCut(JetBrains.Application.DataContext.IDataContext, Boolean, JetBrains.Application.UI.Actions.DelegateExecute)
0000004bacafce50 00007ffb2edd4d88 JetBrains.ReSharper.Feature.Services.Clipboard.ClipboardActionHandler.Execute(JetBrains.Application.DataContext.IDataContext, JetBrains.Application.UI.Actions.DelegateExecute)
...
0000004bacafdc40 00007ffb2268a2da DomainNeutralILStubClass.IL_STUB_COMtoCLR(IntPtr, Int32, Int32, Int64, Int64)
0000004bacafde00 00007ffb7af71011 [ComMethodFrame: 0000004bacafde00] 

从卦中看,是在处理剪贴板OleSetClipboard的逻辑中一直出不来,而且还有一个外来的 JetBrains.ReSharper 插件,看样子朋友的某些操作让 Resharper 介入了。

为了进一步验证是不是 Resharper 导致的,可以根据 ip 找到所属的模块。

0:000> !ip2md 00007ffb2edd5407
MethodDesc:   00007ffb28e984b0
Method Name:  JetBrains.ReSharper.InplaceRefactorings.CutCopyPaste.CopyPasteManager.DoCopyOrCut(JetBrains.Application.DataContext.IDataContext, Boolean, JetBrains.Application.UI.Actions.DelegateExecute)
Class:        00007ffb28ea1b40
MethodTable:  00007ffb28e98578
mdToken:      00000000060000aa
Module:       00007ffb201c48b8
IsJitted:     yes
CodeAddr:     00007ffb2edd4f60
Transparency: Critical
0:000> !DumpModule /d 00007ffb201c48b8
Name:       C:\Users\Administrator\AppData\Local\JetBrains\Installations\ReSharperPlatformVs17_265273ed_001\JetBrains.ReSharper.InplaceRefactorings.dll
Attributes: PEFile SupportsUpdateableMethods
Assembly:   000001edb0cce780
LoaderHeap:              0000000000000000
TypeDefToMethodTableMap: 00007ffb201d0020
TypeRefToMethodTableMap: 00007ffb201d03a8
MethodDefToDescMap:      00007ffb201d10a0
FieldDefToDescMap:       00007ffb201d21c0
MemberRefToDescMap:      0000000000000000
FileReferencesMap:       00007ffb201d2ae0
AssemblyReferencesMap:   00007ffb201d2ae8
MetaData start address:  000001edb6b8a960 (103320 bytes)

从卦中的 Name 来看,再一次确认了 ReSharper 的问题。

2. ReSharper 是阻塞还是死锁

本着 4S 店只换不修的思路,让朋友直接卸载掉VS中的 ReSharper 肯定是没问题的,但为了兴趣继续探究下 Resharper 正在做什么?

记一次 Visual Studio 2022 卡死分析

从汇编代码看,当前正准备做两个 if 判断,而且都是 true,最后跳转到 IGeneratorHost.View 属性中,不管怎么说,这里还是不断的处理,所以我觉得这里的阻塞要么是 死循环 出不来,要么还需要再等等。

可能有些朋友好奇,Resharper 塞入到剪贴板中到底是什么数据,要想挖这个信息,可以看汇编从线程栈提取,这次就不搞这么复杂了,换个思路吧,先看 CriticalSetDataObject 方法源码,输出如下:

internal static void CriticalSetDataObject(object data, bool copy)
{

    IComDataObject dataObject;

    if (data is DataObject)
    {
        dataObject = (DataObject)data;
    }
    else if (data is IComDataObject)
    {
        SecurityHelper.DemandUnmanagedCode();
        dataObject = (IComDataObject)data;
    }
    else
    {
        dataObject = new DataObject(data);
    }
    ...
}

接下来用 !dso 看下有没有类似的 DataObject 和 IComDataObject 对象,输出如下:

0:000> !dso
OS Thread Id: 0x8144 (0)
RSP/REG          Object           Name
...
0000004BACAFB138 000001ed401e9ec8 System.Windows.DataObject
...

0:000> !mdt 000001ed401e9ec8
000001ed401e9ec8 (System.Windows.DataObject)
    _innerData:000001ed401e9ee0 (System.Windows.DataObject+DataStore)
0:000> !mdt 000001ed401e9ee0
000001ed401e9ee0 (System.Windows.DataObject+DataStore)
    _data:000001ed401e9ef8 (System.Collections.Hashtable)
0:000> !mdt 000001ed401e9ef8
000001ed401e9ef8 (System.Collections.Hashtable)
    buckets:000001ed401e9f48 (System.Collections.Hashtable+bucket[], Elements: 3, ElementMT=00007ffb79123af8)
    count:0x2 (System.Int32)
    occupancy:0x1 (System.Int32)
    loadsize:0x2 (System.Int32)
    loadFactor:0.720000 (System.Single)
    version:0x2 (System.Int32)
    isWriterInProgress:false (System.Boolean)
    keys:NULL (System.Collections.ICollection)
    values:NULL (System.Collections.ICollection)
    _keycomparer:NULL (System.Collections.IEqualityComparer)
    _syncRoot:NULL (System.Object)
expand all 2 items   
0:000> !mdt 000001ed401e9f48
000001ed401e9f48 (System.Collections.Hashtable+bucket[], Elements: 3, ElementMT=00007ffb79123af8)
expand all 3 items   
0:000> !mdt -e:2 000001ed401e9f48
000001ed401e9f48 (System.Collections.Hashtable+bucket[], Elements: 3, ElementMT=00007ffb79123af8)
[0] (System.Collections.Hashtable+bucket) VALTYPE (MT=00007ffb79123af8, ADDR=000001ed401e9f58)
    key:000001ed310ea6f0 (System.String) Length=11, String="UnicodeText"
    val:000001ed401e9fd0 (System.Windows.DataObject+DataStore+DataStoreEntry[], Elements: 1)
    hash_coll:0xf337c502 (System.Int32)
[1] (System.Collections.Hashtable+bucket) VALTYPE (MT=00007ffb79123af8, ADDR=000001ed401e9f70)
    key:000001ed310eaa08 (System.String) Length=16, String="Rich Text Format"
    val:000001ed401ea018 (System.Windows.DataObject+DataStore+DataStoreEntry[], Elements: 1)
    hash_coll:0x30818946 (System.Int32)
[2] (System.Collections.Hashtable+bucket) VALTYPE (MT=00007ffb79123af8, ADDR=000001ed401e9f88)
    key:NULL (System.Object)
    val:NULL (System.Object)
    hash_coll:0x0 (System.Int32)
increase depth
0:000> !mdt 000001ed401ea018
000001ed401ea018 (System.Windows.DataObject+DataStore+DataStoreEntry[], Elements: 1)
expand all 1 items   
0:000> !mdt -e:2 000001ed401ea018
000001ed401ea018 (System.Windows.DataObject+DataStore+DataStoreEntry[], Elements: 1)
[0] 000001ed401e9ff0 (System.Windows.DataObject+DataStore+DataStoreEntry)
    _data:000001ed401635e8 (System.String) Length=1165, String="{\rtf\ansi{\fonttbl{\f0 NSimSun;}}{\colortbl;\red0\green0\blue255;\red0\green0\blue0;\red0\green128\blue0;}\f0 \fs19 \cf1 \cb0 \highlight0 var\cf2  hostname = System.Net.Dns.GetHostName();\par                 System.Net.IPAddress[] hostaddrs = System.Net.Dns.GetHostAddresses(hostname);\par \par                 \cf1 var\cf2  localIPList = \cf1 new\cf2  List<IPAddress>();\par                 \cf1 for\cf2  (\cf1 int\cf2  i = 0; i < hostaddrs.Length; i++)\par                 \{\par                     \cf1 if\cf2  (System.Net.Sockets.AddressFamily.InterNetwork == hostaddrs[i].AddressFamily)\par                     \{\par                         \cf1 if\cf2  (hostaddrs[i].ToString().Equals(_localIP))\par                         \{\par                             localIPList.Insert(0, hostaddrs[i]);\cf3 //\uc1\u20248?\uc1\u20808?\uc1\u20351?\uc1\u29992?\uc1\u19978?\uc1\u27425?\uc1\u-28706?\uc1\u25509?IP\cf2 \par                         \}\par                         \cf1 else\cf2 \par                         \{\par                             localIPList.Add(hostaddrs[i]);\par                         \}\par                     \}\par                 \}}"
    _autoConvert:true (System.Boolean)
    _aspect:0x1 (System.Runtime.InteropServices.ComTypes.DVASPECT)
    _index:0x0 (System.Int32)
increase depth

简单整理了下大概是这样的代码。

            var hostname = System.Net.Dns.GetHostName();
            System.Net.IPAddress[] hostaddrs = System.Net.Dns.GetHostAddresses(hostname);

            var localIPList = new List<IPAddress>();
            for (int i = 0; i < hostaddrs.Length; i++)
            {
                if (System.Net.Sockets.AddressFamily.InterNetwork == hostaddrs[i].AddressFamily)
                {
                    if (hostaddrs[i].ToString().Equals(_localIP))
                    {
                        localIPList.Insert(0, hostaddrs[i]);//优先使用上次连接IP
                    }
                    else
                    {
                        localIPList.Add(hostaddrs[i]);
                    }
                }
            }

有了这些信息,还是先让朋友把 Reshaper 卸载掉看看,据朋友反馈在 Resharper 官方的 issue 里找到了解决方案,禁用了如下选项,暂时没有出现任何问题,截图如下:

记一次 Visual Studio 2022 卡死分析

三:总结

综合朋友的反馈,这次VS的卡死就是他按下了 Ctrl+C 复制这段代码的时候,Resharper 插件介入,然后在处理富文本时出问题了,不知道大家可踩过类似的坑,算是给后来人一点定位经验吧。