CVE-2021–40444 CyberChef Recipe
This is a quick cyberchef receipe to extract defanged URLs from the maldocs that are used as the first stage of CVE-2021–40444’s exploitation
https://gchq.github.io/CyberChef/#recipe=Unzip('',false)Regular_expression('User%20defined','Target%3D%22mhtml:%5B%5E!%5D%2B!x-usc:%5B%5E!%20%22%5D%2B',true,true,false,false,false,false,'List%20matches')Find_/_Replace(%7B'option':'Regex','string':'Target%3D%22mhtml:'%7D,'',true,false,true,false)Find_/_Replace(%7B'option':'Regex','string':'!x-usc:'%7D,'%5C%5Cn',true,false,true,false)Unique('Line%20feed')Defang_URL(true,true,true,'Valid%20domains%20and%20full%20URLs')
If you do not want to copy the link and paste it into the browser, this is the recipe, you can copy this and click on load recipe within cyber chef, then paste it in the recipe section and click load.
Unzip('',false)
Regular_expression('User defined','Target="mhtml:[^!]+!x-usc:[^! "]+',true,true,false,false,false,false,'List matches')
Find_/_Replace({'option':'Regex','string':'Target="mhtml:'},'',true,false,true,false)
Find_/_Replace({'option':'Regex','string':'!x-usc:'},'\\n',true,false,true,false)
Unique('Line feed')
Defang_URL(true,true,true,'Valid domains and full URLs')
An older version of the recipe is here, look at the update section to see what we moved away from this.
Unzip('',false)
Extract_URLs(false)
Regular_expression('User defined','(https?:\\/\\/(?!.*\\.?(microsoft|openxmlformats|purl|w3)).*)',true,true,false,false,false,false,'List matches')
Defang_URL(true,true,true,'Valid domains and full URLs')
Test on known CVE-2021–40444 maldoc
This was tested using a doc file with a sha256 hash value of 938545f7bbe40738908a95da8cdeabb2a11ce2ca36b0f6a74deda9378d380a52. The sample was downloaded from malware bazaar.
Update — 2021–09–20 19:23 EST
We noticed that some adveraries did not use the prefix http/https, we go with another way of getting it focusing on the word mhtml and !x-usc
Unzip('',false)
Regular_expression('User defined','Target="mhtml:[^!]+!x-usc:[^! "]+',true,true,false,false,false,false,'List matches')
Find_/_Replace({'option':'Regex','string':'Target="mhtml:'},'',true,false,true,false)
Find_/_Replace({'option':'Regex','string':'!x-usc:'},'\\n',true,false,true,false)
Unique('Line feed')
Defang_URL(true,true,true,'Valid domains and full URLs')
Reading more about the CVE, it appears that we do not need the strings mhtml or !x-usc. Please see more details here, an article by InQuest.