All document samples are pulled from Hybrid Analysis - a free malware analysis service for the community that detects and analyzes unknown threats using a unique Hybrid Analysis technology.
To analyse PDF files, open them in a hex editor and look for the signs of malicious
/Page gives an indication of the number of pages in the PDF document. Most malicious PDF document have only one page
these keywords will help you to decide if a PDF file is potentially malicious and requires further analysis, or if it is benign and requires no analysis. The keywords PDFiD looks for are
• /Colors with a value larger than 2^24
pdfid.py will not look inside the compressed data of stream objects. You will need other tools to do this. The main purpose of pdfid.py is to aid you with deciding if a PDF file requires further analysis or not (especially if it comes from an untrusted source)
Check the file properties and integrity of pdf sample
Run pdfid.py to identify suspicious indirect objects
It was found that doc dew008.docx will be launched once pdf is opened (the other indirect objects are analysed). pdf-parser.py can do lot of good stuff here, but peepdf.py is preferred here to show and extract streams in native format.
It's an interactive tool to deal with individual indirect objects/streams and also can dump streams by directing the output.
peepdf.py can also check the reputation of pdf file in virustotal, it can give an initial glance about the nature of file. Interactive mode can be selected with -i option to look,decode and analyse streams
object <object_number> can give information about object
Observe the stream 5 and stream 8 (it's already found through peepdf.py output that object 8 and object 5 has streams). The file header PK for stream 8 was found to be office word document
stream 8 is dumped and checked the metadata and file properties using file command. The hash value can be checked in VT or run doc in a sandbox environment to get more insights but, this analysis is all about analyzing document without running in any sandbox
The dumped word file is unzipped to see for any external relationships with type oleobject which is an rtf file
Wow! wget it and analyse
It was up luckily :)
ptceg.doc was downloaded and checked with file properties
Cool ! found to be RTF ! CVE 2017-11882 ?? lets run rtfobj
Probably CVE-2018-0802 which superseded CVE-2017-11882
Further analysis need to be done !
085488d3bbb8d79bbbc3a75e5c0497 e915 STN-ORDER4487599.pdf
3218e08884126cc6777dec32a870c1 1ec3 worddoc.zip
f8a8457f082c0ea2016b2e2ffd831a f46f ptceg.doc (RTF)