Learning Python for Forensics
上QQ阅读APP看书,第一时间看更新

Our final iteration – setupapi_parser.py

In our final iteration, we will continue to improve the script by adding deduplication of processed entries and improving upon the output. Although the second iteration introduced the logic for filtering out non-USB devices, it does not deduplicate the responsive hits. We will deduplicate on the device name to ensure that there is only a single entry per device. In addition, we will integrate our usb_lookup.py script from Chapter 2, Python Fundamentals, to improve the utility of our script by displaying USB VIDs and PIDs for known devices.

We had to modify the code in the usb_lookup.py script to properly integrate it with the setupapi script. The differences between the two versions are subtle and are focused on reducing the number of function calls and improving the quality of the returned data. Throughout this iteration, we will discuss how we have implemented our custom USB VID/PID lookup library to resolve USB device names. On line 4, we import the usb_lookup script, as follows:

001 """Third iteration of the setupapi.dev.log parser."""
002 from __future__ import print_function
003 import argparse
004 from io import open
005 import os
006 import sys
007 import usb_lookup
...
037 __authors__ = ["Chapin Bryce", "Preston Miller"]
038 __date__ = 20181027
039 __description__ = """This scripts reads a Windows 7 Setup API
040 log and prints USB Devices to the user"""

As we can see in the following code block, we have added three new functions. Our prior functions have undergone minor modifications to accommodate new features. The majority of the modifications are in our new functions:

  • The parse_device_info() function is responsible for splitting out the necessary information to look up the VID/PID values online and format the raw strings into a standard format for comparison
  • The next function, prep_usb_lookup(), prepares and parses the database into a format that supports querying
  • The get_device_names() function correlates matching device information with the database

With these new functions, we provide additional context for our investigators:

042 def main():
...
068 def parse_setupapi():
...
092 def parse_device_info():
...
137 def prep_usb_lookup():
...
151 def get_device_names():
...
171 def print_output():

We will add one argument to our parser before calling the main() function. The --local argument defined on lines 198 and 199 allow us to specify a local usb.ids file that we can use for parsing in an offline environment. The following code block shows our implementation of the arguments, spaced out over several lines to make it easier to read:

187 if __name__ == '__main__':
188 # Run this code if the script is run from the command line.
189 parser = argparse.ArgumentParser(
190 description=__description__,
191 epilog='Built by {}. Version {}'.format(
192 ", ".join(__authors__), __date__),
193 formatter_class=argparse.ArgumentDefaultsHelpFormatter
194 )
195
196 parser.add_argument('IN_FILE',
197 help='Windows 7 SetupAPI file')
198 parser.add_argument('--local',
199 help='Path to local usb.ids file')
200
201 args = parser.parse_args()
202
203 # Run main program
204 main(args.IN_FILE, args.local)

As with our prior iterations, we have generated a flow chart to map the logical course of our script. Please note that it uses the same legend as our other flow charts, though we omitted the legend due to the width of the graphic. Our main() function is executed and makes direct calls to five other functions. This layout builds upon the nonlinear design from the second iteration. In each iteration, we are continuing to add more control within the main() function. This function leans on others to perform tasks and return data rather than doing the work itself. This offers a form of high-level organization within our script and helps keep things simple by executing one function after another in a linear fashion: