Fortunately, if your project is managed by fine grained build system such as SCons, if your build scripts are not globbing too much, there are chances you can find files that are not participating in the builds.
Here is how to do this on Windows using Process Monitor tool that intercepts all system calls including file access.
While build systems are usually common for C/C++ and Java projects, it is possible to add fine-grained file usage control for any project. For example, SCons itself is written entirely in Python, it could run directly from the source checkout or build distributives from checkout. But instead, it uses build procedure to copy all necessary files from checkout into separate directory and do stuff from there.
Thanks to that it is possible to see which files are no more actual. While it is possible to compare checkout source tree and copied directory trees, I'll go through the hells of monitoring system file access in a source tree during the build process using Process Monitor (FileMon in the past). Linux should have similar tools too - let me know how are they called.
The process is the following:
- Start Process Monitor
- Stop incoming event flood by (un)clicking Capture (Ctrl-E) button
- Open Filter (Ctrl-L) dialog to add some filters
- Go Tools -> File Summary...
- Export to CSV using Save...
SCons build is started by bootstrap.py
script from a root of SCons source checkout. The script is executed by python executable, so I add python.exe
process name to the filter. I know that bootstrap.py
copies files from src/
subdirectory, so it is the directory I need to monitor, so I add this dir to filters too.
There is a list of paths catched by Process Monitor when listening to system calls. They are already filtered, but additional filters can be applied using bottom left button to make information even more useful.
Exported CSV is not very useful without some postprocessing. I used the following a script to compare the list of paths in CSV to actual
src/
directory contents. This gives me names of files that were not touched during build at all.SRCDIR = "C:\\p\\python\\scons\\src"
CSVLIST = 'accessed_bootstrap_files.CSV'
import csv
import os
reader = csv.reader(open(CSVLIST))
header = reader.next()
pathidx = header.index("Path")
pathset = set([row[pathidx] for row in reader])
#for row in pathset:
# print row
fileset = set()
for root, dirs, files in os.walk(SRCDIR):
fileset.update( [os.path.join(root, f) for f in files] )
if '.svn' in dirs:
dirs.remove('.svn') # don't visit .svn directories
if len(pathset & fileset) == 0:
print 'Error: File sets do not intersect at all'
print "Files not found in source directory tree:"
for f in (pathset - fileset):
if not os.path.isdir(f):
print f
print
print "Untouched files in source directory tree:"
for f in sorted(fileset - pathset):
if not os.path.isdir(f):
print f
I've found a few interesting things about SCons. Core tests are mixed with source files in repository checkout. They are not copied during bootstrap build. There are also few setup.py
files, post-install script and announcement that don't participate in the build.Here is the output of the above script:
Files not found in source directory tree:
<Total>
Untouched files in source directory tree:
C:\p\python\scons\src\.aeignore
C:\p\python\scons\src\Announce.txt
C:\p\python\scons\src\engine\.aeignore
C:\p\python\scons\src\engine\SCons\.aeignore
C:\p\python\scons\src\engine\SCons\ActionTests.py
C:\p\python\scons\src\engine\SCons\BuilderTests.py
C:\p\python\scons\src\engine\SCons\CacheDirTests.py
C:\p\python\scons\src\engine\SCons\DefaultsTests.py
C:\p\python\scons\src\engine\SCons\EnvironmentTests.py
C:\p\python\scons\src\engine\SCons\ErrorsTests.py
C:\p\python\scons\src\engine\SCons\ExecutorTests.py
C:\p\python\scons\src\engine\SCons\JobTests.py
C:\p\python\scons\src\engine\SCons\MemoizeTests.py
C:\p\python\scons\src\engine\SCons\Node\.aeignore
C:\p\python\scons\src\engine\SCons\Node\AliasTests.py
C:\p\python\scons\src\engine\SCons\Node\FSTests.py
C:\p\python\scons\src\engine\SCons\Node\NodeTests.py
C:\p\python\scons\src\engine\SCons\Node\PythonTests.py
C:\p\python\scons\src\engine\SCons\Optik\.aeignore
C:\p\python\scons\src\engine\SCons\PathListTests.py
C:\p\python\scons\src\engine\SCons\Platform\.aeignore
C:\p\python\scons\src\engine\SCons\Platform\PlatformTests.py
C:\p\python\scons\src\engine\SCons\SConfTests.py
C:\p\python\scons\src\engine\SCons\SConsignTests.py
C:\p\python\scons\src\engine\SCons\Scanner\.aeignore
C:\p\python\scons\src\engine\SCons\Scanner\CTests.py
C:\p\python\scons\src\engine\SCons\Scanner\DirTests.py
C:\p\python\scons\src\engine\SCons\Scanner\FortranTests.py
C:\p\python\scons\src\engine\SCons\Scanner\IDLTests.py
C:\p\python\scons\src\engine\SCons\Scanner\LaTeXTests.py
C:\p\python\scons\src\engine\SCons\Scanner\ProgTests.py
C:\p\python\scons\src\engine\SCons\Scanner\RCTests.py
C:\p\python\scons\src\engine\SCons\Scanner\ScannerTests.py
C:\p\python\scons\src\engine\SCons\Script\.aeignore
C:\p\python\scons\src\engine\SCons\Script\MainTests.py
C:\p\python\scons\src\engine\SCons\Script\SConscriptTests.py
C:\p\python\scons\src\engine\SCons\SubstTests.py
C:\p\python\scons\src\engine\SCons\TaskmasterTests.py
C:\p\python\scons\src\engine\SCons\Tool\.aeignore
C:\p\python\scons\src\engine\SCons\Tool\JavaCommonTests.py
C:\p\python\scons\src\engine\SCons\Tool\PharLapCommonTests.py
C:\p\python\scons\src\engine\SCons\Tool\ToolTests.py
C:\p\python\scons\src\engine\SCons\Tool\f03.xml
C:\p\python\scons\src\engine\SCons\Tool\msvsTests.py
C:\p\python\scons\src\engine\SCons\UtilTests.py
C:\p\python\scons\src\engine\SCons\Variables\BoolVariableTests.py
C:\p\python\scons\src\engine\SCons\Variables\EnumVariableTests.py
C:\p\python\scons\src\engine\SCons\Variables\ListVariableTests.py
C:\p\python\scons\src\engine\SCons\Variables\PackageVariableTests.py
C:\p\python\scons\src\engine\SCons\Variables\PathVariableTests.py
C:\p\python\scons\src\engine\SCons\Variables\VariablesTests.py
C:\p\python\scons\src\engine\SCons\WarningsTests.py
C:\p\python\scons\src\engine\SCons\cppTests.py
C:\p\python\scons\src\engine\setup.cfg
C:\p\python\scons\src\engine\setup.py
C:\p\python\scons\src\script\.aeignore
C:\p\python\scons\src\script\scons-post-install.py
C:\p\python\scons\src\script\setup.cfg
C:\p\python\scons\src\script\setup.py
C:\p\python\scons\src\test_aegistests.py
C:\p\python\scons\src\test_files.py
C:\p\python\scons\src\test_interrupts.py
C:\p\python\scons\src\test_pychecker.py
C:\p\python\scons\src\test_setup.py
C:\p\python\scons\src\test_strings.py
Hope this helps clean up your projects too.
P.S. I wish there was a Python script replacement for Process Monitor, or at least that it could be controlled from command line.
procmon can be automated from the command-line like so:
ReplyDeleteset PM=C:\path\to\procmon.exe
start %PM% /quiet /minimized /backingfile C:\path\to\pytest.pml
%PM% /waitforidle
start /wait C:\path\to\python.exe myscript.py
%PM% /terminate
start %PM% /quiet /minimized /openlog C:\path\to\mydump.pml /SaveAs C:\path\to\mydata.csv
The downside is that the dumps can be huge, so this may not work so well with multi-hour builds of native code.
Thanks. Thats a good start at least. Unfortunately, PM configuration is in some binary format, so there still is no way to setup filters from Python.
ReplyDelete