A recent thread on the SPSS listserv begins with: "One continuing problem everywhere I've worked is trying to go back to an analysis someone else did 2 years ago and reconstructing what they did, especially regarding data cleaning and other changes to the data files." This is a real problem with immediate, important, and practical implications (e.g., replication). The SPSS thread has already generated a few tips (some more useful than others) that I pass along (in summarized form).
General:
1. "File system." For each project set up 4 subfolders (setup, data, raw data, and analyses).
a. Setup -- syntax file, plus other project documentation (e.g., codebook).
b. Data -- master data file along with subfiles.
c. Raw Data -- original raw data.
d. Analyses -- analyses syntax files (but don't save output files if disk space is an issue).
Syntax:
1. Save and date all syntax files. Annotate syntax files with comments, space out the code neatly, and make sure the log file is turned on.
2. Create a syntax file for your data setup (i.e. reading, converting, tidying, labeling, setting missing vals, formatting).
3. Create separate syntax files for separate analyses.
Data:
1. Create a new data file to reflect any major change to data.
2. Give data and related output files distinctive and sequential names.
3. Link names for syntax and outfiles with name of source data file.
I welcome other suggestions.
Comments