Skip to Main Content

Research Data Management: File Organization and Formats

This guide provides information about managing research data in any discipline.

File Organization

A logical and organized folder structure can make it easier to keep track of project information. 

  • Develop a directory hierarchy which considers that folder names will sort alphabetically.
  • Avoid keeping duplicate working copies (backup copies are not considered duplicates in this context).

File Formats

Retain the original, unedited outputs from software and hardware to preserve source data.

  • Do not edit or alter the raw data file.  Keep it in its native format and create a copy for editing or further manipulation.

Ensure future access to your data files by using standard, stable, commonly-used file formats. 

  • Non-proprietary formats are preferred (particularly for final versions).
  • Be aware of what software is required to view and process data files, and be wary of software lifespans.

 

Library of Congress recommended format specifications
A guide to file formats recommended for long term access and preservation

File Naming Conventions

Develop a file and folder naming convention and document it so all team members can follow it.

Good practices in choosing file and folder names:

  • Uniquely name each file.
  • Be consistent and include similar information in all file names of the same file type.
  • Consider sorting order (usually lexicographic) and logical hierarchies in file directories.
  • Avoid ambiguous and confusing names, such as 'MyData' or 'sample'
  • Derivatives and versions should have similar (but differentiated) names to keep them co-located but still uniquely identified.
  • Names should reflect the contents of the file and/or the stage of development.
  • When using dates, if you want the files to sort chronologically, put the year first and use numerical two-digit months and days (YYYY-MM-DD).  (Example: March 7, 2004 would be written '2004-03-07'.)
  • Use only alphanumeric characters but use dashes (-) or underscores (_) instead of spaces; avoid special characters such as colons (:) and slashes (/).
  • Avoid using case differences to distinguish between files: ‘Record’, ‘record’, and ‘RECORD’ may be three different file names or the same file name, depending on the operating system.

 

File Naming Best Practices

Assign descriptive file names
DataONE's best practices guide for file naming

File Naming & Tracking Changes (Version Control)
University of Oregon's guidance for file naming best practices

 

File Renaming Tools

Bult Rename Utility
A free tool for Windows

den4b ReNamer
A free online file renaming tool

PSRenamer
A free tool for Linux, Mac, or Windows

Renamer
A free batch renaming tool for Mac