Tuesday, February 9, 2016

Love Your Data Week, Day 2: Data Organization

phd101212s_finaldocToday the focus of Love Your Data Week is data organization. We'll have two posts today, the first on the topic of naming, the second reviewing Bulk Rename Utility.

So, first: Data librarians at Penn State have written two blog posts on The Art of Naming Things, one of which focuses on the practice of creating logical element names in a dataset and functions in your code (among other things). The second post deals with naming schemes for files and directories.

Part of #LYD16 is a daily activity designed to both illustrate the concepts being discussed, and to give data users a place to start. Today's activity is to come up with a folder structure and/or naming plan. Tips from #LYD16 folks are:

If you don’t already have a folder structure and/or file naming plan, come up with one and share it. Some good practices for naming files are described below:

  • Be Clear, Concise, Consistent, and Correct
  • Make it meaningful (to you and anyone else who is working on the project) 
  • Provide context so it will still be a unique file and people will be able to recognize what it is if moved to another location.
  • For sequential numbering, use leading zeros.
    • For example, a sequence of 1-10 should be numbered 01-10; a sequence of 1-100 should be numbered 001-010-100.
  • Do not use special characters: & , * % # ; * ( ) ! @$ ^ ~ ‘ { } [ ] ? < >
    • Some people like to use a dash ( – ) to separate words
    • Others like to separate words by capitalizing the first letter of each (e.g., DST_FileNamingScheme_20151216)
  • Dates should be formatted like this: YYYYMMDD (e.g., 20150209)
    • Put dates at the beginning or the end of your files, not in the middle, to make it easy to sort files by name
    • OK: DST_FileNamingScheme_20151216
    • OK: 20151216_DST_FileNamingScheme
    • AVOID: DST_20151216_FileNamingScheme
  • Use only one period and before the file extension (e.g., name_paper.doc NOT name.paper.doc OR name_paper..doc)

There are generally two approaches to folder structures. Filing, or using a hierarchical folder structure. The other approach is piling, which relies on fewer folders and uses the search, sort, and tagging functions of your operating system or cloud storage tools like Box.
DSP_FolderStructure-Ex2DSP_FolderStructure-Ex1

No comments:

Post a Comment