Search This Blog

Tuesday, December 9, 2014

DRUW Gets Going

As mentioned in our previous blog post we are developing an institutional data repository here at UW. We are joining the Hydra community and building our digital repository using the Hydra framework, which pulls together various components and platforms, including Blacklight, Solr and Fedora. More about the technologies in a later post! This project is a partnership between the Libraries and UWIT, the data will live on UWIT’s lolo filesystem.

How did we get here? A few years ago, the Data Services team conducted a survey 323 campus researchers, found that a strong need on our campus was for a place where researchers could store their data for the long term. This, coupled with funder mandates for providing public access to data meant that providing a data repository service at UW just made sense. Luckily, the Libraries administration agreed with us!

Since getting the go-ahead on the project, the majority of our time has been spent on (other than sorting technologies - to be discussed later) ensuring that we make a system that people are going to want to use and that meets their needs. The best way to do this, of course, is to figure out what those wants and needs might be. For this, we used a couple of approaches. First, we had two different standing library committees, the Data Services Committee and the Metadata Interest Group, create user stories. User stories are a technique from agile software development for defining system requirements from the perspective of the people who will use the system. There are different ways to write them, we chose to create each of ours from the skeleton sentence: “(a user type) wants to (their want) so that (why they want it)”. An example user story that was generated from this exercise: “A data depositor wants to not have to contact a librarian to upload a dataset, so that depositing can be done when they want to.” This particular story lead to a desired system feature for self-deposit of datasets. Our most common user types were “Data depositor,” “Researcher” and “Librarian.”

While these were being developed focus groups were held, which brought together researchers from across campus to discuss what they would want out of a data repository. Specifically, the questions asked of the focus groups were intended to identify potential barriers to use, so that we can be aware of those from the beginning and do our best to minimize or eliminate them. These conversations were summarized and then further distilled into the user story format. In total, 71 unique system features were identified. We are currently working on prioritizing the different features, determining what features we have the capability to include now, and what we can perhaps work towards in a future development phase of the repository project.

Wednesday, November 12, 2014

UW Libraries Forms Team to Develop Data Repository

The University of Washington Libraries is excited to announce the formation of a team to develop a Data Repository. The Data Repository at UW (DRUW, pronounced droo) will provide a secure, long-term location for UW faculty to store and share their research datasets. This repository will support UW researchers in meeting federal and private funder data management mandates and will promote the principles of open access and data sharing, while providing a convenient place to archive and discover datasets from research done at UW.

DRUW builds upon the Libraries existing Data Services offerings, which include assistance with data management plans, locating and acquiring research data, data curation and archiving, and data reference assistance. It also joins our ResearchWorks Services, which has been providing curation and archiving of digital research outputs for more than a decade. DRUW will allow campus researchers to archive their research data in a secure, reliable digital repository and allow users from off-campus to discover their work.

Data Repository Librarian Mahria Lebow has recently joined the Data Services Unit at the UW Libraries, and carries project management responsibilities for the repository. Updates and information about the project will be available here on the Data Services Blog.

For more information, contact Mahria Lebow, mahria at uw dot edu.

Wednesday, November 5, 2014

Two Weeks Until UW GIS Day!

On Nov. 19th, the UW Libraries will be hosting the 4th annual UW GIS Day in the Research Commons, Allen Library South, from 10am-3:45pm.

The event will highlight and celebrate the transformational role of Geographic Information Systems (GIS) and Remote Sensing Technologies. We hope you will join us to:
  • Connect with others working on GIS research
  • Hear presentations on LiDAR and Unmanned Aerial Vehicles (UAVs)
  • Enjoy lightning talks about GIS projects going on around campus
  • View UAVs, 3D visualization and 3D printing technology demonstrations
  • Learn about GIS-related resources available to the UW community
Speakers include: Prof. David Montgomery (ESS), Ralph Haugerud (USGS), David Shean (Ph.D. student, ESS), Aaron Cheuvront (UW Capital Projects), and Aaron Racicot (CUGOS). More information is available at:
This event is open to the public. We hope you will join us!

Friday, October 24, 2014

Student lightning talks needed for UW GIS Day

UW students: Interesting in presenting at GIS Day? We're looking for UW students to give lightning talks. This year, we will be awarding a $100 and a $75 UW Bookstore gift card to the top two student lightning talkers! 

For more information and to submit a talk proposal, see:

UW GIS Day info:
Date: Nov. 19, 2014
Time: 10am -- 3:45pm (you can stay for as little or long as you like)
Place: UW Libraries' Research Commons

UW faculty and staff: please share this call for presenters with your students and encourage them to present. Additionally, you too can give a lightning talk (sorry, no prizes, though).

We hope you will join us!

Tuesday, October 21, 2014

Guide to Research Data Management Curriculum

In many of my workshops and class sessions, especially for librarians new to data management, I've been asked about existing research data management (RDM) curriculum and resources. People new to the field want to be able to browse online learning courses, join mailing lists to see what's going on, follow a blog or two. After compiling a few of these lists to serve as handouts, I decided to put them online in a LibGuide for Research Data Management Curriculum. In a way a curated list of these things feels very Yahoo 1990s, but it allows me a place to keep tabs on new curricula, blogs and other resources that I or my coworkers use on a regular basis.

Got any favorite sites/classes/blogs/lists you'd like to see on the list? Put it in the comments and I'll take a look.

Monday, October 13, 2014

UW GIS Day 2014

Campus GIS users:

Save the date!

November 19th is GIS Day and the University of Washington will highlight and celebrate the transformational role of Geographic Information Science (GIS) by hosting a day-long event in the UW Libraries' Research Commons.

Event information is available on the UW GIS Day website:

Interesting in presenting at GIS Day? We're looking for people to give lightning talks. There are going to be two sessions: one for student presenters and one for non-students. Submit a talk proposal here:

We hope you will join us!

--UW GIS Day Planning Group

Thursday, September 25, 2014

Data + Libraries + Twitter

I was just adding to my Twitter list of data-related-library-type-people, and though I know there are several of these lists out there, I thought I'd throw this one out to see if anyone has a feed they'd like to add. Drop it in the comments and I'll add it to the list.

Wednesday, August 6, 2014

Data Librarianship Workshops

Beginning in September, Data Services Curriculum and Communications Librarian Jenny Muilenburg will be offering three Data Librarianship Workshops to librarians and staff who work with people using data. Designed to teach some of the essential concepts necessary for those working in a data support role, these workshops will cover data management plans, data librarian skills, repositories and archives, and data best practices (among others!).

Prior to each workshop there will be a small amount of readings/videos/research for attendees to complete. This will allow workshop attendees to begin with the same background knowledge, as well as enable us to share what we all bring from our different fields/positions.

Session 1, Monday, September 8, 2-3:30pm
Data Librarianship: Skills and Definitions
Curious what it means to be a “data librarian”? This workshop will look at the skills and traits necessary to work as a data librarian, whether or not your current position includes that description, as well as what other libraries and institutions are providing in the way of data support and guidance. We’ll also talk about how to gain various kind of education and training to better support data-intensive research.

Session 2, Monday, January 26, 2-3:30pm
Data Librarianship: Archives & Repositories
There are so many archives and repositories out there it can be difficult to know where to start looking to help someone in your field (or especially a field you’re not familiar with).This workshop will look at some of the categories of archives and repositories, and we’ll have time to share some of the similarities and differences across disciplines. We’ll also talk about some of the usage and ethics considerations that come into play when researchers share their data.

Session 3, Monday, April 20, 2-3:30pm
Data Management Plans: Reading, Writing, and Sharing
The words “data management plan” get tossed around a lot in certain circles. In this session, we’ll spend time learning about the different disciplinary and/or agency requirements for DMPs. We’ll also have an opportunity to read case studies from several disciplines, and learn how to help someone create their own DMP using various resources and tools from the UW Libraries.

For more information, contact us at uwlib-data at uw dot edu.

Thursday, April 10, 2014

UW Libraries hosting virtual conference on "Dealing with the Data Deluge"

The UW Libraries is excited to announce that it will be hosting an upcoming NISO (National Information Standards Organization) virtual conference on "Dealing with the Data Deluge: Successful Techniques for Scientific Data Management." Attendance is open to the UW community.
  • Where: Allen Auditorium, UW Libraries
  • When: Wednesday, April 23, 8am-2pm
  • What: "Dealing with the Data Deluge: Successful Techniques for Scientific Data Management," a NISO virtual conference

With funding support from the Libraries Organization Development & Training to cover our site registration, this virtual conference will "explore in greater depth than traditional webinars some of the practical lessons from those who have implemented data management and developed best practices, as well as provide some insight into the evolving issues the community faces." More details from the event page:
With the expansion of digital data collection and the increased expectations of data sharing, researchers are turning to their libraries or institutional repositories as a place to store and preserve that data. Many institutions have created such data management services and see the data curation role as a growing and important element of their service portfolio. While some of the experience in managing other types of digital resources is transferrable, the management of large-scale scientific data has many special requirements and challenges. From metadata collection and cataloging data sources, to identification, discovery, and preservation, best practices and standards are still in their infancy.
Topics and Speakers include:
  • Keynote Speaker – Jan Brase, Ph.D., German National Library of Science and Technology; Managing Agent of DateCite, Chair of the International DOI Foundation (IDF), Vice-President of the International Council for Scientific and Technical Information (ICSTI), and Co-Chair of the CODATA Data Citation task group
  • Guidelines and Resources for Office of Science and Technology Policy (OSTP) Data Access Plans –Jared Lyle, Director of Data Curation Services, Interuniversity Consortium for Political and Social Research (ICPSR), University of Michigan
  • Joint Declaration of Data Citation Principles: Implementation of the Principles in the Harvard Dataverse Repository – Merce Corsas, Ph.D., Director of Data Science, Institute for Quantitative Social Science (IQSS), Harvard University
  • Purdue University Research Repository (PURR): A Commitment to Supporting Researchers – Michael Witt, Head, Distributed Data Curation Center (D2C2); Associate Professor of Library Science, Purdue University Research Repository (PURR)
  • Is This Data Fit for My Use? The Challenges and Opportunities Data Provenance Presents – Adriane Chapman, MITRE
  • A Durable Space: Technologies for Accessing Our Collective Digital Heritage – David Wilcox, Product Manager, DuraSpace
  • The SHared Access Research Ecosystem (SHARE) Project: A Joint Initiative of ARL, AAU, and APLU –Judy Ruttenberg, Program Director for Transforming Research Libraries, Association of Research Libraries (ARL)

More information about the event, including the schedule, is online at

Wednesday, April 2, 2014

Research Data Management Workshops: Lessons Learned

From January 22 to March 5, 2014, three University of Washington librarians offered a seven-week course in research data management. As a complement to the two workshops we offered in 2013, which were geared toward research data management basics for librarians, this series was aimed at graduate students, primarily in the School of Forest and Environmental Sciences, Biology, Engineering, the iSchool and Health Sciences.

We ran the course as a pilot of the New England Collaborative Data Management Curriculum. Jenny Muilenburg, Mahria Lebow and Joanne Rich worked together to offer the class, which was designed as a one-hour meeting with lecture and exercises, meeting once a week for seven weeks. Students were asked to register, but the classes were not required, and no credit was given. Each of the weeks touched on one concept from the NECDMC curriculum, with one change made mid-course that combined two concepts into one lecture. We took each lecture module from NECDMC and modified it to suit our personal and institutional preferences, as well as adding UW-specific information. One primary lecture room was used on the main campus, where we both recorded the lecture and streamed it to a second location in Health Sciences, where Joanne Rich facilitated the streamed lecture and ran the exercises off-air.

Each class consisted of about 30 minutes of lecture, and 30 minutes of exercise and discussion. Module topics included types, stages and formats of data; metadata; storage, backup and security; legal and ethical considerations; sharing and reuse; and archiving and preservation. Experts from on campus were asked to contribute opinions and UW-specific information, specifically on metadata, storage and security, and legal and ethical information. Overall class evaluations after each lecture were positive, with good feedback about what to include in future iterations of the class.

It was a great first foray into RDM curriculum for non-librarians, although attendance with attrition was not as good as we'd hoped. Our next move will be to take this experience and the curriculum, modify and shorten, and present it to subject-specific librarian groups on campus. I'd like to see one for STEM disciplines and the social sciences, and the health science librarians are working on one for their disciplines. More information will be forthcoming as we pull together what we've learned and where we'll go from here.

Wednesday, February 19, 2014

New Data Policies from AGU and PLOS

Both the American Geophyical Union and PLOS have recently announced new policies dealing with the data that underlies published research. The PLOS statement recognized that having open access literature without providing the underlying data was incongruous, and both policies provide recommendations for what to share, and how to share it.

In both statements, the requirement is not necessarily for all research data to be shared, but for the critical parts of data that were used to complete the published research and analysis. The two groups also require that data is made easily accessible, within legal constraints, and not held by the authors as the sole gatekeeper to the data. The expectation is for the data to be stored in trusted repositories: AGU provides a link to suggested repositories; PLOS suggests authors use repositories verified with criteria such as that provided by the Centre for Research Libraries or Data Seal of Approval (perhaps recognizing that when public archives aren't used, access to the data rapidly diminishes over time).

Recognizing that there are a variety of reasons all data may not be able to be shared publicly, each policy provides options for privacy or legal concerns, but requires that a statement be made to that effect, explaining why part or all of the data can not be shared.

With funding agencies and the federal government moving swiftly toward open access for all research outputs, and the proliferation of tools making it easy to share data, more groups and publishers will no doubt be adopting similar policies in the very near future.