Monday, January 12, 2015

Special Journal Issue Focuses on Data Literacy and Librarians

Time for some reading: the latest issue of the Journal of eScience Librarianship focuses on the role of librarians in data literacy. Included are articles on data management education initiatives, designing RDM curriculum for librarians and graduate students, as well as some case studies from different institutions that used the New England Collaborative Data Management Curriculum in order to teach RDM to various constituencies.

Also featured is an "eScience in Action" piece titled Lessons Learned from a Research Data Management Pilot Course at an Academic Library, from the UW's own Mahria Lebow, Jennifer Muilenburg, and Joanne Rich, detailing their experience teaching a research data management course to graduate students in early 2014.

We're hoping to set aside some time to read through these articles in the next few weeks, and will hope to include some reaction here. Stay tuned!

Friday, January 9, 2015

DRUW: a glance under the hood

As promised, here is the blog post about the technologies we are going to be playing with to build our data repository.  When we decided we wanted to pursue developing an institutional data repository we evaluated different pieces of software, weighing variables like maturity of system, the presence and type of community behind the system, flexibility for handling different object types and general future-proofedness.  There isn’t much of a dramatic pause for me to insert here, as we’ve already written in previous posts that the outcome of this analysis was going with Hydra.

But what is Hydra? Hydra isn’t a single thing  - an out of the box solution (though the community around it has set this as a future goal) -  rather it’s a framework of different pieces of software, that come together to create an institutional repository.   A Hydra installation can be used as a single interface to many different repositories, if we wanted to expand beyond the current scope of research data.  Hydra is based on Fedora, the repository platform from DuraSpace, a nonprofit that supports a number of open source technologies related to digital assets (like DSpace and VIVO).  Fedora is short-hand for Flexible Extensible Digital Object Repository Architecture and as its long-form name implies, Fedora is a digital asset management system capable of handling content regardless of type (GIS, A/V, images, text, data, etc).  Of note, DuraSpace recently has released Fedora 4, which has some significant changes from Fedora 3, including being happier about ingesting larger files and by default providing RDF representation of content and relationships.  The Hydra community is energetically working away at getting all of the pieces of the Hydra environment to play nicely with Fedora 4, and has advised that new adopters of Hydra to plan on using Fedora 4 from the get go, rather than create a situation that requires migration at a later date.  So, we’ve had a bit of good luck here on our timing for jumping in!  

So, Fedora is in charge of managing the objects, the other core components of a Hydra build include Solr and Blacklight.  Solr is an open source search platform from Apache that indexes the repository content. Blacklight is the discovery interface that plugs into Solr and provides features like (customizable) faceted browsing, exporting results and saving search history.  Now, those are just the core technologies, there are many other packages of code (referred to as gems in world of Ruby - the programming language behind Hydra) necessary to get an instance of Hydra up and running.  The community has developed several different flavors of Hydra that leverage this framework of technologies in deployable web applications (technically, Rails engines), the one we’ve elected to go with is Sufia.  

We’ve been working on use cases for our repository and our next steps are to define project phases, with realistic timelines and set milestones for each of these phases.

Tuesday, January 6, 2015

Data Librarianship Workshop for UW Libraries staff: Archives & Repositories

There are so many archives and repositories out there it can be difficult to know where to start looking to help someone in your field (or especially a field you’re not familiar with). This workshop, to be held Wednesday, January 28 from 2-3:30pm in the Allen Auditorium, will look at some of the categories of archives and repositories, and we’ll have time to share some of the similarities and differences across disciplines. We’ll also talk about some of the usage and ethics considerations that come into play when researchers share their data.

The workshop is open to all Libraries staff. Prior to the workshop, please identify 1-2 repositories in your subject area. Take 5-10 minutes and explore:
  • How easy it is to search for data
  • How easy it is to deposit data
  • What the depositor policies are
  • What kind of metadata the repository collects
  • Other general impressions

A good place to start (other than google) is

This workshop is the second of three workshops on data librarianship. The third will be held Wednesday, April 29th from 2-3:30pm in Allen Auditorium, and will focus on data management plans.

Questions can be left in the comments below.

Tuesday, December 9, 2014

DRUW Gets Going

As mentioned in our previous blog post we are developing an institutional data repository here at UW. We are joining the Hydra community and building our digital repository using the Hydra framework, which pulls together various components and platforms, including Blacklight, Solr and Fedora. More about the technologies in a later post! This project is a partnership between the Libraries and UWIT, the data will live on UWIT’s lolo filesystem.

How did we get here? A few years ago, the Data Services team conducted a survey 323 campus researchers, found that a strong need on our campus was for a place where researchers could store their data for the long term. This, coupled with funder mandates for providing public access to data meant that providing a data repository service at UW just made sense. Luckily, the Libraries administration agreed with us!

Since getting the go-ahead on the project, the majority of our time has been spent on (other than sorting technologies - to be discussed later) ensuring that we make a system that people are going to want to use and that meets their needs. The best way to do this, of course, is to figure out what those wants and needs might be. For this, we used a couple of approaches. First, we had two different standing library committees, the Data Services Committee and the Metadata Interest Group, create user stories. User stories are a technique from agile software development for defining system requirements from the perspective of the people who will use the system. There are different ways to write them, we chose to create each of ours from the skeleton sentence: “(a user type) wants to (their want) so that (why they want it)”. An example user story that was generated from this exercise: “A data depositor wants to not have to contact a librarian to upload a dataset, so that depositing can be done when they want to.” This particular story lead to a desired system feature for self-deposit of datasets. Our most common user types were “Data depositor,” “Researcher” and “Librarian.”

While these were being developed focus groups were held, which brought together researchers from across campus to discuss what they would want out of a data repository. Specifically, the questions asked of the focus groups were intended to identify potential barriers to use, so that we can be aware of those from the beginning and do our best to minimize or eliminate them. These conversations were summarized and then further distilled into the user story format. In total, 71 unique system features were identified. We are currently working on prioritizing the different features, determining what features we have the capability to include now, and what we can perhaps work towards in a future development phase of the repository project.

Wednesday, November 12, 2014

UW Libraries Forms Team to Develop Data Repository

The University of Washington Libraries is excited to announce the formation of a team to develop a Data Repository. The Data Repository at UW (DRUW, pronounced droo) will provide a secure, long-term location for UW faculty to store and share their research datasets. This repository will support UW researchers in meeting federal and private funder data management mandates and will promote the principles of open access and data sharing, while providing a convenient place to archive and discover datasets from research done at UW.

DRUW builds upon the Libraries existing Data Services offerings, which include assistance with data management plans, locating and acquiring research data, data curation and archiving, and data reference assistance. It also joins our ResearchWorks Services, which has been providing curation and archiving of digital research outputs for more than a decade. DRUW will allow campus researchers to archive their research data in a secure, reliable digital repository and allow users from off-campus to discover their work.

Data Repository Librarian Mahria Lebow has recently joined the Data Services Unit at the UW Libraries, and carries project management responsibilities for the repository. Updates and information about the project will be available here on the Data Services Blog.

For more information, contact Mahria Lebow, mahria at uw dot edu.

Wednesday, November 5, 2014

Two Weeks Until UW GIS Day!

On Nov. 19th, the UW Libraries will be hosting the 4th annual UW GIS Day in the Research Commons, Allen Library South, from 10am-3:45pm.

The event will highlight and celebrate the transformational role of Geographic Information Systems (GIS) and Remote Sensing Technologies. We hope you will join us to:
  • Connect with others working on GIS research
  • Hear presentations on LiDAR and Unmanned Aerial Vehicles (UAVs)
  • Enjoy lightning talks about GIS projects going on around campus
  • View UAVs, 3D visualization and 3D printing technology demonstrations
  • Learn about GIS-related resources available to the UW community
Speakers include: Prof. David Montgomery (ESS), Ralph Haugerud (USGS), David Shean (Ph.D. student, ESS), Aaron Cheuvront (UW Capital Projects), and Aaron Racicot (CUGOS). More information is available at:
This event is open to the public. We hope you will join us!

Friday, October 24, 2014

Student lightning talks needed for UW GIS Day

UW students: Interesting in presenting at GIS Day? We're looking for UW students to give lightning talks. This year, we will be awarding a $100 and a $75 UW Bookstore gift card to the top two student lightning talkers! 

For more information and to submit a talk proposal, see:

UW GIS Day info:
Date: Nov. 19, 2014
Time: 10am -- 3:45pm (you can stay for as little or long as you like)
Place: UW Libraries' Research Commons

UW faculty and staff: please share this call for presenters with your students and encourage them to present. Additionally, you too can give a lightning talk (sorry, no prizes, though).

We hope you will join us!

Tuesday, October 21, 2014

Guide to Research Data Management Curriculum

In many of my workshops and class sessions, especially for librarians new to data management, I've been asked about existing research data management (RDM) curriculum and resources. People new to the field want to be able to browse online learning courses, join mailing lists to see what's going on, follow a blog or two. After compiling a few of these lists to serve as handouts, I decided to put them online in a LibGuide for Research Data Management Curriculum. In a way a curated list of these things feels very Yahoo 1990s, but it allows me a place to keep tabs on new curricula, blogs and other resources that I or my coworkers use on a regular basis.

Got any favorite sites/classes/blogs/lists you'd like to see on the list? Put it in the comments and I'll take a look.

Monday, October 13, 2014

UW GIS Day 2014

Campus GIS users:

Save the date!

November 19th is GIS Day and the University of Washington will highlight and celebrate the transformational role of Geographic Information Science (GIS) by hosting a day-long event in the UW Libraries' Research Commons.

Event information is available on the UW GIS Day website:

Interesting in presenting at GIS Day? We're looking for people to give lightning talks. There are going to be two sessions: one for student presenters and one for non-students. Submit a talk proposal here:

We hope you will join us!

--UW GIS Day Planning Group

Thursday, September 25, 2014

Data + Libraries + Twitter

I was just adding to my Twitter list of data-related-library-type-people, and though I know there are several of these lists out there, I thought I'd throw this one out to see if anyone has a feed they'd like to add. Drop it in the comments and I'll add it to the list.