Thursday, September 25, 2014

Data + Libraries + Twitter

I was just adding to my Twitter list of data-related-library-type-people, and though I know there are several of these lists out there, I thought I'd throw this one out to see if anyone has a feed they'd like to add. Drop it in the comments and I'll add it to the list.

Wednesday, August 6, 2014

Data Librarianship Workshops

Beginning in September, Data Services Curriculum and Communications Librarian Jenny Muilenburg will be offering three Data Librarianship Workshops to librarians and staff who work with people using data. Designed to teach some of the essential concepts necessary for those working in a data support role, these workshops will cover data management plans, data librarian skills, repositories and archives, and data best practices (among others!).

Prior to each workshop there will be a small amount of readings/videos/research for attendees to complete. This will allow workshop attendees to begin with the same background knowledge, as well as enable us to share what we all bring from our different fields/positions.

Session 1, Monday, September 8, 2-3:30pm
Data Librarianship: Skills and Definitions
Curious what it means to be a “data librarian”? This workshop will look at the skills and traits necessary to work as a data librarian, whether or not your current position includes that description, as well as what other libraries and institutions are providing in the way of data support and guidance. We’ll also talk about how to gain various kind of education and training to better support data-intensive research.

Session 2, Monday, January 26, 2-3:30pm
Data Librarianship: Archives & Repositories
There are so many archives and repositories out there it can be difficult to know where to start looking to help someone in your field (or especially a field you’re not familiar with).This workshop will look at some of the categories of archives and repositories, and we’ll have time to share some of the similarities and differences across disciplines. We’ll also talk about some of the usage and ethics considerations that come into play when researchers share their data.

Session 3, Monday, April 20, 2-3:30pm
Data Management Plans: Reading, Writing, and Sharing
The words “data management plan” get tossed around a lot in certain circles. In this session, we’ll spend time learning about the different disciplinary and/or agency requirements for DMPs. We’ll also have an opportunity to read case studies from several disciplines, and learn how to help someone create their own DMP using various resources and tools from the UW Libraries.

For more information, contact us at uwlib-data at uw dot edu.

Thursday, April 10, 2014

UW Libraries hosting virtual conference on "Dealing with the Data Deluge"


The UW Libraries is excited to announce that it will be hosting an upcoming NISO (National Information Standards Organization) virtual conference on "Dealing with the Data Deluge: Successful Techniques for Scientific Data Management." Attendance is open to the UW community.
  • Where: Allen Auditorium, UW Libraries
  • When: Wednesday, April 23, 8am-2pm
  • What: "Dealing with the Data Deluge: Successful Techniques for Scientific Data Management," a NISO virtual conference

With funding support from the Libraries Organization Development & Training to cover our site registration, this virtual conference will "explore in greater depth than traditional webinars some of the practical lessons from those who have implemented data management and developed best practices, as well as provide some insight into the evolving issues the community faces." More details from the event page:
With the expansion of digital data collection and the increased expectations of data sharing, researchers are turning to their libraries or institutional repositories as a place to store and preserve that data. Many institutions have created such data management services and see the data curation role as a growing and important element of their service portfolio. While some of the experience in managing other types of digital resources is transferrable, the management of large-scale scientific data has many special requirements and challenges. From metadata collection and cataloging data sources, to identification, discovery, and preservation, best practices and standards are still in their infancy.
Topics and Speakers include:
  • Keynote Speaker – Jan Brase, Ph.D., German National Library of Science and Technology; Managing Agent of DateCite, Chair of the International DOI Foundation (IDF), Vice-President of the International Council for Scientific and Technical Information (ICSTI), and Co-Chair of the CODATA Data Citation task group
  • Guidelines and Resources for Office of Science and Technology Policy (OSTP) Data Access Plans –Jared Lyle, Director of Data Curation Services, Interuniversity Consortium for Political and Social Research (ICPSR), University of Michigan
  • Joint Declaration of Data Citation Principles: Implementation of the Principles in the Harvard Dataverse Repository – Merce Corsas, Ph.D., Director of Data Science, Institute for Quantitative Social Science (IQSS), Harvard University
  • Purdue University Research Repository (PURR): A Commitment to Supporting Researchers – Michael Witt, Head, Distributed Data Curation Center (D2C2); Associate Professor of Library Science, Purdue University Research Repository (PURR)
  • Is This Data Fit for My Use? The Challenges and Opportunities Data Provenance Presents – Adriane Chapman, MITRE
  • A Durable Space: Technologies for Accessing Our Collective Digital Heritage – David Wilcox, Product Manager, DuraSpace
  • The SHared Access Research Ecosystem (SHARE) Project: A Joint Initiative of ARL, AAU, and APLU –Judy Ruttenberg, Program Director for Transforming Research Libraries, Association of Research Libraries (ARL)


More information about the event, including the schedule, is online at www.niso.org/news/events/2014/virtual/data_deluge/


Wednesday, April 2, 2014

Research Data Management Workshops: Lessons Learned

From January 22 to March 5, 2014, three University of Washington librarians offered a seven-week course in research data management. As a complement to the two workshops we offered in 2013, which were geared toward research data management basics for librarians, this series was aimed at graduate students, primarily in the School of Forest and Environmental Sciences, Biology, Engineering, the iSchool and Health Sciences.

We ran the course as a pilot of the New England Collaborative Data Management Curriculum. Jenny Muilenburg, Mahria Lebow and Joanne Rich worked together to offer the class, which was designed as a one-hour meeting with lecture and exercises, meeting once a week for seven weeks. Students were asked to register, but the classes were not required, and no credit was given. Each of the weeks touched on one concept from the NECDMC curriculum, with one change made mid-course that combined two concepts into one lecture. We took each lecture module from NECDMC and modified it to suit our personal and institutional preferences, as well as adding UW-specific information. One primary lecture room was used on the main campus, where we both recorded the lecture and streamed it to a second location in Health Sciences, where Joanne Rich facilitated the streamed lecture and ran the exercises off-air.

Each class consisted of about 30 minutes of lecture, and 30 minutes of exercise and discussion. Module topics included types, stages and formats of data; metadata; storage, backup and security; legal and ethical considerations; sharing and reuse; and archiving and preservation. Experts from on campus were asked to contribute opinions and UW-specific information, specifically on metadata, storage and security, and legal and ethical information. Overall class evaluations after each lecture were positive, with good feedback about what to include in future iterations of the class.

It was a great first foray into RDM curriculum for non-librarians, although attendance with attrition was not as good as we'd hoped. Our next move will be to take this experience and the curriculum, modify and shorten, and present it to subject-specific librarian groups on campus. I'd like to see one for STEM disciplines and the social sciences, and the health science librarians are working on one for their disciplines. More information will be forthcoming as we pull together what we've learned and where we'll go from here.

Wednesday, February 19, 2014

New Data Policies from AGU and PLOS

Both the American Geophyical Union and PLOS have recently announced new policies dealing with the data that underlies published research. The PLOS statement recognized that having open access literature without providing the underlying data was incongruous, and both policies provide recommendations for what to share, and how to share it.

In both statements, the requirement is not necessarily for all research data to be shared, but for the critical parts of data that were used to complete the published research and analysis. The two groups also require that data is made easily accessible, within legal constraints, and not held by the authors as the sole gatekeeper to the data. The expectation is for the data to be stored in trusted repositories: AGU provides a link to suggested repositories; PLOS suggests authors use repositories verified with criteria such as that provided by the Centre for Research Libraries or Data Seal of Approval (perhaps recognizing that when public archives aren't used, access to the data rapidly diminishes over time).

Recognizing that there are a variety of reasons all data may not be able to be shared publicly, each policy provides options for privacy or legal concerns, but requires that a statement be made to that effect, explaining why part or all of the data can not be shared.

With funding agencies and the federal government moving swiftly toward open access for all research outputs, and the proliferation of tools making it easy to share data, more groups and publishers will no doubt be adopting similar policies in the very near future.



Monday, December 2, 2013

UW Collaborating on $37M Data Science Initiative

Big news for the University of Washington: UW, along with NYU and Berkeley, has been given 5-year, $37.8M award from the Gordon and Betty Moore Foundation and the Alfred P. Sloan Foundation to advance the growth of data-intensive discovery across a broad range of fields. It's a huge, cross-institutional and multi-disciplinary effort that will build and explore new data science challenges and environments.

The UW team includes more than a dozen faculty, and is led by Ed Lazowska, Director of the UW eScience Institute. Berkeley's team is led by Saul Perlmutter and NYW's by Yann LeCun.

Fernando Perez from Berkeley has written a good description of his hopes for the project, what he thinks it means and what he thinks it might help solve (hint: it involves more than just data science). 

Monday, September 16, 2013

New Reports on Data Archiving and Citation


Two new reports have been published that deal with data issues in research, from proper documentation and archiving, through use of data in research and publication, down to citation. The first is the brief but concise Lost Science: Protecting Data Through Improved Archiving by Karen E. Simmons (http://onlinelibrary.wiley.com/doi/10.1002/2013EO370006/abstract). This short but on-point article uses concrete examples from NASA data to show what can happen when digital data isn't properly documented, when documentation and formatting standards aren't followed or change rapidly, and the potential loss to science and society at large when bountiful, important, and historic information is lost.

The second report is from the U.S. CODATA and the Board on Research Data and Information (BRDI): Out of Cite, Out of Mind: The Current State of Practice, Policy and Technology for the Citation of Data (https://www.jstage.jst.go.jp/article/dsj/12/0/12_OSOM13-043/_article). From the abstract: "This report discusses the current state of data citation practices, its supporting infrastructure, a set of guiding principles for implementing data citation, challenges to implementation of good data citation practices, and open research questions." This is the second report on data citation issues from this group: the first, For Attribution-Developing Data Attribution and Citation Practices and Standards (2012), is available from the National Academies Press online at: http://www.nap.edu/catalog.php?record_id=13564



Wednesday, September 4, 2013

Upcoming Data-related Webinars

There are three upcoming webinars that may be of interest to data-minded folks:

  • DuraSpace is hosting Stewarding Research Data with Fedora and Islandora, September 10, 2013, 11am-12pm Eastern. Mark Leggott from the University of Prince Edward Island and founder of the open source Islandora project will be speaking. From the blurb: “In one example at UPEI, Islandora tools are being built to sync data from systems like DropBox and Google Drive to Fedora, providing immediate preservation services for any arbitrary collection of data. This Physical Data Model is intended to provide a quick and seamless integration with Islandora where the researchers can subsequently enrich and optionally choose to share their data with others. In another example the Smithsonian is applying a set of Intellectual Data Models to steward research output from a variety of projects. In this case data is ingested into Islandora against a domain-specific data model that applies specific metadata forms, data transformations and data viewers to make the data more accessible immediately on ingest. Register here: http://events.r20.constantcontact.com/register/event?oeidk=a07e7yc78n7872b1e63&llr=5iy95gcab.
  • NISO is hosting a two-part webinar on Research Data Curation. Part 1 is on E-Science Librarianship (September 11, 1-2:30pm Eastern), and will discuss “new librarian strategies, tools, and technologies developed to support the lifecycle of scholarly production and data curation. Specific challenges that face research libraries will be described and potential responses will be explored, along with a discussion of the types of skills and services that will be required for librarians to effectively curate research output.”  Registration is here: http://www.niso.org/news/events/2013/webinars/esciencePart 2 is on Libraries and Big Data (September 18, 1-2:30pm Eastern), and will explore librarians and their role in data curation: “There are many challenges to effectively manage and curate this data—challenges that are both similar and different to managing document archives. Libraries can and are assuming a key role in making this information more useful, visible, and accessible, such as creating taxonomies, designing metadata schemes, and systematizing retrieval methods. Our panelists will talk about their experience with big data curation, best practices for research data management, and the tools used by libraries as they take on this evolving role. Registration is here: http://www.niso.org/news/events/2013/webinars/data_curation.
  • The National Research Council's Board on Research Data and Information will be hosting a public symposium titled Privacy in a Big Data World, September 23, 3-5:30pm Eastern. The symposium will discuss such issues as providing adequate privacy protection for individuals without impeding research and innovation, how different regulatory approaches to privacy impact national and transnational research, and how society’s perspective on privacy is evolving.More detail can be found here: http://sites.nationalacademies.org/PGA/brdi/PGA_084312


Wednesday, July 24, 2013

Reports from Open Repositories 2013 and IASSIST2013


Two UW Libraries Data Services Team members were able to attend recent conferences related to data services: Open Repositories 2013, and the International Association for Social Science Information Services and Technology 2013. 

OR2013 was held on Prince Edward Island from July 8th-11th. Stephanie Wright attended two workshops looking at the future of institutional repositories and how institutional repositories deal with data. Plenaries by Victoria Stodden and Jean-Claude Guedon were inspiring and both were focused (in different ways) on research reproducibility and scholarly communications and altmetrics.  The tweets were flowing so if you'd like to read the thoughts of attendees, check out the Twitter hashtag #OR2013, or you can check out the summarized version on Storify, and a summary of the conference from an IASSIST perspective at http://www.iassistdata.org/blog/or2013-open-repositories-confront-research-data

Jennifer Muilenburg attended IASSIST2013 in Cologne, Germany, from May 28-31. Presentations included several on training researchers on research data management, different approaches to institutional repositories, issues around data collection in university libraries, access to restricted data, and lots more. A brief summary of the conference can be found here: http://www.iassistdata.org/blog/ich-bin-ein-iassister. Due to spotty wifi, the tweets were lacking, but some of what made it through includes links to presentations.

Wednesday, July 10, 2013

Upcoming Conferences and Sessions and Workshops, Oh My!

Since beginning to follow groups and conferences relevant to data management issues for information professionals in November 2012, I've known that there is always something upcoming in the not-too-distant future that looks fascinating and informative. There is a panoply happening right now, though, that could have us all booked out of the office for the bulk of 2013 (given inexhaustible travel budgets, that is). Here are a few upcoming events that have caught my eye:


  • Happening right now is Open Repositories 2013 (#or2013) in Prince Edward Island, CA, July 8-12. I've been following the twitter feed via the hashtag and Storify; lots of interesting talk going on around data policies, data curation methods and technologies, the research lifecycle...
  • The University of East London has been training various types of staff on research data management over the last year. They're summarizing some of their work at a daylong workshop, "Support for support: training those in RDM support roles," July 16, London, UK. I'm currently working my way through some of UEL's online curriculum offerings for librarians, and very much wish I could be there for this session.
  • For those interested in the metadata side of scientific data, Camp-4-Data in Lisbon, Portugal, on September 6, will be exploring many facets of metadata standards used to manage scientific data. This is being held just before iPres, the 10th International Conference on the Preservation of Digital Objects, and DCMI, the International Conference on Dublin Core and Metadata Applications. Head in a whirl yet?
  • The HathiTrust Research Center UnCamp 2013, September 8-9, Urbana, IL, is targeted to digital humanities tool developers, researchers and librarians of HathiTrust institutions, and will include hands-on coding and demonstration, use cases, and community building in an un-conference programmming format. Register early to help form the program.
  • Data Information Literacy Symposium, West Lafayette, IN, September 23-24. This workshop will "explore roles for practicing librarians in teaching competencies in data management and curation to graduate students." Registration for this is currently full, but following via twitter should be interesting.
  • The Digital Humanities Data Curation Workshop is being held in College Park, MD, October 16-18. Their resource guide is a great place to start if you can't attend one of their workshops.
  • The 2013 Digital Library Foundation Forum, November 4-6 in Austin, TX. Proposed sessions include one on using a CRM tool to track data management services in an academic library, one on the influence of faculty rank on attitudes toward research data management, several presentations on encouraging better and more specific use of metadata, fostering a culture of data sharing among researchers, data management education for librarians and researchers...

I'm sure there are others out there that I missed; if you have a suggestion, please add it in the comments below.