News and information related to data management provided by the University of Washington Libraries.
Search This Blog
Monday, April 24, 2017
Monday, April 17, 2017
Guest Seminar: Andrew Hufton
Tuesday, Apr. 18, 1:30 p.m., Smith Hall 105
The UW eScience Institute's Repoducibility and Open Science Group is hosting Andrew Hufton, managing editor of scientific data Nature Research. Hufton's talk is entitled "Beyond supplementary material: Sharing data effectively through repositories and data journals."
ABSTRACT
The Nature Research journals understand that effective data sharing supports reproducibility and can increase the impact of published works. Indeed, our policies have long recognized that data sharing is a fundamental part of research publication. The increasing complexity and size of research datasets, however, poses challenges for scientists who wish to share their data in a reusable and transparent manner. Based on my experience at Scientific Data, an open-access data-focused journal from Nature Research, I will provide tips on how researchers can share their data in an effective manner that promotes reuse, supports the credibility of their research, and ensures they get proper credit. This will include advice on writing better data-rich papers, the basics of presenting datasets in a useful manner, and tips on how to find the right repository for your data. I will also explain Scientific Data's editorial policies and share some of our experiences peer-reviewing and publishing data so far.
BIO
Andrew is responsible for the editorial policies of Scientific Data, in consultation with the Honorary Editor and Advisory Panel, and works with the Editorial Board to ensure a fair and thorough peer-review process for all submissions. Andrew received his PhD from Stanford University in 2006, and did postdoctoral work at the Max Planck Institute for Molecular Genetics in Berlin. His research included topics in developmental genetics, computational biology and genome evolution. Before joining Scientific Data, Andrew worked as an Editor at Molecular Systems Biology.
The UW eScience Institute's Repoducibility and Open Science Group is hosting Andrew Hufton, managing editor of scientific data Nature Research. Hufton's talk is entitled "Beyond supplementary material: Sharing data effectively through repositories and data journals."
ABSTRACT
The Nature Research journals understand that effective data sharing supports reproducibility and can increase the impact of published works. Indeed, our policies have long recognized that data sharing is a fundamental part of research publication. The increasing complexity and size of research datasets, however, poses challenges for scientists who wish to share their data in a reusable and transparent manner. Based on my experience at Scientific Data, an open-access data-focused journal from Nature Research, I will provide tips on how researchers can share their data in an effective manner that promotes reuse, supports the credibility of their research, and ensures they get proper credit. This will include advice on writing better data-rich papers, the basics of presenting datasets in a useful manner, and tips on how to find the right repository for your data. I will also explain Scientific Data's editorial policies and share some of our experiences peer-reviewing and publishing data so far.
BIO
Andrew is responsible for the editorial policies of Scientific Data, in consultation with the Honorary Editor and Advisory Panel, and works with the Editorial Board to ensure a fair and thorough peer-review process for all submissions. Andrew received his PhD from Stanford University in 2006, and did postdoctoral work at the Max Planck Institute for Molecular Genetics in Berlin. His research included topics in developmental genetics, computational biology and genome evolution. Before joining Scientific Data, Andrew worked as an Editor at Molecular Systems Biology.
Wednesday, March 29, 2017
UW Data Science Seminar: Sir Philip Campbell
Wednesday, April 5, 3:30 p.m. in Physics/Astronomy Auditorium A118
The role of PIs in sustaining the progress and robustness of research is critically important,and yet the pressures on them - some well advised, some not - seem to keep growing. To help Nature's future coverage of these issues, I will present an overview of some of the key pressures on PIs and invite insights and proposals into how funders, universities and journals might best mitigate them.
Sir Philip Campbell, editor-in-chief of Nature, will be presenting “Pressures on principal investigators and their need of support: A consultation” at next week's Data Science Seminar. The Data Science Seminar is free and open to the public.
The role of PIs in sustaining the progress and robustness of research is critically important,and yet the pressures on them - some well advised, some not - seem to keep growing. To help Nature's future coverage of these issues, I will present an overview of some of the key pressures on PIs and invite insights and proposals into how funders, universities and journals might best mitigate them.
Wednesday, March 22, 2017
New Tools for Data Exploration
Data-Planet
The University of Washington Libraries now subscribe to Data-Planet, an "interactive database [that] allows you to create tables, maps, and figures from a variety of licensed and public data sources" (find it anytime in the A-Z list of databases here). Access the database on-campus or log-in with your NetID for off-campus access. For information about the datasets included in the repository or for an introductory video, visit the Data-Planet libguide.
PolicyMap
The UW Libraries are in the trial phase of PolicyMap, an online U.S. national data and mapping tool and analytics platform that does not require any software download. Users can interact with data available on PolicyMap or upload their own spreadsheets to map data using just their browser. The trial period ends April 24, 2017 (find it in the A-Z list of databases here). Currently access is only available on-campus.
Tuesday, February 21, 2017
UW Data Science Seminar: Kelsey Jordahl
Wednesday, February 22, 3:30 p.m. in Johnson Hall 102
Planet Labs currently operates about 60 Earth observation satellites imaging 50 million square kilometers of land area per day. We plan on tripling those figures in coming months, fulfilling our Mission 1 to image the surface of the Earth every day. Global mosaics are created from these images at regular intervals (quarterly, monthly, and weekly) by selecting the best quality scenes (e.g. cloud- and haze-free), color balancing, and seamlessly compositing millions of scenes to create continuous maps of the Earth for each time slice. As our data rate increases, we plan on scaling up the cadence of our mosaics, including a building a continuously updated "dynamic" mosaic of the most recent cloud-free images of the Earth. Daily data at 5 meter spatial resolution will open up new analysis techniques previously limited by the temporal or spatial resolution of existing instruments.
Kelsey Jordahl, Mosaics Team Lead at Planet Labs, will be presenting “Mosaicking the Earth Every Day” at tomorrow's Data Science Seminar. The Data Science Seminar is free and open to the public.
Planet Labs currently operates about 60 Earth observation satellites imaging 50 million square kilometers of land area per day. We plan on tripling those figures in coming months, fulfilling our Mission 1 to image the surface of the Earth every day. Global mosaics are created from these images at regular intervals (quarterly, monthly, and weekly) by selecting the best quality scenes (e.g. cloud- and haze-free), color balancing, and seamlessly compositing millions of scenes to create continuous maps of the Earth for each time slice. As our data rate increases, we plan on scaling up the cadence of our mosaics, including a building a continuously updated "dynamic" mosaic of the most recent cloud-free images of the Earth. Daily data at 5 meter spatial resolution will open up new analysis techniques previously limited by the temporal or spatial resolution of existing instruments.
Friday, February 17, 2017
Love Your Data Week, Day 5: Rescuing Unloved Data
How do data become unloved? We data users don’t love data that are messy, poorly documented, incomplete, or unwieldy, to name just a few frustrations. However, one important way that data become unloved is that they are just plain old. Older data tend not to be machine-readable, which can pretty much be the kiss of death. Digitization, while it’s improving, is still somewhat labor-intensive and costly, so unless a data set is obviously worth the trouble, it may languish.
However, researchers are starting to explore whether there may be some hidden gems worth rescuing. One area in which this is happening is climate data, and a great example is the Glacier Photograph Collection from the National Snow and Ice Data Center (NSIDC). Before this collection was digitized, users had to travel to the NSIDC in Colorado, ask staff to find physical images or microfilm for them in the collection, and then deal with those physical artefacts. Not surprisingly, the collection had few users. However, digitizing these photographs -- which can be considered data sources, as they contain information that can be analyzed -- has made them not only accessible, but an important resource for documenting changes in glacier size and coverage. Digitizing some of the old photographs also suggests locations for repeat photographs from the same vantage point, which can indicate changes across time periods.
PHOTO: Left: William O. Field, 1941; Right: Bruce F. Molnia, 2004. Muir Glacier: From the Glacier Photograph Collection. Boulder, Colorado USA: National Snow and Ice Data Center. Digital media.
But using the above example is cheating a little bit; these photographs were unloved because they were undigitized, but it was clear that they were worth digitizing. In fact, it was so clear that NSIDC was able to get funding and enter into partnerships to get that work done. So what if a researcher has a great idea, but needs sheer person-power to bring it to fruition? These days, crowd-sourcing may do the trick! Check out the Swiss project Data Rescue @ Home, in which citizen-volunteers are entering German climate data collected during WWII, and also have completed entering data from a weather station in the Solomon Islands collected in the early to mid-1900s. By January 2014, they reported having digitized 1.3 million values! They note: “The old data are expected to be very useful for different international research and reanalysis projects…[for example,] historical weather data from the Azores Islands are particularly valuable since the islands are located at the southern node of the most important climatic variability mode in the North Atlantic-European region, the so-called North Atlantic Oscillation (NAO), and there are not much other historical data available from the larger region.”
PHOTO: Example of data collected in the Solomon Islands, entered electronically by citizen-volunteers of the Data Rescue @ Home project (Accessed 2-13-17).
Interested in getting involved in a citizen-science project yourself? Here’s a list of possibilities! And if you really get hooked, you may want to dive into some collections of older non-digitized data and consider starting your own project, to rescue the unloved data and give them new life.
OK, I’m off now to figure out how to get on the project where I can hang out on the beach in New Jersey and count horseshoe crabs!
Ann Glusker PhD MPH MLIS
Research and Data Coordinator
National Network of Libraries of Medicine, Pacific NW Region
University of Washington Health Sciences Library
University of Washington Health Sciences Library
Thursday, February 16, 2017
Love Your Data Week, Day 4: Finding the Right Data
Welcome to Love Your Data Week, Day 4: Finding the Right Data. Today's theme is about asking the right questions, finding the right sources, and citing accordingly -- all of which will enable you to locate the right data, as well as enable your audience to also see why you chose the data you did.Our friends at the National Network of Libraries of Medicine/Pacific Northwest Region, have taken today to highlight the new DataLumos initiative from ICPSR at the University of Michigan. This project aims to archive government datasets to ensure their preservation into the future. Check out their post on the Dragonfly blog describing this and other data archiving work happening around the country.
Subscribe to:
Posts (Atom)