Moon Zoo: Counting lunar craters with “citizen science”

Title: The Moon Zoo citizen science project: Preliminary results for the Apollo 17 landing site
Authors: Roberto Bugiolacchi et al.
First Authors Affiliation: Centre for Planetary Sciences at UCL/Birkbeck.

Scientific projects utilizing large datasets are increasingly relying on crowd sourcing to analyse their data. The Moon Zoo project (part of same suite of projects as Galaxy Zoo, which is devoted to classifying galaxy morphologies) relies on this kind of “citizen science” to examine images of the lunar surface taken from the Lunar Reconnaissance Orbiter Camera. A census of craters and their sizes allows for the determination of cratering rate on the Moon’s surface, the level of crater erosion and degradation (measured by looking at the variability of the circle sizes and locations), and estimates of the regolith depth. However, the effects of erosion and illumination make accurately identifying craters difficult for computers, and human eyes are generally superior for these sorts of tasks. The limiting factor then, becomes the sheer number of craters (thousands upon thousands) relative to the number of researchers available to examine the data.

The Moon Zoo interface is fairly easy and intuitive to use. After a brief training tutorial, users are presented with images of the lunar surface, on which they can place markers to indicate the size and position of craters (Fig. 1). To validate the accuracy and reliability of the crowd sourced measurements, the authors of this paper conduct their own crater count on a set of images, and compare this to the data generated by the users.

moonzoo

Fig. 1: The Moon Zoo interface allows users to mark the location and size of craters by drawing circles. Additional options are available for the user to mark any other interesting features in the image. On average, each image is examined six times by different users.

While crowd sourcing is a cheap and efficient way to analyze large data sets, it is not without its shortcomings. Out of the ~9000+ Moon Zoo users, each user provides on average 14 annotations per crater. However, almost 75% of all users identified less than 10 craters, which shows a relatively low commitment rate. This bring into question whether the data generated by users are actually reliable. The authors set a minimum threshold of 20 crater notations per user for the data to be used in any scientific analysis, which eliminates a significant fraction of users. Due to the lack of experience of most users, there is a lot of variation in the estimates of crater boundaries and locations. For example, one systematic error results from a significant number of users using the smallest default crater size marker to indicate the size of the smallest craters, a result which is clearly visible in the irregularities in the crater size distribution (Fig. 2) The authors also conclude that users should be sufficiently trained (i.e. through a tutorial) to ensure that their responses are reliable.

cratersizes

Fig. 2: Histogram showing the distribution of crater sizes (lower panel showing the percent deviation from the power law fit). The red spikes are a result of users selecting the smallest available size crater marker for a given level of zoom on an image, instead of measuring the true crater size.

To improve the quality of responses and increase the retention rate of users, researchers involved in similar “citizen science” projects have proposed various incentives for the public to volunteer their time to these efforts. Many projects are offering acknowledgements in scientific publications to many users who are involved in any significant discoveries and gamification incentives to make the data analysis experience more fun and rewarding. Even if it’s not a perfect system, crowd sourced science still offers a huge leap in productivity given the limited number of specialists and experts available to work on these projects.

About Anson Lam

I am a graduate student at UCLA, where I am working with Steve Furlanetto on models of galaxy clustering and their applications to the reionization era. My main interests involve high redshift cosmology, dark matter, and structure formation.

Previously, I was an undergraduate at Caltech, where I did my BS in astrophysics. When I’m not doing astronomy, I enjoy engaging in some linear combination of swimming/biking/running.

Leave a Reply