Classifying Holes in the Sun

Machine learning, or teaching computers to teach themselves, is becoming insanely popular throughout science and industry. This is largely driven by the continuous onslaught of new and extensive datasets that often require machine learning to fully understand. A recent Astrobite summarized some of the massive datasets that astronomy will soon face in the high energy, transient sky; however, we already have a wealth of data from a source much closer to home: the Sun.

The data streaming down from our Sun is extremely unique in astronomy for many reasons, including its size and resolution. For example, let’s compare it the average dataset of a supernova you might get from LSST. The supernova would likely be no more than a few pixels wide, from which you would extract the supernova’s brightness and color. You would get a new image of the supernova every few days and eventually make a lightcurve with your data. That’s not a lot of data for a single object. In contrast, multiwavelength data from our Sun is being uploaded every few minutes  with millions of pixels and high resolution. We’re literally able to watch a video of the Sun and solar features which are only hundreds of kilometers wide! Astronomers have to sort through all of this data and pull out meaningful observations.

Figure 1. Schematic of a coronal hole, demonstrating that the magnetic field lines all point outward. This allows for the solar wind to escape.

Figure 1. Schematic of a coronal hole, demonstrating that the magnetic field lines all point outward. This allows for the solar wind to escape.

In today’s paper, the authors use machine learning to classify two important phenomena on the Sun which both contribute to space weather: coronal holes and filament channels. Coronal holes are regions where the Sun’s corona is colder and have a single magnetic polarity which allows wind to escape, as shown in Figure 1. These high-speed solar wind streams help shape the solar wind distribution in the solar system. Filament channels are elongated areas of the Sun where filaments, or arcs of hot plasma, form. Similarly to coronal holes, filament channels are often dark spots in the Sun’s corona. Coronal holes and filament channels are both shown in Figure 2 — it’s probably easy to spot the differences in this image!

Traditionally, coronal holes and filament channels are identified by eye or by very basic image processing techniques which often confuse the two phenomena. Harnessing the power of machine learning, we can train a computer to understand the subtle differences between the two. To complete this challenge, the first step is to prepare a dataset in which all of the holes and channels have already been labeled. The computer can then use this as a “training set” to see how humans normally classify phenomena and learn how to classify itself.

 

Figure 2. Images of the Sun in three different wavelengths, with highlighted areas around coronals holes and filament channels. The holes are outlined in blue while the channels are highlighted in red and orange. Original image from here.

Figure 2. Images of the Sun in three different wavelengths, with highlighted areas around coronals holes and filament channels. The holes are outlined in blue while the channels are highlighted in red and orange. Original image from here.

The computer can’t magically learn how to classify using just the images, so the astronomer needs to provide features of each phenomenon which the computer can use to differentiate between the two. In addition to features which are typically used image classification (such as a mean or contrast), the authors use some physical intuition to help them classify. For one, the filament channels are clearly elongated while the holes tend to be symmetric. Additionally, the holes typically have a single polarity (which allows solar wind to flow outwards), while the filament channels are often magnetically neutral (having both negative and positive polarity).

The authors tested a number of common classification algorithms, but they found that one of the best methods for solar classification was a support vector machine (SVM) algorithm. SVM classifies objects by combining features in such a way that the two phenomena become as distinct as possible. For example, if I am trying to classify whether an animal is or is not a house cat, I might ask whether or not that animal lives in a house or if it meows. Many animals can either live in a house as a pet or meow in the wild, but only housecats will have both features. Similarly in our astronomical scenario, SVM combines the various features of channels and holes to best distinguish between the two. You can watch a visualization of SVM in this video.

In addition to being exceptionally accurate, the classification is fast, taking only a few minutes.  These techniques can thus be used for realtime analysis of the Sun. The authors are hopeful that this analysis will be incorporated into publically available data reduction pipelines in order to help solar physicists make sense of petabytes of data that the Sun produces. Machine learning will undoubtedly become an essential tool for both solar astronomers as our solar database continues to grow and for other astronomers as missions become more complex and data-intensive.

About Ashley Villar

I am a third year PhD student at Harvard University. I'm generally interested in optical transients, or the dramatic aftermaths of stellar eruptions, collisions and explosions. I'm also broadly interested in how astronomers can efficiently use large datasets produced in future missions. When I'm not working, I bake, exercise and try to enjoy Boston.

Leave a Reply