The Do’s and Don’ts of Incorporating LLMs into your Scientific Workflow 

From undergraduate classes to cutting-edge research, Large Language Models (LLMs) such as ChatGPT are becoming increasingly present in everyday life in academia. While certainly useful and convenient in the right context, relying too heavily on LLMs at every level of science can be dangerous. In today’s Beyond post, we’ll take a look at the best (and worst) practices for incorporating them in your scientific workflow.

LLMs as Coding Copilots 

Day-to-day life as an astronomer often requires lots of coding. Scientific programming can be frustrating at times, especially if you’re on day three of ‘tracking-down-that-one-bug-thats-breaking-everything’. Naturally, many astronomers now use LLMs such as ChatGPT to alleviate some of this stress. Companies such as GitHub even have LLMs designed to actively help you during the coding process. But where should you draw the line when it comes to AI-generated code? 


Generally, you should never copy and paste code from LLMs without thoroughly understanding and testing it. As scientists, we need to make sure we understand our work at every step. Blindly trusting LLMs to write your code removes this idea of absolute understanding, whether it’s code generated from scratch, fixing a bug in existing code, or simply cleaning up existing code that may have grown unruly.

Don’t…

  • Use LLMs to write all your code for you 
  • Use code generated by LLMs without thoroughly understanding and testing it 

Do…

  • Use LLMs to help narrow down bugs in your code under careful consideration 
  • Generate chunks of code that you test and understand thoroughly 
  • Always understand every step of your code 

In short, LLMs can be incredibly useful and efficient coding assistants, but the person in charge (you!) shouldn’t trust the code it gives you blindly.

Should I use LLMs for Scientific Writing? 

Whether it’s grants, telescope time proposals, or papers, writing is a key, often grueling, aspect of the scientific process. LLMs are already being used to help write scientific papers. So to what degree should you trust LLMs to write about high-level, technical scientific work? 

For starters, never copy and paste AI-generated text! This may seem obvious, but an increasing number of scientific papers appear to do this one way or another. Does this mean you should never use LLMs to help write? Absolutely not! Some of my favorite ways to use LLMs is generating lists of antonyms or synonyms for a given word, or generating possible structure changes to my writing.

Finding relevant citations and reading the (usually vast) amount of literature on a topic is generally a grueling process as a junior scientist. This has led many to using LLMs as citation generators, or to synthesize high-level research. In fact, a LLM designed for astronomers in this context is publicly available. But you shouldn’t skirt thoroughly reading up on your research topic using ChatGPT.

Even models designed for retrieving and synthesizing information have issues when presented with contradictory information. For example, if I ask ChatGPT ‘Provide me with some relevant citations describing the hierarchical star formation process’, two of the papers it suggests don’t exist! (see Fig 1.)

A prompt given to ChatGPT which says 'Provide me with citations regarding the hierarchical star formation process'. ChatGPT then gives two incorrect citations
Figure 1: An example of ChatGPT giving citations which do not exist. While both the authors ChatGPT lists have worked in the relevant field, neither of the linked papers are belong to them!

Don’t….

  • Have LLMs write things for you (this is plagiarism!) 
  • Cite sources provided by LLMs without thoroughly checking and reading them

Do…

  • Use LLMs to generate possible synonyms/antonyms for words/phrases
  • Feed LLMs sentences you have written to gain ideas on restructuring them (without copy and pasting the results). 

LLMs for Course Work

 It’s worth mentioning that you should never use ChatGPT to do your homework or write that essay you’ve been putting off. Not only does this violate standard academic integrity, but it also robs you of the opportunity to learn. Reading and writing is when we do the majority of our critical thinking – using LLMs to write for us deprives us of the opportunity to learn and grow! Doing homework often teaches important skills outside of the scope of the subject matter you’re studying as well. 

Don’t be lazy – write that essay!

Other Considerations 

One of the biggest drawbacks to using LLMs, like ChatGPT, is their impact on the environment. LLMs require an immense amount of data to train, which has led to a sharp uptick in the construction of data centers – huge compounds designed for storing data. Data centers use up extreme amounts of energy (citation), producing a non-negligible carbon footprint. For instance, a 100-word email generated by an AI chatbot using GPT-4 requires a little over one bottle of water and uses enough energy to power 14 LED light bulbs for one hour, The Washington Post reports

So, the next time you want to offload that five minute task to ChatGPT, it might be better to just do it yourself.

Some Broad Conclusions 

For our own personal growth and to uphold academic integrity, don’t offload your thinking to LLMs! As academics, our job can be mostly boiled down to using our brains. Carelessly offloading all your work onto LLMs like ChatGPT doesn’t just result in badly written code and incorrect citations, but removes a core piece of what being an academic is all about!  While these tools can be incredibly useful when applied appropriately, always use LLMs with caution!

Astrobite edited by Ryan White

Featured image credit: OpenAI/ChatGPT

Author

  • Drew Lapeer

    Drew is a first-year PhD student at the University of Massachusetts Amherst. They are broadly interested in the evolution of galaxies, with a focus on the impact of cosmic feedback on the galactic ecosystem. In their free time, they enjoy reading, rock climbing, hiking, and baking!

    View all posts

Submit a Comment

Your email address will not be published. Required fields are marked *

Astrobites wants to learn more about you and how you use our site! Take the quick survey:

X