Overview of the data flow in ASTRA¶
Goals of this notebook:
- Very brief explanation/overview of the basic user interface to load data from disk:
- Load data from .fits files
- Configure the "instrument"
- Reject observations based on different conditions (e.g. HEADER values)
- Reject wavelength regions
Loading data from disk¶
In this Section we look at how we can load spectral data from disk, which can be done in a general way through the DataClass object
from ASTRA.data_objects import DataClass
This object will ingest a list of observations, attribute them IDs (based on the hash of the filename) and divide them into different sub-Instruments. Furthermore, it will only open the spectra in memory when it is needed.
How to setup our instrument¶
We can configure ASTRA to load files two different ways:
- Through a path to a file that contains (in each line) the full path to the desired fits file
- An iterable python object (e.g., a list, tuple) where each entry is the path to a fits file
from pathlib import Path
data_in_path = list(Path("/home/amiguel/spectra_collection/ESPRESSO/proxima").glob("*.fits"))
Selection and configuration of the Instrument¶
After generating the paths of the observations, the next step is to configure the instrument that we are using. The current version of ASTRA has two limitations:
- We can't mix data from multiple instruments in the same DataClass object
- It is not able to automatically determine the instrument associated with a given file.
This means that the user must manually define the instrument that is in use. Then, similarly to all other ASTRA objects, we can configure multiple parameters to fine-tune the data pre-processing.
from ASTRA.Instruments import ESPRESSO
instrument = ESPRESSO
inst_options = {
"minimum_order_SNR": 10,
}
Loading the data from disk¶
There are two ways of loading the data from disk (that work in the same fashion):
- Load the data as an independent process (through DataClassManager)
- Load the data in the main python process (through DataClass)
Note: Option A) makes use of python's proxy objects, serializing all communication. This means that we can use option A) to open all observation in one python core and share that data with multiple processes without re-opening data.
from ASTRA.data_objects import DataClassManager
from ASTRA.data_objects.DataClass import DataClass
load_independent_process = False
if load_independent_process: # Option A)
manager = DataClassManager()
manager.start()
# This makes available the same functions as the usual DataClass object
data: DataClass = manager.DataClass(data_in_path, instrument=ESPRESSO, instrument_options=inst_options, storage_path="")
else: # Option B)
data = DataClass(data_in_path, instrument=instrument, instrument_options=inst_options, storage_path="")
2025-04-14 21:09:31.120 | DEBUG | ASTRA.utils.UserConfigs:receive_user_inputs:216 - Generating internal configs of - 2025-04-14 21:09:31.123 | INFO | ASTRA.utils.UserConfigs:receive_user_inputs:221 - Checking for any parameter that will take default value 2025-04-14 21:09:31.124 | DEBUG | ASTRA.utils.UserConfigs:receive_user_inputs:228 - Configuration <SAVE_DISK_SPACE> using the default value: DISK_SAVE_MODE.DISABLED 2025-04-14 21:09:31.126 | DEBUG | ASTRA.utils.UserConfigs:receive_user_inputs:228 - Configuration <WORKING_MODE> using the default value: WORKING_MODE.ONE_SHOT 2025-04-14 21:09:31.126 | INFO | ASTRA.data_objects.DataClass:__init__:126 - DataClass opening 3 files from a list/tuple 2025-04-14 21:09:31.128 | INFO | ASTRA.base_models.Frame:__init__:253 - Creating frame from: /home/amiguel/spectra_collection/ESPRESSO/proxima/r.ESPRE.2019-07-03T01:43:39.634_S2D_A.fits 2025-04-14 21:09:31.129 | WARNING | ASTRA.Components.SpectrumComponent:regenerate_order_status:96 - Resetting order status of Frame - ESPRESSO 2025-04-14 21:09:31.147 | DEBUG | ASTRA.base_models.Frame:assess_bad_orders:711 - Rejecting spectral orders 2025-04-14 21:09:31.148 | INFO | ASTRA.base_models.Frame:assess_bad_orders:741 - Frame 9066568252996992604 rejected 48 orders for having SNR smaller than 10: 0-47 2025-04-14 21:09:31.150 | INFO | ASTRA.base_models.Frame:__init__:253 - Creating frame from: /home/amiguel/spectra_collection/ESPRESSO/proxima/r.ESPRE.2019-07-14T02:07:49.063_S2D_A.fits 2025-04-14 21:09:31.151 | WARNING | ASTRA.Components.SpectrumComponent:regenerate_order_status:96 - Resetting order status of Frame - ESPRESSO 2025-04-14 21:09:31.167 | DEBUG | ASTRA.base_models.Frame:assess_bad_orders:711 - Rejecting spectral orders 2025-04-14 21:09:31.170 | INFO | ASTRA.base_models.Frame:assess_bad_orders:741 - Frame -2928113502234045974 rejected 38 orders for having SNR smaller than 10: 0-37 2025-04-14 21:09:31.171 | INFO | ASTRA.base_models.Frame:__init__:253 - Creating frame from: /home/amiguel/spectra_collection/ESPRESSO/proxima/r.ESPRE.2019-07-20T01:43:40.032_S2D_A.fits 2025-04-14 21:09:31.173 | WARNING | ASTRA.Components.SpectrumComponent:regenerate_order_status:96 - Resetting order status of Frame - ESPRESSO 2025-04-14 21:09:31.190 | DEBUG | ASTRA.base_models.Frame:assess_bad_orders:711 - Rejecting spectral orders 2025-04-14 21:09:31.193 | INFO | ASTRA.base_models.Frame:assess_bad_orders:741 - Frame 8549670138794176738 rejected 38 orders for having SNR smaller than 10: 0-37 2025-04-14 21:09:31.195 | DEBUG | ASTRA.data_objects.DataClass:__init__:154 - Selected 3 observations from disk 2025-04-14 21:09:31.196 | INFO | ASTRA.data_objects.DataClass:_collect_MetaData:369 - Collecting MetaData from the observations 2025-04-14 21:09:31.197 | WARNING | ASTRA.data_objects.Target:__init__:73 - Target dictionary not found in <None> 2025-04-14 21:09:31.198 | DEBUG | ASTRA.data_objects.Target:clean_targ_list:98 - Parsing through loaded OBJECTs 2025-04-14 21:09:31.199 | INFO | ASTRA.data_objects.Target:__init__:92 - Validated target to be V V645 Cen 2025-04-14 21:09:31.199 | INFO | ASTRA.data_objects.DataClass:show_loadedData_table:885 - -------------------------------------------------------------------- -------------------------------------------------------------------- subInstrument Total OBS Valid OBS [warnings] INVALID OBS -------------------------------------------------------------------- ESPRESSO18 0 0 [0] 0 ESPRESSO19 3 3 [0] 0 Total 3 3 [0] 0 -------------------------------------------------------------------- 2025-04-14 21:09:31.200 | INFO | ASTRA.data_objects.DataClass:load_instrument_extra_information:894 - Checking if the instrument has extra data to load 2025-04-14 21:09:31.201 | INFO | ASTRA.data_objects.DataClass:load_instrument_extra_information:901 - Current instrument does not need to load anything from the outside
Removing activity indicators (Optional)¶
- ASTRA allows the rejection of specific wavelength intervals, that are known to be more sensitive to activity.
- By default, we remove lines that are typically used as activity indicators (on the optical domain, NIR is not yet included)
- This interface can also be used to manuall remove other wavelength regions, as long as it is configured to do so
from ASTRA.Quality_Control.activity_indicators import Indicators
inds = Indicators()
Removing extra regions¶
- We must define a unique name (i.e. no repetitions, even among the default "features"
- We must define a region that will be removed from all observations that have been loaded from disk
- BY default we assume that the region is defined in air. Change to vacuum by passing vacuum_wavelength=True
inds.add_feature(name="feature_1", region=[5000, 5500], vacuum_wavelengths=True)
Applying the selected region¶
Lastly, we have to ingest this object in our DataClass object, so that the rejected wavelengths are included in the spectral mask.
data.remove_activity_lines(inds)
2025-04-14 17:16:32.407 | INFO | ASTRA.data_objects.DataClass:remove_activity_lines:216 - Computing activity windows for each RV measurements