Tutorials
Manual Annotation of Physiological Data Using MATLAB GUI
This guide will help you utilize the QA_app, a tool for visualizing and manually assessing the quality of physiological data in MATLAB. By following these instructions, you'll be able to label data quality, add comments, and generate a summary CSV file with your annotations.
Step 1: Download the MATLAB Application
Download the application files:
Clone the GitHub repository containing the application. If you have Git installed, open your terminal or command prompt and run:
Alternatively, download the ZIP file from the GitHub page and extract it to your desired location.Step 2: Prepare Your Data
Access example data files:
- The repository includes a
Data
folder with four example.mat
files. Each file represents a different subject with various physiological measures.
Example folder structure:
Step 3: Launch the Application
- Open MATLAB and set your current directory to the root of the
physio_QA_manual
folder. - Execute the
QA_App_v101.m
script by typing the script's name in the Command Window and pressing Enter. This will open the Quality Assessment GUI.
Step 4: Using the App
Load Data:
- Click on the
Load
button to populate the interface with fields from your.mat
files. - Enter the name of a physiological measure to display its data.
Note
You can load two physiological variables to QA simultaneously and an auxiliary waveform to aid your decision making process.
View and Assess Data:
- Use the
Previous
andNext
buttons to browse through different subjects. - Rate each data segment's quality using categories like
Great
,Good
,Fixable
, orBad
.
Add Comments:
- Provide comments in the provided text box, especially if you mark data as
Fixable
. This helps in later reviews or corrections.
Step 5: Save and Review Results
Check and save your assessment results:
- The app automatically saves your ratings and comments into a CSV file named based on your inputs during the initial setup.
- This file is stored in the root directory of the repository and can be referred to as
result.csv
or another name you provided.
Important Notes
- You can close and reopen the app at any time; your assessments are saved continuously. Do NOT modify the structure of the CSV file or the Data folder after starting your assessments to avoid inconsistencies or data loss.
For further assistance or troubleshooting, feel free to open an issue on or github repo.
Simple Automated Quality Assessment for Cardiac Data
Step 1: Setting Up Your Environment
Before diving into the code, ensure your Python environment is properly set up to handle the required tasks. You will need specific libraries to run the provided code. Here's how to set them up:
Create a virtual environment (optional but recommended):
Create a requirements.txt
file with the following content:
Step 2: Import Necessary Libraries
Open your Python script or notebook and import the necessary modules:
import numpy as np
import pickle
import matplotlib.pyplot as plt
from scipy.signal import find_peaks
from scipy.stats import mode
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.callbacks import EarlyStopping
from tensorflow.keras import layers
from sklearn.model_selection import train_test_split, KFold
from sklearn.preprocessing import StandardScaler
Step 3: Load and Prepare the Data
For this tutorial, you will use a dataset containing cardiac signals.
Download the dataset: Ensure that the files cardiac_input.pkl
and cardiac_label.pkl
are placed in your working directory. These files can be downloaded from the provided OSF link: OSF Dataset.
(Optional) Download the data directly from the URL to your local file path:
def download_file(url, file_path):
"""Download a file from a web URL to a local file path."""
response = requests.get(url, stream=True)
if response.status_code == 200:
with open(file_path, 'wb') as file:
for chunk in response.iter_content(chunk_size=8192):
file.write(chunk)
print(f"Downloaded {file_path}")
else:
print(f"Failed to download from {url}")
# URLs to the dataset files
base_url = "https://osf.io/z8yph/download"
input_url = f"{base_url}?filename=cardiac_input.pkl"
label_url = f"{base_url}?filename=cardiac_label.pkl"
# File paths where the data will be saved
input_file_path = "cardiac_input.pkl"
label_file_path = "cardiac_label.pkl"
# Download the files
download_file(input_url, input_file_path)
download_file(label_url, label_file_path)
Load the data:
with open('cardiac_input.pkl', 'rb') as file:
x = pickle.load(file)
with open('cardiac_label.pkl', 'rb') as file:
y = pickle.load(file)
Clean the data: Remove any samples with NaN labels, which represent missing or undefined labels.
Step 4: Define and Train the Model
You will use a 1D CNN, which is suitable for time-series and sequence data like cardiac signals.
Setup cross-validation: Use 5-fold cross-validation to ensure the model generalizes well across different subsets of your data.
Define the model: Use multiple convolutional layers to capture the hierarchical pattern in the data.
model = Sequential([
layers.Conv1D(64, kernel_size=5, strides=3, activation='relu', input_shape=(X_train.shape[1], 1)),
layers.Conv1D(32, 5, 3, activation='relu'),
layers.Conv1D(16, 5, 3, activation='relu'),
layers.Conv1D(8, 5, 3, activation='relu'),
layers.Flatten(),
layers.Dense(1, activation='sigmoid')
])
Compile and fit the model:Use the Adam optimizer and binary crossentropy loss.
Evaluate the model: After training, evaluate the model on the test set.
history = model.fit(X_train, y_train, epochs=25, validation_split=0.2, batch_size=2, callbacks=[EarlyStopping(monitor='val_loss', patience=5, restore_best_weights=True)])
Step 5: Results and Visualization:
Plot the training history and display the accuracy and loss metrics to evaluate the model performance:
plt.plot(history.history['accuracy'])
plt.plot(history.history['val_accuracy'])
plt.title('Model Accuracy')
plt.ylabel('Accuracy')
plt.xlabel('Epoch')
plt.legend(['Train', 'Validation'], loc='upper left')
plt.show()
This tutorial outlines how to setup, process data, train, and evaluate a machine learning model using a 1D CNN. Adjust the kernel size, stride, and model architecture based on specific requirements and dataset characteristics to optimize performance.