Data processing

Hello, i’m quite new at scRNAseq data analysis. I performed sequencing on 18 samples (2 conditions, 3 time points 3 biological replicates per condition and per time point). I uploaded my data on the platform. My question was: during the data processing steps, is each sample processed individually, to later integrate them e.g. using Harmony, am I right?

Hi, yes you are correct! For more insights on analyzing multiple samples together, you might find this detailed answer helpful: Is it possible to analyze multiple samples together as integrated/merged?.
I hope this helps. Let me know if you have any other questions.

1 Like

Hi Sara,
Thanks for your answer! I just have one question. I uploaded my data in 3 different “experiments”, divided by the parameter “timepoint”. Doing this i have 3 experiments, with 6 samples each and 3 replicates per condition. Now, for two of the timepoints, everything looks kinda good, but in the last time point, there seem to be 1 sample with likely some empty droplets with ambient RNA (no clear transcriptimical profile) which are retained after all the filtering steps, and i was wondering if there may be a way to remove those.

Best,
Kevin

Hi Kevin,

I understand your concern about the ambient RNA affecting your analysis. Unfortunately, there isn’t a direct option to specifically remove ambient RNA. However, you can address this issue by manually adjusting the filters during the first two steps of Data Processing.

In step 1, the “Classifier filter”, which employs the “emptyDrops” method, is designed to identify cell barcodes arising from the background medium (contains ambient RNA). These cell barcodes are then filtered based on the false discovery rate (FDR), which is visually represented by the red line on the density plot. During this process, a “mixed” population, depicted in grey on the knee plot, is identified. This population includes both cell barcodes that are filtered out and those that remain; some of these can be further filtered in the subsequent step.

Moving on to step 2, the “Cell Size Distribution filter” helps differentiate between actual cells (characterized by a high number of unique molecular identifiers, or #UMIs) and either the background medium or cellular fragments (which typically show a low number of #UMIs). By applying this filter in addition to the Classifier filter, you can further refine your dataset by removing cell barcodes that display a low #UMIs count, thereby reducing the impact of ambient RNA on your analysis.

So you should be able to mitigate the issue of ambient RNA in your samples by adjusting these filters.
If you need further assistance or have more questions, feel free to ask!