The upload limit is 2.5G, and our computation RAM limit is 4GB per core. You will get an out-of-memory error if you try to denoise a single file that violates the maximum permissible limit.
Why do I have to create an account? Is there a limit to how much a single account can use the portal?
Creating an account is one of the ways our portal becomes more secure, and also allows to gauge how many users we accrue in the coming months. You can denoise as many files, and use the gateway as frequently as you’d like. The limit will only be your curiosity, and funding $$$ of course.
How long will it take to get back denoised results?
Depends on how large the file size is. Typically, an experiment will be launched immediately and the queuing time will be within a few minutes. You can monitor the computational progress on the gateway.
If I’m uploading my data to the gateway, am I by default providing you consent to use my data?
No! We do not access your data in any way or use it for any internal purpose. You can delete your data file from on the gateway immediately after harvesting the denoised data and data on our cloud computation center will be cleaned every 3 months.
What is the recommended file format, again?
Your data must be uploaded as a gene x cell matrix, stored in a .csv, .txt or .rds file with gene names. For the .rds file format, the data can either be stored as a regular matrix or sparse matrix (provided by the R package: Matrix).
Why do you have both Immune cells and T cells under the Human model?
SAVER-X trained on “Immune cells” includes a general model trained on virtually all subtypes of innate and adaptive immune cells. The “T cells” model is more specific, so if your study has performed FACS sorting, or is specifically looking at only T cells, then the latter model might be more appropriate. Otherwise, a general model would also be able to robustly denoise your dataset.
What if I have performed scRNA-seq on an organism (who doesn’t want to sequence the shark rectal gland?) that you don’t currently have a pre-trained model for?
Well, you’re still in luck. While we may not have every single species and organ listed at the moment, you can do one of two things. First, use the “No Pretraining” model. This would still give you improved, denoised values; it just won’t have much existing data to harness from while denoising your data. Second, explore related organs/species. To do so, map the genes that are shared between sharks and mice or humans. You can denoise homologous genes using either an existing mouse or human pretrained SAVER-X model. We imagine that you likely sequenced the rectal gland to study osmoregulation, or how tubules work to transport ions and water. If so, you could certainly experiment by using a model pre-trained on the human kidney or the intestine, and explore how informative the results are. After all, isn’t this the true power of transfer learning?
Are some pre-trained models better at denoising than others? If so, why?
We tried to bring out the best in every model, but some pre-trained models start out with an advantage if several large studies have already sequenced that organism/tissue of interest. Models that are pre-trained on enormous, high-quality datasets tend to slightly edge out those models trained on fewer cells. For instance, our Human Immune Cells model is pre-trained on ~1 million cells, a combination of datasets obtained from the Human Cell Atlas and 10x Genomics, whereas the Mouse Rib model is trained on less than 25,000 cells. Therefore, the former pre-trained model is more efficient than the latter. We do strive, however, to update each model as newer publicly obtainable datasets become accessible.
I have more questions and/or want to provide feedback, who do I contact?
Please email email@example.com. We strive to resolve all queries within 72 hours. We would also love to hear your feedback on how your experience was in using our web portal.