New Metadata Management Capabilities...

Metadata Management Interface

One of our key lessons from supporting The Cancer Genome Atlas and Pan-Cancer consortia with our GNOS file management software was the need to store complex metadata. We learned that metadata will inevitably vary between different research organizations, but within a single team, standard nomenclatures for fields like sex and age at diagnosis should be well-maintained.

Accordingly, one of STARInsight's core features is the ability to allow data administrators to define a custom metadata schema to suit their research. STARInsight does not have a list of required metadata fields (except for a unique ID and assigned project for each sample), and each STARInsight domain may have multiple projects, each with their own custom metadata schema.

To make it easier for data administrators to add, update or delete metadata for an existing project, we've rolled out a new metadata management interface.

To support metadata schema with complex, hierarchical relationships, this screen takes metadata in JSON format. Once you've validated your JSON entry, STARInsight will validate that the entry conforms to a controlled vocabulary that data admins can work with Annai Systems to define.

STARInsight will ingest valid entries and the metadata will appear on the Search, Analysis Set Details, and other places throughout the product. 

Users who prefer the flexibility of working with STARInsight programatically can perform the same operations through a new REST API endpoint. Check out this article to learn more.

Principal Component Analysis

STARInsight offers several statistical analysis apps for performing quick quality control and initial exploration of new data sets. These apps are optimized to run fast on STARInsight's computing cluster and don't require programming expertise to execute. 

We're pleased to announce the release of a new app based on the MLlib principal component analysis algorithm. This new tool is available on all domains from the apps page; click here to learn more.

Scalable Dimension Reduction

We've revamped our Scalable Dimension Reduction app (formerly known as Probabilistic Principal Component Analysis) to return more accurate results while scaling to a larger number of samples and variants. The latest version of the SDR app includes new summary statistics that you can use to evaluate your results.

Notebook Improvements

The notebook interface is designed for performing custom analysis on genomic data. You can use code you have written in R, Python or Scala to analyze variants we store on our platform. Based on feedback from our users, we've incorporated two small improvements that should make working with notebooks a little bit easier. 

Clone Notebooks

A "clone" option will allow any researcher to copy a notebook for which they have permissions. Cloning a notebook preserves all of the notebook's paragraphs. This is an easy way to create variations on the same basic set of calculations.

A researcher who clones a notebook will automatically have "Owner" permissions for the clone. Each admin user in the account will also have access to the clone. However, researcher users who had permissions to the original notebook will not* have access to the clone.

Launch Blank Notebook

You may want to use the notebook to analyze data (RNA-seq, microbiome taxa counts, clinical metadata, etc.) which you store in the STARInsight Vault. If you don't want to select genomic data before launching a notebook, just leave the "Analysis Set" and "Filter Set" dropdown menus blank and click "CREATE."

Have more questions? Submit a request


Please sign in to leave a comment.
Powered by Zendesk