Maximally Mine
Proteomics Gold

Maximally Mine
Proteomics Gold

from Data Independent Acquisition
Mass Spectrometry (DIA-MS) Data Files
from DIA-MS Data

Proteoforms Determine Phenotype

DIA-MS Does Discovery Proteomics

Yet, the Bulk of Peptides in
DIA-MS Data Files Are Ignored

(despite their immense biological value)

We Quantify the Dark Proteome

Until now, most of the peptides in DIA-MS data files could not be harnessed: between ~75% and ~95% of the invaluable data contained in DIA-MS data files, the so-called "dark proteome", was ignored.

Over the last ten years of research, and while working with several leading DIA-MS labs, we have developed and extensively tested an algorithm that can accurately quantify nearly all the peptides in DIA-MS data files, i.e., the dark proteome. In a preprint and several conference proceedings this year (January - November 2025), we have shared our initial results.

We now open up our algorithm and engineers to labs around the world who wish to maximize their DIA-MS data. We hope to act as a specialized informatics team: we only do DIA-MS informatics and never own MSs or any other wet-lab components (nor are we biologists or MDs). Specifically, we hope to use our algorithm and engineers to maximally separate study conditions using peptides quantified from the dark proteome, and thus revolutionize human health, and we hope to do so through a productive, high-trust partnership with DIA-MS wet labs.

Until now, most of the peptides in DIA-MS data files could not be harnessed: between ~75% and ~95% of the invaluable data contained in DIA-MS data files, the so-called "dark proteome", was ignored.

Over the last ten years of research, and while working with several leading DIA-MS labs, we have developed and extensively tested an algorithm that can accurately quantify nearly all the peptides in DIA-MS data files, i.e., the dark proteome. In a preprint and several conference proceedings this year (January - November 2025), we have shared our initial results.

We now open up our algorithm and engineers to labs around the world who wish to maximize their DIA-MS data. We hope to act as a specialized informatics team: we only do DIA-MS informatics and never own MSs or any other wet-lab components (nor are we biologists or MDs). Specifically, we hope to use our algorithm and engineers to maximally separate study conditions using peptides quantified from the dark proteome, and thus revolutionize human health, and we hope to do so through a productive, high-trust partnership with DIA-MS wet labs.

Please book a free consultation + pilot

Please book a free consultation + pilot

Peptides containing unpredicted sequences or unexpected post translational modifications (PTMs) — the components of the dark proteome — are up to ~2000% more numerous than peptides identified from a FASTA library search space.

Peptides containing unpredicted sequences or unexpected post translational modifications (PTMs) — the components of the dark proteome — are up to ~2000% more numerous than peptides identified from a FASTA library search space.

Not only are they more numerous, but these types of peptides are more likely to create a parsimonious panel that maximally separates study conditions*.

Not only are they more numerous, but these types of peptides are more likely to create a parsimonious panel that maximally separates study conditions*.

Please view our preprint / posters or

Please view our
preprint / posters or

*For a theoretical discussion, please see reference to Alzheimer study on top of page three of preprint as well as supplemental note #S5; for an illustration of a real DIA-MS project's results, please see our HUPO 2025 poster.
*For a theoretical discussion, please see reference to Alzheimer study on top of page three of preprint as well as supplemental note #S5; for an illustration of a real DIA-MS project's results, please see our HUPO 2025 poster.

Our Innovation:

The most effective way to learn about our innovations are through a free consultation + pilot. The preprint / posters are a good resource as well. Further, we have visually summarized the original algorithm through the four figures below.

Global XIC Deconvolution

Please see figures below

Our Innovations:

The most effective way to learn about our innovations are through a free consultation + pilot. The preprint / posters are a good resource as well. Further, we have visually summarized the original algorithm through the four figures to the right.

Global XIC Deconvolution

please click on the figures to the right >>

01

PROBLEM

DIA Produces Chimeric XICs

Two co-eluting peptides from a single sample

02

PROPOSED SOLUTION

LC has Natural Variance

Pairs of peptides that co-elute in one subset of samples do not exactly co-elute in another subset of samples

03

MULTIPARTITE MATCHING

Deconvolute *MS2* Fragments Computationally

Match fragment accross samples to create one clean spectra per peptide

04

QUANTIFICATION & AI

Quantify & Create Predictive Panel

Quantify all analytes in MS and use AI to create predictive panel

Our Innovations:

The most effective way to learn about our innovations are through a free consultation + pilot. The preprint / posters are a good resource as well. Further, we have visually summarized the original algorithm through the four figures to the right.

Global XIC Deconvolution

please click on the figures to the right >>

01

PROBLEM

DIA Produces Chimeric XICs

Two co-eluting peptides from a single sample

02

PROPOSED SOLUTION

LC has Natural Variance

Pairs of peptides that co-elute in one subset of samples do not exactly co-elute in another subset of samples

03

MULTIPARTITE MATCHING

Deconvolute *MS2* Fragments Computationally

Match fragment accross samples to create one clean spectra per peptide

04

QUANTIFICATION & AI

Quantify & Create Predictive Panel

Quantify all analytes in MS and use AI to create predictive panel

Ready to quantify the dark proteome?

The Power Of One

The Power Of One

Although we have observed up to ~2000% more quantified peptides by quantifying dark proteome, it matters little whether this increase is ~20%, ~200%, ~2000%, or ~20,000%.

Instead, the key argument is this: the peptides constituting the dark proteome are biologically valuable, yet they are no more / no less likely to be seen by the MS than those generic FASTA sequences (i.e., the MS is unbiased). So, if there is reasonable chance that those biologically valuable but unexpected sequences or PTMs exist in your DIA-MS data files, we at GoldenHaystack Lab can (a) quantify them and then (b) analyze all the quantified peptides — whether present in the known protein libraries or in the dark proteome — through a sophisticated AI/ML routine that creates a small (typically << 200), parsimonious peptide panel that maximally separates study conditions (e.g., please see HUPO 2025 poster ).

Finally, for any desired follow-up work, such as targeted MS validation or manual/semi-automated denovo sequencing, we can provide all the required information (e.g., peptide's elution times and MS1/2 m/z values etc.) and help you in any other useful ways to make the follow-up work almost trivially easy to do.

Although we have observed up to ~2000% more quantified peptides by quantifying dark proteome, it matters little whether this increase is ~20%, ~200%, ~2000%, or ~20,000%.

Instead, the key argument is this: the peptides constituting the dark proteome are biologically valuable, yet they are no more / no less likely to be seen by the MS than those generic FASTA sequences (i.e., the MS is unbiased). So, if there is reasonable chance that those biologically valuable but unexpected sequences or PTMs exist in your DIA-MS data files, we at GoldenHaystack Lab can (a) quantify them and then (b) analyze all the quantified peptides — whether present in the known protein libraries or in the dark proteome — through a sophisticated AI/ML routine that creates a small (typically << 200), parsimonious peptide panel that maximally separates study conditions (e.g., please see HUPO 2025 poster ).

Finally, for any desired follow-up work, such as targeted MS validation or manual/semi-automated denovo sequencing, we can provide all the required information (e.g., peptide's elution times and MS1/2 m/z values etc.) and help you in any other useful ways to make the follow-up work almost trivially easy to do.

We Extract Value From DIA-MS Data Files

We are a specialized DIA-MS data processing team for DIA-MS wet labs: we combine our novel algorithm with our software engineering expertise to maximally mine DIA-MS data files.

We are a specialized DIA-MS data processing team for DIA-MS wet labs: we combine our novel algorithm with our software engineering expertise to maximally mine DIA-MS data files.

Specifically, we quantify peptides found both in known protein libraries as well as from the dark proteome*, combine those results with sophisticated AI/ML, and ultimately aim to maximally separate one's study conditions and yield biological insights.

Specifically, we quantify peptides found both in known protein libraries as well as from the dark proteome*, combine those results with sophisticated AI/ML, and ultimately aim to maximally separate one's study conditions and yield biological insights.

Please Book a Free Consultation + Pilot

Please Book a Free Consultation + Pilot

*For a theoretical discussion of the potential value of the dark proteome, please see reference to Alzheimer study on top of page three of preprint as well as supplemental note #S5 and this web page; for an illustration of its value on a real DIA-MS project, please see our HUPO 2025 poster.
*For a theoretical discussion of the potential value of the dark proteome, please see reference to Alzheimer study on top of page three of preprint as well as supplemental note #S5 and this web page; for an illustration of its value on a real DIA-MS project, please see our HUPO 2025 poster.

(C) Copyright GoldenHaystack Lab 2025. All Rights Reserved.

(C) Copyright GoldenHaystack Lab 2025. All Rights Reserved.