Next Generation Sequencing

Our bioinformatics pipelines and tools facilitate precise detection of single nucleotide polymorphisms (SNPs), insertions/deletions (InDels), and copy number variants (CNVs). We provide variants analysis including high-quality mapping of short and long read data, in silico mutation prioritization and summarizing SNPs, InDels, and CNVs in your sample. We analyse your Sequence data in state-of-the-art pipelines to produce publication-ready results.

Whole-genome and targeted sequencing analysis data service

Whole-Genome Sequencing (WGS) is ushering a new era in genomic healthcare, personalized medicine and that Variant Calling (VC) is an important aspect of genomic studies as accurate discovery of high-resolution genetic variation of model organisms and humans can be used to inform clinical decisions. We analyse your epigenomics data in state-of-the-art pipelines to produce publication-ready results.

Understand disease etiology and phenotypes through genomic variation and mutations.

WHOLE-GENOME, whole-exome and targeted DNA sequencing allows identifying and studying genetic variants or mutations. Our genome variation analysis identifies SNPs, indels, gene copy numbers, and genomic rearrangements from the various types of DNA-sequencing and microarray data.


Annotating the variants with population allele frequencies, pathogenicity predictions and known clinical associations allows us to focus on the variants that matter. Tailored downstream bioinformatics analysis of variants together with phenotypic data enables the discovery of novel associations and biomarkers.


For microbes and non-model organisms, we produce annotated, quality-controlled genome assemblies in order to ensure the best possible starting point for future studies. 


Understand the effects of genomic variation and mutations with DNA sequencing data analysis


WE ROUTINELY ANALYZE whole genome, whole exome and targeted re-sequencing data of well-characterized organisms, such as humans, with an aim to mapping and identifying small genetic variation (single nucleotide polymorphisms, and short insertions and deletions (indels)) using established best practice methods. In order to help our clients, interpret their variant data, we continuously develop our DNA sequencing data analysis and variant annotation pipeline so as to include more and more information on each identified variant. For example, pathogenicity predictions and minor allele frequencies in databases such as 1000 Genomes and GnomAD provide an excellent way of filtering irrelevant variation from your results. 


OUR DNA SEQUENCING DATA ANALYSIS WORKFLOW also adapts to assembling genomes of non-model organisms. We produce genome assemblies based on WGS data that are then computationally post-processed to achieve the best possible quality. The assembled genomes are then annotated using gene prediction, automated homology searches using genome databases, and gene annotation transfer from closely related organisms. Thorough annotation of novel genomes ensures the best possible starting point for transcriptome studies on these organisms. 

Read more about typical DNA sequencing data analyses:

Our statistical approaches to variant calling employ current best practices that result in a reliable set of variants. Natural variants and single nucleotide polymorphisms can be called against any reference genome in any organism or even a genomic ensemble compiled from individual genomes from sequencing projects for better representation. In addition to high confidence variants, we report regions of low coverage where the variant caller was not able to determine the sequence of samples. Whole genome, whole exome or targeted DNA-sequencing all enable variant calling equally well. The lists of variants can be further combined, compared and filtered in order to find disease-causing de novo germ line variants in trio studies, for example.
Full variant lists for all samples with evidence from data-based evidence.
Filtered variant lists based on any criteria (e.g. germ line control for mutations).
Low-coverage regions where variants could not be called.

Genetic variants are annotated with information regarding their location in the genome, variant type (homozygous/heterozygous), evidence from data (supporting reads), functional classification for exonic variants, amino acid changes in all isoforms, database identifiers for known variants, observed minor allele frequencies in several genome databases, or even your own data. We also provide pathogenicity predictions for each exonic variant using several types of prediction software. Flexible ranking and filtering of the variants based on these annotations enables easy interpretation of complex genomic data for a geneticist or a physician.


  • Functional and location annotation for every variant
  • Minor allele frequencies in relevant databases
  • Database identifiers for known variants
  • Pathogenicity predictions

Gene copy numbers can be deduced from sequencing data using our statistical approaches for analysing both coverage information and allele frequency information. The analysis yields copy numbers for each chromosome-scale segment, gene, and exon independently. Gene copy numbers can be further integrated into expression data, for example, in order to find significant gene dosage effects.


  • Copy number for each chromosome
  • Gene copy number for each gene
  • Copy number for each exon

Whole genome sequencing data coupled with mate pair information from paired-end sequencing can be used to study copy number-neutral genomic rearrangements such as inversions and translocations. These can result in fusion genes that are critically linked to formation of cancer, for example. We deliver a report of the altered genome structure with ranked fusion genes that can be validated with RNA-sequencing data.


  • List of potential fusion genes
  • List of all rearrangements

For simpler organisms, we offer assembly of their genomes de novo based on DNA-sequencing data. Our approach is based on building a consensus assembly from outputs of several assembly tools, and then running computational post-assembly improvement software. If a draft genome exists, we can refine it computationally by joining contigs and resolving errors using improvement tools or additional DNA-seq or RNA-seq data.


  • Assembled contigs in FASTA format
  • Computationally-refined genome assembly
  • Quality estimation scores

Assembled genomes can always be annotated using gene and oriC prediction software and/or based on RNA-seq data. We predict gene identities for all putative genes by comparing their sequence to several genome databases. For genes with less sequence similarity, functions can be predicted by identified functional domains. If annotated genomes for close relatives exist, we can improve the annotation by transferring gene information to the unannotated genome using sequence alignment-based approaches. The result is a comprehensive list of genes with their specific coordinates in the genome.



  • Loci of predicted genes
  • Fully annotated genes based on homology searches
  • Validated genes based on RNA-seq data

Circulating cell-free DNA has potential uses in non-invasive genomic biomarkers, in particular for prenatal diagnosis and oncology. The mere presence of certain DNA sequences in plasma can reveal a tumor undetected by other means. Furthermore, mutations detected in circulating DNA can be used as markers in personalizing treatment and prognosis. Our pipeline for cell-free DNA-based biomarker discovery starts with a full quality control of the data followed by a statistical comparison of pathological and control groups in order to reveal biomarkers with the optimal combination of sensitivity and specificity. Considering biological factors along with clinical feasibility, we summarize the analysis by highlighting the most promising biomarker candidates.



  • List of biomarker candidates from cell-free DNA
  • Sensitivity and specificity estimations for each candidate
  • Database identifiers for known mutations and pathogenicity predictions

RNA sequencing data analysis

RNA sequencing data analysis brings to light the intricate mechanisms of gene regulation. We analyse your epigenomics data in state-of-the-art pipelines to produce publication-ready results.

TRANSCRIPTOME-WIDE ANALYSES of gene expression are currently extremely popular among researchers studying gene regulation in biological systems ranging from single cells to tissues and complex microbiomes. Typically, our customers are interested in differential gene expression based on RNA sequencing, single-cell RNA sequencing or microRNA sequencing measurements, followed by pathway analysis and integration with other omics modalities, such as epigenomics.

FOR NON-MODEL ORGANISMS, and those with very dynamic genomes, i.e. microbes, we typically start RNA sequencing data analysis with assembling a transcriptome de novo and annotating it using homologues of related species and computational gene predictions. A new reference transcriptome is an invaluable resource for your further research, and that of the entire research community. Once a high-quality reference transcriptome has been established, the door opens to most downstream analyses which are routinely used with model organisms.

Epigenomics data Analysis

See behind expression patterns with genome-wide epigenomics data analysis. We analyse your epigenomics data in state-of-the-art pipelines to produce publication-ready results.

RESEARCHERS VENTURING into epigenomics often aim to map the dynamic state of the DNA in order to explain phenomena observed via gene expression studies. Our ChIP-sequencing data analysis pipeline has been optimized to identify both narrow transcription factor binding sites and wider histone binding sites. Clients interested in the binding motifs can opt for our motif discovery analysis, which is based on sequences where these molecules bind.

Read more about our epigenomics data analysis:

Chromatin immunoprecipitation of a DNA-binding protein coupled with next-generation sequencing (ChIP-seq) is one of the most widely used high-throughput epigenomics measurement methods. From such data, we can identify protein binding sites throughout the genome. We deliver a list of significant peaks that are annotated with the genomic location and statistical information, such as width, number of reads, significance p-values, location relative to the nearest genes (distance to TSS), location within genes (exon, intron, UTR), and the binding motif found within the peak. The binding sites are often studied in parallel with transcriptomics data in order to reveal the genes that are likely to be under regulation by the DNA-binding protein of interest. If expression data exist, the expression of nearest gene will also be included to make the association easier.


  • Statistically significant binding locations
  • Functional annotation of the binding loci
  • Comparison of binding events between samples

Using antibodies that target specific histone modifications, then pulling down and sequencing the DNA results in genome-wide epigenomics data indicating the positions of modified histones. Targeting multiple different markers and integrating the data can reveal a map of chromatin state that indicates promoters, active and inactive enhancers and actively transcribed genes. In addition to the fully annotated locations of histones with specific modifications, we can also deliver an interpretation of the chromatin state, given enough measurements of different modifications.


  • Statistically significant binding locations with functional annotation
  • Comparison of binding events between samples
  • Chromatin state interpretation based on combinations of histone modifications

Addition of methyl groups to cytosines in DNA modifies the expression levels of nearby genes. Methylated DNA can either be pulled down and sequenced (MeDIP-seq) or unmethylated cytosines can be converted to uracil and sequenced (bisulphite sequencing). We can map, annotate and compare the methylated CpG islands using a range of protocols in order to make it easier for you to interpret your results. Methylation profiles can also be analysed integratively with other epigenomics or transcriptomics measurements.


  • Quantification of methylation for all CpG islands
  • Functional annotation for differentially methylated CpG islands
  • Methylated individual cytosines (for BS- and RRBS-seq)

The dynamic state of chromatin is a central focus within epigenomics and can be studied by measuring the openness of chromatin throughout the genome. Segments of genome that are not tightly packed can be mapped using ATAC sequencing. Open chromatin is connected to active regions regarding gene expression or regulation of expression. Therefore, transcriptomics data are usually integrated with open chromatin information. Our analysis will indicate regions of open chromatin, with annotations regarding what genes are within that region and how the regions may have changed between your samples.


  • Loci of open chromatin
  • Differentially open chromatin between samples
  • Functional annotation of open chromatin loci

Microbiomes Profiling and functional analysis

See behind expression patterns with genome-wide epigenomics data analysis. We analyse your epigenomics data in state-of-the-art pipelines to produce publication-ready results.

Professional microbiome profiling service

Omics data Solutions offer a wide range of microbiome diagnostics. Our certified partner laboratories employ highly standardized NGS methods to enable accurate and precise determination of microbial community composition from a wide variety of sources. 

We are proud to be entrusted with the analysis of microbial samples by a large number of public and private institutions.


We offer highly standardized NGS microbiome analyses for:

  • Human health
  • Agriculture
  • Environmental monitoring

See detailed description below.

We support your ecological microbial research based on next-generation sequencing. We deliver comprehensive analysis service of the microbiome for data derived from environmental samples, particularly of: 16S rDNA/ rRNA amplicons Shotgun metagenomics Shotgun meta-transcriptomics Phylogenetics analyses We use model-based multivariate statistics to determine the response of microorganisms to environmental factors. Moreover, we apply network analysis to unravel microbe-microbe interactions. This will help you to identify: Habitat-specific indicator taxa Association patterns Environmental factors governing microbial distribution Microbial Human Health:

The microbiome becomes the starting point of a new generation of probiotic and therapeutically effective bacteria. Our services identify candidate bacteria and support the development of new product lines.

Antibiotics (AB)
The use of AB changes the human intestinal microbiome. We can monitor the effect of AB over time and provide information about the resilience of human gut microbiota after completion of AB administration.

Pathogen Monitoring
NGS is used as a diagnostic tool in an increasing number of clinical applications. It enables standardized and accurate identification of pathogenic bacteria in patients as well as in the clinical environment as possible sources of infection for patients.

We also offer accurate and standardized microbiome profiling service for clinical trials.


Microbial Ecology :

  • Microbes play key roles in biogeochemical cycles and as symbionts of animals and plants. Describing their distribution and functional repertoire are keys to understanding these roles.
    • Environmental Monitoring & Ecology
      Microbial communities are the most critical component in every ecosystem. Our assays and accurate analyses of environmental samples from a wide variety of environmental sources allow accurate determination of the microbial composition in samples as well as an assessment of diversity and function.
    • Soil Amendment
      Different fertilization techniques have different effects on the soil microbiome and the resulting crop yield. Microbiome analysis offers the possibility to develop novel soil modifications and to take more targeted measures based on the relationship between microbial community composition and soil quality.

Let us assist you with your ecological project. For academic clients we help you from our expert partners to procure sequencing of amplicon or genomic libraries on Illumina®MiSeq™ and NextSeq™ platforms. Our scientists have long-term experience in ecological microbial research, including: Conception of study designs Marine microbiology Algal microbiota (e.g., Fucus vesiculosus) Microbiota of extreme habitats (e.g., oxygen minimum zones) Cnidaria-associated microbes (e.g., corals, jelly fish)


Next-generation sequencing has raised virological studies to a new level. Viral metagenomics (viromics) reveal critical information on the genetic diversity of a viral community, without the need for isolating and cultivating viral species. We analyse your epigenomics data in state-of-the-art pipelines to produce publication-ready results.


We employ dedicated bioinformatic tools for viral profiling to provide:

  • Viral genes and contigs (possibly whole genomes) in environmental metagenomes
  • Cleaning of viral metagenomes of contaminating host DNA
  • Taxonomic annotation
  • Matches between viral metagenomes and host (meta) genomes
  • Viral distribution patterns in relation to environmental or clinical conditions

Health Prediction

Omics Data Solutions utilizes the latest machine learning algorithms to produce robust risk prediction models, that combine genetics with lifestyle and environment to predict important health outcomes. We provide the clients our automate state-of-the-art app for an affordable personalized health track.

Aiming  to deliver accessible, affordable to all and equitable personalized health services to those who need them, when and where they need them within available digital and potable resources:

Omics Data Solutions utilizes the latest machine learning, security and privacy of user information algorithms  to provide robust personalized health and disease risk prediction mobile devices as a Service, that combine genomics with lifestyle and environment to predict important health outcomes, personalized tracking and decision support for wellbeing, health fitness, healthy eating and food.

Our technology uses world class datasets and combines the best algorithms to generate PRSs with the highest predictive power. Through the integration of these state-of-the-art PRSs with clinical risk factors, cumulation personal daily lifestyle, we provide personalized absolute health and disease risk prediction.

Embrace data-driven healthcare prevention. Offer world class PRS for clinical and healthcare grade risk prediction

Generate the best performing PRS using cutting-edge algorithms in parallel. Validate PRS on multiple populations with just two clicks.

Improve the power of clinical trials by stratifying participants based on genomic diversity.

Explore opportunities for drug repurposing.

Our Core Vision

Data Driven

Our products are developed and validated using world class datasets and state-of-the-art technology to ensure the highest level of quality.

Research Rigor

We are a group of international scientists and researchers committed to maintaining standards of academic excellence every step of the way


Everything we do works towards our common goal of providing tangible improvement in preventive healthcare to help humans live a longer, healthier life.

What we offer

Omics Data Solution’s Polygenic Risk Scores for common diseases have the highest predictive power on the market, allowing physicians to more effectively help patients lower their risk of life-threatening diseases.

Our secure digital platform enables researchers to construct and validate new Polygenic Risk Scores on multiple populations using the top algorithms in parallel to reveal the best Polygenic Risk Scores.

Sport Omics

Omics Data Solutions utilizes the latest machine learning algorithms to facilitate discovery of the genetic and lifestyle influence on sporting performance, training response, injury predisposition, and other potential determinants of successful human performance. We provide the clients our automate state-of-the-art app for an affordable personalized athletic performance prediction and track.

Sports genetics has had limited success in identifying genes associated with athletic performance.

LifeTracker is intelligent application that utilizes the latest machine learning, security and privacy of user information algorithms to provide robust personalized discovery of the genetic and lifestyle influence on sporting performance, training response, injury predisposition, and other potential determinants of successful human performance..

Ancestry Discovery

Discover how your unique DNA may tell your origin and shape who you are and your family genetics history using a dedicate package and software tool. We analyse your DNA data in state-of-the-art pipelines to produce publication-ready results

Omics Data Solutions offers its Software as a Service to analysis your DNA and tell who are your ancestral of origin, map them to some of your traits and the approximate time in your family lineage. This method summarizes the contribution of anywhere between hundreds to millions of genetic variants to determine your origin back in time and the genetic risk of developing diseases. 

This approach guarantees the maximum level of personalization and effectiveness of the results provided.

Scroll to Top
Scroll to Top