Skip to main content

Mentor Areas

  • Computational Genomics & Algorithms
    • Methods development for high-throughput genomics, including sequence mappers and variant callers (short- and long-read).
    • Algorithm design for sensitive and scalable alignment, split-read and discordant-pair evidence integration, and error-modeling.
  • Variant Discovery & Interpretation
    • Comprehensive detection and genotyping of structural variants (SVs) across population and disease cohorts.
    • Integration of SVs with small variants and haplotypes for downstream association and functional analyses.
  • Medical & Neurodegenerative Genomics
    • Study design, QC, joint calling, and association testing for disease genomics.
    • Participation in Alzheimer’s Disease Sequencing Project (ADSP) working groups; close collaboration on UPenn AD datasets.
  • Software Engineering for Bioinformatics
    • Creator of widely used open-source mappers and variant detectors; emphasis on performance optimization, reproducibility, documentation, and community support.
    • Pipeline development and workflow orchestration for large-scale WGS/WES/long-read datasets.

Description:

My research focuses on algorithm/methodology development for genomics data analysis. I developed several pre-eminent software, including sequence mappers and variant detectors. Those tools are widely used from population re-sequencing projects through medical sequencing studies, and make significant contributions to the community. My recent research focuses on discovering human genomic variations, especially structural variations such as deletions, insertions, tandem duplications, inversions, and translocations. They are major contributors to genetic diversity and usually associated with human diseases. I am an active member in the structural variation group of the 1000 Genomes Project since 2009 and in the Human Genome Structural Variation Consortium since 2014. My primary contributions were to the alignment and structural variant detection. Since I join UPenn, I closely work on the Alzheimer’s Disease genomics data analysis and participate working groups in the Alzheimer’s Disease Sequencing Project (ADSP). I will contribute my Bioinformatics analysis skill and software development experience to the proposed study.

Preferred Qualifications

  • Linux skills
  • Knowledge of whole genome sequencing data

Details:

Preferred Student Year

Junior, Senior

Academic Term

Fall, Spring, Summer

I prefer to have students start during the above term(s).

Volunteer

Yes

Yes indicates that faculty are open to volunteers.

Paid

Yes

Yes indicates that faculty are open to paying students they engage in their research, regardless of their work-study eligibility.

Work Study

Yes

Yes indicates that faculty are open to hiring work-study-eligible students.