Back to jobs
Boston, MA, USA
2026-03-06
Xellar Biosystems
North America
Data Science Co-op/Summer Intern – Biological Data Mining
Role Description
**Position Overview**
We are seeking a highly motivated graduate student to join our team as a Co-op or Summer Intern, focusing on multi-modal data integration and advanced data mining for drug discovery. This role is part of our Data Science research program, where you will leverage cutting-edge AI technologies and data mining techniques to accelerate the discovery of drug targets and new opportunities for drug repurposing. You will work closely with cross-functional teams of data scientists, computational biologists, and drug discovery researchers, gaining hands-on experience in translating data-driven insights into actionable drug discovery strategies.
**Key Responsibilities**
* Participate in the design and implementation of advanced data mining workflows to support novel drug target discovery and drug repurposing initiatives.
* Preprocess, clean, and analyze large-scale biological and chemical datasets, ensuring data quality and integrity for research applications.
* Integrate diverse datasets, including multi-omic profiling (genomics, transcriptomics, proteomics) for disease molecular signature discovery, structured drug data (molecular structures, PK/PD, toxicity) for druggability prioritization, and real-world data for target validation and repurposing opportunity identification.
* Assist in developing and optimizing models to predict target-disease associations, evaluate druggability of potential targets, and validate repurposing candidates.
* Leverage cutting-edge AI algorithms and statistics models to extract insights from scientific literature, patents, and multi-omic datasets.
* Collaborate with team members to document research findings, prepare technical reports, and present results in team meetings.
**Required Qualifications**
* Currently enrolled in a graduate program (Master’s or PhD) in Computational Biology, Bioinformatics, Data Science, Computer Science, Statistics, Biostatistics, Biomedical Engineering, or a related field.
* Familiarity with multi-omic data (genomics, transcriptomics, proteomics), drug databases (e.g., PubChem, ChEMBL), scientific literature mining, or knowledge graph construction is highly desirable.
* Strong foundational knowledge in data mining, machine learning, and AI, with practical experience in applying these techniques to biological or chemical datasets.
* Proficiency in programming languages commonly used in data science and bioinformatics, such as Python (preferred) or R, with experience in libraries/tools including Pandas, NumPy, Scikit-learn, TensorFlow/PyTorch, or similar.
* Basic understanding of drug discovery processes, target-disease associations, and druggability criteria is a plus.
* Ability to work independently and collaboratively in a fast-paced research environment, with strong problem-solving skills and attention to detail.
**Preferred Qualifications**
* Prior experience in a research lab or industry setting related to drug discovery, computational biology, or data science.
* Knowledge of molecular biology, biochemistry, or pharmacology fundamentals.
* Experience with LLM-powered text mining or knowledge graph embedding techniques.
* Residents of Massachusetts or students currently attending a Massachusetts institution of higher education is a plus
Pay: $25\.00 - $40\.00 per hour
Work Location: In person