Salomon Kabongo
PhD in NLP · Lead Software Engineer · Masakhane Board Member
Experience
Lead Software Engineer
State Farm — Innovation Group
Feb 2022 – Present
Bloomington-Normal, IL
4 Patent Filings- Designed automated pre-labeling pipelines using embedding-based retrieval to accelerate data annotation.
- Architected a proprietary document deduplication system utilizing visual similarity and hashing algorithms to identify near-duplicates, significantly streamlining business workflows.
- Led R&D initiatives on Synthetic Media (Deepfake) detection and Video Understanding; benchmarked Visual Language Models (VLMs) against vendor solutions.
- Invented novel computer vision applications for the insurance domain in AI/ML, resulting in 1 issued patent and 3+ additional filings pending.
Board Member
Masakhane Research Foundation
2021 – May 2026
Global
~$9M Research Funding- Spearheaded the strategic formation of the Masakhane AI Hub, defining the 2025–2029 roadmap to build digital public infrastructure for 1 billion+ African language speakers.
- Secured and oversaw the execution of ~$9M USD in research funding (including $5M from the Bill & Melinda Gates Foundation and $4M from IDRC) to democratize AI access.
- Led high-level collaborations with strategic partners including Google.org, Lacuna Fund, and UNESCO, scaling the community's impact across 50+ African languages.
Research Assistant
L3S / Leibniz Information Center for Science & Technology (TIB)
Nov 2020 – Nov 2022
Hannover, Germany
- Engineered the core "Leaderboards" feature for the Open Research Knowledge Graph (ORKG), utilizing Knowledge Graphs to automatically track and visualize state-of-the-art (SOTA) progress across scientific publications.
- Collaborated with Hannover Medical School (MHH) on personalized medicine research, applying machine learning techniques to analyze large-scale genetic datasets for predictive healthcare outcomes.
- Conducted research on Scholarly Information Extraction, developing novel NLP pipelines to extract metric data from unstructured text for knowledge graph construction.
Education
PhD in Computer Science — AI / Natural Language Processing (LLMs)
Leibniz Universität Hannover
Nov 2020 – Nov 2025
Hannover, Germany
Master's in Machine Intelligence
African Master's in Machine Intelligence (AMMI)
Sponsored by Google and Facebook through AIMS
Oct 2019 – Nov 2020
Accra, Ghana
Master's in Mathematical Sciences
University of the Western Cape
African Institute for Mathematical Sciences (AIMS South Africa)
Aug 2018 – Jun 2019
Cape Town, South Africa
BSc (Honours) in Mathematics & Computer Science
Université de Lubumbashi
Oct 2014 – Jul 2017
Lubumbashi, DRC
Selected Publications & Patents
Systems and Methods for Advanced Duplicate Image Search and Analysis
2024US Patent
App. 18/652,500, No. US20240411724A1 · Assignee: State Farm
🏆 US PatentBibletts & LiSTra: African Speech Corpora
2022Interspeech, NeurIPS Workshops
High-fidelity multilingual speech corpus; first English-to-Lingala speech translation baseline
Automated Mining of Leaderboards for Empirical AI Research
2021ICADL · International Journal on Digital Libraries
30+ citations. SOTA metric extraction from scientific text.
🏆 ICADL 2021 Best Paper AwardParticipatory Research for Low-Resourced Machine Translation
2020EMNLP Findings
280+ citations. Standard benchmark for African Language NLP.
Technical Skills
Languages
Deep Learning & AI
Cloud & MLOps
Research Areas
Awards & Honors
US Patent Issued (AI/ML) + 3 Pending
State Farm — Innovation Group
ICADL Best Paper Award
International Conference on Asian Digital Libraries
DLRL Summer School
CIFAR / Mila, Montreal
Google Hash Code — Ranked 1747/10724
ACM Future of Computing Academy (FCA) Member
Association for Computing Machinery — 36 selected globally