Benjamin Bengfort

Benjamin Bengfort is an experienced data scientist and software engineer who focuses on implementing data products that can learn from real-time streaming data.

Photo of Benjamin Bengfort

  Benjamin is the program director of the Georgetown Data Science Certificate programs where he also teaches Machine Learning and courses in Advanced Data Science. He is also currently a research scientist at Wright State Research Institute where he explores applied multi-agent artificial intelligence in a distributed environment. His background includes a wide variety of professional, military, and academic experience in software engineering, distributed systems, and machine learning. From developing natural language systems that can match students to their reading level to real time event detection and classification on the electric transmission grid , Benjamin’s focus has been on the direct application of machine learning to solve real-world problems.   Benjamin is also deeply involved with the data science community. He is the founder and maintainer of scikit-yellowbrick, an open source visual steering and diagnostic library for machine learning and also contributes to a number of open source libraries. He is also an emeritus board member and active participant in Data Community DC and a contributor at District Data Labs. He has also mentored students as part of the Google Summer of Code and leads research labs and code sprints to introduce others to open source.   His main research interests include distributed storage systems, natural language processing, machine and statistical learning, distributed computation, and multi-agent systems. Benjamin has published articles in the ACM PODC, IEEE ICDCS, SSCI, and WCNC conferences as well as several OReilly books and is a frequent speaker at events including Strata + Hadoop World, PyCon, and the NumFOCUS PyData series. His primary publication topics are big data, distributed analytics, graph analytics and natural language processing. His books include The Practical Data Science Cookbook (Packt), Data Analytics with Hadoop: An Introduction for Data Scientists (OReilly) and Applied Text Analytics with Python: Enabling Language Aware Data Products with Machine Learning (OReilly).