Benjamin Bengfort

Benjamin Bengfort is an experienced data scientist and software engineer who focuses on implementing data products that can learn from real-time streaming data.

Photo of Benjamin Bengfort

Benjamin is currently concluding his dissertation in Computer Science at The University of Maryland at College Park, where he conducts research on applied machine learning and distributed systems. His background includes significant professional and military experience in addition to his academic work; he has been a Python software developer for 11 years, and a data scientist since earning his Master of Science, Computer Science from North Dakota State University in 2008. He is an emeritus board member of Data Community DC, and head faculty member at District Data Labs. In his role at District Data Labs, he collaborates with local developers on inclusive, high impact open source software. He is one of the core developers of Scikit-Yellowbrick, a visual steering library for machine learning with Scikit-Learn. His main research interests include distributed storage systems, natural language processing, machine and statistical learning, distributed computation, and multi-agent systems. Benjamin has published articles in the ACM PODC, IEEE ICDCS, SSCI, and WCNC conferences as well as several OReilly books and is a frequent speaker at events including Strata + Hadoop World, PyCon, and the Numfocus PyData series. His primary publication topics are big data, distributed analytics, graph analytics and natural language processing. His books include The Practical Data Science Cookbook (Packt), Data Analytics with Hadoop: An Introduction for Data Scientists (OReilly) and the forthcoming Applied Text Analytics with Python (OReilly).