Skip to main content


Matt Jansen spoke on the development of the “On the Books: Jim Crow and Algorithms of Resistance” project, a collections as data and machine learning project of the University of North Carolina at Chapel Hill Libraries with the goal of discovering Jim Crow and racially-based legislation signed into law in North Carolina between Reconstruction and the Civil Rights Movement (1866/67-1967). The project team intends to create a comprehensive list of Jim Crow laws using text analysis and building on work done previously by subject matter experts Pauli Murray and Richard A. Pascal. Jansen explained that the team used digitized North Carolina Session Laws from 1866/67-1967 as raw material and had to take historical practices of the period into context when determining the scope of the project, what metadata was needed and how to make the units of analysis useful later on. Automated pattern matching and manual clean-up were used together to determine publishing oddities as well as Optical Character Recognition (OCR) errors. The first phase of the project resulted in a corpus of nearly 2,000 identified Jim Crow laws, over 1,300 of which were identified by modeling. Jansen finished his talk with an overview of the second phase of funding, which aims to identify more Jim Crow laws and examine how laws moved between states, and a call for participation for the audience to apply to become teaching or research fellows. Jansen was joined by project team members Amanda Henley, Lorin Bruckner, and Brianna Nuñez during the Q&A portion of the event to provide more context on the project overall.


Matt Jansen, Data Analysis Librarian (Digital Research Services)

Department: University Libraries | Faculty Profile

Featured on: February 24, 2022 (Event Page)

Session Title: Tackling Underrepresentation with Data Science (Event Recap

Tools, Information, and Resources: