OCR tends to miss (sometimes) really important context.
Make use of derived data, e.g. age at marriage, gender differences, mortality rates, mapping migrations, last names
Don’t spend all your time cleaning data.
Try a sequence of things, not one big idea.
Don’t expect existing tools to do the whole job. All the while, building new tools is not as straightforward.
Sample Digital Humanities Projects ORBIS by Stanford ....
CityFlo Reduces Google Maps API Bill by 94% Google Directions API limited their free tier . CityFlo uses this API to calculate ETAs. Initially, CityFlo queried the API every time a stop was registered.
This was the problem. The number of API calls should be dependent on the stops themselves, not the buses. For an MVP, sure - make that query - but if you’re running a growing business, those calls add up!...
On Abstractions If you find yourself adding more parameters and if-statements to an existing abstraction, is the abstraction still apt? Why not remove the old abstraction and re-extract a new [more apt] abstraction? Devs frequently succumb to the sunken cost fallacy thinking that there must have been a reason that the code was written in a certain way.
References The Wrong Abstraction. Sandi Metz. www.sandimetz.com ....