Miscellaneous
These documents don’t fit anywhere, but they’re related to technology.
Dated Apr 1, 2019; last modified on Sun, 28 May 2023
These documents don’t fit anywhere, but they’re related to technology.
Random Link ¯\_(ツ)_/¯ | ||
Nov 27, 2024 | » | Consistent Hashing
2 min; updated Nov 27, 2024
Consistent hashing makes me think of hashing without randomization. Why isn’t every hash consistent by definition? For example, a map implementation would need consistent hashing lest it’s inaccurate when searching for stored values. Or is consistent hashing a tradeoff between collision-resistance and speed? Web Caching Was the original motivation for consistent hashing. With a web cache, if a browser requests a URL that is not in the cache, the page is downloaded from the server, and the result is sent to both the browser and the cache.... |
Jul 5, 2020 | » | Of Code Smells and Hygiene
10 min; updated Nov 30, 2022
#code-hygiene Pick up from: IEEE’s International Conference on Software Maintenance Improving Code: The (Mis)perception of Quality Metrics Springer’s Software Quality Journal On Abstractions If you find yourself adding more parameters and if-statements to an existing abstraction, is the abstraction still apt? Why not remove the old abstraction and re-extract a new [more apt] abstraction? Devs frequently succumb to the sunken cost fallacy thinking that there must have been a reason that the code was written in a certain way.... |
Apr 24, 2020 | » | On Data Science
1 min; updated Sep 18, 2022
Data Science: Reality Doesn’t Meet Expectations Execs frequently ignore data science research when making decisions. Data is often dirty, or insufficient to make decisions about majority of the users. Sometimes the infrastructure is poor - SQL queries take hours. Data Scientists are usually the only ‘data person’ on the team. Tons of request from teams, and most of the work is repetitive and ‘easy’. Measuring impact is hard - especially on dollars that were hypothetically saved but never spent.... |
Apr 11, 2020 | » | APIs Are Not In the Efficiency Business
3 min; updated Sep 5, 2022
CityFlo Reduces Google Maps API Bill by 94% Google Directions API limited their free tier . CityFlo uses this API to calculate ETAs. Initially, CityFlo queried the API every time a stop was registered. This was the problem. The number of API calls should be dependent on the stops themselves, not the buses. For an MVP, sure - make that query - but if you’re running a growing business, those calls add up!... |
Oct 1, 2018 | » | Kernighan's Remarks on the Digital Humanities
1 min; updated Feb 5, 2022
OCR tends to miss (sometimes) really important context. Make use of derived data, e.g. age at marriage, gender differences, mortality rates, mapping migrations, last names Don’t spend all your time cleaning data. Try a sequence of things, not one big idea. Don’t expect existing tools to do the whole job. All the while, building new tools is not as straightforward. Sample Digital Humanities Projects ORBIS by Stanford .... |