• 5 Posts
  • 1.36K Comments
Joined 11 months ago
cake
Cake day: August 8th, 2023

help-circle





  • It does sound to me like ingesting all these different formats into a normalized database (aka data warehousing) and then building your tools to report from that centralized warehouse is the way to go. Your warehouse could also track ingestion dates, original format converted from, etc. and then your tools only need to know that one source of truth.

    Is there any reason not to build this as a two-step process of 1) ingestion to a central database and 2) reporting from said database?














  • 4am@lemm.eetoScience Memes@mander.xyzElsevier
    link
    fedilink
    English
    arrow-up
    6
    ·
    8 days ago

    Those printer instructions are called Postscript and they’re the basis of PDF.

    You are thinking that the printing process will rasterize the PDF and then essentially OCR/vector map it back. It’s (usually) not that complicated.