Okay, so quick status update: there are two ways to build the book into a PDF - via LaTeX and via HTML. Each of the options currently has some problems
problems with LaTeX
there's a specific SVG file in the book that makes imagemagick (somewhat) shit its pants (pardon my french). (book/figures/pd/economic_opt.svg). imagemagick manages to process it, but returns exit code 1, which sphinx really doesn't like.
the underlying issue is that the old version of the librsvg library (which is used by imagemagick) is unable to handle the SVG in question. I've recompiled ImageMagick, the SVG no longer causes problems.
latexmk refuses to output a PDF if there are any errors with the produced LaTeX. turns out, there is a lot of them
missing pictures (referenced in text, not present in directory structure - mainly in the old/ directory)
path issues with README: the readme file references a few images in ./book/figures/.... All fine and good, except for the fact that book/intro.md contains the ```{include} ../README.md directive, and the LaTeX compiler is now unable to find the picutres, as it tries to search for them in ./book/book/figures.
There's a bunch of invalid references. latexmk refuses to compile.
There's a bunch of invalid characters. latexmk refuses to compile. I'm guessing that this problem could be solved by installing some font package, but didn't get that far into the research yet.
Perhaps there is a way to make latexmk ignore all the errors and spit something out, and we can proceed from there, once we see if the overall format is even good enough for us to consider fixing errors.
problems with HTML
HTML -> PDF conversion is done via chromium "print to file" functionality (well, at least I think so. I'm 100% sure that chromium is being used, not sure what happens internally). Chromium doesn't have all necessary dependencies installed out of the box, so we need to figure out which dependencies are needed. Perhaps we can install chromium manually, and hope that apt will download everything, and when we run the htmlpdf builder, the chromium instance downloaded by jupyter book will just use the installed libraries.
conclusion
Harder than I thought it would be. Seems like using HTML will be less painful, but there's a chance that the formatting of the latex document will be better. We'll see. My plan for now is to proceed down the HTML path, get something to render, then try and get something out of the LaTeX renderer, and compare the results.
interesting! i think that on the one hand, this is another good reason to clean up the way we build the book (e.g., broken or poorly stored figures), which can be handled in the CI/CD tests. On the other hand, we need to think of how we will use the PDF. There are two reasons I can think of: offline viewing and editing/review. I think for students it is much better to just view the HTML offline - no extra maintenance necessary except for providing the build folder somehow via the website (make this a separate issue?). For editing/review, it doesn't need to be beautiful, and in fact, printing the HTML view to PDF has the advantage that teachers can comment directly on the appearance. so maybe better to use the easiest print to PDF option?
So in summary, i agree with proceeding down the HTML path. My question would be how "easy" is it to have something that still makes the book structures somewhat logical? Will it print sequentially page by page, then we could slap a printout of the ToC on page 1?
i can't recall how the SVG in question was created, since I can't find the code in the app; it was converted from an old lecture notes file, maybe with an online conversion app from a raster format, but also some were done with code or perhaps used Xournal++. i think @cjungbacker did it - do you remember how? I can't find any code in the repo
I'll check if chrome renders them correctly. If it does, the origin of the pictures doesn't really matter, as both LaTeX and HTML builders handle them properly.
this looks pretty good! Looks like there are some bad formatted pages, but it's limited to the nb's that were not developed to be included in a JB so it's not a problem at all. The JB pages look fine. This should definitely not be prioritized over other tasks like mudlisher, but a few things that (hopefully) are easily incorporated:
text at bottom is good (page numbers and date) but the time can be left out (and should fit on the page)
I bet if you can force a new page to start every time there is a top level header # ... or new file (should be the same thing), it will make the PDF more readable - easier to tell the chapters apart
also, it looks like this is currently a script on your computer, is the idea to put it on GitLab somewhere? If so, might be good to put into the interactivetextbooks-CITG Group as a utility---i bet there would be other people interested in using it and developing it further.
text at bottom is good (page numbers and date) but the time can be left out (and should fit on the page)
I bet if you can force a new page to start every time there is a top level header # ... or new file (should be the same thing), it will make the PDF more readable - easier to tell the chapters apart
add cover page that includes MUDE logo, MUDE-CEG@tudelft.nl, TUD logo, URL, date and time of PDF creation