mirror of
				https://github.com/lleene/hugo-site.git
				synced 2025-10-26 17:59:03 +01:00 
			
		
		
		
	WIP: formatting publications on personal website.
This commit is contained in:
		| @ -1,9 +1,9 @@ | ||||
| --- | ||||
| title: "Hugo Short Codes" | ||||
| date: 2022-06-14T19:36:18+02:00 | ||||
| draft: false | ||||
| draft: true | ||||
| toc: false | ||||
| tags:  | ||||
| tags: | ||||
|   - hugo | ||||
|   - code | ||||
| --- | ||||
| @ -73,4 +73,3 @@ railroad.Diagram("foo", railroad.Choice(0, "bar", "baz"), css=style) | ||||
| {{< python-svg dest="/images/posts/test.svg" title="This is a python-svg exmaple." >}} | ||||
| railroad.Diagram("foo", railroad.Choice(0, "bar", "baz"), css=style) | ||||
| {{< /python-svg >}} | ||||
|  | ||||
|  | ||||
| @ -13,9 +13,9 @@ tags: | ||||
| Recently I started porting some of my latex articles to markdown as they would | ||||
| make a fine contribution to this website in simpler format. Making a simple | ||||
| parser python isn't that bad and I could have used [Pandoc](https://pandoc.org/index.html) | ||||
| but I wanted a particular format for rendering a hugo markdown page. So I | ||||
| prepared several regex-based functions in python to dereference and construct | ||||
| a hugo-compatible markdown file. | ||||
| but I wanted to keep formatting as simple as possible when rendering a hugo | ||||
| markdown page. So I prepared several regex-based functions in python to | ||||
| dereference and construct a hugo-compatible markdown file. | ||||
|  | ||||
| ``` python3 | ||||
| class LatexFile: | ||||
| @ -39,16 +39,18 @@ class LatexFile: | ||||
| ``` | ||||
|  | ||||
| The general process for converting a Latex document is outlined above. The | ||||
| principle here is to create a flat text source which we then incrementally | ||||
| format such that Latex components are translated correctly. | ||||
| principle here is to process a flat text source which we then incrementally | ||||
| format such that Latex components are translated incrementally and replaced | ||||
| by plain text with markdown syntax. | ||||
|  | ||||
|  | ||||
| ## Latex Components | ||||
|  | ||||
| In order to structure the python code I created several named-tuples for | ||||
| self-contained Latex contexts such as figures, tables, equations, etc. then | ||||
| by adding a `markdown` property we can replace these sections with hugo | ||||
| friendly syntax using short-codes where appropriate. | ||||
| self-contained Latex contexts such as figures, tables, equations, etc. Then | ||||
| by adding a `markdown` property we can create a collection of objects | ||||
| where we can simple replace the corresponding latex code in a predictable | ||||
| manner. | ||||
|  | ||||
| ``` python3 | ||||
| class Figure(NamedTuple): | ||||
| @ -68,8 +70,85 @@ class Figure(NamedTuple): | ||||
|             fig_str += "{{" + f'< figure src="{file}" width="500" >' + "}}\n" | ||||
|         fig_str += ( | ||||
|             "{{" | ||||
|             + f'< figure src="{self.files[-1] if self.files else ""}" title="Figure {self.index}: {self.caption}" width="500" >' | ||||
|             + f'< figure src="{self.files[-1] if self.files else ""}" ' | ||||
|             + f'title="Figure {self.index}: {self.caption}" width="500" >' | ||||
|             + "}}\n" | ||||
|         ) | ||||
|         return fig_str | ||||
| ``` | ||||
|  | ||||
| Notice that here we use a hugo short-code for when representing the figure in | ||||
| markdown. This lets us set with and other properties in a simpler and more | ||||
| systematic way. | ||||
|  | ||||
| ## Replacement Procedure | ||||
|  | ||||
| As mentioned before the replacement simply looks for sections in the source and | ||||
| directly replaces them with appropriate markdown text. In order to do this it | ||||
| is important to process the source code in reverse order such that the text | ||||
| location references remain correct as the replacement occurs. | ||||
|  | ||||
| ``` python3 | ||||
| def replace_figures(self) -> None: | ||||
|     """Dereference and replace all figures with markdown formatting.""" | ||||
|     fig_list = self.figures | ||||
|     fig_list.reverse() | ||||
|     for figure in fig_list: | ||||
|         self.tex_src = ( | ||||
|             self.tex_src[: figure.span[0]] | ||||
|             + figure.markdown | ||||
|             + self.tex_src[figure.span[1] :] | ||||
|         ) | ||||
|     for figure in fig_list: | ||||
|         self.tex_src = re.sub( | ||||
|             "\\\\ref\{" + figure.label + "\}", | ||||
|             str(figure.index), | ||||
|             self.tex_src, | ||||
|         ) | ||||
| ``` | ||||
|  | ||||
| Secondly we also replace the latex references with plain text references. This | ||||
| means that instead of using labels that are translated during compilation into | ||||
| numbers we directly reference the figure number. | ||||
|  | ||||
| ``` python3 | ||||
| @property | ||||
| def figures(self) -> List[Figure]: | ||||
|     """Parse TEX contents for context eces.""" | ||||
|     return [ | ||||
|         Figure( | ||||
|             span=(begin.start(), stop.end()), | ||||
|             index=index + 1, | ||||
|             files=[ | ||||
|                 elem[1] | ||||
|                 for elem in re.findall( | ||||
|                     "\\\\includegraphics(.*)\{(.*)\}", | ||||
|                     self.tex_src[begin.start() : stop.end()], | ||||
|                 ) | ||||
|             ], | ||||
|             caption=self.first( | ||||
|                 re.findall( | ||||
|                     "\\\\caption\{(.*)\}", | ||||
|                     self.tex_src[begin.start() : stop.end()], | ||||
|                 ) | ||||
|             ), | ||||
|             label=self.first( | ||||
|                 re.findall( | ||||
|                     "\\\\label\{(.*)\}", | ||||
|                     self.tex_src[begin.start() : stop.end()], | ||||
|                 ) | ||||
|             ), | ||||
|         ) | ||||
|         for index, (begin, stop) in enumerate( | ||||
|             zip( | ||||
|                 re.finditer("\\\\begin\{figure\*?\}", self.tex_src), | ||||
|                 re.finditer("\\\\end\{figure\*?\}", self.tex_src), | ||||
|             ) | ||||
|         ) | ||||
|     ] | ||||
| ``` | ||||
|  | ||||
| The piece of python code above exemplifies how we capture all figures found in | ||||
| the latex source code and aggregate them in a list of named-tuples. Naturally | ||||
| this is dependent on the style used when writing latex but I generally try | ||||
| to keep latex-code a simple and systematic as possible. | ||||
|  | ||||
		Reference in New Issue
	
	Block a user