補一下 #ods#csv 不用全部讀到記憶體的 #xmlstarlet 串流解法,其中 `table:` 是 xml namespace,我是直接寫死。

```sh
7z x -so my.ods content.xml |
xmlstarlet pyx | awk -F "" '
/^\)table:table-row/ { printf "\n" }
/^\(table:table-row/ {mcell = 0}
/^\(table:table-cell/ { if (mcell == 1) printf ","; mcell = 1 }
(mcell && /^-/) {
gsub(/\\n|,|\\|"/,"")
printf "%s", substr($0,2)
}'
```

#libreoffice

在 node.js 想處理 odf 試算表格式,又不想把整個 libreoffice 裝進來只為了把 ods 轉 csv,最後發現 #xmlstarlet 還堪用,就搭配 awk 解決了。

久違想起以前想指定 xpath default namespace 的問題,翻一下 code 發現過去的我竟然有寫完,雖然 code 有點髒,畢竟 libxml 就不支援 xslt 2.0 只能用 hack 的。
https://sourceforge.net/u/gholk/xmlstar/

那時候還把其他人的 html 支援 patch 也合併進去了,但 xmlstarlet 是作者停更很久的專案,一堆放置的 patch,也沒有活躍的 fork ,僅管如此還是現今 linux 下處理 xml 的首選工具。

gholk / XMLStarlet command line XML toolkit - Code / [fc0ae8]

#xmlstarlet 在 xml 元素開頭插入文字的二種做法
sgml 家族的 text-node 真的是次等民公民,看得見卻又看不見的東西

```bash
xml -i '/head/title/text()[1]' -t text -n _ -v prefix-
xml -u '/head/title' -x 'concat("prefix-", .)'
```

@jacobydave @qmacro @minego Re: current 21st century data, I think #jq is the #JSON-driven spiritual successor to #awk’s streams of records and fields: https://jqlang.github.io/jq/

For early 21st century #XML, I remember #xmlstarlet, but it’s fallen into disrepair: https://xmlstar.sourceforge.net

jq

@nabijaczleweli a bit late, but I did it.

xmlstarlet sel -E UTF-8 -T -t --var lc=\'qwertyuiopasdfghjklzxcvbnm\' --var uc=\'QWERTYUIOPASDFGHJKLZXCVBNM\' --var cur="translate('$cur', \$lc, \$uc)" --var z="$z" -m '//pozycja[kod_waluty = $cur]' --var 'mul=przelicznik' --var 'cost=translate(kurs_sredni, ",", ".")' --var 'tocur=$z * $mul div $cost' --var 'topln=$z * $cost div $mul' --var 'lcur=string-length(format-number($tocur, "#"))' --var 'lpln=string-length(format-number($topln, "#"))' -v '$z' -o ' zł = ' -v 'str:padding($lpln - $lcur, " ")' -v 'format-number($tocur, "#.0000")' -o ' ' -v '$cur' -n -v '$z' -o ' ' -v '$cur' -o ' = ' -v 'str:padding($lcur - $lpln, " ")' -v 'format-number($topln, "0.0000")' -o ' zł' -n -b "$a" "$b"

Requires $a, $b (files), $cur (currency, even the automated uppercasing is implemented) and $z to be set in the surrounding shell execution environment, like yours. Everything else is done in EXSLT 1.0 and XPath. Outputs nothing if the currency is not found, two lines if it shows up in both lists.

Enjoy!

#xmlstarlet #XSLT #EXSLT #XPath #shell #TextTools

quick word count of presenter notes in #libreoffice #impress with #cli tools #xmlstarlet ✨ :

unzip -p yourpresentation.odp content.xml | xmlstarlet sel -t -v '//presentation:notes//text:span' - | wc -w

New Episode: hpr3962 :: It's your data

Hosted by Ken Fallon on 2023-10-10 is flagged as Clean and is released under a CC-BY-SA license.

Tags: #response, #bash, #rss, #xml, #xmlstarlet

https://hackerpublicradio.org/eps/hpr3962/index.html

Hacker Public Radio ~ The Technology Community Podcast

Hacker Public Radio is a podcast that releases shows every weekday Monday through Friday. Our shows are produced by the community (you) and can be on any topic that is of interest to hackers and hobbyists.

#xmlstarlet
Create a list of links contained on each slide of a LibreOffice Impress presentation.
#xmlstarlet extract text from ALTO-XML v. 4.3 observing the reading order specified in the BASEDIRECTION argument
Extract text lines from ALTO-XML files with ✨ #xmlstarlet