本帖最后由 hfxiang 于 2023-2-8 20:18 编辑
将- BEGIN {
- RS = "},{"
- print "\"Title\",\"Summary\",\"PageSum\",\"Id\tPid\""
- }
- {
- Title = gensub(/^.*"Title":"([^"]+)",".*$/, "\"\\1\"", "g", $0)
- Summary = gensub(/^.*"Summary":"([^"]+)",".*$/, "\"\\1\"", "g", $0)
- PageSum = gensub(/^.*"PageSum":"([^"]+)",".*$/, "\"\\1\"", "g", $0)
- Id = gensub(/^.*"Id":"([^"]+)",".*$/, "\"\\1\"", "g", $0)
- Pid = gensub(/^.*"Pid":"([^"]+)".*$/, "\"\\1\"", "g", $0)
- print Title "," Summary "," PageSum "," Id "," Pid
- }
复制代码 保存为a.awk
下载gawk( http://bcn.bathome.net/tool/4.1.0/gawk.exe ),执行- gawk -f.\a.awk 输入文本.txt>输出文本.txt
复制代码 结果- "Title","Summary","PageSum","Id Pid"
- "敏儿演剧史:剧本类","上海:商务印书馆,** 二十二年十月 [1933.10] 印行:王云五发行","101页","4b1e7731b001378fe85e3470b23e19ab","f721283beed81168a7f56770c2e37f42"
- "上海新学会社图书目录","上海:上海新学会社,[1936]","1 册","18046902990da9f8c0e3b7081c62b7d9","e24b0fcfdd69228163ee2a40179b3a10"
- "京师译学馆生理卫生学讲义","[出版地不详] : [出版者不详], [19??]","1册 ","56373fe68f3834c486e0edfd3682ed96","81d7172b2f060a842cc78a0486c16647"
- "中国中部奥陶纪头足类化石","[北京] : 农矿部地质调查所:国立北平研究院地学研究,** 十九年七月 [1930.7] 印行","18,101页","0b6ea856ab6a52c74415ced060d1028c","7c902869e5461cc9bdb309aa3a0fd9a7"
复制代码
|