标题: [文本处理] 【已解决】求助批处理从一堆XML中提取关键词写到csv中 [打印本页]
作者: zhengwei007 时间: 2024-3-27 19:45 标题: 【已解决】求助批处理从一堆XML中提取关键词写到csv中
本帖最后由 zhengwei007 于 2024-3-27 23:38 编辑
我有若干个XML文件,内容如下:- <?xml version="1.0" encoding="UTF-8"?>
- <list xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="../../xsd/multisell.xsd">
- <npcs>
- <npc>32615</npc> <!-- Ishuma (Maestro) -->
- </npcs>
- <item>
- <!-- Vesper Cutter -->
- <ingredient count="1" id="13457" />
- <!-- Sirra's Blade -->
- <ingredient count="1" id="8678" />
- <!-- Dualsword Craft Stamp -->
- <ingredient count="1" id="5126" />
- <!-- Adena -->
- <ingredient count="7168000" id="57" />
- <!-- Vesper Dual Sword -->
- <production count="1" id="52" />
- </item>
- <item>
- <!-- Stormbringer -->
- <ingredient count="1" id="72" />
- <!-- Caliburs -->
- <ingredient count="1" id="75" />
- <!-- Dualsword Craft Stamp -->
- <ingredient count="1" id="5126" />
- <!-- Crystal (C-Grade) -->
- <ingredient count="183" id="1459" />
- <!-- Adena -->
- <ingredient count="548100" id="57" />
- <!-- Stormbringer*Caliburs -->
- <production count="1" id="2566" />
- </item>
复制代码
希望通过批处理对以上所有文件内容输出到一个CSV文件,标题我写,格式如下:- ingredientID count ingredientID count ingredientID count ingredientID count ingredientID count productionID count
- 13457 1 8678 1 5126 1 57 7168000 52 1
- 72 1 75 1 5126 1 1459 183 57 548100 2566 1
复制代码
以上就是最终格式,谢谢!
作者: ppll2030 时间: 2024-3-27 22:27
本帖最后由 ppll2030 于 2024-3-27 22:30 编辑
只处理同级目录下的xml文件- @echo off&setlocal enabledelayedexpansion
- for /f "delims=" %%f in ('dir /b /a-d "*.xml"') do (
- for /f tokens^=1^-5^delims^=^<^=^" %%1 in ('findstr /ic:"item" /ic:"count=" "%%f"') do (
- if "%%2" == "item>" set "v="
- if "%%2" == "ingredient count" set "v=!v!, %%5, %%3"
- if "%%2" == "production count" echo, !v:~1!, %%5, %%3
- )
- )>>res.csv
复制代码
作者: zhengwei007 时间: 2024-3-27 23:37
只处理同级目录下的xml文件
ppll2030 发表于 2024-3-27 22:27
太厉害了,感谢感谢!
作者: hfxiang 时间: 2024-3-28 11:35
回复 1# zhengwei007
第3方工具gawk( http://bcn.bathome.net/tool/4.1.0/gawk.exe )的实现方式如下:- gawk -v"FS=\042" "BEGIN{print \"ingredientID\tcount\tingredientID\tcount\tingredientID\tcount\tingredientID\tcount\tingredientID\tcount\tproductionID\tcount\"}/^[\t ]*<item>$/,/^[\t ]*<\/item>$/{if(/^[\t ]*<ingredient count=.+id=.+>$/){A[++i]=$4;B[i]=$2}if(/^[\t ]*<production count=.+id=.+>$/){A[6]=$4;B[6]=$2;C=A[1]\"\t\"B[1];for(i=2;i<7;i++)C=C\"\t\"A[i]\"\t\"B[i];print C;delete A;delete B;i=0}}" *.xml>sour.csv
复制代码
欢迎光临 批处理之家 (http://www.bathome.net/) |
Powered by Discuz! 7.2 |