Board logo

标题: [文本处理] [已解决]怎样突破批处理 For /F 的 tokens 能处理的令牌数限制? [打印本页]

作者: hnfeng    时间: 2024-5-25 14:44     标题: [已解决]怎样突破批处理 For /F 的 tokens 能处理的令牌数限制?

有一 csv 文件,以“,” 分割,共有47列
想导出这些列 1,14,26,38,40,42 重新生成一个 csv 文件,但是貌似 for /f "skip=1 tokens=1,14,26,38,40,42 delims=," %%a in (test.csv) do ………… 是不成功的,因为 tokens 好像最大只能处理31个令牌。

目前的思路,是先获取 1,14 列和 26列之后的,然后再处理 获得的 26列以后的字符……,感觉蛮麻烦的。找了找 第三方库,也没找到合适的第三方程序。

请高手帮忙指教,谢谢

这是前几行的,后面还有一些:
  1. ,/lpc/it8620e/control/0,/lpc/it8620e/control/1,/lpc/it8620e/control/2,/lpc/it8620e/voltage/0,/lpc/it8620e/voltage/1,/lpc/it8620e/voltage/2,/lpc/it8620e/voltage/3,/lpc/it8620e/voltage/4,/lpc/it8620e/voltage/5,/lpc/it8620e/voltage/6,/lpc/it8620e/voltage/7,/lpc/it8620e/voltage/8,/lpc/it8620e/temperature/0,/lpc/it8620e/temperature/2,/lpc/it8620e/fan/0,/intelcpu/0/load/1,/intelcpu/0/load/2,/intelcpu/0/load/3,/intelcpu/0/load/4,/intelcpu/0/load/0,/intelcpu/0/temperature/0,/intelcpu/0/temperature/1,/intelcpu/0/temperature/2,/intelcpu/0/temperature/3,/intelcpu/0/temperature/4,/intelcpu/0/clock/1,/intelcpu/0/clock/2,/intelcpu/0/clock/3,/intelcpu/0/clock/4,/intelcpu/0/power/0,/intelcpu/0/power/1,/intelcpu/0/power/2,/intelcpu/0/power/3,/intelcpu/0/clock/0,/hdd/0/temperature/0,/hdd/0/load/0,/hdd/1/temperature/0,/hdd/1/load/0,/hdd/2/temperature/0,/hdd/2/load/0,/hdd/3/temperature/0,/hdd/3/load/0,/hdd/4/temperature/0,/hdd/4/load/0,/hdd/5/temperature/0,/hdd/5/load/0
  2. Time,Fan Control #1,Fan Control #2,Fan Control #3,Voltage #1,Voltage #2,Voltage #3,Voltage #4,Voltage #5,Voltage #6,Voltage #7,Standby +3.3V,VBat,Temperature #1,Temperature #3,Fan #1,CPU Core #1,CPU Core #2,CPU Core #3,CPU Core #4,CPU Total,CPU Core #1,CPU Core #2,CPU Core #3,CPU Core #4,CPU Package,CPU Core #1,CPU Core #2,CPU Core #3,CPU Core #4,CPU Package,CPU Cores,CPU Graphics,CPU DRAM,Bus Speed,Temperature,Used Space,Temperature,Used Space,Temperature,Used Space,Temperature,Used Space,Temperature,Used Space,Temperature,Used Space
  3. 05/24/2024 00:03:59,,,,0.156,2.04,2.052,2.04,0.012,1.728,1.512,3.384,3.048,28,23,1912.18127,7.69230747,41.53846,16.9230766,6.153846,18.07692,31,28,21,26,31,800.0051,800.0051,800.0051,800.0051,7.07779026,0.8458846,0.002768853,2.3195765,100.000641,27,28.538063,42,97.98176,39,49.0412674,43,97.9916458,30,88.13136,32,90.882576
  4. 05/24/2024 00:08:59,,,,0.420000017,2.04,2.052,2.04,0.012,1.728,1.512,3.384,3.048,28,23,1912.18127,32.30769,7.69230747,16.9230766,3.076923,14.9999981,31,29,21,25,31,800.004761,800.004761,800.004761,800.004761,6.94404268,0.880736053,0.002829046,2.232779,100.000595,27,28.5380554,42,97.98176,39,49.0412674,43,97.9916458,30,88.13136,32,90.882576
  5. 05/24/2024 00:13:59,,,,0.744,2.04,2.052,2.04,0.012,1.728,1.512,3.384,3.048,28,23,1917.61365,16.9230766,0,0,13.8461533,7.69230747,32,28,20,25,31,800.005,800.005,800.005,800.005,5.744312,0.430073529,0.00227422,2.047413,100.000626,27,28.5380554,42,97.98176,40,49.0412674,43,97.9916458,30,88.13136,32,90.882576
复制代码

作者: newswan    时间: 2024-5-25 15:20

本帖最后由 newswan 于 2024-5-25 18:29 编辑

回复 1# hnfeng


  用 awk
  1. awk -F',' 'BEGIN{OFS=FS} {print $1,$14,$26,$38,$40,$42}' %filename%
复制代码

作者: hnfeng    时间: 2024-5-25 16:03

回复  hnfeng


  用 awk
newswan 发表于 2024-5-25 15:20



   请问哪里下载 awk ? http://www.bathome.net/s/tool/?key=awk 这里找到的 gawk,按上面的语句出错
作者: newswan    时间: 2024-5-25 16:14

本帖最后由 newswan 于 2024-5-25 18:42 编辑

回复 3# hnfeng


msys 这个比较大,你试试
https://www.msys2.org/

gawk64.exe
单引号 改 双引号
  1. gawk64.exe -F"," "BEGIN{OFS=FS} {print $1,$14,$26,$38,$40,$42}" "%filename%"
复制代码

作者: 娜美    时间: 2024-5-25 16:15

  1. cut -d , -f 1-32 file
复制代码

作者: aloha20200628    时间: 2024-5-25 17:08

本帖最后由 aloha20200628 于 2024-5-25 17:41 编辑

回复 1# hnfeng

给一个纯P版本,以下代码存为test.bat运行,提取结果文件存为 test.new.csv
假设源文件*.csv是ansi(即简中)编码,每行数据中未包含制表符和等号
提取列数可在第2行自定义,只须用斜线符包裹每一个列数即可...
  1. @echo off &setlocal enabledelayedexpansion
  2. set "lst=/1/14/26/38/40/42/"
  3. (for /f "delims=" %%a in (test.csv) do (
  4. set "a=%%~a"&set "a=!a: =┴!"&set "a=!a:;=┬!"&set "n=0"&set "v="
  5. for %%b in (!a!) do (
  6. set/a "n+=1"
  7. for %%i in (!n!) do if "!lst:/%%i/=!" neq "!lst!" (set "v=!v!,%%b")
  8. )
  9. set "v=!v:┴= !"&set "v=!v:┬=;!"&echo,!v:~1!
  10. ))>"test.new.csv"
  11. endlocal&pause&exit/b
复制代码

作者: hnfeng    时间: 2024-5-25 19:12

回复  hnfeng


gawk64.exe -F"," "BEGIN{OFS=FS} {print $1,$14,$26,$38,$40,$42}" "%filename%"


这个命令完美。多谢多谢
作者: hnfeng    时间: 2024-5-25 19:13

娜美 发表于 2024-5-25 16:15



    下载了 cut.exe,试了一下,没看到结果。可能因为我不会用吧
作者: hnfeng    时间: 2024-5-25 19:15

回复  hnfeng

给一个纯P版本,以下代码存为test.bat运行,提取结果文件存为 test.new.csv
假设源文件* ...
aloha20200628 发表于 2024-5-25 17:08


谢谢。但是结果不对。
正确的应该是
  1. Time,Temperature #1,CPU Package,Temperature,Temperature,Temperature
  2. 05/24/2024 00:03:59,28,31,42,39,43
  3. 05/24/2024 00:08:59,28,31,42,39,43
  4. 05/24/2024 00:13:59,28,31,42,40,43
复制代码
批处理运行的结果是
  1. Time,Temperature #1,CPU Package,Temperature,Temperature,Temperature
  2. 05/24/2024 00:03:59,7.69230747,800.0051,49.0412674,97.9916458,88.13136
  3. 05/24/2024 00:08:59,32.30769,800.004761,49.0412674,97.9916458,88.13136
  4. 05/24/2024 00:13:59,16.9230766,800.005,49.0412674,97.9916458,88.13136
复制代码

作者: 娜美    时间: 2024-5-25 20:26

下载了 cut.exe,试了一下,没看到结果。可能因为我不会用吧
hnfeng 发表于 2024-5-25 19:13
  1. cut -d , -f 1,14,26,38,40,42 file
复制代码

作者: aloha20200628    时间: 2024-5-25 20:44

回复 9# hnfeng

复查了一楼示例文件,其中第3-5行前端出现 ,,,, 连续空值分割符,导致运行时for循环分割器自动跳过,结果出现错位... 再给6楼代码打一个补丁吧,新代码如下,其运行结果能与楼主期望完全相符
  1. @echo off &setlocal enabledelayedexpansion
  2. set "lst=/1/14/26/38/40/42/"
  3. (for /f "delims=" %%a in (test.csv) do (
  4. set "a=◆%%~a"&set "a=!a:,=,◆!"&set "a=!a: =┴!"&set "a=!a:;=┬!"&set "n=0"&set "v="
  5. for %%b in (!a!) do (
  6. set/a "n+=1"
  7. for %%i in (!n!) do if "!lst:/%%i/=!" neq "!lst!" (set "v=!v!,%%b")
  8. )
  9. set "v=!v:◆=!"&set "v=!v:┴= !"&set "v=!v:┬=;!"&echo,!v:~1!
  10. ))>"test.new.csv"
  11. endlocal&pause&exit/b
复制代码

作者: hnfeng    时间: 2024-5-25 21:42

娜美 发表于 2024-5-25 20:26



    多谢多谢,此次完美。
作者: hnfeng    时间: 2024-5-25 21:43

回复  hnfeng

复查了一楼示例文件,其中第3-5行前端出现 ,,,, 连续空值分割符,导致运行时for循环分割器 ...
aloha20200628 发表于 2024-5-25 20:44



    多谢多谢,此次完美了。

多谢上面几位的帮忙
作者: Five66    时间: 2024-5-25 22:29

本帖最后由 Five66 于 2024-5-25 22:37 编辑
  1. @echo off
  2. (
  3. for /f "skip=1 eol=□ delims=" %%a in (test.csv) do (
  4. set "@line=%%a"
  5. setlocal enabledelayedexpansion
  6. for /f "tokens=1,14,26* delims=," %%b in ("!@line:,=□,!") do (
  7. endlocal
  8. set "@out=%%b,%%c,%%d"
  9. for /f "tokens=12,14,16 delims=," %%f in ("%%e") do (
  10. set "@out2=%%f,%%g,%%h
  11. )
  12. )
  13. setlocal enabledelayedexpansion
  14. set "str=!@out:□=!,!@out2:□=!"
  15. echo(!str!|| ^)?
  16. endlocal
  17. ))>the_new_test.csv
  18. echo,done
  19. pause&exit/b
复制代码

作者: 77七    时间: 2024-5-26 00:43

楼主问题已解决。
将1楼内容作为文本 1.txt (不处理逗号了)。
  1. @echo off
  2. cd /d "%~dp0"
  3. for /f "useback delims=" %%x in ("1.txt") do (
  4. setlocal
  5. call :getTokens "%%x" "," "1 14 26 38 40 42"
  6. setlocal enabledelayedexpansion
  7. echo=[!`str!]
  8. endlocal
  9. endlocal
  10. )
  11. pause
  12. exit
  13. :getTokens <字符串> <分隔符> <列>
  14. if "%~2" neq "" (
  15. set "`d=%~2"
  16. for %%a in (%~3) do (
  17. set "`%%a=1"
  18. )
  19. )
  20. set /a tk+=1
  21. for /f "tokens=1* delims=%`d%" %%a in ("%~1") do (
  22. if defined `%tk% (
  23. if not defined `str (
  24. set "`str=%%a"
  25. ) else (
  26. rem 结果以","分隔
  27. set "`str=%`str%,%%a"
  28. )
  29. )
  30. if "%%b" neq "" (
  31. call :getTokens "%%b"
  32. ) else (
  33. exit /b
  34. )
  35. )
  36. exit /b
复制代码

作者: newswan    时间: 2024-5-26 08:04

for 的问题在于:空值被忽略,不是 token 数量的问题
和 csv 语义不同,所以不适合 用 批处理的 for 语句,用 awk ,powershell
  1. set _str_=1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48
  2. for /f "tokens=1,14,26,* delims=," %%a in ( "%_str_%" ) do (
  3. for /f "tokens=12,14,16 delims=," %%e in ( "%%d" ) do (
  4. echo %%a,%%b,%%c,%%e,%%f,%%g
  5. )
  6. )
复制代码





欢迎光临 批处理之家 (http://www.bathome.net/) Powered by Discuz! 7.2