Board logo

标题: [文本处理] [已解决]批处理怎样批量删除TXT文件最后的一个断落? [打印本页]

作者: zxzl    时间: 2010-5-21 15:27     标题: [已解决]批处理怎样批量删除TXT文件最后的一个断落?

有一批TXT格式的文章,但多数最后一段都存在垃圾内容,不知用批处理能否实现下面的功能:

1、遍历当前文件夹下所有TXT文件,首先判断每个TXT最后一行是否存在回车或空格,如果存在的话先将回车或空格删除。

2、将每个TXT文件最后一个断落删除,并直接在原来的位置上替换该文件。

上面的功能怎样写批处理呢?谢谢了

[ 本帖最后由 zxzl 于 2010-5-22 11:54 编辑 ]
作者: hanyeguxing    时间: 2010-5-21 16:37

段落的标志什么?
1,当行首为“  ”时为一个新的段落?
2,一行即为一段?
3,空行分隔?
作者: zxzl    时间: 2010-5-21 16:59

原帖由 hanyeguxing 于 2010-5-21 16:37 发表
段落的标志什么?
1,当行首为“  ”时为一个新的段落?
2,一行即为一段?
3,空行分隔?


就是把每个每篇文章最后一个自然段删除,应该也可以说是把最后一行给删除吧?比如下面这个TXT文件:

Drs Secret A Trusted Range To Repair And Renew Your Skin

If you are bothered by any of the common skin problems like acne, fine lines, wrinkles, or pigmentation, Drs Secret skin care range is a safe and effective way of banishing your woes. This revolutionary skin care program is rich in natural vitamins and minerals that will help to repair, rejuvenate and renew your skin from within its deepest layers.

Skin ageing is a result of the daily dose you receive of sun, pollutants, free radicals and other such factors that damage it. As you grow older, your metabolism slows down. The collagen in the skin takes longer to be replaced. Hormonal and other changes taking place within your body also affect your skin. All this shows up as fine lines, pigmentation, blemishes, open pores and other skin problems. Your complexion may look dull and unhealthy due to decreased cell renewal, dead cell build up, exposure and oxidation.

There is a bewildering array of creams that claim to work in the market. But Drs secret has shown proven results for thousands of women to help to repair and renew their skin. You will be delighted to find a cream that finally does what it claims to do!

Laura Lin is a skincare consultant for DR's Secret and care to share her experience to bring forth beautiful life and hope for those who aspires. Want to have beautiful skin, flawless complexion? Visit Laura's DR's Secret Review Site for the best skincare product you need to make you look beautiful and young. DR's Secret - Simply The Best. Seeing is Believing. Click Here


这里我是想把下面这部分内容删除,不过我不知道说最后一行和最后一段有没有区别:)
Laura Lin is a skincare consultant for DR's Secret and care to share her experience to bring forth beautiful life and hope for those who aspires. Want to have beautiful skin, flawless complexion? Visit Laura's DR's Secret Review Site for the best skincare product you need to make you look beautiful and young. DR's Secret - Simply The Best. Seeing is Believing. Click Here


批处理该怎么写呢?

[ 本帖最后由 zxzl 于 2010-5-21 17:00 编辑 ]
作者: hanyeguxing    时间: 2010-5-21 20:14

说明:只对当前目录内的txt处理,不包括系统或隐藏属性文件,保留空行
1,如果行首没有:这个符号,则可以:
  1. @echo off&setlocal enabledelayedexpansion
  2. for %%a in (*.txt) do (
  3. for /f "delims=" %%b in ('find /c /v "" ^<"%%a"') do set b=%%b
  4. (for /f "tokens=1* delims=:" %%b in ('findstr /n .* "%%a"') do if %%b lss !b! echo.%%c)>$
  5. move $ "%%a")
复制代码
2,如果行首没有[]这些符号,则可以:
  1. @echo off&setlocal enabledelayedexpansion
  2. for %%a in (*.txt) do (
  3. for /f "delims=" %%b in ('find /c /v "" ^<"%%a"') do set b=%%b
  4. (for /f "skip=2 tokens=1* delims=[]" %%b in ('find /n /v "" "%%a"') do if %%b lss !b! echo.%%c)>$
  5. move $ "%%a")
复制代码

[ 本帖最后由 hanyeguxing 于 2010-5-21 20:23 编辑 ]
作者: zxzl    时间: 2010-5-21 20:27

感谢回复,不过试了一下,上面的代码都会把所有文件内容删除,并且弹出CMD窗口,关掉CMD窗口后会生成一个名为$的空文件,麻烦再看一下是怎么回事呢
作者: hanyeguxing    时间: 2010-5-21 20:37

原帖由 zxzl 于 2010-5-21 20:27 发表
感谢回复,不过试了一下,上面的代码都会把所有文件内容删除,并且弹出CMD窗口,关掉CMD窗口后会生成一个名为$的空文件,麻烦再看一下是怎么回事呢

你的文本编码是什么?ANSI吗?
作者: zxzl    时间: 2010-5-21 20:59

是的,都是ANSI编码

又试了一下,还是那样,所有TXT文件内容被全部删除,而且运行速度极慢,不应该这样吧?

[ 本帖最后由 zxzl 于 2010-5-21 21:48 编辑 ]
作者: sgaizxt001    时间: 2010-5-21 23:53

不知道可以不,我就测试了两个文本。测试结果正确
  1. @echo off
  2. setlocal enabledelayedexpansion
  3. for %%m in (*.txt) do (
  4.   set /a m+=1
  5.   for /f "delims=" %%a in ('find /c /v "" %%m') do set n=%%a
  6.   for /f "tokens=1,* delims=:" %%i in ('findstr /n .* %%m') do (
  7.      set n=!n:~-2,1!
  8.      if %%i lss !n! echo.%%j>>tmp_!m!.txt
  9.   )
  10. del %%m
  11. ren tmp_!m!.txt %%m
  12. )
  13. endlocal
  14. pause
复制代码

作者: zxzl    时间: 2010-5-22 08:42

楼上的兄弟代码有问题,运行后提示找不到文件,且会把当前目录下所有TXT文件删除,麻烦再看一下
作者: del    时间: 2010-5-22 10:31

  1. @echo off
  2. dir /b *.txt | find /v /c "" > .tmp
  3. set /p Total=< .tmp
  4. find /v /c "" *.txt > .tmp
  5. for /f "delims=: tokens=1,2" %%a in (.tmp) do call :1 "%%a" %%b
  6. del .tmp
  7. exit /b
  8. :1
  9. set /a Last2 = %2 - 1, Count += 1
  10. set "File=%~1"
  11. findstr /n .* "%File:~11%" > .tmp
  12. (
  13.     for /f "delims=" %%a in (.tmp) do (
  14.         set Var=%%a
  15.         set /a Line = Var
  16.         SetLocal EnableDelayedExpansion
  17.         if !Line! lss %Last2% (
  18.             echo,!Var:*:=!
  19.         ) else (
  20.             if !Line! equ %Last2% if "!Var:*:=!" neq "" echo,!Var:*:=!
  21.         )
  22.         EndLocal
  23.     )
  24. ) > "%File:~11%"
  25. cls
  26. echo 已处理 %Count% 个文件(共 %Total% 个)
复制代码

作者: zxzl    时间: 2010-5-22 11:13

感谢del,这个代码运行速度很快,可以完美实现功能:)

如果换种方式,要将当前目录下所有TXT文件中包含“http://”这个字符串的断落删除,这段代码应该怎样修改一下呢?

[ 本帖最后由 zxzl 于 2010-5-22 11:20 编辑 ]
作者: del    时间: 2010-5-22 11:32

  1. @echo off
  2. for /f "delims=" %%a in ('findstr /m "http://" *.txt') do (
  3.     findstr /v "http://" "%%a" > .tmp
  4.     move .tmp "%%a"
  5. )
复制代码
如果不区分"http://"的大小写,把 /m 和 /v 分别改成 /mi 和 /vi

[ 本帖最后由 del 于 2010-5-22 11:35 编辑 ]
作者: zxzl    时间: 2010-5-22 11:33

非常感谢del,很强大:)




欢迎光临 批处理之家 (http://www.bathome.net/) Powered by Discuz! 7.2