标题: [技术讨论] PowerShell下载 双色球 分布图数据【直接写入EXCEL】 [打印本页]
作者: xczxczxcz 时间: 2019-1-9 20:19 标题: PowerShell下载 双色球 分布图数据【直接写入EXCEL】
本帖最后由 xczxczxcz 于 2019-1-10 13:08 编辑
纯为下载数据。非为购彩而写。天朝彩票都是坑。世界所有程序猿一起写,都中不了天朝彩票。
一句话,想让你中就能中,不让你中只要不买完所有注数你就中不了。除非你的运势很强悍。
===非 HTML 节点法====PS3.0以上.- Function DownLotteryData {
- Param ( [Array]$DATA, [string]$year, [int]$n )
-
- $Ball=$RedBlue=$arr=$array= New-Object "System.Collections.ArrayList"
- (1..52) |ForEach-Object {$Ball += 'A'} #期号+日期+33+16
-
- for ( $i = 0; $i -lt $DATA.count; $i++ ) {
- if ( $n -le 8 ) {
- $arr += $DATA[$i] #开奖数据
- if ($n -le 2) { $num = $n } else { $num = [int](([string]($DATA[$i])).TrimStart('0')) + 2 } #分布图
- $Ball[$num] = $DATA[$i] #分布图
- $n++
- } else {
- $arr += $DATA[$i] #开奖数据
- $num = [int](([string]($DATA[$i])).TrimStart('0')) + 35 #分布图
- $Ball[$num] = $DATA[$i] #分布图
- $RedBlue += ,$Ball #33+16分布图
- $array += ,$arr #开奖数据
- $arr = $Ball = New-Object "System.Collections.ArrayList" #开奖数据 分布图
- (1..52) |ForEach-Object {$Ball += 'A'} #33+16+日期
- $n = 1
- }
- }
- $content = $AllBall = New-Object "System.Collections.ArrayList"
- for ( $i = 0; $i -lt ( $DATA.count /9 ); $i++ ) {
- $Content += ( $array[$i] -join ' ' ).Replace('.','/') #开奖数据
- $AllBall += ( $RedBlue[$i] -join ',' ).Replace('.','_').Replace('A',' ') -replace '^(\s+)?,?','' #33+16分布图
- }
- $Content | Set-Content .\开奖数据\简略数据\$year.txt -enc Default -force #开奖数据
- $AllBall | Set-Content .\开奖数据\分布图\$($year + '.csv') -enc Default -force #分布图 EXCEL
- }
- #下载数据
- Remove-Item .\开奖数据 -Rec -force -ea 0
- new-item .\开奖数据\简略数据 -type Directory -force
- new-item .\开奖数据\分布图 -type Directory -force
- for ( $year = 2003; $year -le 2019; $year++ ) {
- $url = "https://kjh.55128.cn/ssq-history-$year.htm"
- Invoke-WebRequest -uri $url -Outfile "$env:temp\Downh.log"
- $PageData = (( Get-Content "$env:temp\Downh.log" -ReadCount 0 -enc utf8 ) `
- -match '^(\s+)?<td>(\d{7}|\d{4}(\.\d{2}){2})</td>$|<li(\s+)?class.*>\d{2}</li>$' ) `
- -replace "(\s+)?</?(td)?(li)?((\s+)class='ball.*-24')?>",''
- DownLotteryData $PageData $year 1
- }
- Remove-Item "$env:temp\Downh.log" -force
- pause
复制代码
作者: 523066680 时间: 2019-1-9 21:17
本帖最后由 523066680 于 2019-1-9 22:16 编辑
写过抓双色球往期结果的,多线程,以前写的丑(发的代码不完整,好像还有另一个脚本负责导出EXCEL,一时没找着)- use Encode;
- use Modern::Perl;
- use Time::HiRes qw/time sleep/;
- use threads;
- use threads::shared;
- use Try::Tiny;
- use Mojo::UserAgent;
-
- use File::Basename;
- use File::Path qw/make_path/;
- use File::Slurp;
- STDOUT->autoflush(1);
-
- our $workdir = "D:\\Temp\\Double_Ball_Lottery";
- make_path $workdir unless -e $workdir;
- #chdir $workdir;
-
- our $ua;
- our $main = "http://kaijiang.500.com";
- our @links :shared;
- our @ths;
-
- $ua = Mojo::UserAgent->new();
- $ua = $ua->max_redirects(5);
-
- print "Getting Links ... ";
- get_links( \@links );
- say "Done";
-
- #创建线程
- grep { push @ths, threads->create( \&thread_func, $_ ) } ( 0 .. 3 );
-
- #等待运行结束
- while ( threads->list(threads::running) ) { sleep 0.2 };
-
- #线程分离/结束
- grep { $_->detach() } threads->list(threads::all);
-
- quit();
-
- sub thread_func
- {
- our (@links, $workdir);
- my ( $id ) = @_;
- my $ua = Mojo::UserAgent->new();
- my ($link, $file, $res, $times);
-
- while ( $#links > 0 )
- {
- $link = shift @links;
- $file = $workdir ."\\". basename($link);
-
- if ( -e $file ) { say "$id - $link already exists"; next };
- say "$id - $link";
-
- $times = 0;
- while (1)
- {
- try { $res = $ua->get($link)->result }
- catch
- {
- printf "[%d] getting %s, retry: %d\n", $id, basename($link), $times++;
- sleep 3.0;
- };
- last if ( defined $res and $res->is_success );
- return if ( $times > 10 );
- }
-
- write_file( $workdir ."\\". basename($link), $res->body );
- }
- }
-
- sub get_links
- {
- my ($aref) = @_;
- my $html = read_file("simple.htm");
- my $dom = Mojo::DOM->new( $html );
-
- for my $e ( $dom->at(".iSelectList")->find("a")->each )
- {
- push @$aref, $e->attr("href");
- }
- }
-
- sub quit { system("pause"); }
复制代码
原来是18年2月写的,之前抓了03年到18年1月的html,导出excel(没显示日期,就是搞个分布图),
https://share.weiyun.com/5V1MvmE
作者: xczxczxcz 时间: 2019-1-10 12:33
回复 2# 523066680
谢谢! 补充一个 直接写 EXCEL 的。*.xlsx格式。
从2003 -2019 最新约 180秒。写入EXCEL 速度感觉也可以。优化一下会更好。基本上是比较完整的 EXCEL操作。
所有的数据保存在 一个文档内,每年一个标签。闲时自用。如果只用分布图可以删除掉简略数据。- $time=get-date
- # EXCEL 对象
- $excel = New-Object -ComObject Excel.Application
- Start-Sleep -Seconds 2
- $Excel.Visible = $true
- if ( [Io.File]::Exists("$Pwd\Lottery.xlsx") ) { $Exist = $true } else { $Exist =$Null }
- if ( $Exist ) {
- $workbook = $excel.Workbooks.Open("$Pwd\Lottery.xlsx")
- $NewestName = $workbook.worksheets.Item(1).name
- $workbook.worksheets.Application.DisplayAlerts = $false
- $workbook.worksheets.Item(1).delete() #删除最近一年 追加下载
- $sheet = $workbook.worksheets.add()
- $sheet = $workbook.worksheets.Item(1)
- } else {
- $NewestName = 2003 #找不到数据则全新下载
- $workbook = $excel.Workbooks.add()
- $sheet = $workbook.worksheets.Item(1)
- }
- $borderWeight = "microsoft.office.interop.excel.xlBorderWeight" -as [type]
-
- Function ExcelMerge {
- Param ( [Array]$CSVData, [string]$year )
-
- $workbook.worksheets.Item(1).Name = $year
- $sheet = $workbook.worksheets.Item($year)
- #全局
- $sheet.Rows.HorizontalAlignment = 3
- $sheet.Rows.VerticalAlignment = 2
- $sheet.Columns.RowHeight = 16
- $RangeHeight = $sheet.Range("A1:B1")
- $RangeHeight.RowHeight = 24
- #合并单元格
- $CELL1 = $sheet.Cells.item(1,3)
- $CELL2 = $sheet.Cells.item(1,35)
- $CELLRange = $sheet.Range( $CELL1 , $CELL2 )
- $CELLRange.Merge()
- $CELL1 = $sheet.Cells.item(1,36)
- $CELL2 = $sheet.Cells.item(1,51)
- $CELLRange = $sheet.Range( $CELL1 , $CELL2 )
- $CELLRange.Merge()
- #标题
- $CELL1 = $sheet.Cells.item(1,1)
- $CELL2 = $sheet.Cells.item(1,51)
- $CELLRange = $sheet.Range( $CELL1 , $CELL2 )
- $CELLRange.Borders.Weight = $borderWeight::xlThin
- $CELLRange.Font.bold = $true
- $CELLRange.Font.Size = 13
- $CELLRange.Font.ColorIndex = 1
- $sheet.cells.item(1,1) = '期号'
- $sheet.cells.item(1,2) = '开奖日期'
- $sheet.cells.item(1,3) = '红球'
- $sheet.cells.item(1,36) = '兰球'
- #红球
- $CELL1 = $sheet.Cells.item(2,3)
- $CELL2 = $sheet.Cells.item($($CSVData.Count + 1),35)
- $CELLRange = $sheet.Range( $CELL1 , $CELL2 )
- $CELLRange.Borders.Weight=$borderWeight::xlHairLine
- $CELLRange.Font.ColorIndex = 3
- $CELLRange.Font.Size = 12
- $CELLRange.Columns.ColumnWidth = 2.5
- #兰球
- $CELL1 = $sheet.Cells.item(2,36)
- $CELL2 = $sheet.Cells.item($($CSVData.Count + 1),51)
- $CELLRange = $sheet.Range( $CELL1 , $CELL2 )
- $CELLRange.Borders.Weight=$borderWeight::xlHairLine
- $CELLRange.Font.ColorIndex = 5
- $CELLRange.Font.Size = 12
- $CELLRange.Font.bold = $true
- $CELLRange.Columns.ColumnWidth = 2.5
- #日期
- $CELL1 = $sheet.Cells.item(2,1)
- $CELL2 = $sheet.Cells.item($($CSVData.Count + 1),2)
- $CELLRange = $sheet.Range( $CELL1 , $CELL2 )
- $CELLRange.Borders.Weight = $borderWeight::xlThin
- $CELLRange.Font.Size = 12
- $CELLRange.Font.ColorIndex = 1
- $RangeWidth = $sheet.Range("A1:A2")
- $RangeWidth.EntireColumn.ColumnWidth = 8
- $RangeWidth = $sheet.Range("B1:B2")
- $RangeWidth.EntireColumn.ColumnWidth = 11
- #处理开奖数据
- $StartLine = 2
- for ( $line =0; $line -lt $CSVData.Count; $line++ ) {
- $EachCell = @($CSVData[$line].Split(','))
- for ( $v=1; $v -le $EachCell.Count; $v++ ) { $sheet.Cells.Item($StartLine,$v) = $EachCell[$($v -1)] }
- $StartLine++
- }
- if ( [int]$year -ne 2019 ) { $sheet = $workbook.worksheets.add() }
- }
-
- # 提取开奖数据
- Function DownLotteryData {
- Param ( [Array]$DATA, [string]$year, [int]$n )
-
- $Ball=$RedBlue=$arr=$array= New-Object "System.Collections.ArrayList"
- (1..52) |ForEach-Object {$Ball += 'A'} #期号+日期+33+16
-
- for ( $i = 0; $i -lt $DATA.count; $i++ ) {
- if ( $n -le 8 ) {
- $arr += $DATA[$i] #开奖数据
- if ($n -le 2) { $num = $n } else { $num = [int](([string]($DATA[$i])).TrimStart('0')) + 2 } #分布图
- $Ball[$num] = $DATA[$i] #分布图
- $n++
- } else {
- $arr += $DATA[$i] #开奖数据
- $num = [int](([string]($DATA[$i])).TrimStart('0')) + 35 #分布图
- $Ball[$num] = $DATA[$i] #分布图
- $RedBlue += ,$Ball #33+16分布图
- $array += ,$arr #开奖数据
- $arr = $Ball = New-Object "System.Collections.ArrayList" #开奖数据 分布图
- (1..52) |ForEach-Object {$Ball += 'A'} #33+16+日期
- $n = 1
- }
- }
- $content = $AllBall = New-Object "System.Collections.ArrayList"
- for ( $i = 0; $i -lt ( $DATA.count /9 ); $i++ ) {
- $Content += ( $array[$i] -join ' ' ).Replace('.','/') #开奖数据
- $AllBall += ( $RedBlue[$i] -join ',' ).Replace('.','_').Replace('A',' ') -replace '^(\s+)?,?','' #33+16分布图
- }
- $Content | Set-Content .\开奖数据\简略数据\$year.txt -enc Default -force #开奖数据
- ExcelMerge $AllBall $year
- }
- #下载数据
- new-item .\开奖数据\简略数据 -type Directory -force
- for ( $year = [int]$NewestName; $year -le 2019; $year++ ) {
- $url = "https://kjh.55128.cn/ssq-history-$year.htm"
- Invoke-WebRequest -uri $url -Outfile "$env:temp\Downh.log"
- $PageData = (( Get-Content "$env:temp\Downh.log" -ReadCount 0 -enc utf8 ) `
- -match '^(\s+)?<td>(\d{7}|\d{4}(\.\d{2}){2})</td>$|<li(\s+)?class.*>\d{2}</li>$' ) `
- -replace "(\s+)?</?(td)?(li)?((\s+)class='ball.*-24')?>",''
- DownLotteryData $PageData $year 1
- }
- Remove-Item "$env:temp\Downh.log" -force
-
- if ( $Exist ) { $workbook.Save() } else { $workbook.SaveAs("$Pwd\Lottery.xlsx") }
- $workbook.Close()
- $Excel.quit()
- $excel = $null
- [GC]::Collect()
- ([datetime]::Now -$time).totalmilliseconds
- pause
复制代码
欢迎光临 批处理之家 (http://www.bathome.net/) |
Powered by Discuz! 7.2 |