本帖最后由 lxzzr 于 2014-7-12 19:54 编辑
LD算法(Levenshtein Distance)又成为编辑距离算法(Edit Distance)。他是以字符串A通过插入字符、删除字符、替换字符变成另一个字符串B,那么操作的过程的次数表示两个字符串的差异。
通俗点讲就是比较两个字符串的相似程度。
http://www.enun.net/?p=2442- '======================
- ' A t e s t
- ' B 0 1 2 3 4
- '
- ' e 1 1 1 2 3
- '
- ' s 2 2 2 1 2
- '
- ' t 3 2 3 2 1
- '======================
-
-
- Wscript.Echo GetLevenshteinDistince("test", "est")
-
- '==============================================================
- ' Copyright (c) enun-net. All rights reserved.
- ' ScriptName: GetStrLD.vbs
- ' Creation Date: 10/11/2013
- ' Last Modified: 10/11/2013
- ' Author: 0x22e09
- ' Homepage: www.enun.net
- ' E-mail: 0x22e09@sina.com
- ' Description: Levenshtein Distance.
- '==============================================================
-
- Function GetLevenshteinDistince(str1, str2)
- Dim x, y, A, B, C, K
- Dim Matrix()
- ReDim Matrix(Len(str2), Len(str1))
-
- '初始化第一行和第一列
- For x = 0 To UBound(Matrix, 1)
- Matrix(x, 0) = x
- Next
- For y = 0 To UBound(Matrix, 2)
- Matrix(0, y) = y
- Next
-
- '填充矩阵
- For x = 1 To UBound(Matrix, 1)
- For y = 1 To UBound(Matrix, 2)
- If (Mid(str1, Matrix(0, y), 1) = Mid(str2, Matrix(x, 0), 1)) Then
- C = Matrix(x -1 ,y - 1)
- Else
- C = Matrix(x -1 ,y - 1) + 1
- End If
-
- A = Matrix(x - 1, y) + 1
- B = Matrix(x, y - 1) + 1
-
- If (A =< B and A =< C) Then Matrix(x, y) = A
- If (B =< C and B =< A) Then Matrix(x, y) = B
- If (C =< A and C =< B) Then Matrix(x, y) = C
- Next
- Next
-
- '计算 LD 值
- If (Len(str1) > Len(str2)) Then
- K = Len(str1)
- Else
- K = Len(str2)
- End If
-
- GetLevenshteinDistince = FormatNumber(1 - (Matrix(Len(str2), Len(str1)) / K), 3, True)
- End Function
复制代码
|