블로그 이미지
훅크선장

카테고리

분류 전체보기 (362)
사진이야기 (23)
펭귄컴퓨팅 (121)
컴퓨터보안 (84)
절름발이 프로그래머 (59)
하드웨어개조 (23)
멀알려줄까 (35)
홈베이킹&홈쿠킹 (2)
잡다한것들 (15)
Total
Today
Yesterday

달력

« » 2024.5
1 2 3 4
5 6 7 8 9 10 11
12 13 14 15 16 17 18
19 20 21 22 23 24 25
26 27 28 29 30 31

공지사항

태그목록

최근에 올라온 글

http://www.matsgefvert.se/blog/archives/651
에서 가져온 글입니다.

StringList상에 있는 동일한 문자열 개수 세기 샘플 코드와
StringList의 검색속도를 증가시키는 방법이 나와있습니다.
----------------------------------------------------------------------------------

Speeding Up Delphi’s TStringList.IndexOf
1 June 2009, 15:33 ? Music, Software Development
Delphi’s TStringList is one of the objects I love the most. If it’s sorted (StringList.Sorted := true) then you can use it to parse huge chunks of data quickly.

For instance, looping through an enormous amount of IP addresses and keeping count of how many times they appeared, is easily done using the following code (not compiled or checked for syntax errors):

ls := TStringList.Create;
ls.Sorted := true;
for ip in ipAddresses do begin
  n := ls.IndexOf(ip);
  if n = -1 then
    ls.AddObject(ip, TObject(1))
  else
    ls.Objects[n] := TObject(Integer(ls.Objects[n]) + 1);
end;
It’s very efficient. Since TStringList.IndexOf always does a binary search, it operates in log2(n) time, and using Objects as integers allows us to keep track of count without messing with the string data.

But there are things we can do to speed it up. For instance, TStringList.IndexOf relies on TStringList.Find, which itself uses AnsiCompareStr, which is a slow Windows call, taking locale and its mother into consideration. Overriding this with our own method should be worthwhile. (The code below is adapted straight from the Classes unit.)

type
  TStringListEx = class(TStringList)
  public
    function Find(const S: string; var Index: Integer): Boolean; override;
  end;

function TStringListEx.Find(const S: string; var Index: Integer): Boolean;
var
  L, H, I, C: Integer;
begin
  Result := False;
  L := 0;
  H := Count - 1;
  while L <= H do
  begin
    I := (L + H) shr 1;
    C := CompareStr(Get(I), S);
    if C < 0 then L := I + 1 else
    begin
      H := I - 1;
      if C = 0 then
      begin
        Result := True;
        if Duplicates <> dupAccept then L := I;
      end;
    end;
  end;
  Index := L;
end;
We’ve replaced AnsiCompareStr with Delphi’s own CompareStr, which is a highly optimized FastCode routine. There are some drawbacks ? things will always be sorted in byte order and no case-sensitivity is done. But we don’t care about this ? it can always be done afterwards; right now, speed is the main importance.

And it turns out that using the above code, in pure examples, can slash execution time with up to about 80%. Dramatic savings, indeed. In my own example, where I analyze ftp log data, I managed to cut execution time on 122 MB of data down from 7 seconds down to 3.1 seconds.

Best of all, since TStringList.Find is declared virtual, we don’t need to change any types anywhere, just do a TStringListEx.Create instead of a TStringList.Create and you’re good to go.




Posted by 훅크선장
, |